johnburnsonline.com

Unlocking Memory for ChatGPT API: A Comprehensive Guide

Written on

Chapter 1: Understanding the Memory Issue

If you've interacted with the OpenAI API, you might have encountered a significant limitation: the model lacks memory of prior requests. Each API call operates as an isolated event, making it challenging for applications like chatbots that rely on context for follow-up queries.

To address this, we will investigate methods to enable ChatGPT to recall previous exchanges when utilizing the OpenAI API.

Warm-Up Interaction

Let's initiate some interactions to illustrate this memory gap:

prompt = "My name is Andrea"

response = chatgpt_call(prompt)

print(response)

# Output: Nice to meet you, Andrea! How can I assist you today?

Now, let’s pose a follow-up question:

prompt = "Do you remember my name?"

response = chatgpt_call(prompt)

print(response)

# Output: I'm sorry, as an AI language model, I don't have the ability to remember specific information about individual users.

As demonstrated, the model fails to retain my name from the initial interaction. Note that chatgpt_call() is simply a wrapper for the OpenAI API. For more details on initiating calls to GPT models, check out "ChatGPT API Calls: A Gentle Introduction."

Many users typically work around this memory issue by sending the conversation history with each new API call. However, this approach can become costly and is limited for extensive conversations.

To implement memory for ChatGPT, we'll leverage the widely-used LangChain framework, which simplifies the management of conversation history and allows you to select the appropriate memory type for your specific needs.

Chapter 2: Exploring the LangChain Framework

LangChain is designed to aid developers in creating applications that harness the capabilities of Large Language Models (LLMs).

LangChain Framework Overview

According to their GitHub page:

> Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. However, using these LLMs in isolation is often insufficient for creating a truly powerful app — the real power comes when you can combine them with other sources of computation or knowledge.

This library aims to facilitate the development of such applications.

Framework Setup

Setting up the LangChain library in Python is straightforward. Like any other Python library, it can be installed via pip:

pip install langchain

LangChain operates by calling the OpenAI API in the background. Thus, you must configure your OpenAI API key as an environment variable named OPENAI_API_KEY. If you need assistance in acquiring your API key, refer to "A Step-by-Step Guide to Getting Your API Key."

Basic API Calls with LangChain

Now, let's execute a basic API call to ChatGPT using LangChain:

from langchain.llms import OpenAI

chatgpt = OpenAI()

Once the desired model is loaded, we can initiate a conversation chain:

from langchain.chains import ConversationChain

conversation = ConversationChain(llm=chatgpt)

We can set the conversation to verbose mode to observe the model's reasoning process. The .predict() method allows us to send prompts and receive responses:

conversation.predict(input="Hello, we are ForCode'Sake! A Medium publication with the objective of democratizing the knowledge of data!")

Now, let's ask a follow-up:

conversation.predict(input="Do you remember our name?")

# Output: "Hi there! It's great to meet you. I'm an AI that specializes in data analysis. I'm excited to hear more about your mission in democratizing data knowledge. What inspired you to do this?"

This shows that the model can manage follow-up queries effectively using LangChain.

Chapter 3: Memory Types in LangChain

LangChain's conversation chains inherently track the .predict calls made during a conversation. However, the default behavior retains every interaction, which can lead to excessive token usage during new prompts. Since ChatGPT has a token limit per interaction, this can become costly over time.

To address these challenges, LangChain offers various memory types:

1. Complete Interactions

While the default is to remember all interactions, you can explicitly set this with the ConversationBufferMemory, which retains a complete log of prior exchanges:

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()

With this memory type, you can inspect the buffer's contents at any time with memory.buffer or add extra context without interacting with the model:

memory.save_context({"input": "Hi"}, {"output": "What's up"})

memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"})

If you don't require direct manipulation of the buffer, the default memory may suffice, although explicit declaration is advisable for debugging.

2. Window of Interactions

An alternative approach is to store only a limited number of recent interactions (k) with the model using the ConversationBufferWindowMemory:

from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=1)

This means that only the most recent interaction will be retained, even if more exchanges are stored.

3. Summary of Interactions

For applications where retaining all details is crucial, such as customer support bots, a more efficient memory type is the ConversationSummaryBufferMemory. This type summarizes past interactions while storing relevant recent exchanges, reducing token usage without losing important information.

from langchain.memory import ConversationSummaryBufferMemory

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)

This memory type allows you to keep track of significant details while minimizing irrelevant data.

Conclusion

In this article, we explored various methods to implement memory in ChatGPT applications, utilizing the LangChain framework. By moving beyond simple API calls to incorporate memory, we can enhance the user experience and facilitate seamless follow-up interactions.

I encourage you to evaluate your average conversation length and compare token usage with the summary memory type to optimize performance and costs.

LangChain offers a wealth of functionality for GPT models. Have you discovered any other useful features?

Thank you for reading! I hope this guide assists you in developing ChatGPT applications.

You might also be interested in my newsletter for updates on new content, especially regarding ChatGPT:

  • ChatGPT Moderation API: Input/Output Control
  • Unleashing the ChatGPT Tokenizer
  • Mastering ChatGPT: Effective Summarization with LLMs
  • What ChatGPT Knows about You: OpenAI’s Journey Towards Data Privacy

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

The Rise of AI-Generated Imagery: Trust Issues Ahead

Explore the challenges and implications of AI-generated images on society and trust in media.

How to Lead Effectively Without Being Just a Manager

Discover the key differences between leadership and management, and learn essential skills to enhance your career.

Revolutionary Advances in Artificial Skin Technology

Discover how groundbreaking artificial skin technology is transforming human-machine interactions, enhancing robotics and prosthetics.

Becoming a Software Developer Without a Degree in 5 Months

A self-taught journey to becoming a software developer in five months.

Understanding the Impact of Hypomagnesemia on Cognitive Health

Exploring how low magnesium levels can affect mental health and cognitive disorders, along with preventive measures and treatment options.

Maximizing Time: Insights from a Month of Tracking Daily Activities

Discover the lessons learned from meticulously tracking daily activities for a month and how to optimize your time effectively.

Investment Firm Predicts 1,133% Surge for Tempest Therapeutics

HC Wainwright & Co. projects a staggering 1,133% increase for Tempest Therapeutics, raising the stock's price target to $47.

The Journey of Progress: Understanding Its Non-Linear Nature

Explore the concept of progress as a non-linear journey, emphasizing discipline over instant gratification.