Unlocking Memory for ChatGPT API: A Comprehensive Guide
Written on
Chapter 1: Understanding the Memory Issue
If you've interacted with the OpenAI API, you might have encountered a significant limitation: the model lacks memory of prior requests. Each API call operates as an isolated event, making it challenging for applications like chatbots that rely on context for follow-up queries.
To address this, we will investigate methods to enable ChatGPT to recall previous exchanges when utilizing the OpenAI API.
Warm-Up Interaction
Let's initiate some interactions to illustrate this memory gap:
prompt = "My name is Andrea"
response = chatgpt_call(prompt)
print(response)
# Output: Nice to meet you, Andrea! How can I assist you today?
Now, let’s pose a follow-up question:
prompt = "Do you remember my name?"
response = chatgpt_call(prompt)
print(response)
# Output: I'm sorry, as an AI language model, I don't have the ability to remember specific information about individual users.
As demonstrated, the model fails to retain my name from the initial interaction. Note that chatgpt_call() is simply a wrapper for the OpenAI API. For more details on initiating calls to GPT models, check out "ChatGPT API Calls: A Gentle Introduction."
Many users typically work around this memory issue by sending the conversation history with each new API call. However, this approach can become costly and is limited for extensive conversations.
To implement memory for ChatGPT, we'll leverage the widely-used LangChain framework, which simplifies the management of conversation history and allows you to select the appropriate memory type for your specific needs.
Chapter 2: Exploring the LangChain Framework
LangChain is designed to aid developers in creating applications that harness the capabilities of Large Language Models (LLMs).
According to their GitHub page:
> Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. However, using these LLMs in isolation is often insufficient for creating a truly powerful app — the real power comes when you can combine them with other sources of computation or knowledge.
This library aims to facilitate the development of such applications.
Framework Setup
Setting up the LangChain library in Python is straightforward. Like any other Python library, it can be installed via pip:
pip install langchain
LangChain operates by calling the OpenAI API in the background. Thus, you must configure your OpenAI API key as an environment variable named OPENAI_API_KEY. If you need assistance in acquiring your API key, refer to "A Step-by-Step Guide to Getting Your API Key."
Basic API Calls with LangChain
Now, let's execute a basic API call to ChatGPT using LangChain:
from langchain.llms import OpenAI
chatgpt = OpenAI()
Once the desired model is loaded, we can initiate a conversation chain:
from langchain.chains import ConversationChain
conversation = ConversationChain(llm=chatgpt)
We can set the conversation to verbose mode to observe the model's reasoning process. The .predict() method allows us to send prompts and receive responses:
conversation.predict(input="Hello, we are ForCode'Sake! A Medium publication with the objective of democratizing the knowledge of data!")
Now, let's ask a follow-up:
conversation.predict(input="Do you remember our name?")
# Output: "Hi there! It's great to meet you. I'm an AI that specializes in data analysis. I'm excited to hear more about your mission in democratizing data knowledge. What inspired you to do this?"
This shows that the model can manage follow-up queries effectively using LangChain.
Chapter 3: Memory Types in LangChain
LangChain's conversation chains inherently track the .predict calls made during a conversation. However, the default behavior retains every interaction, which can lead to excessive token usage during new prompts. Since ChatGPT has a token limit per interaction, this can become costly over time.
To address these challenges, LangChain offers various memory types:
1. Complete Interactions
While the default is to remember all interactions, you can explicitly set this with the ConversationBufferMemory, which retains a complete log of prior exchanges:
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
With this memory type, you can inspect the buffer's contents at any time with memory.buffer or add extra context without interacting with the model:
memory.save_context({"input": "Hi"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"})
If you don't require direct manipulation of the buffer, the default memory may suffice, although explicit declaration is advisable for debugging.
2. Window of Interactions
An alternative approach is to store only a limited number of recent interactions (k) with the model using the ConversationBufferWindowMemory:
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=1)
This means that only the most recent interaction will be retained, even if more exchanges are stored.
3. Summary of Interactions
For applications where retaining all details is crucial, such as customer support bots, a more efficient memory type is the ConversationSummaryBufferMemory. This type summarizes past interactions while storing relevant recent exchanges, reducing token usage without losing important information.
from langchain.memory import ConversationSummaryBufferMemory
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
This memory type allows you to keep track of significant details while minimizing irrelevant data.
Conclusion
In this article, we explored various methods to implement memory in ChatGPT applications, utilizing the LangChain framework. By moving beyond simple API calls to incorporate memory, we can enhance the user experience and facilitate seamless follow-up interactions.
I encourage you to evaluate your average conversation length and compare token usage with the summary memory type to optimize performance and costs.
LangChain offers a wealth of functionality for GPT models. Have you discovered any other useful features?
Thank you for reading! I hope this guide assists you in developing ChatGPT applications.
You might also be interested in my newsletter for updates on new content, especially regarding ChatGPT:
- ChatGPT Moderation API: Input/Output Control
- Unleashing the ChatGPT Tokenizer
- Mastering ChatGPT: Effective Summarization with LLMs
- What ChatGPT Knows about You: OpenAI’s Journey Towards Data Privacy