Building an SEC 10-K Analyst with Mem0: Leveraging Memory Layers for Enhanced Financial Analysis
Code clinic - Reading SEC filings with a dash of waste of money
After the launch of OpenAI’s new model (“o1”), I almost forgot that I wanted to cover memory layers this month. Thankfully, I do have a tracking document that helps me memorize what I am planning to write about. With that in mind, I am beating all current-gen LLMs, let it be the new OpenAI model, Claude, or Quen2.5, since all of them don’t have memory out of the box. ( ̄~ ̄;)
Well, that’s not entirely accurate. Within the context of a session usually the front-ends for ChatGPT or Claude allow you to maintain your session history of the conversation. However, maybe due to privacy reasons, there is not much persistent storage of preferences and other unique personally identifying information across conversations.
But details like the user’s name, dietary preferences, or allergies will help your agent to serve you better. Obviously, if your agent would be buying stocks for you, it is critical that it understands your investment profile.
Mem0 calls this concept memory customization. Mem0.ai is a recent YC startup that aims to solve Memory management for LLMs.
Memory customization matters because it stores only relevant information. Where the “relevant" is semantic. It can improve accuracy by curating the right memories. You don’t need to share much private information in the chat (the system would ‘just know’) and thus tailor the response to your needs. This holds especially true in specialized domains like finance where the context of your question is really important.
For this exercise, I want to build an agent that
can talk to an SEC filing,
remembers what was in the document,
takes my query, and
provides a response
Moreover, I want to implement the SEC agent using Mem0.
Some words before we start. After working on this for far too many days, I found Mem0 annoying. Not that the solution is bad. It’s actually quite the opposite. It’s quite elegant and simple.
What annoyed me is that it HAS to use OpenAI.
OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
On paper, there exists an Ollama integration, the documentation refers you to this, but at least in my experience it doesn’t actually work well. I hold the opinion, it is a clear decision from the business team at Mem0 to focus on OpenAI. I believe this does make sense, because of course OpenAI’s model is reliably good, so the first touchpoint for an engineer to build a proof-of-concept is reliably good as well.
I don’t know how you feel about this, but I think that no one tech company should be the sole gateway to Artificial Intelligence.
With that out of the way. Let’s dive in.
Ymmv, but I always found implementations that hide the stage where they communicate with the LLM for whatever reason incredibly annoying. Yes, they are easy solutions to get basic tasks done, but in the end, the more complex the tasks get, the more this abstraction gets in the way. And Mem0 is not different in this.
Building the SEC Agent
For another project that you can find here, I created an SEC filing extractor that takes a ticker as input, collects the most recent 10-K from the SEC Edgar API, and then extracts all the different sections of the 10-K into a nicely formatted JSON dictionary.
If you want to have a tour through the script let me know in the chat or comments (paying users only)
Anyways, as usual, we begin building the SEC agent. by importing several libraries.
import json, os, openai
from mem0 import Memory,MemoryClient
from openai import OpenAI
Then we load the JSON data in the Notebook so we can work with it.
file_path=f"{user_id}-10-K.json"
with open(file_path, 'r') as file:
document = json.load(file)
If you read the code carefully, you will notice that there is a parameter “user_id”. We will get to that in more detail later, but maybe in short, Mem0 is using user_id as a parameter to organize their data and I used, since I am the only user, the stocks ticker yields a satisfactory primary identifier, so the LLM does not mix-up the different filings of different stocks it might receive over time.
My 10-K parser collects all 21 sections as specified by the SEC. Glancing over Anthropic’s Contextual Retrieval, I believe this will become important when I implement this in the near future.
Then, as the next step, I wanted to store the embeddings in a vector database for easy storage and retrieval.
Qdrant - The long-term memory
The Mem0 team recommends Qdrant. I chose to sign up for their (up to 1GB) free online plan. I suppose for this project, it will be sufficient, but given how easy it was to implement LanceDB locally, I will likely not keep my data there going forward.
You instantiate the qdrant client like this.
from qdrant_client import QdrantClient
qdrant_client = QdrantClient(
url="https://<snip>.<snip>.gcp.cloud.qdrant.io:6333",
api_key=qdrant_key,
)
Once you sign up to Qdrant, just grab the API key and paste it into the code snippet above as “qdrant_key”. Mem0 then automatically uploads your data to the server.
Setting up Mem0
Since a lot of the heavy lifting around embeddings is outsourced to OpenAI, the initialization of the SEC analyst actually quite easy.
I want to instantiate an object with a simple statement like this
sec_agent = SEC10KAnalyst()
Of course, this requires a class declaration with an __init__ function that defines all the required services is needed and looks like this.
config = {
"vector_store": {
"provider": "qdrant",
"config": {
"url": "<snip>.<snip>.gcp.cloud.qdrant.io:6333",
"port": 6333,
"api_key":qdrant_key,
}
},
}
self.memory = Memory.from_config(config)
self.client = OpenAI()
self.app_id = "sec-analyst"
Once the SEC Analyst agent is created we can then simply prompt the agent with a one-liner statement.
sec_agent.handle_query(f"Based on this 10K filing, please tell me if the company has legal issues. 10K filing: {document}", user_id=user_id)
The function “handle_query” is implemented as you can see below. Notably, there is nothing out of the ordinary, except that we add the query to memory.
def handle_query(self, query, user_id=None):
# Start a streaming chat completion request to the LLM
stream = self.client.chat.completions.create(
model=model_name,
stream=True,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": query}
]
)
# Store the query in memory
self.memory.add(query, user_id=user_id, metadata={"app_id": self.app_id})
# Print the response from the LLM as a stream
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
Best practice: I created a new key to differentiate the cost.
Now we can add the 10K document (~475K filesize) to the prompt and should be happy campers.
Wrong!
Don’t do that. I should have known better to listen when OpenAI informed me that they require my api_key to support text-embedding-3-small and gpt-4o-mini.
In my childlike naïveté, I added these two models and let the system run by executing the handle_query statement. (Reminder: “document” is my 475K JSON file)
Within a few seconds I was greeted by this result:
Just calculating the embeddings and a little analyzing of this one 10-K document immediately cost me $3,25 cents!
(╯°□°)╯︵ ┻━┻
So, what did I get back for my 3 dollars and 25 cents?
A standard markdown explanation of what the company does.
Nice. But not really what I asked for.
Rhen I remembered that I had stored all this of this in memory! Great.
Maybe now is the time to shine Mem0?
Memory
Let’s first check what kind of memory Mem0 has in store for us. The agent can retrieve all stored memories with one simple statement. By providing the ticker (“user_id”) we receive an array of dictionaries, ahem memories, as a result.
memories = sec_agent.get_memories(user_id=user_id)
Each of the 16 (!?) entries has the following structure.
'created_at': '2024-09-29T06:11:54.479982-07:00',
'hash': '92ec192b5c2f6b71366395da6c32ce62',
'id': '0db266b5-4d97-4e07-b413-bfeefa0137b9',
'memory': 'The Company is a U.S. designer, developer, and operator of '
'next-generation digital infrastructure across North America.',
'metadata': {'app_id': 'sec-analyst'},
'updated_at': None,
Wait? What? Why 16 entries? What happened with the rest?
Given that I don’t want my whole document to be parsed again, I figured, I should be able to check the memory only for interesting legal cases I should be aware of.
related_memories = sec_agent.memory.search("Could you expand on where you see the main risks and also what a good trade would look like", user_id=user_id)
Interestingly, this just returned the same 16 memories, just in another order.
{'created_at': '2024-09-29T06:11:55.920294-07:00',
'hash': '8b508fcaff646d0aef3f3e2afc30056d',
'id': '95dcd6d5-72b2-4470-9678-db5f3480ca82',
'memory': 'The Company is subject to a highly evolving regulatory landscape, '
'which may impact its business and operations.',
'metadata': {'app_id': 'sec-analyst'},
'score': 0.23226002,
'updated_at': None,
'user_id': 'APLD'}
Hmm. Somehow not really what I wanted. I somehow get the feeling that it’s still a little bit too early to use Mem0 just yet.
Conclusion
Sure, it’s easy to get started with Mem0. However, the absence that I can easily define how I want to use my custom embeddings and a different LLM is a deal breaker. Similar to CrewAI, abstracting so far away that one has to implement OpenAI at all costs is not helpful and it can generate, as shown here, unnecessary costs. There are 9,000 Nasdaq companies for each of them, I would pay 3 dollars every quarter, and it would cost me 27,000 dollars per document per quarter.
There is no concept of ‘forgetting’. I.e. what happens is that Mem0 automatically deprioritizes older memories when new, contradictory information is added and adjusts memory relevance based on changing contexts.
Of course, I would pay for storage, processing, and reasoning if it is transparent to me what I am paying for. But, in this case, while the abstraction layer is nice, it prevents me from adding it to the agents I already have in use.
The true killer app for memory would be a solution that can remember and synthesize both structured and unstructured information about the identifier in a way that's natural for a developer and the user.
Anyway, I will open up a chat with you and add the link to the code in chat for my paying users. If there are any questions, please let me know.
Don’t forget to like, share, and subscribe.