Encyclopedia Autonomica

Encyclopedia Autonomica

Share this post

Encyclopedia Autonomica
Encyclopedia Autonomica
Code Clinic | Building an AI Investing Agent: A Step-by-Step Guide Using Langchain and RAG

Code Clinic | Building an AI Investing Agent: A Step-by-Step Guide Using Langchain and RAG

Because we all like losing money.

Jan Daniel Semrau (MFin, CAIO)'s avatar
Jan Daniel Semrau (MFin, CAIO)
Oct 03, 2023
∙ Paid
2

Share this post

Encyclopedia Autonomica
Encyclopedia Autonomica
Code Clinic | Building an AI Investing Agent: A Step-by-Step Guide Using Langchain and RAG
1
Share

One thing that all of us have dreamed of is an investment agent who is much better than us in trading, always makes the right decisions, and makes us money while we are sleeping. Of course, it is unlikely that such an agent might ever exist. However, while large hedge funds are building their arsenal of trading tools, we can do the same thing. And now that we have generative AI, it is actually quite easy to put components in place that might help you gain a better understanding of a certain investment idea and the news around it.

In this code clinic, we will be employing several components to build our Retrieval augmented generation (RAG) system.

Let’s dive in

Encyclopedia Autonomica is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Components

  1. JSONLoader - The data I want to use are in a JSON file on my hard disk, so I simply want to load it from there. Langchain has powerful web loaders as well, but I found it easier for this exercise to work with a local file and then make it complicated later.

  2. LLM - We will be using OpenAI’s GPT-3.5-turbo. Ideally, for cost reasons, we should be using a local LLM like Mistral or LLama, but for this exercise the OpenAI API shall do.

  3. Vector store - We will be storing our embeddings here. I assume you already know embeddings, so I only let you know that again for simplicity reasons we will be storing the vectors in a folder on the file system and not in an optimized VectorDB like Pinecone.

  4. Embedder - For the vector search to work, we need embeddings. Again for simplicity reasons, I will be using the OpenAI one, although it costs money.

  5. Prompt - And finally of course we want our agent to solve a real-world problem. So I will hand over a couple of posts to the agent, and the agent has to summarize what happened.


Building the RAG

When building systems like these, it is always good practice to start with the data. In my case, I got the data from a MySQL database and exported them into JSON.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 JDS
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share