So, I built Agentic DeepSearch for Investing in 2023 and Nobody Cared
A Journey From Data Architectures, via Semantic Search, to Prompt Engineering
Granted, it didn’t work as well as OpenAI’s or Google’s 2025 product, because GPT-3 was just not THAT good. But hey, the ideas were the same. I guess that must count for something. In this post, I am laying out the architecture considerations for data aggregation that can be provided to SuperBill so it can work with it during its reasoning process and I will in a follow-up post approach from the other side. And btw. This is not a post-mortem. Most parts of the data pipeline still enrich my daily work, but there are surely a few lessons to be learned about why the app is not publicly available anymore. And I do blame mainly timing for it.
Just to be clear, I am fundamentally a data guy by heart; therefore, the way I attacked the problem of financial research and building my agent was to get a well-structured dataset first that allowed me to access research relevant data points quickly and precisely. Of course, a data set like this does not fall from the sky it needs to be carefully curated. For reference, BloombergGPT (remember this?), was trained on 40 years of financial information.
So, let’s talk about data first.
Investment data comes in many shapes and forms, varying in frequency, significance, and impact. These data points help investors like you and me understand market movements, company performance, and broader trends better. However, if you are able to make the right decisions based on this input is still entirely up to you. And even if you have all the answers, you can still make the wrong decisions. At least for now. I suppose that’s one of the reasons why I am building SuperBill.
Here are a couple of examples I have structured by frequency (Irregular and Regular):
Once-off/Irregular data
This category includes significant, irregular data that can have an immediate impact on a company or market.
A product launch (e.g., Apple unveiling a new iPhone —it’s not 2009 anymore). A new product can establish new markets or revenue streams that increase revenue.
A merger or acquisition (e.g., Microsoft acquiring Activision Blizzard). Usually, consolidation takes off competitors and improves market position.
Leadership changes (e.g., Twitter appointing a new CEO). If company leaders like Parag are not qualified, it leads to underperformance.
A company receiving a regulatory fine or approval (esp. Healthcare). FDA approvals make healthcare stocks jump, but it is a gamble without deep insights into the research.
Social media-driven sentiment shifts (e.g., Reddit’s WallStreetBets). Maybe that was a market irregularity during COVID-19.
Analyst rating updates (e.g., Morgan Stanley upgrading a stock to "Buy"). Some analysts matter more than others, however setting their thoughts and reasoning helps funds manage their exposures.
Non-professional analysis reports on sites like Seeking Alpha or similar. For consideration, sometimes there are good ideas and insights there.
Reddit posts
Daily/Regular
On close, it’s possible to calculate KPIs tracking movements and sentiment.
Unusual trading volume spikes (e.g., a stock trading at 5x its normal volume). It’s one of my favorite lagging indicators, something clearly has happened that moved the needle and might provide a setup for the reverse momentum.
Unusual trading volumes
Large price swings (e.g., a stock dropping 10% on no apparent news).
Short interest changes (e.g., increased short-selling activity).
Scheduled analyst rating updates.
Weekly
These insights provide a broader view of market trends over this time horizon.
ETF and mutual fund flow data (e.g., tracking institutional money).
CFTC Commitment of Traders (COT) report (e.g., showing hedge fund positions).
Mortgage applications and real estate reports (e.g., tracking the housing market).
Sector rotation trends (e.g., funds shifting from tech stocks to energy stocks).
Investor sentiment surveys (e.g., AAII Bullish/Bearish Sentiment survey).
Monthly
Monthly data often reflects larger economic and market trends.
Consumer Price Index (CPI) and inflation reports.
Retail sales reports (e.g., gauging consumer spending habits).
Manufacturing and industrial production data (e.g., PMI reports). Indicates economic expansion or contraction in key sectors
Corporate insider buying/selling trends. Insights into executive confidence in their own company’s future.
Economic data releases (e.g., U.S. jobless claims or consumer confidence index).
Senator/politician transaction reports. This can hint at potential policy shifts or regulatory changes (!). Just check Nancy Pelosi’s fund performance.
Senator trades
Quarterly
This type of data is crucial for assessing corporate performance.
Quarterly earnings reports (e.g., Amazon reporting its Q2 revenue and profit) - impact stock prices based on whether results exceed or miss expectations
Earnings call transcripts and guidance. Offers insights into management's outlook and sometimes has sections for upcoming changes.
Share Buybacks / Dividend announcements and changes. Signals financial stability or concerns
Regulatory filings (10-Q reports in the U.S.) (e.g., detailed financial statements). Provide transparency, and any surprises can lead to sharp price swings
Annual
Annual data provides a long-term perspective on a company or industry.
Annual earnings reports (e.g., Berkshire Hathaway’s shareholder letter).
Annual general meetings (AGMs)
10-K filings and corporate disclosures (e.g., in-depth disclosures).
Industry reports and macroeconomic outlooks (e.g., global economic forecast).
Sustainability and ESG reports (e.g., Tesla’s environmental impact report).
In retrospect, probably, I should have tried to make a table and screenshot it, but here we are. As you can see, the data you will receive is complex and structured in a wide variety of ways. In my opinion, if you throw all of this over to an Agent, the agent will not produce anything useful as it would suffer from the same problem as we humans, and it will be difficult to find the needle in the haystack.
On top of that, many inexperienced practitioners think LLMs are really good at finding patterns in unstructured data. This is not such a use case as it will create severe hallucinations in edge cases. In reality, it is easily possible, and even beneficial, to create structure to this dataset before letting the agent run rampant on context.
Designing Architecture for Agentic Dataflows
So this is what I did as a simplified flow. (Read from the left to the right).
First, I set up an Apache Airflow server that connects to an API-based data source on a regular interval to extract this information. Some of you might be surprised to learn that Airflow ran Jupyter Notebooks through an SSH pipeline.
Examples of APIs that I like and that are mostly free are:
But these are just examples. You are obviously free to connect to whatever API you like. Most APIs return a JSON response that can easily be pushed to and parsed by a Flask API.
The task of the Flask API is first to dump the unstructured data in an unstructured cache. In my case, I used a MySQL table where the data I had received from the API was dumped simply into a JSON column. Then I run a “flatten” script that picks up the latest dataset and maps it into a structured MySQL table.
Most of this work is mostly brute-force writing data dictionaries and making sure that the formats are correct. That applies specifically to date fields. I decided to use UTC timestamps since it is a default that works everywhere around the world with the lightest information encoded. Some APIs provide Unix timestamps, but I personally don’t like to work with them.
Btw, if you want to have access to one script or the other, be sure to reach out.
Semantic Search and Embeddings
Searching through a structured database is straightforward. You use a simple SQL statement in the shape of “Select X from Y with conditions Z”. However, as I have shown in my post “Building a Semantic Search Engine: Analyzing Over 200K Finance Posts with BM25 and LanceDB”, searching semantically is trivial. The key tasks that I had done in preliminary work were to remove stopwords (“and”, “be”, etc..) from the body of text, extract keywords through term frequency-inverse document frequency (TF-IDF) vectorization, and then use a suitable vector embedding to make the database column searchable. Then it’s easy to use an appropriate distance measure to find nearby database rows.
I won’t repeat the steps here as they are well outlined in this post already.
Agentic Prompt
Now for the final piece in the puzzle. In modern “chat”-like applications, the UI allows for a conversation with the dataset. However, when I was building my app, I had a chat interface in my backlog, but I didn’t have the front-end expertise to build a nice interface. Also, most users were not that comfortable with chat UI. So I never built it. Now in 2025, there are many options that are well-established. Since I needed for the UI a fixed prompt architecture, I decided to pre-set a couple of prompts that can be found here.
Just to give a quick glance at one of such prompts. Once a request for a given company to perform research comes in, the system would collect all relevant news headlines from the database through BM25 search and then use GPT3 to summarize the new headlines and bodies into a short narrative.
f"Summarize for this company {shortName} the following news headlines and compress them into the most important information: {headlines}"
For equities that had recent news that approach worked remarkably well. However, the more you strayed off a given path, the less it would work. However, since I used an extremely straightforward prompt and grounded data paradigm, the algorithm never produced hallucinations and also provided sources where the information could be found. I really liked this approach because it provided a clear picture of recent news and also avoided repeating news articles. A problem Google Search is still struggling with today.
Of course, there were many other tasks the agent could do as you can see in the screenshot below. But this I will explain at another time. This post has already gotten quite long.
Hope you liked the post, please like, share, subscribe.
Btw, I tried to sell the app back in the day. The guy who never bought it would have made a fortune.
Maybe I will relaunch it as DeepCQ? What do you think?