Executive Summary
LangChain launched in October 2022 as an open-source framework, quickly gained traction among the developer community, and was incorporated in April 2022 with seed investments from Benchmark (USD 10M) and Sequoia (USD 20M).
The company specializes in technologies that allow the easy creation of applications that interact with Large Language Models (LLMs) like AI agents and Retrieval Augmented Generation.
As a reader of this publication, I expect you to know about AI Agents already.
Retrieval augmented generation is a technique that combines the power of large language models (LLMs) with the accuracy of information retrieval (IR). RAG works by first retrieving relevant documents from a knowledge base and then using the LLM to generate text that is consistent with the retrieved documents.
Other examples of LLM-enabled applications are Chatbots, Question-answering systems, Summarization systems, Translation systems, Data analysis systems, Creative writing tools, and AI Agents.
Problem
LLMs are offered from a variety of vendors with different focus areas (Code generation, Knowledge Discovery, or Text Translation). The pain that most application developers experience when tasked to integrate LLMs is the creation of technical debt and the lack of documentation as new features are launched quickly in a rapidly changing industry. LLMs on bare metal are complex to integrate and might change completely with a new version roll-out (GPT 3.5 to GPT4 from an API perspective).
Complexity: LLMs are complex pieces of software that can be difficult to understand and use.
Portability: LLMs are often trained on specific datasets and hardware, which makes them difficult to use in different environments.
Lack of documentation: There is often a lack of documentation for LLMs, which makes it difficult to learn how to use them. While Langchain provides documentation, their approach is often criticized for being confusing and poorly maintained as well.
Solution
LangChain is an open-source framework that makes it easier for developers to build applications powered by LLMs. It provides a set of tools and APIs that allow developers to chain together multiple commands to create more complex applications. One example of such an application is autonomous agents (my favorite topic). Because of solutions like Bard and ChatGPT, we commonly use the terms Chatbots and AI Agents interchangeably. I think that’s wrong.
Chatbots are primarily designed for conversational interactions with the user. Chatbots are not Agents but Agents can be Chatbots. AI Agents usually have a broader scope than chatbots as they usually offer a wider range of capabilities. Agents know how to use tools (Wikipedia, Search, Ask-a-human), have short-term and long-term memory, and have the capability to reason and criticize a thought transparently.
The benefits that LangChain provides to developers are (1) modular design, (2) a high-level API, (3) support for multiple language models, and (4) updated documentation.
Modular design: The framework is made up of a set of independent modules that can be combined to create custom applications. This makes it easy to reuse code and build applications that are tailored to specific needs.
High-level API: The framework provides a high-level API that makes it easy to interact with language models. This API abstracts away the complexity of the underlying language models, making it possible to build applications without specialized knowledge.
Support for multiple language models: The framework supports a variety of language models, including OpenAI's GPT-3 and Google's LaMDA. This makes it possible to choose the language model that is best suited for the specific application.
Documentation and tutorials: The framework is well-documented and there are a number of tutorials available that help developers get started. This makes it easy to learn how to use the framework.
As my recent exploration into AI Agents has shown, building Agents with Langchain is quite straightforward.
Market Validation
There is a growing demand for LLM-powered applications. A recent report by Gartner predicts that the market for LLM-powered applications will reach $15 billion by 2024. This growth is being driven by the increasing adoption of AI in a variety of industries, such as healthcare, finance, and customer service.
The Langchain repo on Github has >61K stars, 831 followers, and 8.4K forks.
As of the date of this writing, many developers consider LangChain to be the easiest-to-use and most popular software library for composing LLM-powered systems.
Market Size
The market for LLM-powered applications is still in its early stages, but it has the potential to be very large. The global market for generative artificial intelligence (gAI) is expected to reach $22B by 2025, $118B in 2032, and LLMs are a key component of gAI.
This means that there is a large addressable market for LangChain and other LLM-powered frameworks.
source: Precedence Research
Gartner has done some research on generative AI
Some quotes about LLM’s from Gartner’s research on generative AI:
We believe that by 2025, more than 30% of new drugs and materials will be systematically discovered using generative AI techniques, up from zero today. Generative AI looks promising for the pharmaceutical industry, given the opportunity to reduce costs and time in drug discovery.
and
We predict that by 2025, 30% of outbound marketing messages from large organizations will be synthetically generated, up from less than 2% in 2022. Text generators like GPT-3 can already be used to create marketing copy and personalized advertising.
Gartner believes generative AI will predominantly affect customer experience. I believe we have not seen the iPhone moment of that technology yet.
source: https://www.gartner.com/en/topics/generative-ai
McKinsey even exceeded this estimation in a recent study and calculated that generative AI could add the equivalent of $2.6 trillion to $4.4 trillion annually across 63 use cases in value to the global economy.
source: McKinsey
It remains to be seen how much of this can be captured by Langchain, but the position of a standardized and fast interface between models and developers is a strategically good position.
Team
The LangChain team (~13 pax as of Sep 2023) is composed of experienced engineers and entrepreneurs with a deep understanding of the AI and software development industries. The team has a proven track record of building great products and although it is quite early you can see the quality of their work already.
I don’t want to dive into this further here, but let one of the founders explain LangChain himself
Traction
LangChain is rumored to be used by a number of companies, including Google, Microsoft, and Amazon. I believe that while the technology solves a real pain point production deployments will still take some time.
Further, monetization of an open-source product is difficult as copycat products (8.4K forks!) can potentially easily be created. I believe that launches like Langchain Hub might develop into a defensible moat.
Competition
There are a few other frameworks that compete with LangChain, such as Hugging Face Transformers and OpenAI API. However, LangChain is unique in its focus on making it easy for developers to build applications powered by LLMs. This focus has made LangChain a popular choice among developers, and it is likely to continue to gain market share in the years to come.
Hugging Face
Hugging Face (valued at 4B) is a company that develops open-source software for natural language processing (NLP). Its flagship product is the Transformers library, which provides a unified API for a variety of NLP tasks, such as text classification, question answering, and machine translation. Hugging Face also hosts the Hugging Face Hub, a community-driven repository of pre-trained NLP models.
In August 2023, Hugging Face raised $235 million in a Series D funding round, led by Google and AMD. This brings the company's total funding to $395.2 million. The new funding will be used to accelerate Hugging Face's growth and expand its product offerings.
Hugging Face is one of the leading companies in the NLP space. Its products are used by researchers, developers, and businesses around the world. The company's recent funding round is a testament to the growing demand for NLP technology.
HF provides three types of agents: HfAgent uses inference endpoints for open-source models, LocalAgent uses a model of your choice locally and OpenAiAgent uses OpenAI’s closed models.
AutoGPT Plugins
AutoGPT is an experimental open-source application that uses OpenAI's GPT-4 language model to perform autonomous tasks. AutoGPT has gotten attention earlier in 2023 as it has become one of the fastest repositories to reach 100K stars on GitHub. After that, the hype has boiled down considerably. Given a goal in natural language, AutoGPT will attempt to achieve it by breaking it down into sub-tasks and using the internet and other tools in an automatic loop. After using AutoGPT for a short while, I couldn’t see the value of it. The whole notion of giving a system a natural language task seems to be difficult to solve as most humans are really terrible at defining tasks.
LlamaIndex
LlamaIndex is a data framework that helps you connect custom data sources with LLMs. It provides a simple and flexible way to store and index data, and it integrates with downstream vector stores and database providers.
AgentGPT
AgentGPT is an open-source platform that allows users to create and deploy autonomous AI agents.
Most competitors, especially AgentGPT, are set on creating goal-driven agents that are executing autonomously given a vaguely defined goal. I don’t think that this approach, that AutoGPT, BabyAGI, or AgentGPT) are taking, works well. Mainly because most humans are not good at defining goals (‘I want more followers on Twitter’, ‘Find me a new Job’) which leads to unsatisfactory results.
The competitive advantage that LangChain has is that it focuses on developer products and not B2C.
Conclusion
I believe that LangChain is a promising investment of your time with an interesting product, a large addressable market, and a strong team. I think that the product solves a real pain point; it surely solved mine, and as use cases for generative AI in real-world applications increase, more people might use LangChain to build their products.
Highlights
Large and growing market for LLM-powered applications
Well-designed and easy-to-use framework
Experienced team with a proven track record
Early traction with several major companies
Well-funded, partnerships with Benchmark and Sequoia.
Unique focus on making it easy for developers to build applications powered by LLMs