Hacker News new | past | comments | ask | show | jobs | submit login
LangChain Announces 10M Seed Round (langchain.dev)
235 points by lysecret on April 4, 2023 | hide | past | favorite | 167 comments



The place where I work was an early adopter of LLM, having started working on it a year ago.

When I build stuff with GPT-3, especially in the earlier days, I get the strong impression that it's like we are doing machine learning without Numpy and Pandas.

with LangChain, many of the systems I have built can be done in just one or two lines, making life much easier for rest of us. I also believe that LangChain's Agent framework is underappreciated as it was pretty ahead of its time until the official ChatGPT plugins were released. (contributed to LangChain a bit too.)

Unfortunately, the documentation is lacking indeed. While I understand the need to move quickly, it is not good that some crucial concepts like Customized LLM have inadequate documentation. (Perhaps having some LLM builds on top of the repo would be more effective than documentation at this point.)


The docs are lacking but the underlying code is so simple that it's just a few clicks/rg searches away from figuring out what you need. It's all mostly ways to do string templating. IMO the ergonomics of LangChain need an overhaul; there's too many ways to do the same thing and there's too much type erasure that makes it hard to use strengths in a particular LLM. For example, it's still a pain to distinguish between using a Chat oriented LLM vs a regular completion one.

There also seems to be really poor observability in the code and performance seems to be an afterthought. I tell friends who ask about LangChain that it's great to experiment with but not something I'd put into production. Hopefully this funding helps them shore things up.


Are you saying you'd use something else in production?

> For example, it's still a pain to distinguish between using a Chat oriented LLM vs a regular completion one.

Totally agree. After using it for a few weeks, this is one of the most visible weaknesses in the design.


> Are you saying you'd use something else in production?

Absolutely. The Langchain interface is quite easy to build atop any old LLM, and if you're just using OpenAI then OpenAI already distirbutes clients in a bunch of languages that you can just use. Even then, you're calling a few HTTP endpoints and stuffing some data in a payload. It's really basic stuff. I'd prototype in Langchain, grab the prompts and agents I ended up using, and reimplement them in $MY_PL on $MY_PLATFORM. That's what I find so fun about these LLMs, they're just trivial to use for anyone that can ship text off to an endpoint (whether networked or local) and receive a text response.


This is what blows my mind. They raised a 10M seed on what is (no disrespect intended) a wafer thin abstraction that an experienced dev really could implement the core of in a few days. Obviously the raise was about the momentum and mindshare Langchain holds, but still. Wow.


Agreed. The text parsing for feeding prompts into an LLM and parsing a response is about as simple and straightforward as it gets in programming. It is nice to have some control over that process and know what your code is doing every step along the way. It doesn’t need to do much anyway, the LLM is doing the hard work mostly. It makes no sense to me and I’m trying to understand it, but I just can’t see the value in a black box library to interface with an LLM when it’s so easy to DIY.


I agree that their current implementation is what you said, something an experienced dev can do in a couple days. But they have the potential to really make a robust library here. The thing is, there's a lot of small things with stop sequences, token estimation, API call backoff, and generic structures that are just a pain to make yourself.

But you're right that their moat will probably be razor then. A few senior devs can get together and probably hackathon a library that's just like Langchain but much more robust. Thanks for an idea on what I'm gonna do this weekend lol.


> But you're right that their moat will probably be razor then. A few senior devs can get together and probably hackathon a library that's just like Langchain but much more robust. Thanks for an idea on what I'm gonna do this weekend lol.

Did you do it? I'm doing something similar but in Rust, for my product. It'll be open source soon enough.


I was about to rant about the documentation, but I just checked and it seems to have improved a lot.


I wonder is it already possible for an AI to write documentation from scratch based of code base?


I agree that the agents are underappreciated.

To make them more accessible I rewrote them in ~200 lines of code, so you can easily understand how it works.

They have access to a python console, Google search and hacker news search:

https://github.com/mpaepper/llm_agents


In case anyone misses it buried in the readme, your accompanying blog post looks like the solid introduction to the subject that I've been looking for: https://www.paepper.com/blog/posts/intelligent-agents-guided...


I was looking through Langchain's docs and code last weekend. I'm surprised how well it is documented, actually. I thought it was fairly feature rich vis-a-vis potential chaining opportunities, but with obvious room to grow. Quite impressive, all things considered.

Excited to see what happens going forward.


I think LangChain is already outdated and it (and its copycats) are going to cripple the entire field.


It is, but it'll still get copious funding and let a few sly engineers escape the Matrix - so what's the harm?


Could you expand on what you think is the state of the art and direction we should be heading in?


Well, had the community tried analyzing what could be done better? And iterating on the design?

No, the design from an old academic paper was used with a much newer model. And now everyone is just copying that.

It works, because new models are impressive. But the design is far from being elegant or particularly efficient. And now, because tons of data like that is going to be generated, we’ll be stuck with it.


I mean, it's based on a paper[0] from November, no? Or is that called "old" in the AI world?

[0]: https://ai.googleblog.com/2022/11/react-synergizing-reasonin...


Yes, relatively old. The issue is, this approach is designed to work with the “classical” language model, trained using “The Pile” methods. This particular one was Palm 540b.

So essentially you have an approach, designed to work on these models that are not really instruction following models and that truly are stochastic parrots.

The models had changed substantially since. But the approach of chaining them in this particular way stuck. And is getting copied everywhere, without much thought.


Your answer doesn't make any sense regarding langchain to be honest.


Sure. I’m just expressing my opinion that the design is suboptimal and that the level of design is literally “You are GPT-3 and you suck at math” [quote from the LangChain code base].

I don’t want to see further expansion of this. I’m not offering higher level design, because I’m not sure about safety of all of this.

But having a poor and potentially unstable design like LangChain also doesn’t contribute to it :-/


Sorry to bring up an older thread, but I was looking into LangChain recently, and I was thinking of making something similar but in other languages. Do you have any insight into what direction is better to move in for LangChain-like alternatives?


It seems pretty rare that communities get "stuck with" a framework. Frameworks are pretty fluid. Eg, Python didn't get "stuck on Django," and they didn't get "stuck on flask," and they're not "stuck on FastAPI" now - the ecosystem continued to evolve, and none of these projects even had to die for a different vision of how a framework should be organized to capture the zeitgeist. They've each got dedicated communities which are continuing to improve them.

Similarly, I expect creative hackers to pursue new approaches in the space of LLM frameworks and for some of those to catch on, and that they don't need to uproot langchain to do so.


The difference is, lots of data is being generated. And, open models in particular, are trained on it. So there is a certain level of stickiness.

An analogy of a file format is a close one. Imagine someone invents something nasty and bulky, like a twisted form of XML or something. And, simply because it is early, it catches on. It could be buggy and unstable and ugly, but it still wins, because it is early.

The call here is to try to examine LangChain design a bit closer. And maybe consider that a start from scratch is a good idea.


Could you expand on what you think is the state of the art and direction we should be heading in?


Why? What do you think the better model is?


There are two Javascript alternatives, but how robust the development is remains to be seen:

https://github.com/hwchase17/langchainjs

https://github.com/cfortuner/promptable


Where do you see Langchain fitting into the ecosystem once Open AI rolls out plugins more widely?


it still works well with other LLM (like llama and more).

various small, open sourced, and verticle LLMs vs one large GPT models would be quite interesting.


Then we will connect plugins visually as Unreal's Blueprint Visual Scripting allows.


Is this supposition, or actually the direction Langchain is headed?


It's unfortunate that prettymuch all shiny AI things have horrible documentation. I see a lot of misinformation in non-researcher AI circles and I feel like it stems from that sometimes.


it is ironic that documentation is lacking when it can be generated with an LLM, using LangChain itself


The landgrab going on in this space is ferocious. If you weren't convinced this is mania, and a 10M "seed" round for Langchain doesn't do it for you, nothing will. Well done Harrison on the cash grab (take as much money off the table every round as you can), it's a smart move. But I can't shake the feeling that this ever increasing mania will sweep up anything OSS with vague traction and this whole AI space that used to be religiously defaulted to open and sharing will fairly quickly end up in VC-funded fiefdoms with pay-to-play being all that's left modulo the "forever free" community hobbled versions. Hope I'm wrong.


Interestingly this is shortly after another huge “seed” round in this space from fixie.ai (17mm). A lot of money being thrown at making it easier to chain/give LLMs access to other applications and tools.


Yeah, wow, a 17M seed round and not a hint of irony. Ferocious. Capital is DESPERATE to throw money into anything that appears to be related to LLMs. "Value creation event of our lifetimes" etc. There's grift money to made here and I'll be damned if some decent proportion of the "API wrapper + template UI" startups masquerading as AI companies aren't cashing in. Not sure I blame them.


I agree with you, and people miss that December 2022 is not when GPT became commercially possible.

Look at the startup OpenAI generations 1-2 years back - they have largely sank at comparable or worse rates than any other startup from 2020/2021.

The GPT-3 first wave companies, around translations, basic quizzes, summarization tools, language learning apps, and ofcourse the notorious paraphrase tools (almost entirely obsolete since ChatGPT) can't be found in that form anymore, they've all been forced to shut down or move functionality significantly. Early on, OpenAI limited output to 300 tokens max - and less for most usecases, often 50-150. Chatbots were not allowed IIRC for over a year. If ChatGPT hadn't came along, much of what langchain enables wouldn't have either, nor so many big companies willing to now risk.

I can count none over 2 years old which have not since been made obsolete by raw ChatGPT access or are now default dead due to competition from existing unicorn (e.g. Duolingo, Quizlet Q chat) who are now crushing them.

It must be painful to have spent $xx,xxx on GPT-3 at $0.06 a token to obtain users, and now have your market ripped from you by a $B+ company paying $0.006...

So I doubt the template UI startups will sustain retention or stay about long term, unless they really find traditional startup ways to nuggle into niches and use cases/vendor lock in. This isn't an innovators dilemma for most companies, it's just an obvious sensible thing to try at this point, so startups don't have much to balance with on risk. That being the case, the market surely should seem less appealing than the open ended "blue ocean new value" of 2021 GPT tech, but - I guess not.


The steamroller is real in any super hot space, but that hazard is well known and founders should pay homage regularly to those who have fallen under its mighty squashing.


The question is if the 2nd AI Winter kills python the way the first killed lisp.


Python has huge uptake outside of the ML domain. I can't see an AI winter affecting the many people using Numpy for non-ML purposes (i.e., scientists in academia, most of whom still deal with normal numerical and data analytical modeling, not ML) much less Django.


I'm...aware. I think a lot of us that aren't using NumPy or ML stuff are rather looking for a good excuse to get away from what the ecosystem has become. (Yeah, I'm still bitter about Py3...Python user since the 1.5.2 days)


Bitterness isn't really a winning strategy.

I thought the whole Python 3 thing was a huge problem. Lately I've been doing JS/Typescript dev and breaking changes like this happen continually and no one blinks.


What is the python3 thing?


Python 3 had incompatible changes with Python 2. You had to update your code, and for some years it meant lots of projects stayed on Python 2.


The problem was never your code, really, it was the dependencies.

Wasn't that hard for quite a while to find situations where dependency A was Py3 compatible and B was not (and the incompatibility went both ways, especially in early 3.x releases, you could NOT have one codebase that worked with both).

Sometimes A dropped Py2 support before B gained Py3.

Pain pain pain.

Then add the increasing level of insanity as the answer to "python packaging sucks" was repeatedly to add yet another layer.


Plenty of other options out there if you've fallen out of love with Python and don't need good numerical libraries. Give JS, Elixir, or Crystal a try if you want something more dynamic. Nim if you want something a bit off the beaten path. Go, Rust, Java, Kotlin, and Scala if you want something more static.


Sadly it's what pays the bills atm.


God I hope so.


Could it swap, so we can have lisp back?


It would be absolute godsend if we could, so one measly error wouldn't require restarting a entire process just to rewrite some small part.


That's really more erlang's niche.


I meant more like dynamically re-evaluating specific functions without having to restart the process. Haven't done much Erlang, but my experience is around doing so in lisp rather, which definitely can do that.


No, that's really Lisps thing.


I think you're absolutely correct.

But this is by 'design', I have very unreasonable suspicions that indeed, the whole VC 'world of entrepreneurs' is just the way the USA government does R & D on an industrial-corporate scale. The 'brilliance' behind this way of doing R&D is that they only pick up the winners after they won, so they don't "waste" money on R&D death ends nor moonshots.

on the other hand, this is a good way to 'explode' for cheap the technological applications of already developed scientific innovations. meaning none of those VC-backed startups are doing innovative research, but in fact are devleoping commercial applications for corporate overlords who having seen who won, step in to buy them out.

even music industry is shifting to that model, they are now only signing bands/artists/influencers who already build their audience.


It's far superior to the EU model where politicians are tasked with creating requirements for ,,the next innovations'' and creating tenders, businesses popping up / recycled to win those tenders, do the minimum to satisfy the criterion that nobody really cares about, and try to siphon out as much money as possible using trusted subcontractors, meanwhile paying a huge part of the money for the privilage of winning the tender.

At least that's what's going on in Hungary, the most corrupt government in EU, I hope other parts are a bit better.


I'm also worried that courts will decide that model weights are copyrightable and the open source free-for-all will be over.


If models can’t prove they are fully free of copyrighted data I don’t think they’ll have a leg to stand on there.


This is clearly not a given. Search engines are good decided case law in the opposite direction.


Search engines aren't a replacement for the original data, they're a way to direct traffic towards it.


The business model doesn't really change the IP considerations though.

(Additionally, newer LLMs like Perplexity.AI's correctly cite content sources, so that is even more similar to search engines)


These models will readily generate near identical outputs to copyrighted data, at length. This is not comparable to search.


I'd invite you to read "Foundation Models and Fair Use"[1] which is a paper written as a collaboration between Standford's law school and computer science department.

It talks at length about this specific problem and migration techniques for it:

Existing foundation models are trained on copyrighted material. Deploying these models can pose both legal and ethical risks when data creators fail to receive appropriate attribution or compensation. In the United States and several other countries, copyrighted content may be used to build foundation models without incurring liability due to the fair use doctrine. However, there is a caveat: If the model produces output that is similar to copyrighted data, particularly in scenarios that affect the market of that data, fair use may no longer apply to the output of the model. In this work, we emphasize that fair use is not guaranteed, and additional work may be necessary to keep model development and deployment squarely in the realm of fair use. First, we survey the potential risks of developing and deploying foundation models based on copyrighted content. We review relevant U.S. case law, drawing parallels to existing and potential applications for generating text, source code, and visual art. Experiments confirm that popular foundation models can generate content considerably similar to copyrighted material. Second, we discuss technical mitigations that can help foundation models stay in line with fair use. We argue that more research is needed to align mitigation strategies with the current state of the law.

[1] https://arxiv.org/abs/2303.15715


Sounds like you're agreeing they are legally in murky territory.

Further, new laws get made in reaction to new things whenever they push an existing doctrine beyond the original ruling, and these are certainly in that territory.


> Sounds like you're agreeing they are legally in murky territory.

Of course. As I said originally "This is clearly not a given". It's very unclear how this will be decided, but anyone who thinks that just because models contain copyrighted data they don't have a leg to stand on is very wrong. There are multiple good arguments and precedents to show that they do, depending on the circumstances.


Huh, I still interpret “this is clearly not a given” as a being contradictory reply to me saying it’s murky territory.

I think they contain massive amounts of copyrighted data, and reproduce them exactly, and that’s why they don’t have a leg to stand on. It’s a personal opinion, and I think backed by your citation. But thanks for the reference there, and glad to chat.


If you are holding copyright to something, it will be on you to prove it's in there.


I am obviously clueless, but if it's a case, can't one demand what the training data is?


Based on what case law? In what jurisdiction?


It's very easy to show the models generating near-identical data to copyrighted data, which is enough to get courts to force them to allow discovery.


This not have happened all over the place yet is evidence against this being as easy as you make it sound.


The LangChain community and ecosystem is one of the most interesting places in AI at the moment, but after spending some time with the codebase I can't shake the feeling that the constructs and way they are factored are hopelessly wrong. Not sure what to think about that.


I agree - LangChain is totally awesome and unlocks amazing new capabilities of LLMs (easily!), but the code itself is pretty awkward. No logging framework, for instance, and the strong typing enforcement with pydantic seems un-pythonic and brittle to me (what if I want to use a different kind of Mappable instead of a dict for something?). Still, great work overall and congrats to the team!


Whatever time pydantic is supposed to have saved me from making dumb mistakes, I've wasted 100x more trying to shoehorn stuff into some poorly thought-out schema.


For FastAPI, Pydantic ends up being part of the API contract. It's less about saving time and more about validating requests.


> what if I want to use a different kind of Mappable instead of a dict for something?

You can use custom field types either by using a `validator` with `pre=True` or a define a class with a `__get_validators__` classmethod.

But pydantic does have problems, imho it isn't strict enough & has given me the wrong types when using `Union`s a bunch of times. Defining custom encoding & decoding behavior is harder than it needs to be. The purely narrative documentation is easy to learn but difficult to reference.

I would agree that strict typing is unpythonic but in this case I think this is an outdated opinion of Python, one that it's been backpedaling on with type annotations &c. I think Python made perfectly reasonable decisions about this some 30 years ago, and they haven't aged well, which isn't even really a criticism as much as a consequence of it's enormous success. Python had outlived a lot of the ideas & attitudes about language design that went into making it, and carries them as scar tissue.


Indeed, the team has done momentum, community and probably other things well, congrats to them!


As of right now, LangChain is the easiest-to-use and most popular software library for composing LLM-powered systems together.

The funding will help the team make it into a polished product.

In particular, I expect LangChain will start offering prebuilt hosted services that developers can use as components for building third-party apps.

Congrats to the LangChain team!


Money != polish, money means they now need to find a way of earning money from it. This is sometimes how FOSS goes to die.


This one is right in the middle of the action - the plugin market. It is the Android to OpenAI's iOS. Everyone needs a second option.


Another thing people seem to forget is that Langchain can use LLMs that aren't made by OpenAI.

If OpenAI goes under, or a great open-source model comes onto the scene, Langchain can still do its thing.


this is a seed round, the goal is to get growth, and after series C/D/E expectation will be to get into profitability.


Meaning, bait-and-switch users by all means necessary (but after series C). I'm really looking forward to it.


but you have to at least have a monetization strategy right?


Consulting, paid plugins.


FYI: the signup link on your home page leads to the about page instead.


Sorry, which home page, I don't have any )


Doh, I thought you were from LongChain.


Maybe a managed solution, too.


This is the path forward that I see for Langchain commercialization as well. As we begin to see more and more products built atop LLMs, these hosted services could form the backbone for a lot of those applications.

Disclaimer: We - Milvus (https://milvus.io) and Zilliz (https://zilliz.com) - and we have integrations as a vector store in Langchain.


Are there examples of a langchain agent that uses chat API w/ gpt 3.5 or 4 and works well? I keep trying the library every once in a while, and I like it in theory, but the more impressive-seeming parts just don’t seem to actually work well.

Eg

- the Agent tool to retrieve webpage content just immediately fills up the context window. You can say “only use it for small pages”, but the agent picks the page!

- “as an AI language model I can’t use any fancy tools”

- I couldn’t mix and match ConversationalAgent with keeping intermediate steps and memory - gave me an error about only allowing certain chain output keys. This one could’ve been fixed by now, but to me it points to how the library’s features don’t necessarily work together just because they’re in the library together

I do find the library to be useful as a survey of prompting techniques though.

For my own chatbot that can use tools, I found I had to lean in to how ChatGPT _wants_ to respond, by getting it to reply in “code mode” (intro text, bullet list of steps, followed by code blocks and outtro). From there the trick is to get it to just write a single line of python (or another lang) where it just calls a function. Then the non-code reply is it’s “scratch pad”, followed by a code block of action invocation. So the only parsing I’m doing is just pulling out the code block and (don’t boo me, it’s just for fun) eval-ing it. But the response format is pretty consistent so I could make a little mini-parser for python func calls.

Before that, I tried and failed to do what langchain does- prompt engineer in the System Prompt in order to get chatgpt to respond in the correct format (thought, action, observation, answer)


It's definitely a strange thing to go from _using_ "raw" GPT-4/chatGPT and how amazing that is for productivity - then having to go back to the old ways of doing things to build a langchain app.

You would think the "prompt-plugin-engineer" API would be as easy to use, if not easier.


Every time I read the langchain docs, I get confused, and realize that implementing these things myself won’t take much time and offer me transparency… maybe I’m not the target audience?


I feel the same way. I’m confused as to the value it’s really offering. The LLMs themselves do all the heavy lifting and all you have to do is feed in text and parse output text. Very basic programming. It seems like a bunch of hype to me. I’d rather build my own and have transparency with what’s going on and how prompts etc are being handled.


Well, I was going to work on a couple of langchain document loaders and an agent this weekend, but now after this raise? nah.

After reading about countless other awesome open source projects get tainted by VC cash, what makes it so different this time?


Check out my agent implementation here which is just for fun and to get a better understanding. Maybe you want to contribute?

https://github.com/mpaepper/llm_agents


What will you use instead? Fixie?


Realistically, just steal the prompts and implement my own agents. The framework around it isn't terribly complex and you need to "turn everything off" so to speak when you want it to do your own stuff anyway.


> isn’t terribly complex

It’s trivial. The upside from langchain is so minimal that I wonder what experience with software development the people who love this thing really have.


Will you do that right away, or wait for LangChain's fall from grace?


Fixie got a 17M seed so I guess not LOL


I see. So is the idea to avoid companies that are getting resources to improve their product and grow faster?


I think the idea is to avoid contributing to money-horny companies using FOSS as marketing plans. Growing faster is not end-all-be-all, sometimes it's best to go at a reasonable speed.


It just seems like an emotional reaction rather than practical advice.


I find spending time working for a company without getting paid is impractical, but that might just be me.


What do you suspect could go wrong? Not attacking, just curious.


Personally I try to stay away from FOSS projects who raise VC funding, it signals to me they are using FOSS as a way to do marketing, not developing a community-based project.

Plenty of FOSS who tried to make it in a for-profit, VC funded world eventually disappears when VCs can't extract any money back and they never become profitable, and then the creators/maintainers lose interest as there is no longer any money to be made.

Lastly, I generally don't work for for-profit companies for free. If they are non-profit, then it'd be fine. But you make money from my work? Hm, call me a socialist, but at least I should get a part of the profits then.


But that is where OSS licenses come in. If it’s not something like agpl or some self invented thing, then you can just make a fork right? You are only helping the community in that case.

I agree with you by the way, but for MIT or Apache licenses, I think even if the company goes full commercial or terminates at some point, they gave us enough in the first place to warrant some trust and help. For the ‘we are open but clearly using that fact for marketing’ licenses, I indeed refuse to use that software and definitely cannot contribute to it (but no one will anyway that doesn’t work there) as you cannot make a meaningful fork anyway in case it goes bad.


Can someone explain how LangChain compares to Llama Index [1]?

[1] https://github.com/jerryjliu/llama_index


Langchain allows an LLM to initiate query and prompt workflows and observing the results in a sequential or conditional logic chain or workflow and invoking agents, while GPT Index (now Llama Index) excels at connecting an LLM to external data sources and implementing creative indexing schemes for querying within given token context windows. Right now they both fill a need and complement each other nicely.


Do you have any good/interesting examples of these prompt workflows with langchain? I can see the utility in the abstract, but haven’t seen anything out in the world beyond clever toy projects.


This is a good cookbook style collection of notebooks that illustrate the Langchain functionality https://github.com/gkamradt/langchain-tutorials

Pinecone has some comparable langchain 'recipe' notebooks in their pinecone-io\examples Github repo


Thank you! I’ll check these out!


Be careful out there when using Pinecone. The pricing seems to be confusing. https://twitter.com/Exploringfornow/status/16426070792852193...


Yikes! Thanks for the heads up.


Entirely notional as of now, but I plan on testing it for game development. The thought is, you could create an embedding of the lore, data, facts and state of your game world and then interrogate the LLM about this in various ways. I see this being very useful both during the authoring of the game, and at runtime.


That’s awesome, wonderful idea.

“State Machine games” is incredibly hard to Google!


I'm realizing I probably should update my bio on here more than once every decade! Currently working full time in a related industry, so most of my AI experimentations are strictly on my own time these days. If anyone is interested in these same sorts of things I am on twitter @TrevorJ


I tried to use langchain but found some bits broken/suboptimal and quickly found myself hacking around it. It certainly has useful concepts/patterns but you aren't using most of them and it's generally easier to just implement what you need for your specific problem - at least at this stage. Maybe once the ecosystem grows more the value proposition changes.


Glad that others are having the same thoughts as I was and I’m not just crazy or missing something big here. I am trying to understand why I’d actually want to use it vs just interfacing with the LLM api directly. For most use cases it is very straightforward and it is good to understand every step of the interaction rather than a “black box”


Open source is the new marketing gimick for rasing funding.


And hijacking HN to do so.

Not sure what’s worse. That people do it, or that it works.


New?


Good luck to the team!

We went back-and-forth in diff APIs here before coming back to langchain. The march release with async streaming, and growing plugin work, are examples of why.

Balancing VC funding with OSS will be hard. I like having a neutral player here, separate from the model and DB co's, so huge value. But until they prove a scalable revenue model, huge risk of them fighting their community, and the biggest coders are smart enough nowadays to recognize and jump ship: who wants to give free work to someone else + risk donating their entire income stream to them?

So I'd love to see a lot more clarity on business model and clear lines in the sand for this not being a big rug pull at the cost of its users & developers. The next 2-3 weeks are pretty critical here for community comms, especially given the competition.

Again, good luck to the team!


kind of amazing that something that is haphazardly slapped together was able to pull a business plan together to get funding.


"haphazardly slapped together" describes ~99% of all startups that raise seed funding, and is often correlated with success.


LangChain is especially slapped together. Not something I would use in production (especially with the eval bug that was left unpatched for weeks despite an open PR)

It feels at some point being too slapped together isn’t correlated with success

Wishing the LangChain team well! But there are already alternatives cropping up that are far more polished (Microsoft’s Semantic Kernel being one)


I don't know if it's as much slapped together as it is just "simple". I feel super arrogant and very HN-esque saying it but I did a deep dive into the code and it isn't anything to write home about engineering wise.

What it does offer is standardization of composing LLM prompts - something that a team within a company can spend a week max to write an SOP doc/implementation for.


Yeah, it is not a deep layer of abstraction I don’t what the hype is about this.

The effort to develop LLM stuff is surprisingly low, put a framework upon it seems overkill


Well $10M “seed” rounds are still a thing I guess.


It is hardly a seed round. Langchain is the most mature tool (not saying much) in this new LLM revolution. Like a market leader doing a seed round.


Anything less than $10M would be wholly inappropriate for the moment. I’m shocked it wasn’t much more. Surely this will be followed by $100M within months. Rarely has there been a field of technology that has moved as rapidly as LLMs are in the last two months.


This was my first thought too - I'm starting to question it, because they do have a product. Whether they have fit, will be interesting to see. More than most of the hype seed rounds in 2021.


What was the most hyped seed in 2021?


I think LangChain will be the "backend" for many AI based SaaS companies in the future.


Interestingly on Feb 13th it was already reported [0] LangChain was in talks with Benchmark about a Seed round.

https://www.theinformation.com/articles/benchmark-expected-t...


LangChain is bad at documentation. I hate it.


Yeah it’s kind of a disaster. Partly because they’ve been moving so fast, and their docs don’t keep up. But for actually building stuff on it, they have a ways to go.

Good for quick demos on Twitter though.


They should use ... you know ... LangChain to write documentations.


Submit some PRs!


they received funding, they should be submitting PRs now?


Unfortunately, depositing $10 million into a bank account does not also deposit the butts of 50 engineers into seats. But if you're looking for a job, now would be a good time to reach out to them and offer your services!


Or write a competitor and and get some of that sweet sweet merger money.


Congratz to LangChain. Wondering what's the dev workflow when doing prompt engineering using langchain/haystack.

- Is there a way to view the final log?

- Why there isn't a simple way how abstractions will look in the final output?

- Can we change the output of those abstractions (tools, agents..etc)?

Unfortunately, I see no plans about documentation whatsoever.


Congratulations Harrison, Ankush, and team.

I first heard about Langchain at the Scale hackathon when I think it was Colin who was like “half the people are using Langchain” and I was like whatchain?

Amazing trajectory!


Congrats to Harrison and the team. The pace of Langchain development is pretty stunning and they deserve the shot to make this happen.


Folks interested in this space and actually wants to work on open source, now is a good time to fork Lang chain.

The moment they raised VC funding the open source project is dead. All their incentives are now to 100x the investment they just raised. They would start putting core features behind an enterprise license.

Contributors of langchain please fork the project and make a better project! Stop sending free contributions to make the investors rich.


This isn't true, many open source companies make money in ways that don't diminish the core open source product, Kafka/ElasticSearch/MongoDB/Supabase/Vercel/Grafana/Red Hat/etc


Most of them have the core features put behind an enterprise license.


Correct


This is so true. Sad to see a promising open source project being sold out. I doubt now if langchain can truly ever reach its potential of supporting additional models beside OpenAIs and would be able to support user contributed tools or agents.


Why not provide the free contributions to an open source fork yourself instead of begging others to give up their own time?


The appeal is to existing contributors or people who are interested in open source and wants to work on this space. I am not either of them nor use the product.


If you're unwilling to do it yourself then don't demand others do it.


Why not? They have freewill and can choose to do it if they want to do it.


Could use more competition


Langchain has been amazing for me to demonstrate what's possible as a product guy but my engineers are highly reluctant to use it for production deployment - I half get their concerns, many of which are in this thread already. I would say they have 6 months to prove our their enterprise value, hope they can do it.


This is great news! Things are moving so fast; I remember we started to help out with our team after we saw the first Weaviate pull request https://github.com/hwchase17/langchain/pull/261


Is it no longer possible to just have a nice open source project without a company existing around it?


People want to eat.


Wonder what the monetization route will be with so many alternative projects.


TL;DR & DIY: asked gpt-4 this prompt "Cluster the top10 categories of complaints by the users, and describe each category with a few adjectives/nouns in order or importance." as of rn.

Crisp or too critical?

1. Documentation: lacking, inadequate, outdated

2. Code quality: simple, awkward, suboptimal

3. Production readiness: experimental, unreliable, limited

4. Monetization: unclear, risky, potentially detrimental to open-source

5. Community support: misinformation, poor communication, fragmented

6. Ecosystem: competing alternatives, redundancy, unclear positioning

7. Business model: potential rug-pull, VC-funded, uncertain sustainability

8. Developer experience: poor ergonomics, type erasure, confusing

9. Performance: slow, afterthought, poor observability

10. Maintenance: unpatched bugs, slow response to issues, dependency on contributors


The investment is all about the vision LangChain proposes & I guess they have 20K stars and a decent amount of users.


Good for them! If you've been following their Twitter and Discord, their speed has been remarkable


Anyone know if they're hiring (Product Designer)? Their web presence is pretty sparse.


Things are moving like Crazy!


$10m Seed Round?


wow




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: