Hacker News new | past | comments | ask | show | jobs | submit | jpulec's comments login

I've been using the JS version of langchain for a few months now, and despite there being a lot of valid criticism, (especially around the abstractions it provides) I'm still glad to be using it.

We get the benefits of a well used library, which means making certain changes is easy. For example, swapping our vector database was a one line change, as was swapping our cache provider. When OpenAI released GPT-4, we were able to change one parameter, and everything still just worked.

Sure, it's moving fast, and could use a lot better documentation. At this point, I've probably read the entire source code several times over. But when we start testing performance of different models, or decide that we need to add persistent replayability to chains of LLM calls, it should be pretty easy. These things matter to production applications.


> When OpenAI released GPT-4, we were able to change one parameter, and everything still just worked.

Wouldn't that be the same if you used the OAI js library directly? Basically swapping the model parameter?


It's not. The API is different, since GPT-4 is a chat based model, and davinci isn't. It's not a huge difference, but these little sort of things add up.


It is a very minor change (made the changes in minutes and didn't have to bring in a new framework for it).


Agreed. It's a trivial change, certainly not a justification for using langchain.


I see, thought you were using GPT-3.5 and moved to GPT-4.


Arguing that Tailwind is a leaky abstraction for CSS is like arguing that ORMs are leaky abstractions for SQL; hiding the underlying implementation isn't the point of these tools.

The biggest benefit is that you get a fairly well thought out API to work with. In the case of Tailwind, this a pretty flexible and good set of defaults that works for 95% of use cases. You can focus more on building classes for cases specific to your site, and not spend time rebuilding undifferentiated layout utilities.


I launched mergecaravan.com to deal with a problem I've encountered at a few jobs in the past. Mostly, I built it initially to scratch my own itch, and to go back to doing more django dev.

Basically, you can add a label to a PR in github and it will then queue it up to be merged once all the required checks pass, and it keeps queued PRs "up-to-date".

It's made a little money, but not much.

Github recently rolled out a feature at Github Universe that has overlap, so I'm guessing it won't get much more traction.

A few lessons I learned: - Especially when building on a platform, make sure you have the right niche. In this case, it probably has a wide enough audience that Github decided it was worth it to build as part of the platform. - Like any engineer, I spent too long building and let scope creep delay me from launching.

All in all, it's pretty cheap (read: basically free) to run, but I probably committed somewhere over 120 hours on it.


That sounds almost exactly like bors-ng: https://github.com/bors-ng/bors-ng#a-merge-bot-for-github-pu...

There was also homu, which predated bors and also did more or less the exact same thing.

Does your thing have any features that differentiate it from what bors does?


You don't have to host it yourself ;)

Beyond that, there are a couple of things that I don't believe bors has.

A big one that I built for myself was the idea of supporting "working hours", i.e. only merge code during this timeframe.

For example, one company I worked for had some pretty flaky tests. Unfortunately, what would happen is we would have several PRs get reviewed and then use a tool like bors to enqueue them. Inevitably, some queued PR would have a failed check for this flaky test.

Fast forward to the weekend and a completely different developer would merge a hotfix into master. Unfortunately, a side effect would be that a tool like bors would try to merge the head commit into the PR with the flaky test, and now it passes! So it gets deployed at a random unexpected time, which isn't what we wanted.


Gitlab also has merge trains. Atlassian open sourced Landkid which does the same thing, initially just for Bitbucket Cloud, but it’s pluggable.


As a very happy Metabase user, glad to see this. Hope the team uses the money to make the project better while still keeping its open source roots.


I've used StitchData at a startup with AWS Redshift. Pair it with something like dbt for transforming your data, and you have a great match. A little pricey, but totally worth it, IMO.


Chewse | Fullstack Developer | San Francisco | ONSITE | https://www.chewse.com | $115-162k

Chewse is weird little family who works with offices to run their meal programs.

We're looking for individuals who want to work as part of a small team, and have a lot of responsibility for what they produce. Humble confidence strictly required. Previous experience with Python and JS nice, but not required.

Process: Initial phone screen, technical phone screen and take home question, video chat, full day onsite

Come bring your heart to work!


Came here looking for a comment like this. If you're not discussing HATEOAS at all, you will be losing out on many of the benefits that REST provides.


It's the 'central dogma' of RESTful services.


Having a clear salary expectation up front just saves both parties time and effort. One of the big reasons salary ranges aren't always public, is because it benefits the company, by maintaining information asymmetry. By but doing that, you're at least in a small way, telling me that you don't want to play that game and are more likely to be transparent with me about other parts of the process.


I was the first engineer hire at a small startup, and have helped our CTO/Co-founder grow the company over the past 3 years.

All I can say, is that almost everything that was said here was the exact experience we had. Even down to the choice of Angular 1.X and rewriting all of the IIFEs in our codebase to use ES6 imports with babel.

I also need to acknowledge that PM is something that you do fine with 2 people, but your processes will fall apart, probably as soon as you even hit 4 or 5 people.


While I resonate with the sentiment, I just wish Python would add better syntax for functional programming. Having written a lot of JavaScript lately, I wish Python's built-in functional tools supported something cleaner, kinda like Underscore/Lodash.


Guido is against it, so I doubt it will ever happen.

https://news.ycombinator.com/item?id=9973301


Good good. One more reason to pry 2.7 out of Guido's hands and into 2.8.


Send to me something like Coconut which implements that in a way which accepts existing Python but adds cleaner functional syntax is one way of making the case for that in future Python.


Specifically?


Not really a fan of the lambda syntax in Python. Comparison:

JavaScript:

  let arr = [1, 2, 3]
  let sumOfSquares = arr.map(n => n * n).reduce((a, b) => a + b) // 14
Python:

  arr = [1, 2, 3]
  sum_of_squares = reduce(lambda a, b: a + b, map(lambda n: n * n, arr)) # 14


Is there a more Pythonic way to do it? Lambdas are cool but usually not the first place you go in Python. I would think something like (my best guess, not a Python pro)...

    sum_of_squares = sum([x*x for x in arr])
Which I think is easier to read than either example post above.

Of course you will point out that this is less powerful than full map and reduce.. but meh... pros and cons to both styles


I would write it like this, to avoid constructing the immediate list:

    sum_of_squares = sum(x*x for x in arr)
This makes use of https://www.python.org/dev/peps/pep-0289/


Thanks for the link, this is good to know.


Worth noting that map() can be parallelized whereas a list comprehension can't necessarily (since it is an explicit loop). The multiprocessing module allows trivial map parallelization, but can't work on list comprehensions.

It's more than just stylistic.


So I have coded everything from dumb web servers (tm), to high performance trading engines (tm). I have toyed with doing the list in parallel thing... and used it in a toy GUI tool or two I wrote... but never really found it that useful in the real world. If you actually want high performance, doing a parallel map is not going to be fast enough. If you are a dumb web server, it's a waste of overhead 99% of the time.

But hey, if you want to use map when you actually need to do a parallel map, cool. But seems very very uncommon. ~ 1 in 10,000 maps I write.


I don't think this is the case, list comprehensions can be expressed as syntax sugar over list functions, it's how they work in Scala for example

http://docs.scala-lang.org/tutorials/FAQ/yield.html


map() can only be parallelized if the function has no side effects. If there are no side effects, list comprehensions can be parallelized just as well


That example works only because the function sum is already defined in Python. If you wanted to do something less common than summing up elements you would have to either use reduce or implement a for loop.


In Python 3, reduce was intentionally moved into the functools library because it was argued that its two biggest use cases by far were sum and product, which were both added as builtins. In my experience, this has very much been the case. Reduce is still there if you need it, and isn't any more verbose. The only thing that is a little bit more gross about this example is the lambda syntax; I would argue that even that is a moot point, however, since Python supports first-class functions, so you can always just write your complicated reduce function out and then plug it in.


True, but I've used python a lot, and I've used reduce maybe...twice? (well, twice that I can find on my github at least)


I just counted the number of reduce I used in my current python project (6k lines). reduce comes up 32 times. And by comparison, map is used 159 times and filter 125 times - for some reason I tend to use list comprehensions less than I should.


I also thing you use map not a ton if you get the list comprehension syntax down... it is a map with less cruft mostly...


curious how often reduce is used with something else than operator.add?


One occurence was to calculate the GCD of a list of polynomials.

In fact I had "reduce" appearing in the names of some of my variables so I used it less than 32 times, about 20 times in that project.


I see nothing particularly inspiring in the examples posted on http://stackoverflow.com/questions/15995/useful-code-which-u...

Could you show your reduce calls?


They are very similar to this one from your link: http://stackoverflow.com/questions/15995/useful-code-which-u...


Well, or write a static method somewhere that you call. Sum is used a lot, so handy it is written somewhere (vs having to do a lambda x + x thing.


That seems like an argument against lambda functions in general - why use lambdas when you can define a static function for every case? Well, the answer in my opinion is because it makes code more readable if you can define a simple lambda function instead of having to name every single function in the code base.


Well if you are going to reuse the function, name it. If it is a 1 time thing, use a lambda.


Sounds great in theory. Problem is if you need a lambda that isn't a single expression, you then have to name it. Welcome to the conversation.


What's the advantage of list comprehension over lambdas (assuming the lambda syntax is decently lightweight)?

I feel like I come down hard on the side of lambdas, but I've never really spent enough time in a language with list comprehension, so there's a good chance I'm missing something.


how can you come down hard on the side of one when you've never experienced the other?

I'm from a non-list-comprehension background too, but recently started working a lot in a large python codebase, and have found the dict/list comprehensions to be beautiful. I'm a huge fan. It's a shame lambda syntax is not the best and it's generally crippled, but comprehensions are a great 80/20 compromise for handling most cases very cleanly.


I find it a lot easier to read, part of which is that I'm used to the Scala way of sequence dot map function. When I see the python one I can't remember if the function comes first or the array.


I'm not positive, but I think it saves the need to create a new execution frame for each lambda call, since the whole loop executes in single frame used by the comprehension.

In theory I suppose the VM could have a map() implementation which opportunistically extracts the code from a lambda and inlines them when possible; but doubt CPython does that. OTOH, I'd be surprised if PyPy doesn't do something like that.


Since Python 3, both generators and lists create a new stack frame. [1] (2nd to last paragraph)

[1] http://python-history.blogspot.com/2010/06/from-list-compreh...


I'm not meaning when the comprehension is invoked, but during each iteration of the loop within the comprehension.

When doing something like `map(lambda x: 2+x, range(100))`, there will be 101 frames created: the outer frame, and 100 for each invocation of the lambda.

Whereas `[2+x for x in range(100)]` will only create 2: one for the outer frame, and one for the comprehension.


I think it's just less to type really, and it's considered the more standard way to do it.


For simple mathematical operations you can import them as functions:

    from operator import mul, add
    arr = [1, 2, 3]
    sum_of_squares = reduce(add, map(mul, arr, arr))


It's even more concise in Clojure:

    (defn sum-of-squares [a] (reduce + (map #(* % %) a)))
    (sum-of-squares [1, 2, 3]) ; => 14


I think lambda syntax can be a bit cumbersome, but that aside what I really miss is a clean syntax for chaining functional operations. So often I find myself thinking about data in terms of 'pipelines'. i.e. in JS:

  _.chain(values)
   .map(() => {})
   .flatten()
   .compact()
   .uniq()
   .value()
vs Python where doing the same thing becomes either a nested mess of function calls or comprehensions or a for loop.


But that's a function of API, not the language itself. Django (and most ORMs, I believe) support that kind of behavior:

    MyTable.objects.
        filter(some_row__gt=5).
        exlude(other_row='q').
        order_by('other_row')
The Python iterable APIs have decided to use nesting rather than chaining, but you can still have an underscore-like API: https://github.com/serkanyersen/underscore.py

The bigger problem remains: lambda functions are hideous in Python. map() will forever be ugly if you try to use it in the same way it is used in most functional languages.


This sort of API is hard to implement in Python though, because there's no formal notion of interfaces, so you cannot extend all iterables generically. So you need to use free functions (which don't read well when chained) or a wrapper object (ick).


Elixir and F# have my favorite syntax for that:

    values |> map(&({})) |> flatten |> compact |> uniq
Although, the closure syntax is a little clunky before you get used to it.


I've been thinking that it might be nice to use chaining (though I didn't know it had a name) in ordinary mathematical notation too, writing "x f g" instead of "g(f(x))".


You don't like using pytoolz? Pseudocode:

    result = pytoolz.pipe(values, map, flatten, compact, uniq, value)
    # or
    func = pytoolz.compose(value, uniq, compact, flatten, map)
    results = func(values)


Looks interesting. How do you tell map what function to use?


You probably need `compose(foo, partial(map, fn), bar)`.


Yeah, you do this or use curried version:

    pytoolz.curried.map(fn)


I think a big thing for many is the lambda: syntax, as well as the lack of full anonymous functions.


The latter can't really happen given Python is a statements- and indentations-based language. You'd need some really weird meta-magical syntax which really isn't going to happen in Python. Although you can cheat by fucking around with decorators e.g.

    def postfix(fn, *args):
        return lambda arg: fn(arg, *args)

    @postfix(map, range(5))
    def result(i):
        return i * i

    print result
    # [0, 1, 4, 9, 16]
(`postfix` is necessary because `map` takes its argument positionally so it's not possible to pass in the sequence with functools.partial)


> The latter can't really happen given Python is a statements- and indentations-based language.

Yeah, though I suppose you could hack around that and get nearly-full functionality in lambdas if you built a library that either wrapped non-expression statements in functions or provided equivalent functions. There are obviously some statements that there aren't good solutions for in that direction.

OTOH, using named functions is in many cases more readable -- in the context of what is otherwise a normal Python codebase -- than the kind of lambdas that you can't easily write in Python. But I like the Coconut approach of but providing a more concise syntax for the kind of lambdas Python already supports.


I agree that typing out the word lambda is annoying, but you can use them as fully anonymous functions.


> you can use them as fully anonymous functions.

A lambda can only contain a single expression, by "full anonymous function" I'm guessing hexane360 means multiple statements. You can't put a for loop or a context manager in a lambda for instance.


You can nest lambdas to get the equivalent of several expressions. I once wrote a Runge-Kutta example on Rosetta Code showing this:

http://rosettacode.org/wiki/Runge-Kutta_method#using_lambda

It does not look as bad as one might expect, though the nesting of parenthesis makes things messy.


You can but you still can't get statements in there.


You can hack together multiple expressions chaining them with and http://sigusr2.net/one-line-echo-server-using-let-python.htm...


That still doesn't get you statements.

You can't get context managers or exception handling (although you can raise exceptions) into lambdas, I've tried.


Well you might be able to if you add a bunch of named function combinators wrapping these, but definitely not with only lambdas, unless you define your combinators using `ast`, which I think would let you define statements via expressions.


That's an awful lot of effort to go to to avoid naming a function.


Well sure, you could do something like

    def apply_ctx(ctx, func, *func_args, **func_kwargs):
        with ctx as __ctx:
            func(*func_args, **func_kwargs, ctx=__ctx)
but you can't do that itself as a lambda. And I consider modifying the ast cheating :P


> but you can't do that itself as a lambda. And I consider modifying the ast cheating :P

No disagreement, really depends whether you're a "rules" or "spirit" kind of person though.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: