Hacker News new | past | comments | ask | show | jobs | submit | nborwankar's comments login

Not sure if this article provides enough definitive info - happened to see it yesterday.

https://www.newscientist.com/article/mg26335104-500-the-brai...


Seems ironic to be working for a company without integrity, drawing a salary ie your livelihood from such an organization while believing somehow your personal integrity is not already impacted by doing so. The two seem incompatible to an outside observer.


Can you share what kinds of problems were conducive to hyperbolic embeddings in your experience. Also, separately, are you saying companies are using these in practice but don’t talk about them because of the advantage they give? Or am I reading too much into your last sentence.


They are better at separating clusters and keep the fact that distances under the correct metric also provide semantic information. The issue is that training is longer and you need at least 32, and ideally 64 bit floats during training and inference.

And possibly.

The company I did the work for kept it very quiet. Bert like models are small enough that you can train them a a work station today so there is a lot less prestige in them than 5 years ago, which is why for profit companies don't write papers on them any more.


They have vacuum-cup like pads on their feet?


+1 for Mathematica’s documentation system. It is the best of any software doc, OSS/free or closed/commercial. By far.


This is related to a classic relational database problem called “bill of materials problem” - suggest you Google that. Also look for book on Graphs and networks in SQL by Joe Celko where he goes into modeling graphs in SQL tables. Naive formulations will lead to deeply nested joins which balloon in computational cost.


Thanks for the rec. Yeah that’s the issues I’m running into. What makes it worse is that users can define their own nesting so building the dynamic queries in application code is becoming really difficult. Add on the problem of needing to comply with a rigid external API for those consumers.

I’ve been setting up some test data in graph db and I feel like I can toss out my 1400 loc total to build the dynamic query and just have the client pass in the json of what they want.

Current new implementation is about 200 loc.

Personally I don’t mind if sql can do it or not. I care about being able to test easily, and detect and fix bugs without needing to bring out a whiteboard. Saving 1200 loc is already such a good feeling.


Gives a whole new meaning to “manufactured consent“.


If it could support the pgvector extension it would be a super fast vector database with all the power of Pg - the relational aspect brings the ability to add and query using rich domain specific metadata usually contained in relational databases.


I spent last week trying to do that with some of the other pg embeded libs.

And then lancedb released their embedded client for rust, so I went towards that. But it's still lacking FTS. So I fell back to sqlite. have some notes here https://shelbyjenkins.github.io/blog/retrieval-is-all-you-ne...


+1 on fastmail


I didn’t take a year fully off but last year was slow in my consulting work so I did a deep dive on the emerging LLM area.

One recommendation - get a beefy Mac laptop so you can run LLM’s locally. I got an M2 with 96G RAM. Makes a huge difference to your thinking about LLM’s when you can run your own and integrated it with little tasks here and there for experimenting.

Else I find most people think only of centralized closed LLM’s when they think of what’s possible. Severely limiting.

/r/LocalLlama on Reddit is a great community. Other than that check out llama.cpp and ggml.cpp and the whole ecosystem around that.

Cheers and good luck. Ping me on DM if you want more pointers.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: