Hacker News new | past | comments | ask | show | jobs | submit | nborwankar's favorites login

Hyperbolic embeddings have been an interest of mine ever since the Max Nickel paper. Would love to connect directly to discuss this topic if you're open. here's my email: https://photos.app.goo.gl/1khCwXBsVBuEP6xF7

I've had some success using hyperbolic embeddings for bert like models.

It's not something that the companies I've worked for advertised or wrote papers about.


Since this doesn't have documentation yet I piped the code through Claude 3 Opus and asked it to write some.

https://gist.github.com/simonw/9ff9a0ab8ab64e8aa8d160c4294c0...

You don't have a license on the code yet (weirdly Claude hallucinated MIT).

Notes on how I generated this in the comments on that Gist.


Under the hood it does a few things I'll shed some light on (at a high level!):

1. Running the model: it's built on the open-source (and amazing) llama.cpp project for running quantized (i.e. compressed) models like Llama 2 (launched yesterday) that will fit in memory on even a commodity Mac. It's similar to their "server" example as a starting point.

2. Downloading and storing models: models are distributed in a way that ensures their integrity and re-usability as much as possible (since they are large files!). For this we use a similar approach as Docker (https://github.com/distribution/distribution)

3. Creating custom models: models can be extended with this new idea we're experimenting with: a Modelfile. What this will do is effectively add "layers" to a model so you can distribute model data together and keep them self-contained. This builds on what I mentioned in 2 – our hope is this will make it easier to extend models like Llama 2 to your own use cases (e.g. a character).


If anything, the new hotness has been Lagers and Pilsners for a while.

As someone who likes hoppy + low ABV beers the death of the Session IPA has hit me hard.



To say it in more plain words, "spreading activation" describes the activity through a weighted graph, that could be a good, old-fashioned, semantic network (but with weights) or a neural net.

It's fundamental and foundational connectionist stuff and for once I'm not using the term dismissively. It seems like spreading activation goes to the heart of connectionist ideas about storing knowledge in a graph, what used to be known as a Parallel Distributed Process model, apparently.


If I'm interpreting correctly, this is a psychology paper taking examples from early algorithms in AI research and then applying them to constructing a psychological model of the human mind with them. Presumably to inform other psychological techniques?

Although that's structurally approximately correct it understates the influence that SA had, and continues to have (IMHO, see below). Mathematically, Spreading Activation is essentially contextually driven probability integration, sort an unruly version of Bayesian Networks -- iterative matrix multiplies of a weighted connectivity matrix given a set of initial "context" weights. My hypothesis is what embeddings are doing for the GPTs can be understood as essentially spreading activation over distributed representations connected to each token. In classical (GOFAI) SA, per Collins and Loftus, you spread to/from the tokens, whereas in modern SA (by my hypothesis) you are spreading through a high dimensionality network represented by the embedding vectors, instead of via the tokens.

I'm very impressed at the quality of Guanaco 33B, the model that accompanies this paper.

You can try it out here: https://huggingface.co/spaces/uwnlp/guanaco-playground-tgi

I tried "You are a sentient cheesecake that teaches people SQL, with cheesecake analogies to illustrate different points. Teach me to use count and group by" and got a good result from it: https://twitter.com/simonw/status/1661460336334241794/photo/...


The context as written by the author of this project: https://twitter.com/antirez/status/1632664864538738688

Psst ... why don't you spend 30 minutes of quality time with chatGPT and get to the bottom of this? Get those personalised explanations and enjoy its unlimited patience.

I have felt the same in the past, related to a completely different topic. I know how it feels, it's like people are not saying things what they are, just using weird words.

"weights" - synapses in the AI brain

"tokens" - word fragments

"model" - of course, the model is the AI brain

"context" - the model can only handle a piece of text, can't put whole books in, so this limited window is the context

"GPT" - predicts the next word, trained on everything; if you feed its last predicted word back in, it can write long texts

"LoRA" - a lightweight plug-in model for tweaking the big model

"loss" - a score telling how bad is the output

"training" - change the model until it fits the data

"quantisation" - making a low precision version of the model because it still works, but now is much faster and needs less compute

"embedding" - just a vector, it stands for the meaning of a word token or a piece of image; these embeddings are learned


I'm a high-tech electronics person but I still love steam engines (and that's partly historical as my father was a mechanical engineer who worked on building power station boilers and I'm old enough to have traveled on steam trains as a kid when they weren't a novelty).

That said, I'm concerned that the reawakening of interest in steam engines/trains especially in the UK may be the result of a hankering for the old days when Britain had an empire and its mechanical engineering output was the envy of the world. Of course, I hope I'm wrong and that interest actually runs deeper than just that.

What ought to be important about this reawakening is that we have the opportunity to awaken an interest in kids about mechanics, engineering and science and do so from a very young age, and in my opinion, there's nothing much better than having them stand on a platform watching a hissing, noisy steam engine.

Steam engines are visually exciting and interesting but they are also the combined embodiment of mechanics and thermodynamics and understanding both of them is the keyway to understanding science.

Despite my electronics background I consider that a thorough understanding of thermodynamics is crucial to understanding the high-tech world irrespective of one's field. And I reckon there's likely no better way of starting kids' education in thermodynamics other than to have them engulfed in steam from a Puffing Billy.


HN tends to have a lot of insight into tech topics, and a high level of hubris when it comes to non tech topics.

You want to buy a “monitor”, not a TV.

For instance there is this from LG [0] or this from Dell [1]. Just do a search for “large 4K monitor” and you’ll find more.

[0] - https://www.bestbuy.com/site/lg-43-ultrafine-4k-uhd-monitor-...

[1] - https://www.dell.com/en-us/work/shop/dell-55-4k-conference-r...


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: