Hacker News new | past | comments | ask | show | jobs | submit login

It seems to write in the generic "style" of GPT, instead of in the style I would recognise as a HN poster. Is that because of something baked into how the training process works? It lacks a sort of casualness or air of superiority ;)



There was no training process, this is just running GPT with relevant HN comments as part of the prompt.

If he wanted it to replicate that classic HN feel he would either have to extend the prompt with additional examples or, better yet, use finetuning.

I guess he could also just randomly sprinkle in some terms like 'stochastic parrot' and find a way to shoehorn Tesla FSD into every conversation about AI.


> “AskHN” is a GPT-3 bot I trained on a corpus of over 6.5 million Hacker News comments to represent the collective wisdom of the HN community in a single bot.

First sentence of the first paragraph on OP's page

EDIT: it's a bit misleading, further down they describe what looks like a semantic-search approach


Scroll a bit further down and you will see

> 7. Put top matching content into a prompt and ask GPT-3 to summarize

> 8. Return summary along with direct links to comments back to Discord user


Ah got it. Perhaps they should edit the intro then, it's misleading.


I agree, that language could be very improved. This is not a GPT-like LLM whose training corpus is HN comments, which I found to be an extremely interesting idea. Instead, it looks like it's finds relevant HN threads and tells GPT-3 (the existing model) to summarize them.

To be clear, I think this is still very cool, just misleading.


Soon we will see language style transfer vectors, akin to the image style transfer at the peak of the ML craze 5-10 years ago -- so you will be able to take a HN snark vector and apply it to regular text, you heard it here first ;)


Joking aside, that does seem like it would be very useful. Kind of reminds me of the analogies that were common in initial semantic vector research. The whole “king - man + woman = queen” thing. Presumably that sort of vector arithmetic is still valid on these new LLM embeddings? Although it still would only be finding the closest vector embedding in your dataset, it wouldn’t be generating text guided by the target embedding vector. I wonder if that would be possible somehow?


Hmm. If you're willing to be stuck in time at 2016, there's https://zenodo.org/record/45901

Build a model off of that?


Last year (pre the chatGPT bonanza) I was using GPT-3 to generate some content about attribution bias and the responses got much spicier once the prompt started including the typical HN poster lingo, like "10x developer":

https://sonnet.io/posts/emotive-conjugation/#:~:text=I%27m%2...

My conclusion was that you can use LLMs to automate and scale attribution bias.

We did it guys!


To truly capture the HN experience, the user should provide a parameter for the number of "well actually"'s they want to receive. So initial response should demonstrate clear expertise and make a great concise point in response to question, and then start the cascade of silly nitpicking.


I think you'll find "I think you'll find" trumps "well actually".

;)


I wish the results were reversed, so I could "well actually" your comment, but 'site:news.ycombinator.com "well actually"' gives ca. 4k results in Google and 'site:news.ycombinator.com "I think you'll find"' gives close to 17k results, so you appear to be right.


Well, "it turns out that" beats both, with about 26k results ;)


site:news.ycombinator.com "in my experience" 120K results


IANAL: unfortunately only 10.6k results, thought I had a winner for a second.


I am mildly disappointed that none of the phrase pitches in this thread were phrased with the given pitch.


> ii. Compute embeddings and similarity and choose top K comments closest to question

> iii. Put top matching comments into a prompt and ask GPT-3 to answer the question using the context

It depends on the Prompt used to ask GPT the question. A prompt that instructs GPT to write like a HN poster should fix that.


There also needs to be at least one question mark at the end of a statement?


Now that you say it, it will train itself for it while it learns from your comments ;-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: