It seems to write in the generic "style" of GPT, instead of in the style I would...

sebzim4500 · on Feb 22, 2023

There was no training process, this is just running GPT with relevant HN comments as part of the prompt.

If he wanted it to replicate that classic HN feel he would either have to extend the prompt with additional examples or, better yet, use finetuning.

I guess he could also just randomly sprinkle in some terms like 'stochastic parrot' and find a way to shoehorn Tesla FSD into every conversation about AI.

btbuildem · on Feb 22, 2023

> “AskHN” is a GPT-3 bot I trained on a corpus of over 6.5 million Hacker News comments to represent the collective wisdom of the HN community in a single bot.

First sentence of the first paragraph on OP's page

EDIT: it's a bit misleading, further down they describe what looks like a semantic-search approach

agolio · on Feb 22, 2023

Scroll a bit further down and you will see

> 7. Put top matching content into a prompt and ask GPT-3 to summarize

> 8. Return summary along with direct links to comments back to Discord user

btbuildem · on Feb 22, 2023

Ah got it. Perhaps they should edit the intro then, it's misleading.

stnmtn · on Feb 22, 2023

I agree, that language could be very improved. This is not a GPT-like LLM whose training corpus is HN comments, which I found to be an extremely interesting idea. Instead, it looks like it's finds relevant HN threads and tells GPT-3 (the existing model) to summarize them.

To be clear, I think this is still very cool, just misleading.

agolio · on Feb 22, 2023

Soon we will see language style transfer vectors, akin to the image style transfer at the peak of the ML craze 5-10 years ago -- so you will be able to take a HN snark vector and apply it to regular text, you heard it here first ;)

OkGoDoIt · on Feb 24, 2023

Joking aside, that does seem like it would be very useful. Kind of reminds me of the analogies that were common in initial semantic vector research. The whole “king - man + woman = queen” thing. Presumably that sort of vector arithmetic is still valid on these new LLM embeddings? Although it still would only be finding the closest vector embedding in your dataset, it wouldn’t be generating text guided by the target embedding vector. I wonder if that would be possible somehow?

efreak · on Feb 26, 2023

Hmm. If you're willing to be stuck in time at 2016, there's https://zenodo.org/record/45901

Build a model off of that?

rpastuszak · on Feb 22, 2023

Last year (pre the chatGPT bonanza) I was using GPT-3 to generate some content about attribution bias and the responses got much spicier once the prompt started including the typical HN poster lingo, like "10x developer":

https://sonnet.io/posts/emotive-conjugation/#:~:text=I%27m%2...

My conclusion was that you can use LLMs to automate and scale attribution bias.

We did it guys!

britzkopf · on Feb 22, 2023

To truly capture the HN experience, the user should provide a parameter for the number of "well actually"'s they want to receive. So initial response should demonstrate clear expertise and make a great concise point in response to question, and then start the cascade of silly nitpicking.

bradwood · on Feb 22, 2023

I think you'll find "I think you'll find" trumps "well actually".

;)

vidarh · on Feb 22, 2023

I wish the results were reversed, so I could "well actually" your comment, but 'site:news.ycombinator.com "well actually"' gives ca. 4k results in Google and 'site:news.ycombinator.com "I think you'll find"' gives close to 17k results, so you appear to be right.

actually_a_dog · on Feb 22, 2023

Well, "it turns out that" beats both, with about 26k results ;)

Jimmc414 · on Feb 22, 2023

site:news.ycombinator.com "in my experience" 120K results

genericone · on Feb 22, 2023

IANAL: unfortunately only 10.6k results, thought I had a winner for a second.

ysavir · on Feb 22, 2023

I am mildly disappointed that none of the phrase pitches in this thread were phrased with the given pitch.

clark-kent · on Feb 22, 2023

> ii. Compute embeddings and similarity and choose top K comments closest to question

> iii. Put top matching comments into a prompt and ask GPT-3 to answer the question using the context

It depends on the Prompt used to ask GPT the question. A prompt that instructs GPT to write like a HN poster should fix that.

cookie_monsta · on Feb 22, 2023

There also needs to be at least one question mark at the end of a statement?

reacharavindh · on Feb 22, 2023

Now that you say it, it will train itself for it while it learns from your comments ;-)