I am one of the people who worked on Google's PaLM model. Having skimmed the Git...

javajosh · on June 23, 2022

> this announcement seems to very focused on number of parameters

And yet your own project headline is "Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance"[0].

0-https://ai.googleblog.com/2022/04/pathways-language-model-pa...

panda-giddiness · on June 23, 2022

1. The OP did not criticize the headline; they criticized the content. If you read the article that you linked, you would find that they do, in fact, evaluate the performance of the model.

2. 540 billion parameters is notable for its size, which is likely why they lead with that particular headline.

gwern · on June 23, 2022

The difference is PaLM was extensively benchmarked and it performed as well as it should, which is to say, amazingly well. The irony here is that you should instead be invoking that other ~500b model, Nvidia's Megatron-530b, which was undertrained, only cursorily evaluated (no interest in any new capabilities or even examining old ones like inner monologues) and promptly forgotten by everyone after the headlines about being the largest dense model: https://arxiv.org/abs/2201.11990#microsoftnvidia

fswd · on June 23, 2022

it's in there look for this sentence. And they did some top dog stuff: Training details and best practices on acceleration and stabilizations can be found on Medium (English)

rllearneratwork · on June 23, 2022

Given that Yandex is a crucial part of Russian propaganda arm, we should consider the whole range of possibilities from:

* Good. This is great researchers helping community by sharing great work. (which is what I'd like to assume before I have any proof of the contrary)

* Bad. This very expensive training has been approved by Ya leadership (which is under Western personal sanctions) because they've secretly built in RU's propaganda talking points into the model. Such as "war in Ukraine is not a war but special operation" etc.

whimsicalism · on June 23, 2022

Should we assume language models released by Twitter have injected content praising Hunter Biden?

rllearneratwork · on June 23, 2022

No. read my message again. As I said, we should assume good intention first until proven otherwise.

But we should have better tools to test for biases/toxicity. Perspective API is great tool for toxicity detection. But I'm not aware of any "propoganda" detection tool.