More

x_may · 2025-07-06T16:50:43 1751820643

I think it’s also largely driven by the apparently cheapness of turning the CapEX of server buying to the OpEX of cloud renting. Less up front investment and auditing/access controls for SoC2 compliant are so much easier m.

x_may · 2025-05-27T12:41:24 1748349684

Unfortunate name collision on that one

neutronicus · 2025-05-27T14:38:35 1748356715

I do donate to NVDA indirectly via the S&P500

x_may · 2025-05-22T19:37:33 1747942653

Obviously its not at the scale of the top auto-regressive models yet but there are some OSS models https://github.com/dllm-reasoning/d1

x_may · 2025-04-18T11:41:51 1744976511

It may be that it was time for the hardware that was previously running Arxiv to be retired and this is just another Capex -> Opex decision being made by so many tech companies.

I'd like to know if GCP is covering part of the bill? Or will Cornell be paying all of it? The new architecture smells of "[GCP] will pay/credit all of these new services if you agree to let one of our architects work with you". If GCP is helping, stay tuned for a blog post from google some time around the completion of the migration with a title like "Reaffirming our commitment to science" or something similarly self affirming.

khuey · 2025-04-18T16:02:26 1744992146

> If GCP is helping, stay tuned for a blog post from google some time around the completion of the migration with a title like "Reaffirming our commitment to science" or something similarly self affirming.

"Google pays to run an enormous intellectual resource in exchange for a self-congratulatory blogpost" seems like a perfectly acceptable outcome for society here.

stonogo · 2025-04-18T21:14:07 1745010847

It wasn't when it happened to Usenet.

toomuchtodo · 2025-04-19T01:11:37 1745025097

Frequent backups to the Internet Archive for rehydration when needed. RIP Dejanews. Hopefully we’ve learned from past experience.

mistrial9 · 2025-04-19T02:35:45 1745030145

mirrors, please

yumraj · 2025-04-18T21:40:08 1745012408

> If GCP is helping, stay tuned for a blog post from google some time around the completion of the migration with a title like "Reaffirming our commitment to science" or something similarly self affirming.

This is an odd criticism. If a company is footing the bill, it can’t even talk about it to gain some publicity/good will?

nophunphil · 2025-04-19T03:29:43 1745033383

Footing the bill for how long?

helsinki · 2025-04-19T15:21:09 1745076069

How much is the bill for running Arxiv? $1000 - $3000/month? Yeah, I don't think Google deserves any recognition for footing that bill. Likely just another self-congratulatory bullshit move on behalf of big G.

flakiness · 2025-04-18T15:58:57 1744991937

https://info.arxiv.org/about/supporters.html

  Our Supporters
  ...
  Gold Sponsors
  Google, Inc (USA)

TZubiri · 2025-04-18T22:38:42 1745015922

"Reaffirming our commitment to science" or something similarly self affirming.

While I understand that something is more genuine if done in secret, it doesn't stop being a real commitment to science just because you make a pr post about it.

If company X contributes to Y open source foundation, that's real and they get to claim clout, nobody cares about a post anyways.

x_may · 2025-01-01T13:57:32 1735739852

I believe they are using scalable TTC. The o3 announcement released accuracy numbers for high and low compute usage, which I feel would be hard to do in the same model without TTC.

I also believe that the 200$ subscription they offer is just them allowing the TTC to go for longer before forcing it to answer.

If what you say is true, though, I agree that there is a huge headroom for TTC to improve results if the huggingface experiments on 1/3B models are anything to go off.

ankit219 · 2025-01-01T16:02:08 1735747328

The other comment posted YT videos where Open AI researchers are talking about TTC. So, I am wrong. That $200 subscription is just because the number of tokens generated are huge when CoT is involved. Usually inference output is capped at 2000-4000 tokens (max of ~8192) or so, but they cannot do it with o1 and all the thinking tokens involved. This is true with all the approaches - next token prediction, TTC with beam/lookahead search, or MCTS + TTC. If you specify the output token range as high and induce a model to think before it answers, you will get better results on smaller/local models too.

> huge headroom for TTC to improve results ...1B/3B models

Absolutely. How this is productized remains to be seen. I have high hopes with MCTS and Iterative Preference Learning, but it is harder to implement. Not sure if Open AI has done that. Though Deepmind's results are unbelievably good [1].

[1]:https://arxiv.org/pdf/2405.00451v2

whimsicalism · 2025-01-01T17:19:07 1735751947

ttc is an incredibly broad term and it is broadening as the hype spreads. people are now calling CoT “TTC” because they are spending compute on reasoning tokens before answering

HarHarVeryFunny · 2025-01-01T17:50:05 1735753805

Yes, and HuggingFace have published this outlining some of the potential ways to use TTC, including but not limited to tree search, showing TTC performance gains from LLama.

https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling...

x_may · 2024-12-31T18:48:22 1735670902

The LMSYS leaderboards are crowdsourced and would be hard to fake, it showing a pretty strong performance in terms of human preference.

paxys · 2024-12-31T21:04:50 1735679090

Crowdsourced data is the easiest to fake unless you can somehow ensure that you have a completely unbiased population (which is impossible). There's a reason why certain models do so well on upvote-based leaderboards but rank nowhere on objective tests.

CGamesPlay · 2025-01-01T00:15:06 1735690506

Which ones? I think fine-tunes are where I see most of this (I'll just call it) "model spam", but the base models don't seem to exhibit this behavior. I do see some models perform way below the curve compared to their benchmark performance, though (Phi family being the most famous).

x_may · 2024-12-31T15:46:42 1735660002

Captcha solvers as a service are already well developed. The end result is going full circle to in person applications only.

umvi · 2024-12-31T15:55:12 1735660512

Yeah, that's what I meant by "captcha like" - mechanisms that prevent automated applications such as in person only; doesn't have to literally be captcha. Anything that fulfills the same purpose will do.

x_may · 2024-12-31T15:46:00 1735659960

Tragedy of the commons at work once again

WD-42 · 2024-12-31T15:48:03 1735660083

More like play stupid games, win stupid prizes.

x_may · 2024-12-17T14:35:54 1734446154

There’s black sand! Volcanic sand from Iceland is perfectly black and would be a great way to distinguish them

michaelmior · 2024-12-17T22:20:32 1734474032

You can also by any color of "sand" you like at your local craft/hobby shop.

x_may · 2024-10-19T11:55:28 1729338928

I think right now they lose more money with each user. But maybe their value lies in training data