The most important part for me is: > DeepSeek is a tiny Chinese company that rep...

auntienomen · on Jan 27, 2025

DeepSeek is a subsidiary of a relatively successful Chinese quant trading firm. It was the boss' weird passion project, after he made a few billion yuan from his other passion, trading. The whole thing was funded by quant trading profits, which kind of undermines your argument. Maybe we should just let extremely smart people work on the things that catch their interest?

lxgr · on Jan 27, 2025

Interest of extremely smart people is often is strongly correlated with potential profits, and these are very much correlated with policy, which in the case of financial regulation shapes market structures.

Another way of saying this: It's a well-known fact that complicated puzzles with a potentially huge reward attached to them attract the brightest people, so I'm arguing that we should be very conscious of the types of puzzles we implicitly come up with, and consider this an externality to be accounted for.

HFT is, to a large extent, a product of policy, in particular Reg NMS, based on the idea that we need to have many competing exchanges to make our markets more efficient. This has worked well in breaking down some inefficiencies, but has created a whole set of new ones, which are the basis of HFT being possible in the first place.

There are various ideas on whether different ways of investing might be more efficient, but these largely focus on benefits to investors (i.e. less money being "drained away" by HFT). What I'm arguing is that the "draining" might not even be the biggest problem, but rather that the people doing it could instead contribute to equally exciting, non-zero sum games instead.

We definitely want to keep around the the part of HFT that contributes to more efficient resource allocation (an inherently hard problem), but wouldn't it be great if we could avoid the part that only works around the kinks of a particular market structure emergent from a particular piece of regulation?

rfoo · on Jan 27, 2025

This is completely fake though. It was more like their founder decided to start a branch to do AI research. It was well planned, they bought significantly more GPUs than they can use for quant research even before they start to do anything AI.

There was a crack down on algorithmic trading, but it didn't had much impact and IMO someone higher up definitely does not want to kill these trading firms.

lxgr · on Jan 27, 2025

The optimal amount of algorithmic trading is definitely more than none (I appreciate liquidity and price quality as much as the next guy), but arguably there's a case here that we've overshot a bit.

rightbyte · on Jan 27, 2025

The price data I (we?) get is 15 minute delayed. I would guess most of the profiteering is from consumers not knowing the last transaction prices? I.e. an artificially created edge by the broker who then sells the API to clean their hands of the scam.

lxgr · on Jan 27, 2025

Real-time price data is indeed not free, but widely available even in retail brokerages. I've never seen a 15 minute delay in any US based trade, and I think I can even access level 2 data a limited number of times on most exchanges (not that it does me much good as a retail investor).

> I would guess most of the profiteering is from consumers not knowing the last transaction prices?

No, not at all. And I wouldn't even necessarily call it profiteering. Ironically, as a retail investor you even benefit from hedge funds and HFTs being a counterpart to your trades: You get on average better (and worst case as good) execution from PFOF.

Institutional investors (which include pension funds, insurances etc.) are a different story.

rightbyte · on Jan 27, 2025

OK ty I guess I got it wrong. I thought it was way more common than for my scrappy bank.

doctorpangloss · on Jan 27, 2025

Who knows? That too is a bunch of mythmaking. One thing's for sure, there are no moats or secrets.

rfoo · on Jan 28, 2025

Well, I know. I still have connections back there. But yeah, I'm just a random guy on the Internet so what I said could be just myth too.

godelski · on Jan 27, 2025

Interestingly a lot of the math and physics people in the ML community are considered "grumpy researchers." A joke apparent with this starter pack[0].

From my personal experience (undergrad physics, worked as engineer, came to CS & ML because I liked the math), there's a lot of pushback.

  - I've been told that the math doesn't matter/you don't need math.
  - I've heard very prominent researchers say "fuck theorists" 
  - I've seen papers routinely rejected for improving training techniques with reviewers say "just tune a large model"
  - I see papers that show improvements when conditioning comparisons on compute restraints because "not enough datasets" or "but does it scale" (these questions can always be asked but require exponentially more work)
  - I've been told I'm gatekeeping for saying "you don't need math to make good models, but you need it to know why your models are wrong" (yes, this is a reference)
  - when pointing out math or statistical errors I'm told it doesn't matter
  - and much more.

I've heard this from my advisor, dissertation committee, bosses[1], peers, and others (of course, HN). If my experience is short of being rare, I think it explains the grumpy group[2]. But I'm also not too surprised with how common it is in CS for people to claim that everything is easy or that leet code is proof of competence (as opposed to evidence).

I think unfortunately the problem is a bit bigger, but it isn't unsolvable. Really, it is "easily" solvable since it just requires us to make different decisions. Meaning _each and every one of us_ has a direct impact on making this change. Maybe I'm grumpy because I want to see this better world. Maybe I'm grumpy because I know it is possible. Maybe I'm grumpy because it is my job to see problems and try to fix them lol

[0] https://bsky.app/starter-pack/roydanroy.bsky.social/3lba5lii... (not perfect, but there's a high correlation and I don't think that's a coincidence)

[1] Even after _demonstrating_ how my points directly improve the product, more than doubling performance on _customer_ data.

[2] not to mention the way experiments are done, since it is stressed in physicists that empirics is not enough. https://www.youtube.com/watch?v=hV41QEKiMlM

lxgr · on Jan 27, 2025

Is this in academia?

Arguably, the emergence of quant hedge funds and private AI research companies is at least as much a symptom of the dysfunctions of academia (and society's compensation of academics on dimensions monetary and beyond) as it is of the ability of Wall Street and Silicon Valley to treat former scientists better than that.

godelski · on Jan 27, 2025

  > Is this in academia?

Yes and no. Industry AI research is currently tightly coupled with academic research. Most of the big papers you see are either directly from the big labs or in partnership. Not even labs like Stanford have sufficient compute to train GPT from scratch (maybe enough for DeepSeek). Here's Fei-Fei Li discussing the issue. Stanford has something like 300 GPUs[1]? And those have to be split across labs.

The thing is that there's always a pipeline. Academic does most of the low level research, say TRL[2] 1-4, partnerships happen between 4-6, and industry takes over the rest. (with some wiggleroom on these numbers). Much of ML academic research right now is tuning large models, made by big labs. This isn't low TRL. Additionally, a lot of research is rejected for not out-performing technologies that are already at TRL 5-7. See Mamba for a recent example. You could also point to KANs, which are probably around TRL 3.

  > Arguably, the emergence of quant hedge funds and private AI research companies is at least as much a symptom of the dysfunctions of academia

Which is where I, again, both agree and disagree. It is not _just_ a symptom of the dysfunction of academia, but _also_ industry. The reason I pointed out the grumpy researchers is because a lot of these people have been discussing techniques that DeepSeek used, long before they were used. DeepSeek looks like what happens when you set these people free. Which is my argument, that we should do that. Scale Maximalists (also alled "Bitter Lesson Maximalists", but I dislike the term) have been dominating ML research, and DeepSeek shows that scale isn't enough. So will hopefully give the mathy people more weight. But then again, is not the common way monopolies fall is because they become too arrogant and incestuous?

So mostly, I agree, I'm just pointing out that there is a bit more subtly and I think we need to recognize that to make progress. There are a lot of physicists and mathy people who like ML and have been doing research in the area but are often pushed out because of the thinking I listed. Though part of the success of the quant industry is recognizing that the strong math and modeling skills of physicists generalize pretty well and you go after people who recognize that an equation that describes a spring isn't only useful for springs, but is useful for anything that oscillates. That understanding of math at that level is very powerful and boy are there a lot of people that want the opportunity to demonstrate this in ML, they just never get similar GPU access.

[0] https://www.ft.com/content/d5f91c27-3be8-454a-bea5-bb8ff2a85...

[1] https://archive.is/20241125132313/https://www.thewrap.com/un...

[2] https://en.wikipedia.org/wiki/Technology_readiness_level