> DeepSeek is a tiny Chinese company that reportedly has under 200 employees. The story goes that they started out as a quant trading hedge fund similar to TwoSigma or RenTec, but after Xi Jinping cracked down on that space, they used their math and engineering chops to pivot into AI research.
I guess now we have the answer to the question that countless people have already asked: Where could we be if we figured out how to get most math and physics PhDs to work on things other than picking up pennies in front of steamrollers (a.k.a. HFT) again?
DeepSeek is a subsidiary of a relatively successful Chinese quant trading firm. It was the boss' weird passion project, after he made a few billion yuan from his other passion, trading. The whole thing was funded by quant trading profits, which kind of undermines your argument. Maybe we should just let extremely smart people work on the things that catch their interest?
Interest of extremely smart people is often is strongly correlated with potential profits, and these are very much correlated with policy, which in the case of financial regulation shapes market structures.
Another way of saying this: It's a well-known fact that complicated puzzles with a potentially huge reward attached to them attract the brightest people, so I'm arguing that we should be very conscious of the types of puzzles we implicitly come up with, and consider this an externality to be accounted for.
HFT is, to a large extent, a product of policy, in particular Reg NMS, based on the idea that we need to have many competing exchanges to make our markets more efficient. This has worked well in breaking down some inefficiencies, but has created a whole set of new ones, which are the basis of HFT being possible in the first place.
There are various ideas on whether different ways of investing might be more efficient, but these largely focus on benefits to investors (i.e. less money being "drained away" by HFT). What I'm arguing is that the "draining" might not even be the biggest problem, but rather that the people doing it could instead contribute to equally exciting, non-zero sum games instead.
We definitely want to keep around the the part of HFT that contributes to more efficient resource allocation (an inherently hard problem), but wouldn't it be great if we could avoid the part that only works around the kinks of a particular market structure emergent from a particular piece of regulation?
This is completely fake though. It was more like their founder decided to start a branch to do AI research. It was well planned, they bought significantly more GPUs than they can use for quant research even before they start to do anything AI.
There was a crack down on algorithmic trading, but it didn't had much impact and IMO someone higher up definitely does not want to kill these trading firms.
The optimal amount of algorithmic trading is definitely more than none (I appreciate liquidity and price quality as much as the next guy), but arguably there's a case here that we've overshot a bit.
The price data I (we?) get is 15 minute delayed. I would guess most of the profiteering is from consumers not knowing the last transaction prices? I.e. an artificially created edge by the broker who then sells the API to clean their hands of the scam.
Real-time price data is indeed not free, but widely available even in retail brokerages. I've never seen a 15 minute delay in any US based trade, and I think I can even access level 2 data a limited number of times on most exchanges (not that it does me much good as a retail investor).
> I would guess most of the profiteering is from consumers not knowing the last transaction prices?
No, not at all. And I wouldn't even necessarily call it profiteering. Ironically, as a retail investor you even benefit from hedge funds and HFTs being a counterpart to your trades: You get on average better (and worst case as good) execution from PFOF.
Institutional investors (which include pension funds, insurances etc.) are a different story.
Interestingly a lot of the math and physics people in the ML community are considered "grumpy researchers." A joke apparent with this starter pack[0].
From my personal experience (undergrad physics, worked as engineer, came to CS & ML because I liked the math), there's a lot of pushback.
- I've been told that the math doesn't matter/you don't need math.
- I've heard very prominent researchers say "fuck theorists"
- I've seen papers routinely rejected for improving training techniques with reviewers say "just tune a large model"
- I see papers that show improvements when conditioning comparisons on compute restraints because "not enough datasets" or "but does it scale" (these questions can always be asked but require exponentially more work)
- I've been told I'm gatekeeping for saying "you don't need math to make good models, but you need it to know why your models are wrong" (yes, this is a reference)
- when pointing out math or statistical errors I'm told it doesn't matter
- and much more.
I've heard this from my advisor, dissertation committee, bosses[1], peers, and others (of course, HN). If my experience is short of being rare, I think it explains the grumpy group[2]. But I'm also not too surprised with how common it is in CS for people to claim that everything is easy or that leet code is proof of competence (as opposed to evidence).
I think unfortunately the problem is a bit bigger, but it isn't unsolvable. Really, it is "easily" solvable since it just requires us to make different decisions. Meaning _each and every one of us_ has a direct impact on making this change. Maybe I'm grumpy because I want to see this better world. Maybe I'm grumpy because I know it is possible. Maybe I'm grumpy because it is my job to see problems and try to fix them lol
Arguably, the emergence of quant hedge funds and private AI research companies is at least as much a symptom of the dysfunctions of academia (and society's compensation of academics on dimensions monetary and beyond) as it is of the ability of Wall Street and Silicon Valley to treat former scientists better than that.
Yes and no. Industry AI research is currently tightly coupled with academic research. Most of the big papers you see are either directly from the big labs or in partnership. Not even labs like Stanford have sufficient compute to train GPT from scratch (maybe enough for DeepSeek). Here's Fei-Fei Li discussing the issue. Stanford has something like 300 GPUs[1]? And those have to be split across labs.
The thing is that there's always a pipeline. Academic does most of the low level research, say TRL[2] 1-4, partnerships happen between 4-6, and industry takes over the rest. (with some wiggleroom on these numbers). Much of ML academic research right now is tuning large models, made by big labs. This isn't low TRL. Additionally, a lot of research is rejected for not out-performing technologies that are already at TRL 5-7. See Mamba for a recent example. You could also point to KANs, which are probably around TRL 3.
> Arguably, the emergence of quant hedge funds and private AI research companies is at least as much a symptom of the dysfunctions of academia
Which is where I, again, both agree and disagree. It is not _just_ a symptom of the dysfunction of academia, but _also_ industry. The reason I pointed out the grumpy researchers is because a lot of these people have been discussing techniques that DeepSeek used, long before they were used. DeepSeek looks like what happens when you set these people free. Which is my argument, that we should do that. Scale Maximalists (also alled "Bitter Lesson Maximalists", but I dislike the term) have been dominating ML research, and DeepSeek shows that scale isn't enough. So will hopefully give the mathy people more weight. But then again, is not the common way monopolies fall is because they become too arrogant and incestuous?
So mostly, I agree, I'm just pointing out that there is a bit more subtly and I think we need to recognize that to make progress. There are a lot of physicists and mathy people who like ML and have been doing research in the area but are often pushed out because of the thinking I listed. Though part of the success of the quant industry is recognizing that the strong math and modeling skills of physicists generalize pretty well and you go after people who recognize that an equation that describes a spring isn't only useful for springs, but is useful for anything that oscillates. That understanding of math at that level is very powerful and boy are there a lot of people that want the opportunity to demonstrate this in ML, they just never get similar GPU access.
> DeepSeek is a tiny Chinese company that reportedly has under 200 employees. The story goes that they started out as a quant trading hedge fund similar to TwoSigma or RenTec, but after Xi Jinping cracked down on that space, they used their math and engineering chops to pivot into AI research.
I guess now we have the answer to the question that countless people have already asked: Where could we be if we figured out how to get most math and physics PhDs to work on things other than picking up pennies in front of steamrollers (a.k.a. HFT) again?