Figure 3 on p.40 of the paper seems to show that their LLM based model does not statistically significantly outperform a 3 layer neural network using 59 variables from 1989.
This figure compares the prediction performance of GPT and quantitative models based on machine learning. Stepwise Logistic follows Ou and Penman (1989)’s structure with their 59 financial predictors. ANN is a three-layer artificial neural network model using the same set of variables as in Ou and Penman (1989). GPT (with CoT) provides the model with financial statement information and detailed chain-of-thought prompts. We report average accuracy (the percentage of correct predictions out of total predictions) for each method (left) and F1 score (right). We obtain bootstrapped standard errors by randomly sampling 1,000 observations 1,000 times and include 95% confidence intervals.
I never traded consistently and successfully but I did do a startup with a seasoned quant trader with the ambition of using bigger models to generate novel alpha. We mopped the floor with the academics who publish but that is whiffle ball compared to a real prop outfit that lasts.
Not having made it big myself I obviously don’t know the meta these days, but last I had any inside baseball, the non-stationarity and friction just kill you on trying to get fancy as opposed to just nailing it on the fundamentals.
Extreme execution quality is a game, people make money in both traditional liquidity provision and agency execution by being fast as hell and managing risk well.
Individual signals that are individually somewhat mundane but composed well via straightforward linear-ish regressions is a game: people get (ever decaying) alpha out of bright ideas (and rotate new signals in).
And I’m sure that LLMs have started playing a role, there’s a legitimate capability increase in spite of the dubious production-worthiness.
But as a blind wager, I bet prop trading is about what it was 5 years ago on better gear: elite execution (no pun intended) on known-good ways to generate alpha.
1. Your automated system should be as fast as possible.
2. Stick with known, basic fundamental strategies.
3. Try new ideas around how to give those same strategies more predictive power (signal).
#1 is straight technical execution.
#3 is constantly evolving.
Is how I understood this.
And as sort of an afterthought I guess the better you are at #1 the less good you need to be at #3 and the worse you are at #1 the better you need to be at #3?
Speaking for myself and likely others with similar motivations, yes we can "figure it out" and publish something to show our work and expand the field of endeavor with our findings - OR - we can figure something profitable out on our own and use our own funds to trade our strategies with our own accounts.
Anyone who has figured out something relatively profitable isn't telling anyone how they did it.
> Anyone who has figured out something relatively profitable isn't telling anyone how they did it.
Corollary: someone who is selling you tools or strategies on how to make tons and tons of money, is probably not making tons and tons of money employing said tools and strategies, but instead making their money by having you buy their advice.
I think I could probably make more money selling a tool or strategy that consistently, reliably makes ~2% more than government bonds than I could make off it myself, with my current capital.
You can't do it because there are lots of fraudulent operators in the space. Think about it: someone comes up to you offering a way to give you risk-free return. All your ponzi flags go up. It's a market for lemons. If you had this, the only way to make it is to raise money some other way then redirect it (illegal but you'll get away with it most likely), or to slowly work your way up the ranks proving yourself till you get to a PM and then have him work your strat for you.
The fact that you can't reveal how means you can't prove you're not Ponzi. If you reveal how, they don't need you.
It's been done again and again by funds. A new fund or fund company convinces someone with some name and reputation to come work for them and they become the name and reputation that gives some credibility and sells the new fund. Clients start by putting in just a little money and more later. Nobody knows the hard details outside of the new firm. Sometimes it goes nowhere and nobody hears of the thing again, sometimes it works out and they let nobody unaware.
> The fact that you can't reveal how means you can't prove you're not Ponzi. If you reveal how, they don't need you.
This is why I am wary of all those +10 minute YT vids telling you how you can't make significant amounts of money quickly or reliably in a short amount of time with very limited capital.
Seems like the money here would be building a shiny, public facing version of the tool behind a robust paywall and build a relationship with a few Broker Dealer firms who can make this product available to the Financial Advisors in their network.
If you were running this yourself with $1M input capital, that'd be $20k/year per 1M of input - so $20K is a nice number to try and beat selling a product that promulgates a strategy.
But you're going to run into the question from people using the product: "Yeah - but HOW DOES IT WORK??!!!" and once you tell them does your ability to get paid disappear? Do they simply re-package your strategy as their own and cease to pay you (and worse start charging for your work)? Is your strategy so complicated that the value of the tool itself doing the heavy lifting makes it sticky?
Getting people to put their money into some Black Box kind of strategy would probably be challenging - but Ive never tried it - it may be easier than giving away free beer for all I know. Sounds like a fun MVP effort really. Give it a try - who knows what might happen.
As fas as I know the more people use the strategy the worse it performs, the market is not static, it adapts. Other people react to the buy/sell of your strategy and try to exploit the new pattern.
The average return from index funds is the benchmark that all those others are trying to beat but all the competitors trying to beat the average have a tendency to push successful strategies towards the average.
That hypothetical person or organization already has an advisor in charge of their money at the smaller end or an entire private RIA on the Family Office side of things. This approach is a fools errand.
Well, see, I don't actually have a method for that. But if I did, I think my capital is low enough that I'd have more success selling it to other people than trying to exploit it myself, since the benefit would be pretty minimal if I did it with just my own savings, but could be pretty dramatic for, say, banks.
Strats tend to have limits. What works for you may fall apart with large amounts of capital. Don't discount compound interest. $10,000 compounding 30% over 20 years is 2 million without any additional capital.
Absolutely correct - and more over - when you do sit someone down (in my case, someone with a "superior education" in finance compared to my CS degree) and explain things to them, they simply don't understand it at all and assume you're crazy because you're not doing what they were taught in Biz School.
Why not then publish the strategies once outmoded, or are they in fact published? Can I go see somewhere what strategies big funds used in the 90s to make bank, which presumably no longer offer a competitive advantage? The way I can go see what computer exploits/hacks used to work when they were still secret?
Maybe it's just what I know, but I can't help but think the "strategies" are a lot like security exploits--some cleverness, some technical facility, but mainly the result of staring at the system for a really long time and stumbling on things.
Why not? Because you won't know what of your strategies is outmoded by something new because that group is not publishing their strategy, which is like yours but on steroids, either.
And then everything regresses to the Dark Forest game theory.
Wouldn't publishing also influence the performance itself because it would also make an impact on the data? And if you'd calculate that in and the method is spreading, wouldn't that in turn have to be calculated in also, which would lead to a spiral?
At a SciPy meeting where someone in finance was presenting an intro on some tools, someone asked if they ever contribute code to those open source projects. Their answer was "Yes, but only after we've stopped making money with them."
No hedge fund registered before the last 2 weeks will use Llama3 for their "prod work" beyond "experiments".
Quant trading is about "going fast" or "being super right", so either you'd need to be sitting on some huge llama.cpp/transformer improvement (possible but unlikely) or its more likely just some boring math applied faster than others.
Even if they are using a "LLM", they wont tell you or even hint at it - "efficient market" n all that.
Remember all quants need to be "the smartest in the world" or their whole industry falls apart, wait till you find out its all "high school math" based on algo's largely derived 30/40 years ago (okay not as true for "quants" but most "trading" isn't as complex as they'd like you/us to believe).
Well I work in prop trading and have only ever worked for prop firms- our firm trades it's own capital and distributes it to the owners and us under profit share agreements - so we have no incentive to sell ourselves as any smarter than the reality.
Saying it's all high school math is a bit of a loaded phrase. "High school math" incorporates basically all practical computer science and machine learning and statistics.
If I suspect you could probably build a particle accelerator without using more math than a bit of calculus - that doesn't make it easy or simple to build one.
Very few people I've worked with have ever said they are doing cutting edge math - it's more like scientific research . The space of ideas is huge, and the ways to ruin yourself innumerable. It's more about people who have a scientific mindset who can make progress in a very high noise and adaptive environment.
It's probably more about avoiding blunders than it is having some genius paradigm shifting idea.
Would you ever go off on your own to trade solo or is that something that just does not work without a ton (like 9 figures) of capital and a pretty large team?
Going solo in trading is a very different beast compared to trading at a prop firm. Yes, capital is a significant factor. The more you have, the more you can diversify and absorb losses which are inevitable in trading. However, it's not just about the capital. The infrastructure, data access, and risk management systems at a prop firm are usually far superior to what you could afford or build on your own as an individual trader.
Moreover, the collaborative environment at a prop firm can't be understated. Ideas and strategies are continuously debated, tested, and refined. This collective brainpower often leads to more robust strategies than what you might come up with on your own.
That said, there are successful solo traders, but they often specialize in niche markets where they can leverage unique insights or strategies that aren't as capital intensive. It's definitely not for everyone and comes with its own set of challenges and risks.
It's like any other business, there are factors of production that various actors will have varying access to, at varying costs.
A car designer still needs a car factory of some sort, and there's a negotiation there about how the winnings are divided.
In the trading world there are a variety of strategies. Something very infra dependent is not going to be easy to move to a new shop. But there are shops that will do a deal with you depending on what knowledge you are bringing, what infra they have, what your funding needs are, what data you need, and so on.
> It's probably more about avoiding blunders than it is having some genius paradigm shifting idea.
I too believe this is key towards successful trading. Put in other words, even with an exceptionally successful algorithm, you still need a really good system for managing capital.
In this line of business, your capital is the raw material. You cannot operate without money. A highly leveraged setup can get completely wiped out during massive swings - triggering margin calls and automatic liquidation of positions at the worst possible price (maximizing your loss). Just ask ex-billionaire investor/trader Bill Hwang[1].
Here is a simple way to think about it. The markets follow random walk and there is a 50/50 chance of being right or wrong. If you can make more when you are right, and lose less when you are wrong, you are on your way to being profitable.
>Saying it's all high school math is a bit of a loaded phrase. "High school math" incorporates basically all practical computer science and machine learning and statistics.
Im responding to the comment "do use llama3" not "breakdown your start"
> Very few people I've worked with have ever said they are doing cutting edge math - it's more like scientific research . The space of ideas is huge, and the ways to ruin yourself innumerable. It's more about people who have a scientific mindset who can make progress in a very high noise and adaptive environment.
This statement is largely true of any "edge research", as I watch the loss totals flow by on my 3rd monitor I can think of 30 different avenues of exploration (of which none are related to finance).
Trading is largely high school Math, on top of very complex code, infrastructure, and optimizations.
> but most "trading" isn't as complex as they'd like you/us to believe
I know nothing about this world, but with things like "doctor rediscovers integration" I can't help but wonder if it's not deception but ignorance - that they think it really is where math complexity tops out at.
Drs rediscover integration is about people stepping far outside their field of expertise.
It is neither deception or ignorance.
It's the same reason some of the best physics students get PhD studentships where they are basically doing linear regression on some data.
Being very good at most disciplines is about having the fundamentals absolutely nailed.
In chess for example, you will probably need to get to a reasonably high level before you will be sure to see players not making obvious blunders.
Why do tech firms want developers who can write bubble sort backward in assembly when they'll never do anything that fundamental in their career? Because to get to that level you have to (usually) build solid mastery of the stuff you will use.
Trading is truly a complex endeavour - anybody who says it isn't has never tried to do it from scratch.
Id say the industry average for somebody moving to a new firm and trying to replicate what they did at their old firm is about 5%.
Im not sure what you'd call a problem where somebody has seen an existing solution, worked for years on it and in the general domain, and still would only have a 5% chance of reproducing that solution.
> Being very good at most disciplines is about having the fundamentals absolutely nailed.
> In chess for example, you will probably need to get to a reasonably high level before you will be sure to see players not making obvious blunders.
To extend the chess analogy, having the fundamentals absolutely nailed is critical at even a mid-level, because the payoff/effort ratio in avoiding blunders/mistakes is much higher than innovating or being creative.
The process of getting to a higher level involves rote learning of common tactics so you can instantly recognize opportunities, and then eventually learning deep into "opening theory" which is memorizing 10 starting moves + their replies because people much better than you have written lengthy books on the long-term ramifications of making certain moves. You're learning a vast repertoire of "existing solutions" so you can reproduce them on-demand, because those solutions are battle-tested to not have weaknesses.
Chess is a game where the amount you have to lose by being wrong is much higher than what you gain by being right. Fields where this is the case want to ensure to a greater extent that people focus on the fundamentals before they start coming up with new ideas.
Spell the assembly backwards out loud with no prior notes while juggling knives (shows boldness in the way you approach problems!) and standing on a gymnastics ball (shows flexibility and well-roundedness)...
> Id say the industry average for somebody moving to a new firm and trying to replicate what they did at their old firm is about 5%.
Because 95% of experienced candidates in trading were fired or are trying to scam their next employer.
“Oh, yeah, my <insert HFT pipeline or statarb model> can do sharpe <random int 1 to 10> for <random int 10 to 100> million pnl per year. Trust me bro”. Fucking annoying
Obviously not true. The deals for most of these set ups are team founders/pms are paid mostly by profit share. So the only scam is scamming yourself into a low salary position for a couple years till they fire you.
Orders of magnitude more leave their jobs of their choosing than are fired.
They hire people who know that maths doesn't "top out here", so they can point to them and say "look at that mathematicians/physicists/engineers/PHD's we employ - your $20Bn is safe here". Hedge funds aren't run by idiots, just a different kind of "smart" to an engineer.
The engineers are are incredibly smart people, and so the bots are "incredibly smart" but "finance" is criticised by "true academics" because finance is where brains go to die.
To use popular science "the three body problem" is much harder than "arb trade $10M profitably for a nice life in NYC", you just get paid less for solving the former.
It's like math v engineering - you can come up with some beautiful pde theory to describe this column in a building will bend under dynamic load and use it to figure out exactly the proportions.
But engineering is about figuring out "just make its ratio of width to height greater than x"
Because the goal is different - it's not about coming up with the most pleasing description or finding the most accurate model of something. It's about making stuff in the real world in a practical, reliable way.
The three body problem is also harder than running experiments in the LHC or analysing Hubble data or treating sick kids or building roads or running a business.
Anybody who says that finance is where brains go to die might do well to look in the mirror at their own brain. There are difficult challenges for smart people in basically every industry - anybody suggesting that people not working in academia are in some way stupider should probably reconsider the quality of their own brain.
There are many many reasons to dislike finance. That it is somehow pedestrian or for the less clever people is not true.
Nobody who espouses the points you've made has ever put their money where there mouth is. Why not start a firm, making a billion dollars a year because you're so smart and fund fusion research with it? Because it's obviously way more difficult than they make out.
> The three body problem is also harder than running experiments in the LHC or analysing Hubble data or treating sick kids or building roads or running a business
Not that it's particularly relevant to this discussion but the three body problem is easy. You can solve it numerically on a laptop with insane precision (much more precisely than would be useful for anything) or also write down an analytic solution (which is ugly and useless because it converge s extremely slowly, but still. See wikipedia.org/wiki/Three-body_problem).
> Unlike the two-body problem, the three-body problem has no general closed-form solution,[1] and it is impossible to write a standard equation that gives the exact movements of three bodies orbiting each other in space.
The crucial parts of that are "closed-form" and "standard". The analytic solution is "non-standard" because it involves the kind of power series that nobody knows or cares about (because they are only about 100 years old and have no real useful applications in engineering).
A similar claim is that roots of polynomials of degree 5 (and over) have no "general closed form solution" (with, as usual, the implicit qualification: "in terms of functions I'm currently comfortable with because I've seen them a lot"). That doesn't mean it's a difficult problem.
The two problems have in common that they are significantly harder than their smaller versions (two bodies, or degree 4). Historically, people spent a lot of time trying to find solutions for the larger problems in terms of the same functions that can be used to solve the smaller problems (conic sections, radicals). That turned out to not be possible. This is the historical origin of the meme "three body problem is unsolvable".
Ill probably go look this up, but do you mean functions of a higher type than normal powers like eg. Tetration, or something more complicated (am I even on the right track?)
I mean functions defined by power series (just like sin(x) is defined in analysis courses). For the three body problem, see http://oro.open.ac.uk/22440/2/Sundman_final.pdf (Warning, pdf!). This is what Wikipedia cites when talking about the solution to the three body problem. The document gives a lout of historical context.
For polynomial roots, see wikipedia.org/wiki/Elliptic_function.
> ... suggesting that people not working in academia are in some way stupider ...
My interpretation of "finance is where brains go to die" is more along the lines of finance being less good for society at large compared to pure science. Like if someone invents something new and useful in a lab for their phd, then they go find a job in finance. The brain died because it was onto something and then abandoned it for being a cog in the machine.
I was specifically addressing the "being smart isn't necessary for trading".
The op is making some implication across numerous posts that it's all basically a big con and it's all very simple.
It is like claiming you don't need to be rocket scientist to go to the moon because they just use metal and screws.
The individual parts might be simple in isolation. But it is the complexity of conducting large scale, large scope research in an environment that gives you limited feedback and will adapt to your own behaviour changes that is where the smarts are needed.
OP seems to not understand the inherent difficult of doing any research.
Almost anybody could be taught to make a simple circuit and battery from some basic raw materials.
The fact it is simple and easy now we know the answer does not mean it was simple or easy to discover. Some of the greatest minds dedicated their entire lives to discovering things that now most 10 years olds understand. That doesn't imply you only need to have the intellect of a 10 year old to make fundamental breakthroughs in science.
Working in quant trading is almost pure research - and so it requires a certain level of intellect - probably at least the intellect required to pursue a quantitative PhD successfully (not that they need the PhD but they need the capacity to be able to do one).
You misunderstand the quote. It’s where brains go to die from a societal perspective. It might be stimulating and difficult for the individual but it’s useless to science.
It’s impressive how incorrect so much of this information is. High frequency trading is about going fast. There is a huge mid and low freq quant industry. Also most quant strategies are absolutely not about being “super right”…that would be the province of concentrated discretionary strategies. Quant is almost always about being slightly more right than wrong but at large scale.
What algos are you referring to derived 30 or 40 years ago? Do you understand the decay for a typical strategy? None of this makes any sense.
Quantitative trading is simply the act of trading on data, fast or slowly, but I'll grant you for the more sophisticated audience there is a nuance between "HFT" and "Quant" trading.
To be "super right" you just have to make money over a timeline, you set, according to your own models. If I choose a 5 year timeline for a portfolio, I just have to show my portfolio outperforming "your preferred index here" over that timeline - simple (kind of, I ignore other metrics than "make me money" here).
Depending on what your trading will depend on which algo's you will use, the way to calculate the price of an Option/Derivative hasn't changed in my understanding for 20/30 years - how fast you can calculate, forecast, and trade on that information has.
My statement wont hold true in a conversation with an "investing legend", but to the audiance who asks "do you use llama3" its clearly an appropriate response.
I don't really understand your viewpoint - I assume you don't actually work in trading?
Aside from the "theoretical" developments the other comment mentioned, your implication that there is some fixed truth is not reflected in my career.
Anybody who has even a passing familiarity with doing quant research would understand that black scholes and it's descendants are very basic results about basic assumptions. It says if the price is certain types of random walk and also crucially a martingale and Markov - then there is a closed form answer.
First and foremost black scholes is inconsistent with the market it tries to describe (vol smiles anyone??), so anybody claiming it's how you should price options has never been anywhere near trading options in a way that doesn't shit money away.
In reality the assumptions don't hold - log returns aren't gaussian, the process is almost certainly neither Markov or martingale.
The guys doing the very best option pricing are building empirical (so not theoretical) models that adjust for all sorts stuff like temporary correlations that appear between assets, dynamics of how different instruments move together, autocorrelation in market behaviour spikes and patterns of irregular events and hundreds of other things .
I don't know of any firm anywhere that is trading profitably at scale and is using 20 year old or even purely theoretical models.
The entire industry moved away from the theory driven approach about 20 years ago for the simple reason that is inferior in every way to the data driven approach that now dominates
Very few of the fancy models are actually used. Dupire's non parametric model has been the industrial work horse for a long time. Heston like SV's and Jump diffusions promised a lot and did not work in practice (calibration, stability issues). Some form of local stochastic models get used for certain products.
In general, it is safe to say that Black-Scholes and its deterministic extension local vol have held up well.
Not only that, but Dupire’s local vol, stochastic vol (Heston in rates, or on the equity side models that combine local vol with a stoch vol component to calibrate to implied vols perfectly) and jump diffusion were basically in production 15 years ago.
Since the GFC it’s not about crazy new products (on derivatives desks), but it’s about getting discounting/funding rates precisely right (depending on counterparty, collateral and netting agreements, onshore/offshore, etc), and about compliance and reporting.
> the way to calculate the price of an Option/Derivative hasn't changed in my understanding for 20/30 years
Not true. Most of the magic happens in estimating the volatility surface, BSM's magic variable. But I've also seen interesting work in expanding the rates components. All this before we get into the drift functions.
> all foundational derivatives models were basically in place back then
In vanilla equity options, sure. But that’s like saying we solved rockets in WWII. The foundational models were derived by then; everything that followed was refinement, extension and application.
Leveraging "hidden" risk/reward asymmetries is another avenue completely that applies to both quant/HFT, adding a dimension that turns this into a pretty complex spectrum with plenty of opportunities.
The old joke of two economists ignoring a possible $100 bill on the sidewalk is an ironic adage. There are hundreds of bills on the sidewalk, the real problem is prioritizing which bills to pick up before the 50mph steamroller blindsides those courageous enough to dare play.
Algo trading is certainly about speed too though, but it's not HFT which is literally only a out speed and scalping spreads. It's about the speed of recognizing trends and reacting too them before everyone else realizes the same trend and thus altering the trend.
It's a lot like quantum mechanics or whatever it is that makes the observation of a photon changes. Except with the caveat that the first to recognize the trend can direct it's change (for profit).
I agree this isn't earth shattering, but I think the benefit here is that it's a general solution instead of one trained on financial statements specifically.
That is not a benefit. If you use a tool like this to try to compete with sophisticated actors (e.g. all major firms in the capital markets space) you will lose every time.
We come up with all sorts of things that are initially a step backwards, but that lead to eventual improvement. The first cars were slower than horses.
That's not to suggest that Renaissance is going to start using Chat GPT tomorrow, but maybe in a few years they'll be using fine tuned versions of LLMs in addition to whatever they're doing today.
Even if it's not going to compete with the state of the art models for something, a single model capable of many things is still useful, and demonstrating domains where they are applicable (if not state of the art) is still beneficial.
It seems to me that LLMs the metaphorical horse and specialized algorithms are the metaphorical car in this situation. A horse is a an extremely complex biological system that we barely understand and which has evolved many functions over countless iterations, one of which happening to be the ability to run quickly. We can selectively breed horses to try to get them to run faster, but we lack the capability to directly engineer a horse for optimal speed. On the other hand, cars have been engineered from the ground-up for the specific purpose of moving quickly. We can study and understand all of the systems in a car perfectly, so it's easy to develop new technology specialized for making cars go faster.
Far too much in the way of "maybe in a few years" LLM prediction relies on the unspoken assumption that there will not be any gains in the state of the art in the existing, non-LLM tools.
"In a few years" you'd have the benefit of the current, bespoke tools, plus all the work you've put into improving them in the meantime.
And the LLM would still be behind, unless you believe that at some point in the future, a radically better solution will simply emerge from the model.
That is, the bet is that at some point, magic emerges from the machine that renders all domain-specialist tooling irrelevant, and one or two general AI companies can hoover up all sorts of areas of specialism. And in the meantime, they get all the investment money.
Why is it that we wouldn't trust a generalist over a specialist in any walk of life, but in AI we expect one day to be able to?
> That is, the bet is that at some point, magic emerges from the machine that renders all domain-specialist tooling irrelevant, and one or two general AI companies
I have a slightly more cynical take: Those LLMs are not actually general models, but niche specialists on correlated text-fragments.
This means human exuberance is riding on the (questionable) idea that a really good text-correlation specialist can effectively impersonate a general AI.
Even worse: Some people assume an exceptional text-specialist model will effectively meta-impersonate a generalist model impersonating a different kind of specialist!
> Even worse: Some people assume an exceptional text-specialist model will effectively meta-impersonate a generalist model impersonating a different kind of specialist!
Specialists exist because the human generalist can no longer possibly learn and perfect all there is to learn in the world not because the specialist has magic powers the generalist does.
If there were some super generalist that could then the specialist would have no power.
agreed. most people can't create a custom tailored finance statement model. but many people can write the following sentence: "analyze this financial statement and suggest a market strategy." and if that sentence performs as well as an (albeit old) custom model, and is likely to have compound improvements in its performance over time with no changes to the instruction sentence...
But it can't come up with a particularly imaginative strategy; it can only come up with a mishmash of existing stuff it has seen, equivocate, or hallucinate a strategy that looks clever but might not be.
So it all needs checking. It's the classic LLM situation. If you're trained enough to spot the errors, the analysis wouldn't take you much time in the first place. And if you're not trained enough to spot the errors...
And let's say it does work. It's like automated exchange betting robots. As soon as everyone has access to a robot that can exploit some hidden pattern in the data for a tiny marginal gain, the price changes and the gain collapses.
So if everyone has the same access to the same banal, general analysis tools, you know what's going to happen: the advantage disappears.
All in all, why would there be any benefits from a generalised model?
> "buy and hold the S&P 500 until you're ready to retire"
That is bad advice.
VGT Vanguard Technology ETF has outperformed S&P 500 over the past 20 years.
All the people who say “VTSAX and chill” disappeared in the past 3-4 years because their cherished total passive index fund is no longer the best over long horizons. And no, the markets are not efficient.
Given the techie audience here, I want to caution that investing in the same industry as your job is a kind of anti-diversification.
A really severe example would be all the people who worked at Enron and invested everything in Enron stock.
Even if your employer/investments aren't quite so fraudulent, You don't want to be in a situation where you are long-term unemployed and are forced "sell low" in order to meet immediate needs. If only one or the other is hit, you can ride things out more effectively.
So the history of this type of research as I know it was that we
- started to diff the executives statements from one quarter to another. Like engineering projects alot of this is pretty standard so the starting point is the last doc. Diffing allowed us to see what the executives added and thought was important and also showed what they removed. This worked well and for some things still does, this is what a warrant canary does, but stopped generating much alpha around 2010ish.
- simple sentiment. We started to count positive and negative words to build a poor mans sentiment analysis that could be done very quickly upon doc release to trade upon. worked great up until around 2013ish before it started to be gamed and even bankruptcy notices gave positive sentiment scores by this metric.
- sentiment models. Using proper models and not just positive and negative word counts we built sentiment models to read what the executives were saying. This worked well until about 2015/2016ish in my world view as by then executives carefully wrote out their remarks and had been coached to use only positive words. Worked until twitter killed the fire hose, and wasn't very reliable as reputable news accounts kept getting hacked. I remember i think AP new's account got hacked and reported a bombing at the white house that screwed up a few funds.
You also had Anne Hathaway news pushing up Berkshire Hathaway's share price type issues in this time period.
- there was a period here where we kept the same technology but used it everywhere from the twitter firehose to news articles to build a realtime sentiment model for companies and sectors. Not sure it generates much alpha due to garbage in, garbage out and data cleaning issues.
- LLMs, with about GPT2 we could build models to do the sentiment analysis for us, but they had to be built out of foundational models and trained inhouse due to context limitations. Again this has been gamed by executives so alot of the research that I know of is now targeted at ingesting the Financials of companies and being able to ask questions quickly without math and programming.
ie what are the top 5 firms in the consider discretionary space that are growing their earnings the fastest while not yet raising their dividends and whose share price hasn't kept up with their sectors average growth.
I have no window into this world but I am curious if you know anything about the techniques that investors used to short or just analyze Tesla stock during the production hell of 2017-2020? It was an interesting window in ways that firms use to measure as much of the company as they can from the outside. In fact was there any other stock that was as heaving watched during that time?
Looking back at that era it seemed investors were too focused on the numbers and fundamentals, even setting up live feeds of the factories to count the number of cars coming out and thats the same feeling I get from your post. It seems like dumb analysis ie. analysis without much context.
We now know from the recent Isaacson biography what was happening on the other side. The shorts failed to measure the clever unorthodox ways that Musk and co would take to get the delivery numbers up. For example: The famous Tent. Musk used a loophole in CA laws to set up a giant tent in the parking lot and allowed him to boost the production by eliminating entire bottlenecks from the factory design. There is also just the religious like fervor with which the employees wanted to beat the shorts. I dont think this can be measured no? It helped to get them past the finish line.
Most companies aren’t obsessed enough with shortens to try and hide poor results from analysis that will be exposed in 3 months anyway. There’s always ways around them - number of cars registered, number of delivery trucks visiting, time for delivery on website, how much overtime is being worked etc.
Markets aren't sports teams, i.e. bimodal camps with us vs. them drama. Twitter discussion of markets, maybe, but not markets.
I've been on both sides of this trade, regularly.
Bear thesis back then was same as now. In retrospect, I give it a few more credits because Elon says they were getting close to bankrupt while he was posting "bankwupt" memes and selling short shorts.
Being a pessimist, and putting your money where your mouth is in markets, is difficult because you have to be right and have the right timing.
> In this section, we aim to understand the sources of GPT’s predictive ability.
Oh boy... I wonder how a neural net trained with unsupervised learning has a predictive ability. I wonder where that comes from... Unfortunately, the article doesn't seem to reach a conclusion.
> We implement the CoT prompt as follows. We instruct the model to take on the role of a financial analyst whose task is to perform financial statement analysis. The model is then instructed to (i) identify notable changes in certain financial statement items, and (ii) compute key financial ratios without explicitly limiting the set of ratios that need to be computed. When calculating the ratios, we prompt the model to state the formulae first, and then perform simple computations. The model is also instructed to (iii) provide economic interpretations of the computed ratios.
Who will tell them how an LLM works and that the neural net does not calculate anything? It only predicts the next token in a sentence of a calculation if it's been loss-minimized for that specific calculation.
It looks like these authors are discovering large language models as if they are some alien animal. When they are mathematically describable and really not so mysterious prediction machines.
At least the article is fairly benign. It's about the type of article that would pass as research in my MBA school as well... It doesn't reach any groundbreaking conclusions except to demonstrate that the guys have "probed" the model. Which I think is good. It's uninformed but not very misleading.
I have heard of generalization vs memorization, but the article you shared is very high quality. Thank you.
I do not think that SOTA LLMs demonstrate grokking for most math problems. While I am a bit surprised to read how little training is necessary to achieve grokking in a toy setting (one specific math problem), the domain of all math problems is much larger. Also, the complexity of an applied mathematics problem is much higher than a simple mod problem. That seems to be what the author of the first article you quoted thinks as well.
Our public models fail in that large domain a lot. For example, with tasks like counting elements in a set (words in a paragraph). Not to mention that they fail in complex applied mathematics tasks. If they have been loss-minimized for that specific calculation to the point that they exhibit this phase change, then that would be an exception.
But in the financial statement analysis article, the author says explicitly that there isn't a limitation on the types of math problems they ask the model to perform. This is very, very irregular, and there are no guarantees that model has generalized them. In fact, it is much more likely that it hasn't, in my opinion.
In any case, thank you again for the article. It's just such a massive contrast with the MBA article above.
Phase changes and grokking make me nervious... It seems once you reach a certain threshold of training, you can continually "phase-change" and generate these emergent capabilities. This does not bode well for alignment.
The area where I see this making the most transformational change is by enabling average citizens to ask meaningful questions about the finances of their local government. In Cook County, Illinois, there are hundreds of local municipalities and elected authorities, all of which are producing monthly financial statements. There is not enough citizen oversight and rarely any media attention except in the most egregious cases (e.g. the recent drama in Dolton, IL, where the mayor is stealing millions in plain view of the citizens).
The citizens ask LLMs (or more advanced future AIs) to identify if government finances are being used efficiently, and if there is evidence of corruption.
The corrupt government officers then start using the AIs to try to cover up the evidence of their crimes in the financial statements. The AI possibly putting the skills of high-end and expensive human accountants (or better) into the hands of local governments.
Corrupt government officers are one thing. But there is a ton of completely well-meaning bureaucracy in the U.S. (and everywhere!) that could benefit from a huge, huge step change in "ability to comprehend".
Bad actors will always exist but I think there's a LOT of genuine good to be done here!
If we put the right checks and balances (powered by AI) in place now, we can front run the criminals, both the obvious and non-obvious crimes. We can shine light in more places and push the corruption further out of the system.
> The corrupt government officers then start using the AIs
You're making it way too complicated. The government will simply make AI illegal and claim it's for safety or something. The'll then use a bunch of scary words to demonize it, and their pals in the mainstream media will push it on low-information voters. California already has a bill in the works to do exactly this.
> The corrupt government officers then start using the AIs to try to cover up the evidence of their crimes in the financial statements.
There's a difference between an AI being able to answer questions and it helping cover up evidence, unless you mean "using the AIs for advice on how to cover up evidence"
that's what I did with my town financial report. Asked chatGPT to find irregularities.
The response was very concerning, with multiple expenses that looked truly very suspicious (like planting a tree - 2000$).
I would have gone berserk at the town council meeting if I was an activist citizen.
I think this is in general one of the big wins with LLMs: Simple summarization. I first encountered it personally with medical lab reports. And as I noted in a past comment, GPT actually diagnosed an issue that the doctors and nurses missed in real-time as it was happening.
The ability to summarize and ask questions of arbitrarily complex texts is so far the best use case for LLMs -- and it's non-trivial. I'm ramping up a bunch of college intern devs and they're all using LLMs and the ramp up has been amazingly quick. The delta in ramp up speed between this and last summer is literally an order of magnitude difference and I think it is almost all LLM based.
> citizens to ask meaningful questions about the finances of their local government.
is there a demand for this. I live in cook country. I really don't want to ask these questions. Not sure what I get out of asking these questions other than anger and frustration.
> if all the citizens can ask these questions, I think it will make a difference.
Our major just appointed some pastor to a high level position in CTA( local train system) as some sort of patronage.
Thats the a level things operate in our govt here. I am skeptical that some sort of data enlightenment in citenzery via llm is what is need for change.
Then anything you plan is doomed from the start. If companies start slipping cyanide into their food it would take at least 20 years for people to stop buying it. Getting everyone to simply do your thing while they're busy with their own life is a fool's errand.
Most people won't care most of the time. But if the local government cuts the budget for something you like and says "We couldn't find the money," you may care that year.
Let's say LLMs work exactly as advertised in this case: you go into the LLM, say "find corruption in these financial reports", and it comes back with some info about the mayor spending millions on overpriced contracts with a company run by his brother. What then? You can post on Twitter, but unless you already have a following it's shouting into the void. You can go to your local newspapers, they'll probably ignore you; if they do pay attention, they'll write an article which gets a few hundred hits. If the mayor acknowledges it at all, they'll slam it as a political hit-piece, and that's the end of it. So your best chance is... hope really hard it goes viral, I guess?
This isn't meant to be overly negative, but exposing financial corruption is mostly about information control; I don't see how LLMs help much here. Even if/when you find slam-dunk evidence that corruption is occurring, it's generally very hard to provide evidence in a way that Joe Average can understand, and assuming you are a normal everyday citizen, it's extremely hard to get people to act.
As a prime example, this bit on the SF "alcohol rehab" program[0] went semi-viral earlier this week; there's no way to interpret $5 million/year spent on 55 clients as anything but "incompetence" at best and "grift and corruption" at worst. Yet there's no public outrage or people protesting on the streets of SF; it's already an afterthought in the minds of anyone who saw it. Is being able to query an LLM for this stuff going to make a difference?
Also, per the link, cheaper than emergency room visits and ambulance transports:
> But San Francisco public health officials found that the city saved $1.7 million over six months from the managed alcohol program in reduced calls to emergency services, including emergency room visits and other hospital stays. In the six months after clients entered the managed alcohol program, public health officials said visits to the city’s sobering center dropped 92%, emergency room visits dropped more than 70%, and EMS calls and hospital visits were both cut in half.
> Previously, the city reported that just five residents who struggled with alcohol use disorder had cost more than $4 million in ambulance transports over a five-year period, with as many as 2,000 ambulance transports over that time. [emphasis mine]
> The San Francisco Fire Department said in a statement that the managed alcohol program has “has proven to be an incredibly impactful intervention” at reducing emergency service use for a “small but highly vulnerable population.”
Beautifully stated. I can only speculate, but I'd say the reason it is this way is due to the collective apathy/cynicism toward government. We have collectively come to expect a certain level of corruption and influence peddling. We have a high tolerance for incompetence in carrying out government operations. Only the most egregious offenders are brought to the public's attention, and in an age of increasingly short attention spans, people have forgotten by the time elections roll around.
That is, if they vote in the first place - in that example I gave above of a corrupt mayor stealing millions (Tiffany Henyard of Dolton, IL), the voter turnout was only 15%.
Why would you report financial crimes to Twitter? If your LLM uncovers financial crimes you should contact regulators and prosecutors. They're both incentivized to do something about it.
Oh yeah. This. I live in a tiny community it our district school board has a $54 million budget right now, and all the audits are rubber stamps and wink and nudge from the State. When residents try to dig in and complain about waste and fraud we are shrugged off.
It’s your assumption that the lack of oversight is because of too much information. How will you validate that hypothesis before you invest in a solution?
if you want to see successful "machine learning based financial statement analysis", check out my paper & thesis. its from 2019 and ranks #1 for the term on google and gs because it is the first paper that applies a range of machine learning methods to all the quantitative data in them instead of just doing nlp on the text. happy to answer questions
To everyone thinking they can sell a LLM wrapper based on this - this is a very tough domain. You will soon run into data, distribution, and low demand. Funds that would actually use this are already using it.
I recalling seeing a LinkedIn post by Greg Diamos at Lamini, they shared analysis of earnings calls. There are links on HuggingFace and GitHub, here they are:
It would've been interesting to compare models with larger context windows, e.g. Gemini with 1m+ tokens and Claude Opus. Otherwise, the title maybe should've been Financial Statement Analysis with GPT-4.
I don't have a horse in the race. I don't even know what financial statement analysis is. But it worries me that a novel reliance on these models for traditionally skilled labor jobs will turn into a dependence. These models use the built up experience of human practitioners to achieve similar results. But if a dependence grows, then there will be no more skilled human practitioners to further develop improved skills and knowledge for these jobs. Calcifying the skills in time.
That wouldn't be a problem if the models actually worked for that. We don't miss cobblers.
But these models only seem to perform these jobs on the surface. Enough that companies will try them and waste resources. This will just hurt the bottom line optimizer shops and boost the professionals doing quality work on the long run.
As someone alluded to, the narrative that management drives has been examined and studied many times over. What is management saying, what are they not saying, what are they saying but not loudly, what did they say before that they no longer speak about. There are insights to glean but nothing that is giving you an unknown edge. Sentiment analysis and the like go back well into the late 80s, early 90s.
Maybe, but it sounds hard if there are multiple LLMs out there that people might use to analyze such text. Tricking multiple LLMs with a certain poisonous combination of words and phrases sounds a lot like having a file that hashes to the same hash from different hashing techniques. Theoretically possible but actually practically impossible.
If this were to become widely used, I can imagine executives writing financial statements, running them through an LLM, and tweaking it until they get the highest predicted future outcome. This would make the measure almost immediately useless.
This is already how it works. Have you listened to an earnings call? Especially companies like Tesla? They are a dog and pony show to sell investors on the stock.
I am not saying executives aren't currently trying to game the system. I am saying currently the best they can do is estimate how thousands of analysts will respond. If LLM analysts become wide spread then they would be able to run simulations in a feedback loop until their report is optimized.
Still, if you use GPT-4 it gives you 60% of accuracy in predicting if it's going to go up and down, which is considerably better than median human forecasters. Stop being so dismissive and start reading the numbers.
From a first principles approach: it does not really make sense to use an LLM to do fundamental analysis directly. Maybe you can use an LLM to write some python code to do fundamental analysis. But skipping that model building step and just applying a language model to numbers does not make intuitive sense to me.
I am surprised at the results in the paper. The biggest red flag is that the researcher are not sure why there is predictive ability in LLMs. Maybe they didn't control for some lookahead bias.
This research and the comments mentioning the need of being fast as hell connect with the human capacity of using the information. Just a hypothetical point: at some point only trading engines driven by AI could make a timely use of this source.
What will happen if everyone starts using heavy statistical methods or LLMs to predict stocks prices? And buys stock based on them? Will it absolutely make everything unpredictable?
Edit: assuming that they initially provide good predictions
Or rather: If LLMs could give those guys an edge, there's no way they'd share their edge-giving LLMs with anyone, least of all their competition and the plebs.
Isn't it already unpredictable? That is why nobody outperforms indices. And utterly irrational, which is again both expected and seen. This must be why Tesla continues to have a huge market value - Elon knows how to excite the LLMs. :)
Oh damn, I thought this was about real finances, turns out it’s just part of that weird “property” thing they do in NYC. I hope someone’s working on feeding their Ledger files into a structured language model…
More substantively, LLMs are for linguistic tasks. That’s why I’m super super bullish (heh) on llms for decoding EEG data, and incredibly bearish on their ability to accurately model a corporation’s asset flow. I just don’t see how the confounding variables / motivating forces would be at all linguistic. This is basically using LLMs for super-advanced arithmetic
I guess this makes sense. Because while there should be some noise from the text translation into the internal representation of the financial data once ingested into the model, the authors purposefully re-formatted all the reports to be formatted consistently. That then should allow the model to essentially do less of the LLM magic and more plain linear regression of the financial stats. And often past performance does have an impact on future performance, up to a point.
I wonder what the results would have been with still-anonymized but non-fully standardized statements.
> A key research design choice that we make is to not provide any textual information (e.g., Management Discussion and Analysis) that typically accompanies financial statements. While textual information is easy to integrate, our primary interest lies in understanding the LLMs’ ability to analyze and synthesize purely financial numbers. We use this setup to examine several research questions
Seems like an odd test for a large language model. There are tabular models out there
Great. Humans no longer need to cook the books and can claim plausible deniability. The only problem is the hallucination errors could go against you as well as for you.
They may claim, but there is no such plausible deniability. Not for lawyers using AI hallucinations, not for Tesla drivers crashing into things with FSD, not for tax fraud. People are ultimately held responsible and accountable for the way they use AI.
Top story on HN because we all secretly think we can be the next Jim Simons when in reality we're a few months away from posting loss porn to /r/WSB.
If standardized LLM models are used to analyze statements, expect the statements to be massaged in ways that produce more favorable results from the LLM.
I can’t wait until there’s warnings on stock market apps like cigarettes and lottery tickets. Well actually I guess there are no warnings on lotto tickets, probably for the exact same reason as why the government doesn’t protect people from being scammed by hedge funds with way more info than they have: the government needs that revenue.
But GPT4 has been trained up to recent events, so you can't do rolling predictions using just historical data with it. If your LLM knows the future obviously it can predict well, even if the company name is hidden.
I am disturbed to see so much enthusiasm here for "trading"
Markets matter, and some speculation is useful, but the purpose of markets is not speculation. Obviously.
If you want to make some money get trained and get a good salary. Save your money in safe assets
if you want to get supper rich be super creative, ensure going broke will only effect you (i.e. do not do this while supporting a family), and found a firm. You will likely fail, but there is a chance of super wealth and a bigger chance of a wild ride that will be good for you
Trading from the perspective of greed runs the risk of total destruction. Putting you in jail, maybe. Bankruptcy if not too unlucky. Many people out of work because of your misallocation, and if you do not care about that I'm not interested in you
The financial system is a zero sum game. (The economy in general is not) There is always someone cleverer and they likely do not care if they crush you. International finance is a snake pit
Friends, look after friends. Maximise happiness. Be honest, be ethical, be safe
>>> If you want to make some money get trained and get a good salary. Save your money in safe assets.
You mean like the good salary you get working for a trading firm?
I'm not sure what this comment you made is meant to be, but it reads like a blend of somebody who's high and a tik tok wellness influencer.
Trading is not a zero sum game in the sense you intend to suggest it is. It is 0 sum only if all participants have the same trading horizon.
The pool grows in the same way because it is linked to the economy. The markets are a variety of players with different requirements.
Most transactions occur between parties who have different horizons. Yes the hft makes money over 5s and the pension fund loses it. But the pension fund is looking at the return over the next year, so the small loss to the hft is just a cost of acquisition.
It's a long winded insult assuaged by a meme phrase at the end to not appear as one, I wouldn't treat it as anything more, what you've said is pretty apt though.
> I am disturbed to see so much enthusiasm here for "trading"
It is lots of fun. Very mathy. A nerd's dream.
Then the data. Oh the amount of data. 34 Gbit/s to get the full US options feed last I checked (someone posted that here I think). Much of the rest is kiddie stuff compared to dealing with that.
People can lament has much as they want that it drains the great minds: it is fun.
I didn't invent that game. Don't blame the players.
Arguably, the purpose of markets is part price discovery, part liquidity, and arguably mostly to support economic growth and stability by channeling funds from savers to those who can invest them productively.
> If you want to make some money get trained and get a good salary. Save your money in safe assets
No, not all of finance is a zero-sum game. If you're connecting a buyer and seller that otherwise wouldn't have met, you provided value. Same for connecting them through time (in that you can e.g. help prevent somebody having to panic-sell their house from getting a suboptimal price).
Sure, there's speculation, nepotism, corruption; there are immoral and illegal market practices with no end, but you're making it sound like that's the entire purpose of finance, and not an undesirable byproduct.
Also, as if these only exist there, and not everywhere where there is power and money: Politics, business, even charity are not immune.
Starting a company is more ethical than trading – seriously? While there might be a general trend, can you think of no philanthropic traders and of no unethical founders (some of them in jail)?
> If you want to make some money get trained and get a good salary. Save your money in safe assets
100% agreement on the first part. But if everybody invests their money in "safe assets", there is no capital for people to start companies other than banks. Is that desirable? And who even determines what a safe asset is? What about people that manage and allocate risk? That's a function of finance again!
> Friends, look after friends. Maximise happiness. Be honest, be ethical, be safe
I agree, but this arguably has little to do with the remainder of your sweeping generalization.
The fact that the paper does not mention the word "hallucinations" in the full body text makes me think that the authors aren't fully familiar with the state of LLMs as of 2024.