This post is quite interesting, and I will have to re-read and ponder it some more, but there is one obvious flaw in the analysis. By analyzing the past returns of current S&P500 companies, the author is allowing for survivorship bias; companies which have done consistently well (in terms of market capitalization) over the analysis period are likely to be over-represented in current indices. To correct for this, the author should re-run the analysis using the S&P500 companies from the beginning of the period instead of the end.
This is the same problem that a study ran into some time ago when it demonstrated that the portfolio managers with the worst returns were the best investments; it analysed the returns of a number of managers over a period and found that the ones with the worst returns at the beginning had the best returns at the end. The problem is that all the consistently mediocre or bad managers were discarded, as they did not survive until the end of the period.
Toy model: A fund can do well or badly at the start of the period, and it can do well or badly at the end. Both happen completely at random. A fund that does badly at both ends is closed and never heard from again.
If you analyse funds in this situation, you will find that every fund that does badly at the start of the period does well at the end. (Because the ones that do badly at the end too are all gone.) You might be tempted to think up clever explanations about how fund managers with bad initial results make extra effort, or how stocks that do badly tend to rebound later as investors recognize their true value, or something -- but that would be a mistake, because in this situation the only thing leading to the relationship between early and late performance is the fact that the "bad at both ends" funds aren't represented in the analysis.
I think he means that lucky strikes can be longer than the career span of many managers. You never see them losing because they don't live long enough to have a devastating return to the average.
It is quite a leap to go from 'not gaussian' to 'not random' as done here. All that has been falsified, as far as I can tell, is a very simple model of a random walk with normally distributed disturbances.
It would be interesting how much better it becomes if higher moments, in particular kurtosis ('fat tails') are included.
The article clearly states that the test extends to many forms of randomness beyond Gaussian:
"Nevertheless, the desired effect of stochastic volatility namely, fatter tailed distributions ..."
"... we want a test for the random walk hypothesis which passes (it concludes the market is random) even if the returns demonstrate heteroskedastic increments and large drifts. Why? Because both of these properties are widely observed in most historical asset price data (just ask Nassim Taleb) and neither invalidate the fundamental principle underpinning the random walk hypothesis, namely the Markov property (unforecastibility of future asset prices given past asset prices)"
His model is not heteroskedastic. The log-error terms are i.i.d, so they have actually all the same variance. The distribution they are sampled from is a normal variance mixture.
Interesting. I'm no statistician but the blog post states that the authors of the original paper (Lo and Mckinlay) claim the test is heteroskedasticity-consistent. So what are you saying? Are you saying the original paper is wrong? That his example was wrong? (This seems more likely) And if the example is wrong, is it wrong to say it is heteroskedastic AND wrong to say it is a stochastic volatility model? Or just that it is heteroskedastic? As far as I can tell stochastic volatility simply means the variance itself is randomly distributed which looks consistent with his example? Just trying to clarift. Thanks.
I haven't read the original paper, I can't comment on that.
The model in the webpage however is not heteroskedastic (literally "unequal variance") because all the log-increments are iid. It could be legitimately considered a geometric Brownian motion with stochastic volatility, because the log-error is indeed normally distributed with variance picked from some stochastic distribution, in this case a normal distribution. This term however is normally used for models in which the volatility has more structure (e.g. the ARCH or GARCH models which are mentioned in the page).
" weak form of the efficient market hypothesis which states that: future prices cannot be predicted by analyzing prices from the past ..."
No. That is the consequence of the efficient market hypothesis being true. The hypothesis itself is subtly different. Though most people miss it:
All information from past prices is in current prices.
That is a better formulation.
I spent five years studying it and I think it is robust if you remember that it takes time for information to be consumed. That is why HFT works, because it acts before the information can be processed.
Computer scientists do not stand on the shoulder of giants, they slay them, grind them up and then do all the work again.
> You want to make your way in the CS field? Simple. Calculate rough time of amnesia (hell, 10 years is plenty, probably 10 months is plenty), go to the dusty archives, dig out something fun, and go for it. It's worked for many people, and it can work for you. -- Ron Minnich
This is the same problem that a study ran into some time ago when it demonstrated that the portfolio managers with the worst returns were the best investments; it analysed the returns of a number of managers over a period and found that the ones with the worst returns at the beginning had the best returns at the end. The problem is that all the consistently mediocre or bad managers were discarded, as they did not survive until the end of the period.