Hacker News new | past | comments | ask | show | jobs | submit login

> I laughed at the renaming of terms part because it is so true (classic example is RSS/SSR/ESS/SSE). Taking a course in the stats department alongside an econometrics course was bound to confuse any student.

Come on, this is silly. Calling things RSS vs. SSR might confuse an undergrad, but I've taken both classes (stats intro to linear regression + econ intro to linear regression) at the same time and saw barely a difference in the material. The stats class had a greater emphasis on finite-sample properties under normality and the econ class on asymptotics. Not a huge change, and the differences were complementary.

We both know that the "bizarre obsession with [generalized] method of moments" is because moments come out of models with rational agents, not distributions.




I've taken PhD Econometrics where we touched zero data and was 100% theoretical over the 9 months, so I don't think we are talking about the same thing. I don't like spewing out economically technical non-sense on HN, so I wasn't going to go into the methods of moments part of that comment as it applies to the current state of econometric theory.

In econometric theory, method of moments comes up in the largest way in the form of generalized methods of moments (or GMM). The idea behind GMM is one that is in competition with maximum likelihood, with the point being that GMM doesn't force you to make arbitrary assumptions about the true probability distribution of the data, purely for the implementation of the model. This seems obviously attractive, because then the results of our model won't be jeopardized just because one of our assumptions was false. In other words, GMM provides a way to estimate the parameters of a model with out making assumptions about the population.

Oh, and the topic of rational agents is not relevant here. This is a purely statistical/philosophical argument.

But there goes the economist in me again... I'm sorry.


I've taken just as many stats classes that don't touch data; I don't think either of us want to argue pedagogy of teaching on HN (fuck, I didn't even want to spell it and it's likely I didn't), but I don't think that there's a huge difference in how Econ and stats departments teach the same material or use terminology. (there are big differences in the material selected, obviously).

This statement can't be true: "GMM provides a way to estimate the parameters of a model with out making assumptions about the population.". You need assumptions, just different assumptions. Frequently those assumptions involve agent rationality, but not always (after all, MLE is a special case of GMM).


"MLE is a special case of GMM"

You have it backwards. MLE is actually a special case of GMM.

I'm done here.


You seem to have just repeated the quote? Presumably you mean "GMM is a special case of MLE"?


Mistake is on my side, I apologize. I read what the poster said backwards myself. :) The repeated quote is a true statement: MLE is a special case of GMM.


GMM requires a weights matrix, which implicitly does the same thing as distributional assumptions do in MLE (and you can replicate most MLE models in GMM by appropriate choice of weights matrix)


As a Bayesian, this sounds very intriguing. Could you expand upon GMM/provide a good reference?


Good reference: For most things in econometrics, the publicly available lecture notes by Jeff Wooldridge and Guido Imbens are excellent. I don't recall their GMM notes specifically, but that'd be a good place to start (just google for it).

Super-quick explanation: Write down a model, and a bunch of conditions that should be true at the correct parameter values.

As a trivial example, the conditions might be that the errors/residuals are orthogonal to the explanatory variables. Put mathematically, the expected value of

X*epsilon=0

where X is the matrix of explanatory data, and epsilon is your vector of errors. Since the errors are a function of the parameters (beta), a can find the optimal parameter values by solving your moment conditions.

You frequently have more conditions than parameters, so the conditions can't all be true at once. Then the computer tries to get the conditions as close to true as possible... where closeness is defined by how you weigh the importance of each individual condition.

Ideally the conditions arise transparently from the model. Executed poorly, it can seem ad-hoc.

A quick google search should gives better explanations than my response in comments :)


Gmm's popularity among econometricians isn't motivated by anything related to rational agents. GMM is popular in econometrics because it is a natural expression of IV estimators.

IV regressions aren't popular among statisticians or machine learnists, so this isn't an issue for them.

Even in structural econometrics (e.g. BLP), they are using moments to deal with endogeneity... the rationality angle is a red herring.


I'm mostly familiar with macro and finance applications; I'll take your word for it that it's not strictly coming from agent rationality in other subfields. Perhaps I should have said, "unforcastability" which would have covered natural experiments and IV as well.


Oops. I wasn't thinking about macro when I disagreed with your earlier claim. Your explanation sounds consistent with my distant recollection of the field (though I'd also take your word for it.)

I shouldn't have disagreed so strongly in the first place.


Maybe the fact that moment conditions are often first order conditions for some agent's optimization problem is why rationality was mentioned, but I'm just speculating.


That may have been why it was mentioned, but within micro applications, that is getting the relationship flipped.

Micro models estimated from agent's optimization problems are more frequently estimated using MLE (John Rust's GMC bus paper being a canonical example, though still true in the current literature e.g. Nevo, or Bajari and Hong).

Micro models estimated from aggregated data frequently lack agent-level optimization, and those are the models more frequently estimated with gmm (examples here would include the hundreds of papers based on Berry, Levinsohn and Pakes).

So, this explanation doesn't seem to hold in micro contexts.

Though, as the previous commenter pointed out, macro is quite a bit different, and your explanation is probably correct there.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: