NIST's TREC workshop series uses Cyril Cleverdon's methodology ("Cranfield paradigm") from the 1960s, and more could surely be done at the evaluation front:
- systematically addressing sampling error;
- more than 50 queries;
- more/all QRELs;
- full evaluation instead of system pooling;
- study IR not just of the English language (this has been picked up by CLEF and NTCIR in Europe and Japan, respectively)
- to devise metrics that take energy efficiency into account.
- ...
At the same time, we have to be very grateful to NIST/TREC for executing an international (open) benchmark annually, which has moved the field forward a lot in the last 25 years.
- systematically addressing sampling error;
- more than 50 queries;
- more/all QRELs;
- full evaluation instead of system pooling;
- study IR not just of the English language (this has been picked up by CLEF and NTCIR in Europe and Japan, respectively)
- to devise metrics that take energy efficiency into account.
- ...
At the same time, we have to be very grateful to NIST/TREC for executing an international (open) benchmark annually, which has moved the field forward a lot in the last 25 years.