Hacker News new | past | comments | ask | show | jobs | submit login
Ten Noteworthy AI Research Papers of 2023 (sebastianraschka.com)
128 points by danboarder 8 months ago | hide | past | favorite | 19 comments



Couple other interesting papers from 2023 to add, specifically for small language models.

TinyStories: https://arxiv.org/abs/2305.07759

Phi-1.5: https://arxiv.org/abs/2309.05463

Both these papers proved that with good/concise data we can get better performance with smaller parameter counts.


no I dont think so - TinyStories has a different architecture and assumptions for success.. different enough so that they dont compare to the others IMHO


Perhaps but TinyStories was one of the first papers to show that we don't need a lot of parameters to build a coherent generative model.


fwiw this link looks interesting, everyone

https://github.com/neuml/txtai


What is “coherent”?


Generates plausible continuations of text.


Both gpt-2 and gpt-4 do that.


> we don't need a lot of parameters to build a coherent generative model.

GPT-4 is likely to be a trillion+ parameters over its MoR models.


fantastic commentary on these papers and very good choices for ai engineering.especially dont miss sebastian’s commentary on pythia https://twitter.com/rasbt/status/1734920232173539796?s=12&t=... and bloomberg gpt (which was a surprise to me bc to my knowledge the model wasnt actually ever released? idk) https://x.com/rasbt/status/1738467874644128193?s=20

we covered some of these papers with their authors at neurips (those we could get…) coverage here

https://www.latent.space/p/neurips-2023-papers


I stumbled upon this site

https://openreview.net/group?id=NeurIPS.cc/2023/Conference#t...

not sure how this works exactly


People run conferences through openreview. You sign up as an organizer, people submit papers, and reviewers review them. Then a final decision is made about acceptance.

As a non-expert, I think that openreview is basically useless for you. The signal to noise ratio is terrible. It's just like arxiv.

Just because a paper got good reviews, doesn't make it a good paper. And just because it got bad reviews, doesn't mean it's a bad paper. Conference reviews are pretty random. A lot of good work is rejected and a mountain of trash is accepted.

At the moment we're massively short on competent reviewers, so the level of commentary of openreview is.. "oh look, my racist uncle commented on cnn.com again"-level abysmal.


+1 to this. I've been orienting away from ML conferences because the review process is so bad; it's an incredible amount of wasted work responding to and re-submitting things after getting a bad crop of reviewers. Honestly it also makes me despair for the quality of the conferences themselves.


Sadly if you're an academic it's a game you must play. My students need papers so that they get jobs and I need them to get grants.

But yeah. We all waste an incredible amount of time resubmitting to another conference for no reason at all.


For my group, there's journals and a whole ecosystem outside of the ml conference space related to the actual application area... We can go to a top-tier journal in the application domain and get better reviewers.


This is a popular site used by various conferences for peer review. Useful for getting expert opinion on papers. Probably too noisy for someone not involved in research.


It’s also useful for getting hot takes by reviewers who didn’t read the paper carefully. Though even in those cases sometimes the author responses can be pretty illuminating


What has happened with the BloombergGPT? The paper was published in March and as far as I know, they have not launched anything.


AdaptLLM appeared to be just as capable.


Pretty sure it was never intended for external consumption




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: