Hacker News new | past | comments | ask | show | jobs | submit login

I guess it's the difference between an ensemble and a mixture of experts, i.e. aggregating outputs from (a) model(s) trained on the same data vs different data (GPT-4). Though GPT-4 presumably does not aggregate, but it routes.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: