In general encoder+decoder models are much more efficient at infererence than de... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		VHRanger 14 days ago \| parent \| context \| favorite \| on: T5Gemma 2: The next generation of encoder-decoder ... In general encoder+decoder models are much more efficient at infererence than decoder-only models because they run over the entire input all at once (which leverages parallel compute more effectively). The issue is that they're generally harder to train (need input/output pairs as a training dataset) and don't naturally generalize as well

GaggiX 14 days ago [–]

≥In general encoder+decoder models are much more efficient at infererence than decoder-only models because they run over the entire input all at once (which leverages parallel compute more effectively).

Decoder-only models also do this, the only difference is that they use a masked attention.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact