Hacker News new | past | comments | ask | show | jobs | submit login

Well it'll always depend on the length of the meeting to summarize. But they are using mistral which clocks at 32k context. With an average of 150 spoken words per minute, 1 token ~= word (which is rather pessimistic), that's 3h30m of meeting. So I guess that's okay?



  mistral which clocks at 32k context
I may be wrong, but my understanding was/is:

- Mistral can handle 32k context, but only using sliding window attention. So it can't really process all 32k tokens at once.

- Mixtral (note the 'x') 8x7B can handle 32k context without resorting to sliding window attention.

I wonder whether Mistral would do a better job summarizing a long (32k token) doc all at once, or using recursive summarization.


Hmm. Interesting question. We had no issues using Mixtral 8x7B for this, perhaps reinforcing your point. We use fine-tuned Mistral-7B instances but not for long context stuff.

Maybe a neat eval to try.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: