You also have fine-tuned models for specific tasks that may see very similar inp... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

Tostino on Sept 14, 2023 | parent | context | favorite | on: Efficient Memory Management for Large Language Mod...

You also have fine-tuned models for specific tasks that may see very similar inputs for a variety of outputs. Think an LLM trained on pulling out specific types of information, no matter where it was stored within the file. E.g. "find the date of the shipment for product# 5432" and then you pass in 10k json documents with a similar shape.

liuliu on Sept 14, 2023 [–]

Yeah, but I was under the impression that for the same prompt, implementations are already share the KV cache. This area is so new so these obvious ideas might not get implemented as widely as I thought.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact