Interestingly the Gemma 3 docs say: https://ai.google.dev/gemma/docs/core/model_...

nico · 2025-04-20T16:14:31 1745165671

Thank you for the insights and useful links

Will keep experimenting, will also try mistral3.1

edit: just tried mistral3.1 and the quality of the output is very good, at least compared to the other models I tried (llama2:7b-chat, llama2:latest, gemma3:12b, qwq and deepseek-r1:14b)

Doing some research, because of their training sets, it seems like most models are not trained on producing long outputs so even if they technically could, they won’t. Might require developing my own training dataset and then doing some fine tuning. Apparently the models and ollama have some safeguards against rambling and repetition

Gracana · 2025-04-20T19:43:56 1745178236

You can probably find some long-form tuned models on HF. I've had decent results with QwQ-32B (which I can run on my desktop) and Mistral Large (which I have to run on my server). Generating and refining an outline before writing the whole piece can help, and you can also split the piece up into multiple outputs (working a paragraph or two at a time, for instance). So far I've found it to be a tough process, with mixed results.

nico · 2025-04-20T21:28:31 1745184511

Thank you, will try out your suggestions

Have you used something like a director model to supervise the output? If so, could you comment on the effectiveness of it and potentially any tips?

Gracana · 2025-04-21T00:27:13 1745195233

Nope, sounds neat though. There's so much to keep up with in this space.