This model is a LOT of fun. It's absolutely tiny - just a 241MB download - and screamingly fast, and hallucinates wildly about almost everything.
Here's one of dozens of results I got for "Generate an SVG of a pelican riding a bicycle". For this one it decided to write a poem:
+-----------------------+
| Pelican Riding Bike |
+-----------------------+
| This is the cat! |
| He's got big wings and a happy tail. |
| He loves to ride his bike! |
+-----------------------+
| Bike lights are shining bright. |
| He's got a shiny top, too! |
| He's ready for adventure! |
+-----------------------+
This reminds me of my interactions lately with ChatGPT where I gave into its repeated offer to draw me an electronics diagram. The result was absolute garbage. During the subsequent conversation it kept offering to include any new insights into the diagram, entirely oblivious to its own incompetence.
I see you are using ollamas ggufs. By default it will download Q4_0 quantization. Try `gemma3:270m-it-bf16` instead or you can also use unsloth ggufs `hf.co/unsloth/gemma-3-270m-it-GGUF:16`
Oh fantastic it worked! I was actually trying to see if we can auto set these within LM Studio (Ollama for eg has params, template) - not sure if you know how that can be done? :)
We uploaded gemma3:270m-it-q8_0 and gemma3:270m-it-fp16 late last night which have better results. The q4_0 is the QAT model, but we're still looking at it as there are some issues.
I don't really gender LLMs in my head in general. I guess Gemma is a female name. I only gendered it in the joke because I think it makes it funnier, especially since it's just "a little guy". I know they are giving gendered names to these models now but I think it's a bit weird to gender when interacting with them.
Fine-tuning for specific tasks. I'm hoping to see some good examples of that soon - the blog entry mentions things like structured text extraction, so maybe something like "turn this text about an event into an iCal document" might work?
Fine tuning messes with instruction following and RL'd behavior. I think this is mostly going to be useful for high volume pipelines doing some sort of mundane extraction or transformation.
I feel like the blog post, and GP comment, does a good job of explaining how it's built to be a small model easily fine tuned for narrow tasks, rather than used for general tasks out of the box. The latter is guaranteed to hallucinate heavily at this size, that doesn't mean every specific task it's fine tuned to would be. Some examples given were fine tuning it to efficiently and quickly route a query to the right place to actually be handled or tuning it to do sentiment analysis of content.
An easily fine tunable tiny model might actually be one of the better uses of local LLMs I've seen yet. Rather than try to be a small model that's great at everything it's a tiny model you can quickly tune to do one specific thing decently, extremely fast, and locally on pretty much anything.
i was looking at the demo and reading the bed time story it generated and even there, there was confusion about the sprite and the cat. switched subjects instantly making for a confusing paragraph. what's the point of this model?
Here's one of dozens of results I got for "Generate an SVG of a pelican riding a bicycle". For this one it decided to write a poem:
There are a bunch more attempts in this Gist, some of which do at least include an SVG tag albeit one that doesn't render anything: https://gist.github.com/simonw/25e7b7afd6a63a2f15db48b3a51ec...I'm looking forward to seeing people fine-tune this in a way that produces useful output for selected tasks, which should absolutely be feasible.