I've been trying both deepseek-r1:8b and deepseek-r1:32b using ollama on a local desktop machine.
Trying to get it to generate some pretty simple verilog code with extensive prompting.
It seems really bad?
Like specify what the module interface should be in the prompt and it ignores it and makes something up bad. Utterly rubbish code beyond that. Specify a calculation to be performed yet it calculates something very different.
What am I missing? Why is everyone so excited? Seems significantly worse to me than llama. Both o1-mini and claude haiku imperfect, sure, but way ahead. Both follow the same prompt and get the interface and calculation as specified. Am I doing it all wrong somehow (more than likely)?
After fixing up my open-webui install I tried "testing 1 2 3, testing. Respond with ok if you see this." Deepseek-r1:8b started trying to prove a number theory result.
Is there a chance this thing is heavily optimised for benchmarking not actual use?
Just to confirm, Ollama's naming is very confusing on this. Only the `deepseek-r1:671b` model on Ollama is actually deepseek-r1. The other smaller quants are a distilled version based on llama.
Which, according to the Ollama team, seems to be on purpose, to avoid people accidentally downloading the proper version. Verbatim quote from Ollama:
> Probably better for them to misunderstand and run 7b than run 671b. [...] if you don't like how things are done on Ollama, you can run your own object registry, like HF does.
It’s definitely on purpose - but if the purpose was to help the users making good choices they could actually give information - and explain what is what - instead of hiding it.
I think if you find Ollama useful, use it regardless of others say. I did give it a try, but found it lands in a weird place of "Meant for developers, marketed to non-developers", where llama.cpp sits on one extreme, and apps like LM Studio sits on the other extreme, Ollama landing somewhere in the middle.
I think the main point that turned me off was how they have their custom way of storing weights/metadata on disk, which makes it too complicated to share models between applications, I much prefer to be able to use the same weights across all applications I use, as some of them end up being like 50GB.
I ended up using llama.cpp directly (since I am a developer) for prototyping and recommending LM Studio for people who want to run local models but aren't developers.
But again, if you find Ollama useful, I don't think there is any reasons for dropping it immediately.
Yeah, I made the same argument but they seem convinced it's better to just provide their own naming instead of separating the two. Maybe marketing gets a bit easier when people believe them to be the same?
ollama has their own way of releasing their models.
when you download r1 you get 7b.
this is due to not everyone is able to run 671b.
if its missleading then more likely due to user not reading.
I'm not super convinced by their argument to blame users for not reading, but after all it is their project so.
> It is very interesting how salty many in the LLM community are over Deep Seek
You think Ollama is purposefully using misleading naming because they're mad about DeepSeek? What benefit would there be for Ollama to be misleading in this way?
The quote would imply some crankiness. But ye it could be just general nerd crankiness too of course. Maybe I should not imply the reason or speculate too much about the reason in this specific case.
It's also not helping the confusion that the distills themselves were made and released by DeepSeek.
If you want the actual "lighter version" of the model the usual way, i.e. third-party quants, there's a bunch of "dynamic quants" of the bona fide (non-distilled) R1 here: https://unsloth.ai/blog/deepseekr1-dynamic. The smallest of them is just able to barely run on a beefy desktop, at less than 1 token per second.
The press and news are talking about R1 while what you've been testing is the "distilled" version.
Sadly, Ollama has a bit of a confusing messaging about this, and it isn't super obvious you're not actually testing the model that "comes close to GPT-4o" or whatever the tagline is, but instead testing basically completely different models. I think this can explain the mismatch in expectation vs reality here.
Like an 8 year old factoring large numbers it's not amazing how well it is done it's that it is done at all that amazes. Sure. Amazing but not at all useful and not something one would expect the kind of fuss we've seen.
Seems the explanation is the deepseek-r1 models I was using are not, in fact, deepseek-r1. Thanks all for the heads up.
My take: the distills under 32B aren’t worth running. Quants seem to impact quality much more than other models. 32B and 70B unquantized are very good. 671B is SOTA.
Ollama is "taking flak" for the confusion because it's entirely created by them. If they renamed/split what they provide into deepseek-r1 and deepseek-distilled-r1, way less people would probably be confused about this.
Do other models do well for the same use cases? I thought LLMs are only good for low-value adtech codes and resources accessible on public Internet, like tons of getters/setters and onEvent triggers without much CS elements or time or multi domain implications.
They're also hyper sensitive to what I'd describe as geometric congruency between input and output: your input has to be able to be decompressed into final form with basically zero IQ spent on it, as if the input were zipped version of as yet existing output that the LLM simply macro-expanded.
R1 is just an improved LLM, nothing groundbreaking in those specific areas. Common limitations of LLMs still apply.
IMO, the layman's model of LLM should be more of predictive text than AI. It's a super fast keyboard that types faster than, not better than, your fingers.
> I thought LLMs are only good for low-value adtech codes and resources accessible on public Internet, like tons of getters/setters and onEvent triggers without much CS elements or time or multi domain implications.
I'm not sure where you get this generalization from. It seems like most local models you can run locally today on consumer hardware are kind of at that level, at least in my experience. But then you have things like o1 "pro mode" which pretty much allowed me to program things I couldn't before, and no other LLM until o1 could actually help me do.
They aren't deepseek at all but Distill models.
In LLM what distillation means is the better models trains ( fine tune ) the smaller models with their knowledge ( responses) so the smaller models are also getting better.
Deepseek 14b and 32b should be good enough, they are based on Alibaba's Qwen model and usually the best models from opensource 40b or less.
Deepseek 8b is based on Meta's Llama3
I would say 70b doesn't worth the cost in comparison to 32b except coding... but if you want a model for coding, then you should try a model specifically trained for that, like Deepseek-Coder or Mistral's Codestral.
Deepseek-r1 is considered a general use AI. It would do good enough at many topics but won't excel on everything.
I tried that distill, plus the "original" at chat.deepseek.com and the Azure-hosted replica on a simple coding problem (https://taoofmac.com/space/blog/2025/01/29/0900), and all three were bad, but not that bad. I suspect the distill will freak out with very little context.
Everyone keeps mentioning that you’re using the distilled version, which is true. But the real question is, do you see acceptable results with any model, open or private?
Verilog is relatively niche as far as programming languages go, so I’m not surprised that you’d have trouble getting good output generally. You can only train the model on so much stuff, and there is probably limited high quality training data for verilog. It’s possible the model planners just decided not to prioritize this data in the training set. 8b sized models will especially struggle to have enough knowledge about niche topics to reason over it. Anything that small is really just a language tool for NLP tasks unless it’s trained specifically to do something.
All that said, your comment does illustrate a misunderstanding with the “thinking” models. They always output a long monologue on what to say, for anything, even “hello”. It’s a different skill to prompt and steer them in the right direction. Again, small models will be worse at everything, even being directed in the right direction.
TLDR: I think you need to find a new model, or at least try the “full” version through the web app or API first.
the mind model behind it very different then that of "normal" programming languages, so less reuse of learned knowledge from other places ("knowledge" for a lack of better wording)
And unbanned as soon as OpenAI made ChatGPT compatible with the data/privacy laws in the country. I'm sure DeepSeek will be able to make the adjustments too.
I imagine if lets say Hetzner were to offer hosted DeepSeek they would be forced out of the US market, no longer be allowed to even buy NVIDIA. That is a high-risk, assuming EU companies would ever even chose to switch to Hetzner over Azure/etc., which they won't. The EU is acting like and ruled by people who believe the US isn't an hostile entity, that attitude is never going to change and the biggest reason why AI in the EU is dead.
It's interesting how ChatGPT is securing major deals with governments and large corporations, while DeepSeek is gaining support from smaller startups. The contrast is also intriguing from the perspective of the power dynamics between the U.S. and China.
Side note: what are people’s opinions of the UK regarding AI now? Completely insignificant? I feel the UK has taken a nosedive over at least the past 30 years.
The UK is a financial (and legal) centre because of its regulations. They make the UK a trustworkthy broker for international companies to deal with. (Also its deregulations, making it easy to squirrel money away to tax havens.)
Well, you can try to do a job search in those countries, and look at titles and the job descriptions. Also looknat the curriculums of universities in those countries, and don’t be fooled by the official sites. Look in forums how good they are in AI topics.
I did the interviews for grad recruitment in tech at a big investment bank in London. The best maths and computer science graduates where frequently from the Eastern European countries.
How about that the Danish pharma giant Novo Nordisk, makers of Ozempic, is opening their AI/ML lab in London next to DeepMind and other tech giants, instead of in its back yard?
The only one close to UK in Europe is probably Switzerland and France but they too are mostly focused on research in universities rather than pushing out commercial products the way the US is exceeding at. Everyone else is not even in the game.
Exactly nr 2 is Switzerland with ETH. Both together with France at a VERY academic level. Next maybe far far away Italy and Germany in very different fields, but at levels comparable (if not worse) with south America, like Chile Argentina Brasil and Uruguay
The government wants to introduce a law to make it illegal to possess AI tools that are capable of CSAM output. As we know this is impossible, any company starting in the UK with AI will likely fail compared to other countries if this law passes.
I scariest part of this which I do not see people worried about, is the one sentence about requiring suspects to open their phones at the border for inspection.
I would agree with the (highly summarized) premise of the article if Europe actually had a technology strategy. I’ve seen the multiple announcements of “national pride” LLM initiatives that aim to create “a native national chatbot”, sometimes launched under extremely cringe circumstances (like the Portuguese one, announced at WebSummit and that is going to be trained at a “partially Portuguese” “supercomputer” in Spain, which most of us found hilarious).
Otherwise, Europe is doomed to doing pointless, prematurely dead-ended one-offs.
If French, German and British nations start competing (like in the old days) for AI supremacy, it will unleash a level of creativity that we long forgot we are capable of in Europe.
DeepMind - the company owned by Google, but behind things like AlphaFold is based on London. And one of the fathers of modern AI - leading the resurgence of research into neural nets - Geoffrey Hinton is British-Canadian.
He moved from Britain originally due to the difficulty in getting his research funded.
So the issue isn't one of intellectual capital - and while it's obviously the case that well place monetary capital is an issue - it's not clear to me what the real underlying issue is.
Perhaps Europe needs a tech/industrial revolution again - where the power shifts from the old guard to the new. Perhaps too many people in charge in Europe are from a certain class that studied history at university.
Sure the ideas go away away back - but most people had given up on neural nets after the initial excitement, and sure others were also working on it - however the difference for me is he used it to solve real world interesting problems - and by showing what was possible - that ignited the resurgence.
Now you could argue that the people in the 60's and 70's didn't have the compute available to make non-toy networks, and it was only applying the same techniques on bigger datasets with more compute that was the real difference.
Sure - but that happens all the time in science - every innovation is building on the shoulders of others and the assignment of the Nobel prize is as a result often rather arbitrary.
Also don't underestimate the value of reducing to practice - the difference between coming up with an idea and actually making it work in practice.
I wouldn't discount the issue of intellectual capital. There are just a handful of universities in the UK that produce world class level work in CS. While there are dozens of such universities in the US. Numbers like that make a big difference.
And if I remember correctly, PhDs in the UK are kind of weird compared to the US. Your thesis has to be research that you haven't published yet.
The most common degree for UK politicians is PPE - politics, philosophy and economics.
I'm not sure the problem is an understanding of economics, it's an understanding of how the real world works, and how to make it better - they are often too easily swayed by big lobby groups with vested interests.
We both know that's not gonna happen. Europe is way too entrenched in its ways by this point. The good ol' glory days that brought in Airbus and Concorde are gone and not transferrable to the modern, dynamic and very internationally competitive SW world, nor are its leaders strong and motivated enough to enact policy changes that favor disruption of the old money guard at the expense of the status quo. Case in point we have no SW giants, no Airbus equivalent of the SW world. All Europe's giants are decades to centuries old. 20 years ago EU's GDP was on par with the US's, now we're only half the US's GDP. We're cooked.
Plus, we first have to prioritize solving more urgent and important topics like affordable housing (WHEN?!), the collapsing pension and welfare systems which is a ticking timebomb, cheap energy, collapsing demographic (see affordable housing), illegal immigration, Putin's war next door, the rise of the right wing (see illegal immigration) before jumping into another pissing contest with the US and China on something that's not gonna help fix the pressing issues we have right fucking now. I don't see how we can recover from this downward spiral when I look at the inactions of our politicians who are just kicking the can down the road and blaming the EU and other countries of the union for their own systemic failures.
Winning the AI race might sound cool but it might also be similar to winning the race to the moon: a cool flex but not super useful to the general population if they can't afford a place to live or getting healthcare in a timely manner. Until ChatGPT can wipe your retired old ass in a care home I doubt many people will see AI investments as being a top priority.
You just wrote my thoughts in polite manner. I am adding up, that German universities have no capability to do applied research. All the time everything “applied” was not worthy for them. “Applied” was the level of Hochschule (higher school) type institutions. Even being good these institutions didn’t had good reputation. So the best and brightest went to universities far away from practical research applications. The system isn’t built for great AI race. Add poor salaries and yearly contracts for research positions and all the smart pupils are gone. Gone to work for Google or Facebook or even Huawei!
Imho death spiral could be turned by providing enough affordable housing. That would be really long term goal, but the democracies do not have long term goals - the time after election ist time before election.
It's not like the US doesn't have a problem with affordable housing, so I don't see how this plays any role in the divide.
Germany has plenty of applied research organizations, from universities (e.g. RWTH) to things like Fraunhofer. The funding schemes behind these organizations are horrible and I would argue that in many ways, they are machines to burn up potential. Even with all this, Germany has been doing okay on the publicly funded AI research front, but that is irrelevant. The US isn't leading because of publicly funded AI effort, but because of privately funded AI effort.
The problem with “build more affordable housing” in countries that are desperately importing the entire world in an attempt to keep welfare programs afloat is that the amount of housing required to be built EVERY YEAR is staggering.
When a new citizen is born, there’s 18+ years for the required housing supply for that person to be created. When a new citizen is imported, they need housing TODAY. It’s just not a sustainable model on a continuous basis, but no one wants to hear that.
Food isn't an apreciating asset who's value grows when you hoard it and is also easily transportable between free markets. I can't import cheaper land from abroad. I'm limited by what's around me. That scarcity gives it value. You can keep growing virtually infinite food in the same land. You can't do that with housing.
Government can't solve affordable housing. It is the problem causing unaffordable housing.
Housing affordability is 100% because we let people tell their neighbors they cannot build as much housing as they want to, where they need to build it more slowly. There's no middle ground, there are no acceptable structures of land use law if you want affordability. If you get to tell your neighbor how much housing they can produce, bureaucracy will form around that and it will drive up the price of housing.
>Government can't solve affordable housing. It is the problem causing unaffordable housing.
They CAN solve housing because, like you said, they're the ones causing it. They just don't want to because the housing bubble is making a lot of people rich.
I think if you drill down you'll find there's no "they" who can. The majority of voters are homeowners who benefit from restricting housing supply. Any state actor who loosens land use regulation enough that it actually lowers housing prices would likely be voted out.
The changes you see - like allowing ADUs - are inherently very limited impact so that they look good to those advocating for more housing without actually lowering prices.
What about SAP? What innovations does SAP make? Nvidia alone is worth 10 SAPS and US has about 20 Nvidias. It's not even a competition. It's like Usain Bolt vs Grandma.
I don't believe it does while the Bureaucrats in Brussels keep getting paid to stifle innovation. If we look at it from a practical view, thousands of mid-level politically appointed people in EU, get paid to make life miserable to anyone that wants to innovate. That's their job.
Let's think about it for a minute. Eu was already behind the race, and they were proud for actually creating even more barriers for their businesses and researchers to catch up in the AI race.
Europe will have a change (in AI and other areas) if they get rid of most bureaucrats in Brussels. That's it. Otherwise, what expects us, is a long, slow decline into obscurantism and irrelevancy.
Said barriers to AI are
- not allowing the use of AI for social scoring, precrime and other nefarious profiling methods
- for high risk applications, requiring a quality system.
You know, like the kind of quality system you need in place if you make food for human consumption, produce light bulbs, or any of a myriad of other production processes. Somehow the people doing catering at my employer's canteen manage to comply with that, but it's too complicated for tech bros.
I think you fail to realize that's not the utilization end goal that's the problem with regulations, but the fact that you need to put a lot of checks and bureaucrats overseeing the development process to make sure "the rules are being followed".
Regulation for building a house in Europe are also totally valid, and what kind of person wouldn't follow them, right?
But then you need to send a pre project for approval that takes 6 months (and pay for it), then during construction you need to get a local government worker to check the progress several times and see if the rules are being followed (and pay for it) and after you finish the house, you need to wait up to 12 months for a government official to come inspect the house and declare that your house follows all their rules (and pay for it) and you are finally allowed to live in it.
So no, let's not try and declare that these rules are obvious, and great, and we need them and what kind of people wouldn't want to follow them? When in fact, these rules mean that at every single step, you are going to wait for the government to bless what you tell them you want to do and then to make you wait again while they check if you did what they allowed you to do.
P.S. do you even have any idea of what kind of hurdles small and big companies in Europe have to go every time they need to do something just because of personal data protection rules?
The waiting time could be due to government agencies being overloaded due to shortages of personnel, but hiring more people costs money and people wouldn't like taxes to go up.
Also consider this: petrochem companies would have an easier time if they were allowed to dump waste into rivers instead of having to jump hurdles to process it properly. That doesn't mean that environmental regulation is bad and we should do away with it to let them innovate more.
> The waiting time could be due to government agencies being overloaded due to shortages of personnel, but hiring more people costs money and people wouldn't like taxes to go up.
"The bureaucracy is expanding to meet the needs of the expanding bureaucracy"
Not just Europe. Many companies in the US will benefit, too. As will companies in Asia, Africa and the Middle East. These are the first truly frontier-grade models released under a friendly license. The most potent non-reasoning model before this point was Mistral Large, and it has serious restrictions on allowed uses (research only).
> OpenAI charges $2.5 for 1 million input tokens, or units of data processed by the AI model, while DeepSeek is currently charging $0.014 for the same number of tokens.
This is somewhat misleading, because OpenAI price is for uncached and DeepSeek price is for cached. DeepSeek uncached price is $0.14.
Besides Mistral which AI companies are out there in Europe really competing against the US and Chinese tech giants? Deepmind doesn't count, it's owned by Google for quite some time now.
Besides Mistral there are a bunch of smaller "research" models like https://huggingface.co/Almawave/Velvet-14B (Italian) and https://huggingface.co/projecte-aina/aguila-7b (Spanish/Catalan), but as far I'm aware, nothing that really competes with OpenAI/DeepSeek so far (which tbh, I don't think Google does at this time either? None of their models I've tried even came close to GPT4)
My humble opinion as an European citizen: let the US sink $500B in GPUs if they feel like it. I think they have bigger problems than marginally improving next token predictors, but who am I to judge. We'll just distill their models for a fraction of the cost, if need be. There is no moat in AI.
The AI race is not about innovation, it's about speculation. I do believe the tech holds promises, but as it stands now, the primary goal of AI is to attract capital. The benefits of AI are going straight up, siphoned by tech billionaires, and I don't see many improvements in the lives of my American friends.
Is there truly a moat there? Is there truly some very secret sauce that one must learn? Or can the results be replicated in reasonable time when they are developed.
Does it make sense to burn lot of money now? Or more to wait the technology and field to mature and then buy it for commodity prices. Think back to solar and wind power for example.
China and US obviously but first EU needs to be shaken out of complacency. It’s brewing slowly, but as all risk, it’s slow until it isn’t. Shutting down nuclear, outsourcing all tech and manufacturing, overregulation and allowing to be taken over by culture of lazy (and lazy cultures) are all fixable, but it takes time, which is running out.
Huh? Can you mention any examples of this, specifically in tech?
AFAIA outsourcing is done largely by western countries, including the US, _to_ eastern countries, including parts of Europe, Russia, China, India, etc. This is an obvious cost cutting measure, since developers there are cheaper and the talent pool is large. This has been going on for decades. Hell, the H-1B visa is made to bring those developers in, while still undercutting their salaries compared to US employees.
Outsourcing is much less prevalent in the EU, let alone "all tech"...
> overregulation
As opposed to no regulation? The US and China are not role models for how tech companies should exist and operate in society. Whatever "innovations" are being stifled by regulations in the EU is for good reasons. Big Tech has way too much power and influence to the detriment of society. At least the EU is making an effort to draw some boundaries, and if you ask me, it's not nearly enough.
> allowing to be taken over by culture of lazy (and lazy cultures)
Yes, how dare those lazy europeans have a sensible work-life balance!
It is deeply ingrained in lots of workforce in lots of jobs. Many people just don't care about doing a good job, especially in state owned institutions, and on the other hand being actually good at your job is not valued or recognized appropriately by bad middle management in tons of places, so people lose motivation to be better than orienting themselves at the average. Many want to do the bare minimum, understandably, when recognition for more does not exist.
The capitalist class is probably the laziest class in the US, one of the loudest capitalist critics of so lazism even has a self-made 20 season documentary on how she does nothing all day.
> "If you have built your application using OpenAI, you can easily migrate to the other ones ... it took us minutes to switch," he said in an interview on the sidelines of the GoWest conference for venture capitalists in Gothenburg, Sweden.
I suppose this is the plus side of picking "unstructured human language" as your API. If everything is a chatbot, then the vendor lock-in is minimal.
Trying to get it to generate some pretty simple verilog code with extensive prompting.
It seems really bad?
Like specify what the module interface should be in the prompt and it ignores it and makes something up bad. Utterly rubbish code beyond that. Specify a calculation to be performed yet it calculates something very different.
What am I missing? Why is everyone so excited? Seems significantly worse to me than llama. Both o1-mini and claude haiku imperfect, sure, but way ahead. Both follow the same prompt and get the interface and calculation as specified. Am I doing it all wrong somehow (more than likely)?
After fixing up my open-webui install I tried "testing 1 2 3, testing. Respond with ok if you see this." Deepseek-r1:8b started trying to prove a number theory result.
Is there a chance this thing is heavily optimised for benchmarking not actual use?
reply