Deepseek Misconceptions
Deepseek has definitely been making waves with their R1 model and I love seeing this sort of competition come into the space to drive innovation and promote more critical thinking. That being said, I’ve heard a couple of statements that range from misunderstandings to simply being wrong. Let’s look at a couple of these misconceptions.
R1 is open-source
This is one I hear most often, and it’s not strictly true.
The model weights are open, and they published their training strategies but there are no open training datasets. In other words, the implementation is not open. Kind of like having an already compiled piece of software called open-source.
To reflect a general level of pedantry that one can only find among the general population, the Open Source Initiative (OSI) defines open source in the context of AI as the following:
- Use: Employ the system for any purpose without seeking additional permissions
- Study: Examine the system’s working and inspect its components to understand its functionality
- Modify: Alter the system to suit specific needs, including altering output
- Share: Distribute the system to others, with or without, for any purpose
- Data information: Detailed descriptions of the data used to train the AI model, encompassing its source, selection and labelling process
- Code: The complete source code utilized for data processing and training, made available under OSI compatible licence
- Params: The model’s parameters, such as weights and config settings should be accessible under terms approved by OSI
According to these definitions none of Deepseek’s models are open source, R1 in particular fails 5 and 6.
Open weights simply means that they released the pretrained parameters, or weights of the neural network model itself. This normally allows others to use the model for inference and fine-tuning but that’s about it.
However, thanks to the open weights and a clear paper on the training strategies, it is possible to reproduce the model and the folks over at Hugging face have already started developing “Open-R1”.
Check it out for yourself: https://github.com/huggingface/open-r1
Lastly I want to add that Deli Chen, a Deep Learning Researcher at Deepseek, has expressed a desire to push forward to making open-source AGI a reality (I’m serious, he said that! https://x. com/victor207755822/status/1882757279436718454)
You can run R1 locally!
Okay, I’ll admit, this one is a half-truth. Some people can run R1 locally. All you need is a decent amount of compute and at least 80GB of combined RAM and VRAM (if you want decent tokens). We’re not going to be running it on Raspberry PIs any time soon.
We have seen other models, in particular on https://huggingface.co/, that can be run locally. None of them are quite as potent as R1 though. This is impressive, especially if you consider how much it costs to use GPT’s models.
Deepseek is the reason NVDA is dropping
I can understand what’s driving this thinking, same week that Deepseek reaches the top spot in the app store, NVDA takes a nose dive on the stock market. But we also the start of “the economic war” with all the tariffs as well. I think that’s the more likely culprit given that so many companies saw a massive drop in share price.
It’s tempting to think that Deepseek has proven that the existing LLMs have wasted hundreds of billions of dollars on hardware to “brute-force” AI, by proving that better efficiency exists. But the reality is, those companies will simply look to implement those efficiencies and then utilise their mass of compute hardware to drive productivity even more.
Conclusion
Yeah, you can run Deepseek locally, without requiring a purchased licence, with 64GB of RAM and an RTX 4080 Super. It’s still not going to compare to running it on the massive compute hardware that has already been procured in the ML space.
At the very least we can hope to see a decrease in prices going forward - but I have been accused of wishful thinking before.