It’s open source and people are literally self-hosting it for fun right now. Current consensus appears to be that its not as good as chatGPT for many things. I haven’t personally tried it yet. But either way there’s little to be “suspicious” about since it’s self-hostable and you don’t have to give it internet access at all so it can’t call home.
Is there any way to verify the computing cost to generate the model though? That’s the most shocking claim they’ve made, and I’m not sure how you could verify that.
If you take into account the optimizations described in the paper, then the cost they announce is in line with the rest of the world’s research into sparse models.
Of course, the training cost is not the whole picture, which the DS paper readily acknowledges. Before arriving at 1 successful model you have to train and throw away n unsuccessful attempts. Of course that’s also true of any other LLM provider, the training cost is used to compare technical trade-offs that alter training efficiency, not business models.
It’s open source and people are literally self-hosting it for fun right now. Current consensus appears to be that its not as good as chatGPT for many things. I haven’t personally tried it yet. But either way there’s little to be “suspicious” about since it’s self-hostable and you don’t have to give it internet access at all so it can’t call home.
https://www.reddit.com/r/selfhosted/comments/1ic8zil/yes_you_can_run_deepseekr1_locally_on_your_device/
Is there any way to verify the computing cost to generate the model though? That’s the most shocking claim they’ve made, and I’m not sure how you could verify that.
If you take into account the optimizations described in the paper, then the cost they announce is in line with the rest of the world’s research into sparse models.
Of course, the training cost is not the whole picture, which the DS paper readily acknowledges. Before arriving at 1 successful model you have to train and throw away n unsuccessful attempts. Of course that’s also true of any other LLM provider, the training cost is used to compare technical trade-offs that alter training efficiency, not business models.