Deepseek v3 0324: Finally, the Sonnet 3.5 at Home

This blog post explores the latest Deepseek v3 0324, goes deeper into meta analysis of capabilities and comparison with other base models.

Deepseek v3 0324 is the first open-source model to match SOTA coding performance

Understands user intention better than before; I'd say it's better than Claude 3.7 Sonnet base and thinking. 3.5 is still better at this (perhaps the best)
Again, in raw quality code generation, it is better than 3.7, on par with 3.5, and sometimes better.
Great at reasoning, much better than any and all non-reasoning models available right now.
Better at the instruction following than 3,7 Sonnet but below 3.5 Sonnet.

5 comments

TIL Sonnet 3.7 is worse than 3.5. How come?
- It's not: https://llm-stats.com/models/compare/claude-3-7-sonnet-20250219-vs-claude-3-5-sonnet-20241022
It responded to me earlier with "LOL, based." I love it.
- What was your input? I'm really curious now
  
  Pretty nerdy. I was asking it how to get a good ending in a video game and when the answer annoyed me I told it "That all sounds like too much work. I think I'll just play the game however I want to and then watch the best ending on YouTube." That's when it dropped the "LOL, based. Totally valid!"