DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Compared to the previous version, this upgrade brings improvements in multiple aspects:
Hybrid thinking mode: One model supports both thinking mode and non-thinking mode by changing the chat template.
Smarter tool calling: Through post-training optimization, the model's performance in tool usage and agent tasks has significantly improved.
Higher thinking efficiency: DeepSeek-V3.1-Think achieves comparable answer quality to DeepSeek-R1-0528, while responding more quickly.
The tool calling improvements are very welcome
I wonder if we can extend the context length. It already fine-tuned with YaRN so we can't get free extend with that method.
Honestly, did the word "drop" change meaning in the past few months, or am I just crazy?
It picked up a lot of use by GenZ+ especially though the phrase "New {something} just dropped".
Depends where they dropped the thing. In your lap, or in a bin? Context, as always, is key.
Not really, it’s just more common!
Drop is a contronym, it means its own opposite, and its use as “disappear” or “appear” extends waaay back. Eg. Usage as in Drop a line or drop a letter go as far as the 1700s.
So just a different line in a long history of drops.
More like in the past 10 years or so, but yes. On a side note, "leaked" has also subtly changed meaning from the actual product (such as a game being released early through piracy etc) to just information about the product.
"leaked" has also subtly changed meaning from the actual product (such as a game being released early through piracy etc) to just information about the product.
Just no... Information getting out despite attempts to conceal it, is the secondary meaning of that verb since at least the 19th century.
It's not you
What's not what we expected?
Everybody been rumoring about R2. So releasing this thing kinda unexpected
The tool calling improvements are very welcome
I wonder if we can extend the context length. It already fine-tuned with YaRN so we can't get free extend with that method.