Has anyone applied tree of thought prompting to r1 yet?

Generate 5 thoughts, prune 3, branch, repeat. I think that’s what o1 pro and o3 do

5 comments

I just ask it about Winnie the Poo
- It literally stops thinking whenever asked about something China
  
  "The user wants a response"
Does't seem too hard to me. I personally didn't. And it's kind of hard to track what happeded, with all the articles on DeepSeek.

I'd just take some prompt/agent framework like Langchain. That has Chain of Thought prompting built in for quite some time already. And then connect it to R1. That shoud do it. Maybe the thinking blocks need to be handled differently, idk.
- Well I think you actually need to train a "discriminator" model on rationality tests. Probably an encoder only model like BERT just to assign a score to thoughts. Then you do monte carlo tree search.