If you can run 13B, I would recommend the following models. The links are for the GGML models. Keep in mind that if you can run a 30B or larger model, there are other LLMs that will work better.
This one is my personal current favorite. I think that it's better than Chronos for short messages. For me it usually sticks to using asterisks for actions, and quotes for speech, which is what I prefer.
This one feels more "clever" to me all around, and is currently very popular for roleplay. It produces results that are usually longer, so I feel like its better suited for longer dialogue. It also feels to me like it understands the scenarios better, and I usually get slightly more creative results from it.
It also really depends on VRAM IMO, I have a 4090 and these days I don't tend to touch anything under 30B (Wizard Uncensored is really good here) if I had dual 3090s I would likely be running a 65B model.