I can vouch for whisper.cpp . It's not 100% perfect but it's good enough to transcribe a half hour podcast with numerous speakers and which requires pretty minimal fixing afterwards.
OP, this is the best Speech-to-Text solution, IMO. I've used Whisper on Windows (link to GitHub) successfully to transcribe graduate-level class recordings with very minimal manual fixing, mostly only certain last names.
Not FOSS as it's under another license, but there's "FUTO Voice Input" if you're looking for a local alternative to Google's voice dictation on Android
This one of my most used apps at the moment, it works 100% on your device and is great for filling in search terms, for AI prompts , messages etc. The only downside is that it seems to have a character limit so it may not be what OP is looking for.
Maybe not exactly what you're looking for but I found this a few weeks ago https://github.com/k2-fsa/sherpa-onnx and I haven't really seen anyone talk about it
I've been using the tts on android for navigation and its way better than rhvoice and espeak.
I did try stt on android and it worked great but I've never used stt before so I don't know how good it is compared to other stt
It’s still surreal to see OpenAI’s need for training data be so vast that they casually developed and open sourced a generational leap in transcription technology just so that they could scrape online videos better.