Skip Navigation

Text-to-speech Youtube voice overs are driving me nuts

Remember the "I want a white one" video? That's the first video I clearly remember having a text-to-speech voice-over. It was really bad TTS, and it was awesome. Lately, though, I find myself wishing video hosting services like Youtube and Peertube (to a lesser degree) had a filter so that I could filter out any videos with TTS voice overs. Does this bother anyone else?

I'm a little torn about it. There are legitimate reasons for people to use them; I've seen commentary from posters about social anxiety that makes even recording audio difficult, and TTS must be fantastic for mute non-verbal(?) folks. Non-native English speakers may be more comfortable with it. I'm sure the platform doesn't help... how many videos do you have to post where the peanut gallery mocks your verbal mistakes before you give up and just have an engine read your written text? I've also noticed that the use of TTS is far, far worse on Youtube -- I have yet to come across a single video on any Peertub site that uses it, although it must exist.

Like a lot of technology, generated speech is getting abused, and since TTS has valid uses, I put it in the "enshittification" category. It's used on every bulk, low-effort "N greatest/funniest/random-adjective" videos; I hear it in increasingly in those suspiciously AI-smelling, ad-ish "reviews" that just read specs and make an odd comment about how cool it is; and there's so much more low-quality, low-information content that feels AI generated uses it -- or maybe it feels AI generated because it uses it. It's almost always on just awful content.

TTS on video content is a perfect example of "this is why we can't have nice things." I am starting to hate it so much, I abort whatever I'm starting to watch as soon as I hear the absurd cadence and mispronunciations -- I'd rather hear an honest non-native speaker making mistakes than that terrible TTS crap.

Whatever the reason, the use of TTS is a trend I'm putting firmly in the "enshittification" category, but am I overreacting here? Do you have a way of dodging or identifying content that uses TTS, in advance?

7 comments
7 comments