A Dutch publisher has announced that it will use AI to translate some of its books – but those in the industry are worried about the consequences if this becomes the norm
A Dutch publisher has announced that it will use AI to translate some of its books – but those in the industry are worried about the consequences if this becomes the norm.
When it comes to how people feel about AI translation, there is a definite distinction between utility and craft. Few object to using AI in the same way as a dictionary, to discern meaning. But translators, of course, do much more than that. As Dawson puts it: “These writers are artists in their own right.”
That's basically my experience.
LLMs are useful for translation in three situations:
declension/conjugation table - faster than checking a dictionary
listing potential translations for a word or expression
a second row of spell/grammar-proofing, just to catch issues that you didn't
Past that, LLM-based translations are a sea of slop: they screw up with the tone and style, add stuff not present in the original, repeat sentences, remove critical bits, pick unsuitable synonyms, so goes on. All the bloody time.
And if you're handling dialogue, they will fuck it up even in shorter excerpts, by making all characters sound the same.
A colleague who does this has a fair point it is not a 1:1 translation but a translation as the natives would say it. Different words but nearly/identical meaning of course it depends on how good this is but it is a valid use case
I've used deepl, and as a "quick solution/I'm fine with the occasional error" translation service it's definitely better than Google. As a commercial platform probably tracking more than I personally care for, trying to corner a market share —not so much.
But neither of the above are fit for translating books of any kind (except perhaps as a joke to emphasise just that). And I'm still doubtful of the "AI" models doing any better.
DeepL has always used machine learning, and they already switched to LLMs for some language pairs -- not rebranded ChatGPT, but their own stuff. They're also quite open about the model not being perfect, they're advertising with things like "blind tests show our results sound more natural than the competition", "our model output needs fewer edits than the competition", etc.
Even without machine translation, stuff like that has been the bane of translating software for ages as they are almost always done with absolutely zero context whatsoever, just a list of words and strings.
As someone who speaks conversational Japanese (well, probably more since I do banking, doctor, etc. on my own, but my grammar is far from perfect), and fluent English, Google's AI can make some... questionable choices when translating at least. My wife (fluent Japanese speaker who knows a little English) and I decided to play with its translator function when I got a pixel phone and once again a bit latter trying to come up with some English practice for her.
Japanese is definitely a bit more difficult to work with since it's so context-dependent and has lots of homophones (one reason translating things into Japanese and back can be interesting, particularly in the older days of Google Translate). It's fine for short, concise, and non-complex sentences, but even certain formal grammar and honorifics can be bad with the AI translation services.
i see an issue with technical manuals as well. i am not native english speaker and whenever some android app decides to machine translate itself to my native language, it is a fucking disaster. some words can be translated in multiple ways depending on context and guess what is missing when translating stuff like app menus? that's right.
So as a counterpoint to all the comments here, I absolutely see this working. I needed to translate a fairly long work of fiction, and an LLM made my work 10x as fast, since quite obviously my active vocabulary between the two languages differed.
It was much easier and faster to correct the LLM than to write the translation myself. Imagine this replacing workers not like 1 workplace becomes 1 LLM subscription, but more like 10 workplaces become 2 workplaces and an LLM subscription.
It was not professional work but a private request from a loved one.
It was actually their idea.
And I was very, very sceptical about it at the idea at first and the output all throughout the process.
I have made extensive edits to the original LLM translation, as it got a lot of things wrong. To be honest, it got a lot of the stuff that is unique to the book and that made the book special wrong, both in words, or intent, and I had to correct it. My workflow was literally putting it in the prompt, taking the output, then putting the two texts next to each other and deciding, sentence by sentence, word by word:
Is the translation any good? (around 95% was generally good, sometimes it trailed off, and I needed to find the point at which it started bullshitting)
Does it use terms that are unique in the book consistently the right way (it almost never did, I literally had a dictionary of the most frequent mistakes)
Could I have done it better? Do I know a way to better convey the intent? (this happened quite rarely, as it has done a near word-for-word translation, the biggest problems were idioms that made sense in one language but didn't in another, or misgendered characters)
All in all, I think the LLM did the heavy lifting in remembering all the odd words and grammar, and it gave me a very flawed first draft. It was 80% of the time, but like 5% of the actual creative work that goes into a translation.
I spent 90% of my time outside the LLM, in my text editor.