Posted by: patenttranslator | November 18, 2018

How Many Translators Does It Take to Fix a Machine Translation?

 

 

The title of my silly post today is of course a variation on the famous light bulb jokes, as in “How many cops” (liberals, psychiatrists, feminists, Christians, etc., even dogs) “does it take to change a light bulb?”

Most of these jokes are very funny and some are very revealing, even though some people might find them a little mean, usually because they happen to belong to the particular group being made fun of.

Some of these jokes are really clever:”How many cops does it take to screw in a light bulb? Two, one standing on a chair with a light bulb and another one turning the chair with the other cop on it around.” Or, ”How many Christians does it take to change a light bulb?”“One to change the light bulb and three committees to approve the change and decide who brings the potato salad.” This one is funny without really being mean, which is kind of a rare occurrence in the universe of jokes.

The variations using dogs are a clever way to describe in a funny way how different breeds of dog deal with a particular task.

So anyway, enough of light bulbs and dogs. Let us instead concentrate on the job at hand, which is determining how many translators does it take to fix a machine translation.

I think we have to start with the question “Why should it be necessary to fix machine translations.”

It’s pretty clear why burnt out light bulbs need to be changed. Burnt out light bulbs need to be changed because, unlike cats and dogs, people don’t see very well in the dark. And it’s also clear why machine translations translation need to be checked and fixed … because unlike a human translation, a machine translation is just a translation tool, not really a translation. Most humans have discovered that despite enormous technological progress over the course of the last half century or so, some of the text translated by an algorithm is usually completely wrong and that only a well functioning human brain can figure out where the problems are and how to fix them.

A machine translation may look like a real translation, i.e. like a translation done using a human brain, but it is in fact a completely different kind of animal. Moreover, thanks to the progress achieved in machine translation, unlike ten or twenty years ago, the fact that machine translations now look almost like real translations made it much more difficult to find out where the problems might be hidden in the translations.

Machine translation is now an incomparably better tool than it used to be. It is so much better in many ways except for one – a machine translation is just as unreliable as it was two decades ago. A machine-translated text that looks like it makes perfect sense, reads well and appears to be a really good translation that was created by human brain may in fact be saying the opposite of what the original text says because, unlike human translators, algorithms do not have brains and therefore do not understand the meaning of what the original text is saying.

That is why so-called translation industry is putting so much hope in post-editing of machine translation by human translators as a way to get rid of this pesky problem, a problem that is always encountered with any machine translation system.

But can the post-processing strategy work?

If a human translator were really to fix all of the potential mistranslations in machine-translated texts, he or she would have to spend as much time, and often more time, reading the original text, comparing its meaning to that of the machine translation, and creating his or her own translation, just as if the translation were done by a human translator from scratch.

That would of course be so time-consuming and expensive that it would in fact beat the purpose of the whole post-processing scheme. So the “translation industry” found the perfect solution for this problem, comprising two key ingenious elements.

  1. Instead of actual translators who know what they are doing and who would be likely to demand the same reimbursement for their work as if they were translating instead of just “post-processing”, the industry is using (or wants to use) mere “bilinguals” for the post-processing task, whatever that means, because “bilinguals” are much cheaper than established, experienced translators, most of whom would not be interested in the mindless, post-processing slavery, even if it paid well (which it most definitely does not.)

As a result, what such “bilinguals”, who are expected to charge no more than a penny or two per word for their “post-processing”, are likely to produce is a text from which most glaring mistakes may theoretically have been removed, but not all of the mistakes. Especially mistakes that are crucial to the meaning of entire sentences, which often occur with machine translation programs, such as using “is” instead of “is not” and vice versa, are unlikely to be noticed by rushed and underpaid “post-processors.”

  1. Every now and then, the “translation industry” creates a few new, clever propagandistic buzzwords to make it seem as if a major problem in its quest for “perfect or almost perfect” machine translations has been solved. The latest highly creative buzzwords are “neural machine translation systems” and “deep neural machine translation systems.” As Google puts it, Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems.

Whether this potential is real or not, the fact remains that no matter how deep or how neural a machine translation system is, it will still be unable to solve the main problem, namely how to create algorithms that would in fact understand the actual meaning of a given text, simply because that is an impossibility.

On the other hand, the terminology using words such as “neural” and “deep” is pure genius because it creates the impression that the machine translation system is in fact based on understanding of the meaning of the text. Since neural means “of or relating to a nervous system”, and deep in this context is likely to be associated with “deep thinking”, from a purely propagandistic viewpoint, the combination of such terms is very effective.

The industry is thus continuing to forge ahead with its plans because, regardless of whether the post-processing method makes sense from a practical viewpoint, i.e. regardless of whether the result of post-processing is a much better, or even slightly better product. Whether the mistakes unavoidable with machine translation have been eliminated during “post-processing” or not, the method is effective from the industry’s viewpoint as long as the customers can be persuaded that they are buying good value for their money and receive a good product based on what they are paying for it … despite the fact that most machine translation, even very good machine translation, is available on the internet for free.

So what is the answer to the question in the title of my silly post today? I’m afraid nobody really knows, and nobody really cares, as long the human “post-processors” of raw machine translations can be found at a rate that guarantees healthy profit margins for the industry.

The answer to the question in the title of my post today in fact makes about as much sense as the many answers to the question:“How many dogs does it take to change a light bulb”?

Border Collie: Just one. Then I’ll replace any wiring that’s not up to code.

Rottweiler: Make me!

Lab: Oh, me, me! Pleeease let me change the light bulb! Can I? Huh? Huh?

Dachshund: You know I can’t reach that stupid lamp!

Malamute: Let the Border Collie do it. You can feed me while he’s busy.

Jack Russell Terrier: I’ll just pop it in while I’m bouncing off the walls.

Greyhound: It isn’t moving. Who cares?

Cocker Spaniel: Why change it? I can still pee on the carpet in the dark.

Mastiff: Screw it yourself! I’m not afraid of the dark…

Doberman: While it’s out, I’ll just take a nap on the couch.

Boxer: Who needs light? I can still play with my squeaky toys in the dark.

Pointer: I see it, there it is, there it is, right there!

Chihuahua: Yo quiero Taco Bulb?

Australian Shepherd: First, I’ll put all the light bulbs in a little circle…

Old English Sheep dog: Light bulb? That thing I just ate was a light bulb?

Basset Hound: Zzzzzzzzzzzzzz…

 

 

Advertisements

Responses

  1. Steve, you are certainly right about the propaganda value of terms like “deep” and “neural” with regard to machine pseudo-translation, but regardless of what you call it, in the end it remains shit. In fact, when you consider that the “good” parts are in fact largely parsed from human translations, and the differences are patched from terminology lists and other resources of varying relevance, the intellectual achievement of this technology is rather pitiful. At its best, in controlled language situations, it returns a flat, barely serviceable text like one might expect from a translator with little writing skill in the target language. The result might seem better where there is great overlap with the human input to train the machine, but any broadening of the text scope will soon lead to a linguistic morass, the depth of which it is really not worth knowing.

    I think the PEMpT model may be on its way out, but I don’t think there is much more potential in any of the MpT variants offered lately. Predictive typing features in CAT tools are useful – sort of in the way that typing suggestions on a smartphone can serve as helpful shortcuts – but integrating MpT with this will in all likelihood have the same bad effects on the language skills of the user as Bevan reported ages ago (see https://pdfs.semanticscholar.org/8c89/fe5657d63e3494e792ad0552efc43e66c681.pdf). These damaging effects on writing skills have also been confirmed by a heavy user of machine translation in a US company, with whom I spoke at a conference in Budapest in 2015 and later in an Asia Online webinar. Quite a number of others have reported similar experiences from their personal use of MpT in various projects. In fact, one of the earliest hints of trouble came from Jost Zetsche, who commented in his newsletter years ago that he noticed his writing got worse after working with machine pseudo-translation input for a while.

    One sees claims of speed improvement with MpT use (which I might believe in the case of those who are linguistically weak in the source and/or target language), but I have yet to see anyone claim that their quality is improved by using MpT. The stunting, distorting and damaging effects of this technology have and will continue to have severe consequences, the entirety of which we can hardly estimate.

    Time and again in discussions of MpT and its potential value, people ask me if I wouldn’t like “suggestions” during my work, implying that it couldn’t hurt. Well, these suggestions would be about as useful as mine to a master airplane mechanic making critical engine repairs before a flight. At best, my advice will be background noise s/he can ignore, but I might just as well annoy or distract the mechanic enough that a small thing might be overlooked, leading to a big plunge into the ocean on a trans-Atlantic flight. Not all input is truly helpful.

    Liked by 1 person

  2. This Saturday, I PEed some 2k5 words of language converted text. It took me the entire day. Automated language conversion of content can work well on content specifically written for that purpose. The shady mass of blood sucking sociopathic and narcissistic consultants are however out there to make promises about advanced technology that they pretend to master and comprehend. Yet, they don’t.

    Liked by 1 person

  3. Some more light bulb jokes.

    Q. How many gorillas does it take to change a light bulb?
    A. Just one, but you need at least 70 spare light bulbs.

    Q. How many militants does it take to change a light bulb?
    A. 100. One to change the light bulb, and 99 to debate whether changing the light bulb contributes to the struggle against imperialism or is merely a petty-bourgeois deviation.

    Liked by 1 person


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Categories

%d bloggers like this: