Posted by: patenttranslator | January 8, 2016

Efforts at Turning Mud into Gold by “The Translation Industry” Could Use a Better Analysis in the Media

Articles in mainstream media that are ostensibly analyzing the problem of machine translation almost always try to confuse the reader by pretending that the real problem is not with the frequent occurrences of mistranslations that may be very difficult to detect, but with the style of machine translations. I see this as a misdirection that is used to turn the audience’s attention away from the possibility that the fatal problem could be the basic premise of how machine translation works.

According to a New York Times article titled “Is Translation an Art or a Math Problem” that was linked by a commenter in the comment section of my last blog post, “Warren Weaver [described in the article as a founder of the discipline] conceded: “No reasonable person thinks that a machine translation can ever achieve elegance and style. Pushkin need not shudder.”

“The whole enterprise introduces itself in such tones of lab-coat modesty”, continues the article.

Pushkin need not shudder? How original (it’s usually Shakespeare who is put down in this manner in these types of articles, at least in English). What a pretentious thing to say! It’s like saying that you can’t defeat a mighty army equipped with tanks, bombs and machine guns with ice cream and chocolate chip cookies. Gee, who would have guessed?

The problem with machine translation is not style. Unlike in the days of Jane Austen, style means very little in modern society. The article in the New York Times then goes on to say, among other things, this in a paragraph describing visionary machine translation developers:

“But human translators, today, have virtually nothing to do with the work being done in machine translation. A majority of the leading figures in machine translation have little to no background in linguistics, much less in foreign languages or literatures.”

Unfortunately, the author of the article doesn’t seem to know anything about translation either. When I Googled the author’s name, I discovered that “he lived in San Francisco, Berlin and Shanghai”. But based on how the problem with machine translation is presented in the article, I would bet a farm (if I had one) that he does not speak German or Chinese. You may be able to form your own opinion about foreign languages when you live abroad for a while. But foreign languages cannot be learned by osmosis. You have to work on them, often for decades, and that would be just too much to ask from people who write articles about translation for The New York Times.

Why is it that it’s always non-translators who are asked to ponder the mysteries of machine and human translation on our mass media? How could they possibly know anything about something they don’t understand? Why is it that you don’t need to know anything about languages, let alone translation, to write treatises on human and machine translation that apparently deserve to be taken seriously?

A 14-year-old virgin, writing an advice column for married couples on sex for The New York Times, or pretending to be a sexologist worthy of being listened to by grownups, would probably not be taken seriously. Because if you’ve never had sex, you obviously cannot possibly know anything about it. The 14-year-old virgin’s musings might be of some value to psychologists studying issues related to mental health of children, but probably not a whole lot.

But when you see an article about machine translation published in lame stream media such as The New York Times or The New Yorker, it’s always written by people who can’t translate and generally don’t know anything about foreign languages.

And if an article about machine translation is published by a translators’ association, it is invariably written by somebody who is obviously trying to ride the gravy train of “the translation industry” by selling machine pseudo-translation as a viable alternative to real translation, which is to say human translation. It’s an unfortunate fact that the ATA Chronicle, which bills itself as “The Voice of Translators and Interpreters” has now become “The Voice of the Translation Industry”. The truth is, while the Chronicle has been publishing articles selling machine translation for years, it has never published an article that would attempt to critically analyze the faulty premise that is at the root of the obvious problems with machine pseudo-translation.

But let’s get back to the article in the New York Times. I would like to suggest to its author another reason why translators generally don’t join the ranks of machine translation developers, unless they see it as an easy way to make mucho dinero by promising their clients that they can turn mud into gold.

The reason why human translators don’t want to have anything to do with machine translation is that unlike non-translators, they understand that MT is being sold to an unsuspecting public in a quest for an easy way to turn mud into gold, or a quest for perpetuum mobile, by charlatans who are trying to sell MT to gullible clients as a new business model while falsely promising to save a lot of money with machine translation.

True, you can save a lot of money if you buy the concept of machine-pseudo translation as a realistic alternative to human translation. The question is, how much money will you ultimately lose if your advertising, strategic decisions, and other information are based on undetected mistranslations?

You can turn mud into bricks and build beautiful houses from mud that has been only slightly post-processed by heat. But you cannot turn mud into gold, although charlatans called alchemists were able to make a very good living at various courts in Medieval Europe for quite a few decades by claiming that they will soon be able to do that. And there is no perpetuum mobile beyond Sun, wind, and oceans, and never will be, although some people are still looking for it.

What machine translation developers are trying to do is to convert with fast computers and beautiful algorithms “linguistic corpora” into something that will mean the same thing in the translation as it did in the original language. But it cannot be done because in order to do that, you need a tricky, ephemeral ingredient called “meaning”. Unfortunately, this tricky ingredient has its origin in the human brain, which is the only place where meaning can be created, and then recreated in another language.

Translators understand this simple fact. Computer programmers and greedy salespeople of machine pseudo-translations stubbornly refuse to even acknowledge it.

Machines need not apply for a job that would require them to understand and evaluate what is being said or written. Regardless of their processing speed, their systems of algorithms, and a quasi infinite linguistic corpora that may be stored in the cloud and made available to these machines, they will never be able to do that.

Machine translation sometimes makes very good sense, and under some predefined circumstances it may even make sense frequently. But it can only make sense if the input (the text fed to the machines) is controlled by humans, the same humans who will then still be very busy massaging the ingredient called meaning into the output end of the machine translation perpetuum mobile.

If machine translation is to make sense in another language and mean the same thing as the original, the input must be strictly and restrictively controlled, and the output (the resulting translation) too must be controlled by humans, namely translators who understand this original text, who some people in “the translation industry” would love to turn into post-processors of machine translation.

In other words, the same information must be retranslated by humans if what you need is a real translation rather than just machine translation.

Translation is both “a math problem” and “an art problem”. This is not an “either/or” problem as the article seems to suggest. The problem is much bigger than that, as just about every translator will understand, immediately and intuitively. The reason why computer programmers and mathematicians working on machine translation solutions keep firing human translators is very simple: human translators keep telling these computer programmers and mathematicians that the problem cannot be solved with technology and algorithms, and the programmers don’t want to hear it because technology and algorithms are the only things they know and understand.

It may take a few more decades before most humans realize that just like it made very good sense to stop looking for ways to turn mud into gold (because something like that isn’t possible) and that it might be better instead to look for ways to create better bricks from mere mud, it also makes much more sense to stop pretending that machine translation that is just as good as human translation (except maybe for the style)  is just around the corner. We must look instead for machine translation that will simply be a better and more accurate tool for humans, translators and non-translators alike.

Let’s stop pretending that machine translation will soon “be just as good as human translation”, except maybe for a slightly deficient style. We can start by realizing that the false modesty of an MT founder who is graciously willing to allow Pushkin for time being to sleep his eternal dream when he says that the Russian literary giant need not shudder in his grave as a result of his depressing thoughts in the netherworld about the linguistic, contextual and stylistic superiority of machines over the human brain is in fact haughty arrogance of a pretty foolish person.

Advertisements

Responses

  1. It occurs to me that another way to phrase this is: Man in search of meaning would do very well to avoid machine translation.

    Like

  2. Or: You can save a lot of money with machine translation as long as you don’t mind not knowing for sure what’s in the original text.

    Liked by 3 people

  3. I think that we will indeed see competent (not stylish, but intelligible and correct) machine translation at some point – to say that we won’t is like Malthus predicting global famine or Michelson saying that all physical laws have been discovered. “Hasn’t been done, so can’t be done” has rarely been right.
    It will come first for science, because the grammar (and general vocabulary) is more limited than in general use, and specific vocabulary is a matter of dictionaries, which are easily memorized by computers. And it should come first for those language pairs in which the basic sentence structure is similar, so that fewer contextual references are necessary (does “ikimasu” mean “I will go” or “I am going”? – tough to tell without context, but if the languages have well-established tenses, like English or French, context is not necessary, the words themselves establish the tense) – on the other hand, those are also the language pairs (e.g. the pairs of European languages) in which there are numbers of competent human translators.
    But I don’t see it happening soon, as it doesn’t matter enough to the world at large. Perhaps the EU, with its demand for translation of an ever-increasing number of documents into an ever-increasing number of languages, will invest in having someone develop a *real* MT system, but I don’t see anyone other than an international organization (or perhaps IBM, to give Watson something to do) doing so.

    Liked by 2 people

    • The day we won’t need editors to edit same-language texts is the day we won’t need human translators to convey the meaning from one language to another.

      Like

    • We already have intelligible MT. The problem is, it’s impossible to tell to what extent it is correct and to what extent it is incorrect unless you happen to be a translator yourself.

      Liked by 2 people

      • I said “correct” as well as “intelligible”, Steve.
        I’d bet against Systran ever doing it, I’d bet that it won’t happen any time soon; but betting that it will *never* happen is a sucker’s bet.
        What makes you (collectively) think that the human brain possesses some magical capacity for translation that is incapable of being instantiated in a computer at some point in the future? And if you don’t think that, then you may reasonably expect that if the task of developing real MT is sufficiently worthwhile to someone with the resources to do it, it will eventually be done.

        Like

      • The thing that makes me think that the human brain possesses the magical (in other words: you don’t understand how it happens) capacity for communicating with other human brains. That’s what translation is: human brains, communicating with other human brains.

        Computers, which I love dearly and rely on daily, are constructs of algorithm, and even the most advanced, self-correcting, self-learning algorithm, is an algorithm – it is constrained to specific ways of processing input.

        Humans are not.

        As long as humans can and do depart from algorithmic rules in our communications, human translators will be needed to determine what the first set meant to say when they took that grammatical departure.

        My intuition is that this is a corollary to Kurt Gödel’s Incompleteness Theorem (which says that a formal system of logic can be EITHER complete OR free of paradoxes.) Computer translation, similarly, can be EITHER of natural language (a cognate of completeness) OR reliably provide an accurate translation (in a system constrained to the point of near-uselessness).
        I am far too busy translating to do the work required to prove this, but I don’t really have to, because back in the 1960s Professor Yehoshua Bar-Hillel did quite a bit of work as a logician that ended up in the same territory. The rules of logic have not changed significantly in the intervening years, and I believe his work has been extravagantly proven correct by the results documented at furious length by our gracious host.

        Liked by 1 person

    • I think that human brain is much too complicated to try to simulate real language and real translation in a computer. But of course I don’t know what is going to happen fifty or a hundred years from now. Nobody does. It’s like arguing whether there is intelligent life in other galaxies. Nobody knows that. Regardless of what we may think about it, we don’t have enough data, so arguing about it seems like a waste of time to me.

      Liked by 2 people

  4. Around the year 20599.

    Liked by 1 person

  5. Why not take the lawyers and judges out of the criminal justice system and let computers decide guilt/innocence of the perpetrator; after all, practically all offences have been committed before, many times. Imagine the money that can be saved! Even pleading could be automated; tick the box of your choice!

    See: http://www.truthdig.com/report/item/two_years_behind_bars_or_20_one_day_a_computer_formula_may_have_a_say_20160

    It’s a brave new world we live in; problem is, I don’t see much improvement in the human condition other than for a selected few who have the power/money.

    Liked by 3 people

  6. “Why not take the lawyers and judges out of the criminal justice system and let computers decide guilt/innocence of the perpetrator; after all, practically all offences have been committed before, many times.”

    Easier done than machine translation that is as good as human translation IMHO.

    Like

  7. Another thought: legislation, carefully framed and written by the best in the business, often requires a bunch of judges in a constitutional court listening to the arguments of teams of highly educated lawyers to interpret its meaning (and they often differ in their opinions)……….. So even a ‘perfect’ translation thereof will also be subject to differing interpretations, probably more so, since it is generally read by persons from a different cultural environment.

    Translation is not an exact science suitable for digitisation; it requires a combination of talent, imagination, cultural knowledge, education, training, experience, good looks and modesty.

    My computer falls well short of that in some areas, particularly in the last two categories 🙂

    Liked by 1 person

  8. Thanks for your compliment, Rennie.

    I guess I have my moments.

    So it was impossible to turn mud (or lead) into silver in Middle Ages because we are dealing with two different atomic elements, which was something alchemists of old did not know.

    It is possible to do that now with a particle accelerator, but it would be extremely expensive to do so, which would beat the entire purpose of the exercise.

    What modern machine translation alchemists don’t seem to understand, or pretend not to understand when they are hawking their product, is that just like lead and gold are very different atomic elements, machine-pseudo translation and real human translation too are very different products, because the former by definition excludes the element called “meaning” (since this is something that is created in human brain, i.e. an element that machines don’t understand and probably never will), while the latter is built on the element of the actual meaning.

    When a machine translates something correctly, which happens more or less frequently, it is a coincidence, as the machine can’t tell the difference between translation and mistranslation.

    When a human mistranslates something, which also happens, more frequently with some humans than others, it is not a coincidence, but the result of lack of knowledge (“lack of linguistic and other databases and experiences that need to be stored in human brain”) and/or other factors, such as fatigue.

    It is possible to convert the product of machine translation into a real translation with post-processing performed in the human brain of a human translator, the equivalent of particle accelerator in this comparison. But then, the process is more expensive and much more prone to mistranslation than if you only use a human translator, who can consult a machine translation if she wants to do so.

    I think that the argument of higher cost should be stressed more by translators because it is the only argument the business world is likely to take understand. The higher cost negates the entire purpose of the exercise, which is why MT charlatans must insist that higher cost of machine translations processed by humans is a myth, and so it the tendency towards mistranslations.

    Otherwise they would not be able to sell processing of machine translation as a viable business model (because it is not a viable model).

    Liked by 1 person

  9. “The thing that makes me think that the human brain possesses the magical (in other words: you don’t understand how it happens) capacity for communicating with other human brains.”

    The magical, as you call it, is called God by some people, namely those who believe in God. I don’t know whether there is a God, I am an agnostic – I believe that this is a concept that is not accessible to humans (and I also detest how all organized religion is used to manipulate people). But I do believe that there is something magical in human communication and that this magic, which I call “meaning” will never be reproduced by machines.

    Whether there is a God or not, when MT programmers are trying to impart meaning to machine translation, they are trying to be just like God. That’s why it does not work – they don’t realize that they are going down the wrong road, because just like medieval alchemists, they or the people who are financing them are blinded by their greed. There would be a pot of gold for them at the end of the road if you could figure out how to make machines understand what it is that you want from them. But that is not going to happen.

    They should read an article about machine translation, that Yehoshua Bar-Hillel, who was a philosopher, mathematician, AND A LINGUIST wrote a very long ago. The article is very relevant to the problem MT programmers are trying to solve. Instead, they simply fire all the linguists while trying to find better algorithms.

    http://www.mt-archive.info/Bar-Hillel-1960-App3.pdf

    Like

    • Oh dear, I may have mis-threaded my response and added confusion rather than sense to this thread.

      I am in full agreement with you and Bar-Hillel. The comment about “magical (in other words: you don’t understand how it happens)” was to Derek Freyberg, and alludes to Clarke’s Third Law: “Any sufficiently advanced technology is indistinguishable from magic.”

      Perhaps by coincidence, I am also very much in agreement with you about that certain added something which puts the sense in sentience.

      I apologize if I have caused offense through my comment.

      Like

      • No offense on my part. Sometime there is no button for me to click on a particular comment, I think that was what happened now, and what often creates confusion. I am not sure why WordPress is doing it.

        As to “Any sufficiently advanced technology is indistinguishable from magic”. It is a cute saying, I wish I had come up with it. But this is true only for a limited time – until you understand how it works. When I was watching a Netflix movie on my iPad, and then when I switched to TV because I wanted a bigger screen and found out that I could start watching the movie from the exact same spot where I stopped watching in on iPad, it seemed like magic to me. But once you realize that both the iPad, and TV, and PC connect to the same information stored in the cloud, the magic is gone.

        Liked by 1 person

    • Hi Steve,

      I think the ‘magic’ we are grasping at here is something called ’embodied cognition’ by the linguist George Lakoff and others.

      “By using the term embodied we mean to highlight two points: first that cognition depends upon the kinds of experience that come from having a body with various sensorimotor capacities, and second, that these individual sensorimotor capacities are themselves embedded in a more encompassing biological, psychological and cultural context.”

      — Eleanor Rosch, et al. The Embodied Mind: Cognitive Science and Human Experience

      I don’t think we have to start worrying until we can share Gauloises and a cup of coffee with a ‘machine translator’ at Les Deux Magots and he (it?) begins to complain about all the tourists.

      Frank

      Liked by 1 person

  10. “I don’t think we have to start worrying until we can share Gauloises and a cup of coffee with a ‘machine translator’ at Les Deux Magots and he (it?) begins to complain about all the tourists.”

    Very good. I Love it.

    The way I put it was that we don’t have to worry about MT being as good as human translation until MT starts writing romance novels that women actually want to read.

    You would not embodied cognition for that too. Both examples are just a variation of what Bar-Hillel said in 1960.

    Liked by 2 people

  11. I would bet that farm, too (if I had one), that he doesn’t speak German. Interesting post, thanks.

    Like

  12. Hello. I have been commissioned a job as a ‘editing of a dictionary translation’, which as I go along has turned out to be clearly a translation memory of some sorts, not a dictionary. There are no definitions, just lose terms that sometimes make sense, sometimes not, even with some context, that have been not-so-well translated by somebody else. I’ve accepted before getting it and as the agency is paying my rates (well, let’s see) and we have a good relationship, I am fine with it. But I guess that the end client is really trying to save some bucks here – instead of feeding into a TM as jobs are commissioned they want to have all set for future translations. I am not against MT in context – it is very useful for certain tasks, useless for others – but this is really going to cost the client much more time and money in the future… There you see – they have probably been sold the idea of how good and cost-saving MT can be. Time will tell. Still within the topic – I guess articles about MT on papers sometimes can be paid for, a bit like advertorials. Papers are desperately trying to make money every way they can in order to survive in the digital era and paid-for articles are not unheard of.

    Liked by 2 people


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: