Posted by: patenttranslator | September 15, 2012

Post-Editing of Machine Translations Is a Good Definition of the Term “A Fool’s Errand”

Fool’s errand:

 1. A foolish undertaking, especially one that is purposeless, fruitless, nonsensical or certain to fail.

 2. Such an undertaking assigned as a prank.

 (Source: The Viktionary, open-content dictionary).

I happen to know something about machine translation (MT) as I have been using it in my line of work, translation of patents from Japanese and other languages, for about the last 15 years.

And yet, although more often than not these days I do have a machine-translated version of a patent application in Japanese, German, French or another language available to me, I have never post-edited the MT product. I only use MT more or less as a pretty good, context-based dictionary. Plus it’s free. You can’t beat that price.

Would I save time if  I could simply edit the MT product instead of having to retranslate everything from scratch? I am sure that I would. Then why don’t I do it?

Because I am an irrational, intransigent, conceited, self-absorbed, mad patent translator who can’t be reasoned with?

It’s not really up to me to say, I may well be all of the above (although I would take exception to the word “irrational”), but that’s not why I don’t post-edit MT. I don’t do it for the following two reasons:

1. I want my clients to continue sending work to me, and most people are smart enough to be able to tell post-edited MT from real translation. My clients pay me for translating, not post-editing.

2. If I tried to massage MT with my editing to lick it into a shape that would resemble human translation, it would take me at least as long as retranslating because ….. MT post-editing that can add real value is in fact retranslating. Even when you have most of the right words and some of the sentences that have been translated correctly in the MT product, you still have to rearrange everything and change a lot of things, which is very time consuming.

If you start pulling out bricks from a house that was poorly built, it will fall down. The same is true about post-editing of machine translations. It is much better and also usually faster to build the house of translation based on a blueprint that can be designed only in a fallible translator’s head, rather than building it based on algorithms of infallible and quite arrogant mathematical geniuses who don’t seem to have any understanding of how languages work. If they did, they would be working on something that should be easier to achieve, perpetuum mobile for instance, or how to turn water into gold.

It does not really matter how large the “corpus”, “database”, or “content tsunami” (the vocabulary available to the hardware) is and how fast the microprocessors are these days.

The problem is, instead of working with the meaning, which is how the human mind works, machines work with algorithms.

Algorithms are and always will be an insufficient substitute for that precious blueprint formed by billions of neurons and synapses firing seemingly at random in human brain. This is a problem that can be solved only if somebody can design an algorithm that will replace the role that meaning plays in human mind and human languages.

The way I see it, it should easier to bring the dead back to life, which is something that happens only in scary movies like The Flatliners, and in a few books like the Bible or Mary Shelly’s Frankenstein.

Post-editing of machine translations could be used to improve texts that are relatively unimportant. But I don’t see a bright future for post-editing of machine translations of texts that nobody would pay any money to have them written or translated in the first place, such as this, hopefully somewhat entertaining, but otherwise unimportant post. Why pay any money at all to translate something that is of no consequence?

Nor do I see promising future for application of post-editing by humans to patents that were originally translated by machines.

So far I have been asked to perform this task twice: once about 12 years ago when MT was still kind of new, by a patent lawyer who spoke with authority and had a deep voice. He must have been a partner. I mentioned it in this article that I wrote back then if you want to read it (but it’s pretty long).

The second time was a couple of months ago. Some guy, a private individual who found my website, asked me for a quote for translating a long Japanese patent. When I sent him my quote for over two thousand dollars, he replied that he already had in his possession a machine translation of the document and that he would be willing to pay me 400 dollars for post-editing.

I declined both job offers, the first one politely, and the second one rather rudely. I don’t like people who try to play the bait-and-switch con game with me.

The thing is, the patents that I translate are kind of important. A lot of money is often at stake, which is why the translation must be as accurate as possible. Post-edited machine translations simply won’t do.

Patent lawyers use my translations to file new patent applications for their clients, or to argue about fine (and sometime nonexistent) technical differences and details for months or years before one or the other corporation finally loses a lawsuit and has to pay royalties.

Most often, though, lawsuits involving patents with complicated technology have the potential of going on for such a long time and being so expensive that both parties find it easier and cheaper to settle out of court.

I am trying now to think of good examples of promising applications for post-editing of machine translations. Something that is not important enough to pay humans to translate it, but important enough to pay humans to try to edit mechanical attempts at translation.

Perhaps there are, perhaps there must be things like that out there.

But I can’t think of any.

Maybe you can.


Responses

  1. […] Fool’s errand:  1. A foolish undertaking, especially one that is purposeless, fruitless, nonsensical or certain to fail.  2. Such an undertaking assigned as a prank.  (Source: The Viktionary, open-content dictionary)  […]

    Like

  2. […] Fool’s errand:  1. A foolish undertaking, especially one that is purposeless, fruitless, nonsensical or certain to fail.  2. Such an undertaking assigned as a prank.  (Source: The Viktionary,…  […]

    Like

  3. Hm… MT as flatlined translation? It’s a thought. I really do wonder about the pipe dreamers who waste so much time waiting and hoping for machines to add even more to the noise level of superfluous data splashing around us. Content tsunami? Anyone who thinks they can or should surf a tsunami like that is welcome to, but I find that shutting out the noise and being very selective about content gives me more real information.

    Bear in mind, Steve, that most of the huckster pushing MT are just trying to make a buck, like all those old COBOL programmers did with the Y2K scam before reality arrived at the Day of Doom and the world and its computers just kept on running. Now you’ve got people like D.W. making silly public statements about how one must “get on the MT boat or drown”. I’d soon drown in crappy metaphors, and once again, if there’s a tsunami, the last place I want to be is on a boat when it sweeps my way.

    Even if MT post-editing were viewed as a way of paying the bills, I would rather sell buns in the local bakery. I think any good writer would understand that what you read affects your mind and what you produce. Expose yourself to machine-generated spew for weeks on end and see where that puts the quality of your other work. If you are left with any.

    Life’s too short. I’ll take what we have today in the quality I need. That’s human quality. I suppose some of the TAUS gurus, the CSA crowd and others have days when they are so in love with the machines they would prefer a romantic evening with a vacuum cleaner to its human alternative. Not this guy, thank you.

    Like

  4. Now that last message from me with all its typos could definitely use some post-editing, but at least the basic rant is clear enough. We have so much we can improve with human factors. Why waste time on the rest?

    Like

  5. …. “most of the hucksters pushing MT are just trying to make a buck, like all those old COBOL programmers did with the Y2K scam before reality arrived at the Day of Doom and the world and its computers just kept on running. Now you’ve got people like D.W.” …

    Exactly, except that the only abbreviation that I am familiar with is MT. What is D.W., TAUS and CSA (I’m afraid I am not very well informed about the latest trends in commercial translatology, mechanic-linguistic hucksterism and related issues).

    “Even if MT post-editing were viewed as a way of paying the bills, I would rather sell buns in the local bakery.”

    Making a living as a post-editor of MT detritus … what a horrible way to die!

    Like

  6. I think that if we, as humans, ever teach computers how to understand language the way we do, the decline of the translating profession will be the least of our worries. 🙂

    Like

  7. I agree.

    See Karel Čapek’s play R.U.R. from 1920.

    http://en.wikipedia.org/wiki/R.U.R.

    Like

  8. Post-editors’ epigram at Tarragona, Spain:

    “Go tell the Translators, stranger passing by
    that here, against their law, we lie and die.”

    Click to access 2012_competence_pym.pdf

    Like

  9. Thanks for the link, Wenjer.

    “the translator’s function can be expected to shift to linguistic postediting, without requirements for extensive area knowledge and possibly with a reduced emphasis on foreign–‐language expertise”

    I started to read the paper but after about 4 pages I gave up.

    It’s too long and nowhere in the first few pages does the author even mention the word “meaning” and how MT technology is going to get around this minor problem.

    He is saying is that a patent translator such as myself does not really need to know much about the languages and the subjects that he translates thanks to the combined miracle of MT and TM.

    So for argument’s sake, since the knowledge of the Chinese or Japanese language can be quantified among other things in how many characters does the translator know, would it be sufficient if this “post-editor” of patents and related technical documents that would be presumably translated by MT knew only:

    100 characters?

    300 characters?

    900 characters?

    Obviously not. He would need to know about 3,000 characters, have a solid understanding of the Japanese and Chinese grammar and of the meaning of the technical terms within a given context both in Japanese or Chinese and in English, and he would also need to have the range of skill sets that only experienced translators possess to be able to keep his clients happy.

    The same principle is applicable to less difficult languages such as German, French or Russian.

    A skilled post-editor of MT wielding awesome TM tools who does not have the skills identified by me above would be dumped by his clients very quickly because he would be identified very quickly as an incompetent impostor who is making too many mistakes.

    I don’t know what Anthony Pym, whoever he is, is smoking, but as far as I can tell, he has no idea what he’s talking about.

    Like

    • Steve, Anthony Pym is quite a guru in Translaton Study, a kind of translation theorist.

      To easy your reading pain, here comes a link to John Bunch’s comment on Pym’s essay, concluding encouragingly: “I also think that even if Pym’s model were to take over the entire market space, there still would be a role for the quality translator, as a project manager / post-editing coordinator / writer / subject matter expert. In short, translators, it is not the end of the world for us.”

      http://www.bunch-translate.com/2012/01/translator-vs-post-editor-vs-technical.html

      BTW, the post-editors’ epigram is a parody of Simonides’s elegiac couplet, commerating the 300 fallen Spartan warriors who died against Persian invasion in the Battle of Thermopylae. I was thinking of the horrible way to die as a post-editor.

      Like

      • 1. I thought he must be some kind of Great Guru, like Reverend Jim Jones selling kool-aid to his People’s Temple followers.

        2. To think that I would not get the reference to Spartan warriors is insulting, but I will let it slide this time. 🙂

        3. I read John Bunch’s post. He often has interesting ideas, but he tends to get easily excited over nothing.

        He will probably outgrow this phase. Most young people usually do.

        4. If you carefully select which source texts will be translated by MT, presumably texts that are easily translatable and between language pairs that are quite similar such as Spanish and English, a good MT program, be it Google Translate or Microsoft Translator, can indeed do miracles (almost), and a post-editor indeed does not need to really be a translator, or at least somebody who can understands the subject in two languages.

        Teaching a course is at a university in Spain is one thing, and the real world is something else altogether.

        After I quickly scanned the paper (since I could not find many things that I thought would be relevant to my work), my conclusion is that this particular course would be completely useless to people like me who have to deal with a very different environment in the real world of technical translation of patents and related documentation (office actions, legal briefs, rejection letters) from foreign languages to English.

        The skill set that real translators need to translate in the real world is not really that different from the skill set that St. Jerome was using sixteen hundred years ago when he was translating the Bible into Latin.

        The skill set relating to TM and MT tools is optional. Some translators use these tools, and some don’t.Whether they use them or not says absolutely nothing about the quality of their work.

        Like

      • “The skill set relating to TM and MT tools is optional. Some translators use these tools, and some don’t. Whether they use them or not says absolutely nothing about the quality of their work.”

        Yes, your conclusion makes sense. However, we have a variety of translations. Some need human quality while some others not. John Bunch posted his thoughts on translation quality few hours ago (http://www.bunch-translate.com/2012/09/the-red-line-test-and-translation.html). I think he is realistic in this regard. The choice is after all ours. As Kevin says, “I would rather sell buns in the local bakery.”

        Like

  10. […] Fool's errand: 1. A foolish undertaking, especially one that is purposeless, fruitless, nonsensical or certain to fail. 2. Such an undertaking assigned as a prank. (Source: The Viktionary, open-content dictionary).  […]

    Like

  11. This is a fabulous little blog post, thank you! It summed up some things that have been brewing in my head for a while. I’ve written something inspired by it here:
    http://www.proz.com/forum/machine_translation_mt/232996-software_misnomers_and_how_to_use_mt.html

    Like

  12. You made very good points in your post.

    I wonder, how many Chinese characters does one need to know to be a competent translator?

    In Japanese it is between 2 to 3 thousand.

    Like

  13. Nice to read Phil’s comment.

    The Japanese dictionary 大漢和辞典 contains over 50,000 character entries and around 530,000 compound words. Yet, you need knowledge about 2,000 to 3,000 characters to be able to read daily news and most of the books published in Japanese during the last 50 years.

    The characters contained in 大漢和辞典 are mostly from the Chinese dictionary 康熙字典, which was first compiled in 1716 and supplemented 1827. This dictionary contains 47,035 character entries, plus 1,995 graphic variants, giving a total of 49,030 different characters. However, the statistic of commonly used characters taken from publications between 1928 and 1988 shows that there are only 6,763 having been used, among which there are about 3,000 characters are used with the frequency of 99.9% and the other 3,000 with an accumulated frequency of less than 0.1%. There are only 2,891 characters used in Mao Zedong’s Anthology, out of a volume of totally 660,000 characters.

    So, you would need less than 3,000 Chinese characters to read and understand anything written in modern Chinese, provided you are familiar with Chinese syntax and semantics, plus some cultural knowledge.

    Like

  14. Hi Steve, thanks for another highly interesting post. You hit the nail on the head with “MT post-editing that can add real value is in fact retranslating.” Precisely.

    I’ll take a stab at Kevin’s abbreviations for you since those are still unexplained – CSA stands for Common Sense Advisory, TAUS is a self-proclaimed “innovation think tank and a platform for shared services, resources and research for the global translation industry” with a huge database for MT, and I’m guessing that D.W. will be Dion Wiggins of Asia Online.

    Like

  15. @ Wenjer

    Thank you very much. So it’s about the same number of characters as in Japanese.

    Which language would you see is likely to be harder for a native speaker of a European language to learn – Chinese or Japanese?

    @ Susan Starling

    Thank you very much for your comment and your valiant effort to try to expand my limited horizons when it comes to this kind of lingo.

    Like

  16. “Which language would you see is likely to be harder for a native speaker of a European language to learn – Chinese or Japanese?”

    Theoretically, Japanese. Since Japanese words and phrases (combinations of Kanji and Kana) in sentences are syntactically “tagged” (with endings of letters/Hiragana てにおは and so on to signalize declensions or conjugations), it is easier to recognise their functions in sentences (patterns/structures).

    In contrast, Chinese words and phrases (pure combinations of characters) in sentences are not syntactically “tagged” (without endings to signalize declensions and conjugations). Chinese sentence structures are rather semantic-driven constructs. It is, therefore, more difficult to recognise the functions of words and phrases in Chinese sentences. You have to get used to the semantic devices of meaning/pointing, in order to recognise patterns. But once you are used to the semantic devices in Chinese (someone like Phil), the growth of Chinese language competence accerates. In fact, Japanese sentence structures are partly semantic-driven, too (idiomatic expressions/phrases that act like ecliptical sentences, for instance). I am sure you pick them up chunk-wise, because you are already familiar with such devices in Japanese and stop analysing when you meet a new idiomatic expression.

    Kids exposed to the environments of different languages acquire them without great difficulties. Unlike adults, kids don’t analyse patterns. They assimilate them (almost equally) in (greater) chunks. Because adults tend to break down patterns to the smallest elements and rules that govern the formation of expression patterns, it is theoretically easier for adult native speakers of European languages to learn Japanese.

    Listen to the following talk and you might come to some ideas of theorizing the acquisition of a language and the growth of language competence of a human being which definitely differ from a machine (based on a statistical / stochastical model), although we analyse intensively when we theorize.

    Like

    • Thank you.

      Like

      • Sorry, Steve. Your question was about “which one is harder” and my answer was for “which one is easier.” I hope nobody got confused.

        Like

      • OK, so I picked the easier one 37 years ago.

        Like

      • Perfectly right choice. I would have picked the same 41 years ago.

        It is a matter of course for an analytical mind to choose a grammatically better ordered language, then go over to a less ordered one.

        Since you have the basis for Japanese, it won’t be too difficult for you to go over to Chinese. You would recognise quite a few patterns in Chinese similar to Japanese. The only one thing is to notice false friends disquised in Kanji.

        Like

      • I’m afraid it is too late for me to start learning Chinese. What they say about old dogs and new tricks is mostly true.

        But I appreciate your vote of confidence in my quasi limitless capabilities.

        Like

      • Well, I just forget sometimes that we are getting old and there is an end, the limit, awaiting.

        “It’s never too late.” This must be a lie. 😀

        Like

  17. Being a PM in the “Translation Industry”, I find your opinions on the difficulties of post-editing MT to be fairly consistent with that of other translators. Experience would agree that MT is not a one-size-fits-all method for cutting costs, and we encourage our clients not to approach it that way. If it is to be used, it must be done with the proper considerations given to the technical nature of the material and the language pairs among other factors. As you mentioned in your response to Wenjer Leuschel, using MT to translate from a European language to another European language is likely to be less daunting than going from European based to an Asian, Arabic, or Russian based language. Still, the methods employed have little to do with the end result of a quality translation, only the irreplaceable judgment of the [human] translator does. In an ideal world, MT is meant to be a timesaving tool… But again, if MT is not set up and used in the proper way and under the proper circumstances, it can be (as you point out) more of a hindrance than a help to the translator.

    Like

  18. […] from Canterbury – Part 3 (“Comparative Law: Engaging Translation” conference) Post-Editing of Machine Translations Is a Good Definition of the Term “A Fool’s Errand&#… An Ethical Myth: As Interpreters and Translators “We Are Not Allowed to Talk Fees” […]

    Like

  19. To Robert Hunt: what’s a “Russian based language” ?

    Like

    • @G. Kamniskas

      Don’t be so hard on Robert.

      He’s just a “PM in translation industry”.

      Everybody knows that these people don’t know anything about foreign lanugages.

      Like

      • I’d still like to know what a “Russian based language” is. If it’s shorthand, then shorthand for what?

        Like

  20. If some one wants expert view concerning blogging afterward i propose him/her to go to see this webpage, Keep up the pleasant work.

    Like

  21. Hi, I do believe this is a great site. I stumbledupon it 😉 I am going to return
    yet again since i have saved as a favorite it.

    Money and freedom is the greatest way to change, may you be rich and continue
    to help others.

    Like

  22. I really love your site.. Very nice colors & theme. Did you
    build this website yourself? Please reply back as I’m looking to create my own site and want to learn where you got this from or just what the theme is named. Cheers!

    Like

  23. I am curious to find out what blog system you have been working with?

    I’m experiencing some small security problems with my latest website and I’d like to find something more safeguarded.
    Do you have any suggestions?

    Like

  24. Yes! Finally someone writes about school.

    Like

  25. Its like you read my mind! You seem to know a lot about this, like you wrote the book in it or something.

    I think that you can do with a few pics to drive the message home a
    bit, but instead of that, this is great blog. A great read.
    I’ll definitely be back.

    Like

  26. This design is spectacular! You most certainly know how to
    keep a reader entertained. Between your wit and your videos, I was
    almost moved to start my own blog (well, almost…HaHa!) Great job.
    I really loved what you had to say, and more than that, how you presented it.

    Too cool!

    Like

  27. Bartfield says to eat breakfast every day within an hour of getting out of bed.
    It can help to recall that hypnosis is not being
    done TO you, but what you are choosing to do. Having more healthy cells and tissues will
    definitely improve your metabolism since it is your cells and tissues that metabolize
    your calories to fuel your bodily functions, after all.

    Like

  28. I like the valuable information you provide in your articles.
    I will bookmark your weblog and check again here frequently.
    I am quite sure I will learn many new stuff right here!
    Good luck for the next!

    Like


Leave a comment

Categories