Posted by: patenttranslator | April 1, 2016

A Brief Comparison of Machine Translation and Human Translation

Instead of pontificating about the dangers of relying on machine translation and arguing that post-processing of machine translation is a really idiotic idea and probably not the way to go, I am posting below the Japanese text of a leaflet dropped by the US Army on Japanese islands in August of 1945, machine translation of the same Japanese text (lightning-fast with GoogleTranslate), and my own quick-and-dirty translation (as a sample of human translation, which did take a little bit longer to finish).

Otherwise I have no comment for the moment, although I am hoping that the readers of my blog might have a few.

1. US Army Leaflet Urging the People of Japan to Surrender in Japanese

日本國民に告ぐ!!

〝卽刻都市より退避せよ〟

このビラに書いてあることは最も大切なことでありますから良く注意して読んで下さい。

日本國民諸君は今や重大なる秋に直面してしまつたのである。

軍部首脳部の連中が三國共同宣言の十三ヶ條よりなる寛大なる條項を以て此の無益な戦争を止めるべく機會を與へられたのであるが軍部は是を無視した。

そのためにソ聯は日本に對して宣戦を布告したのである。

亦米國は今や何人もなし得なかつた恐しい原子爆弾を發明し之を使用するに至つた。之原子爆弾はたゞ一箇だけであの巨大なB-29二千機が一囬に投下 する爆弾に匹敵する。この恐るべき事實は諸君が廣島に唯一箇だけ投下された際、如何なる状態を惹起したかはそれを見れば判るはずである。

此の無益な戦争を長引かせてゐる軍事上の凡てをこの恐るべき原子爆弾を以て破壊する。米國はこの原子爆弾が多く使用されないうち諸君が此の戦争を止 めるよう天皇陛下に請願される事を望むものである。米國大統領は曩に諸君に對して述べた十三ヶ條よりなる寛大なる條項を速やかに承諾し、より良い平和を愛 好する新日本の建設をなすよう米國は慫慂するものである。

随つて日本國民諸君は直ちに武力抵抗を中止すべきである。

然らざれば米國は断乎この原子爆弾並に、其他凡ゆる優秀なる武器を使用しこの戦争を迅速且強制的に終結せしむるであらう。

〝即(ソツ)刻(コク)都(ト)市(シ)より退(タイ)避(ヒ)

2. Machine Translation (with GoogleTranslate)

!! Which it says to Japan Kokumin

“卽刻 case urban retreat than”

Carefully read well because it is that the most important thing is that you have written in this villa.

Japan Kokumin gentlemen is the was settlement in the face of now Naru serious fall.

Although guys of military leadership is Mikuni of Womotte generous Joko consisting of ten SankeJo of the joint declaration was spatula gave the Ki會 in order to stop the futile war of 此 military ignored the Shi.

Soviet Unicom for that is the declared war and unto Japan.

Also Beikuni is ItaruTsuta the fear Shii atomic bomb who has failed now obtained without any person to use this to 發明. This atomic bomb was Isuzu Ichi箇 only in that huge B-29 two thousand aircraft comparable to the bomb to drop on one 囬. This formidable KotoMinoru is when the gentlemen was dropped only only 箇 in Hiroshima, or elicited any state should be seen if you look at the it.

The all of the military on the Iru prolongs a futile war of 此 destroy Womotte this terrible atom bomb. Beikuni is wishing that the gentlemen is petition the 此 of war to stop Mel so the Emperor of this atomic bomb is not used a lot. Beikuni President accepted promptly generous jaw section consisting of thirteen KeJo described in unto the gentlemen in 曩, is Beikuni so as to form the Shin Nihon & Co. of construction to love good a better peace intended to encourage is there.

Sui Tsute Japan Kokumin gentlemen should be stopped armed resistance immediately.

If scolded Zare Beikuni is Danko in this atomic bomb parallel, using the excellent Naru weapons loose Yahoo Legend will Ara a Schiml to end this war quickly 且強 system basis.

“Soku (graduate) time (full-bodied) capital (g) City (city) than withdrawal (Thailand) avoid (human)

3. Human Translation of the Same Leaflet

People of Japan: Attention – Evacuate Your Cities Immediately –

The text written in this bill is very important: please read it carefully!

All of the people of Japan are now facing a Fateful Autumn.

Your military leaders were given the opportunity to stop this futile war based on the generous 13 conditions of a Joint Declaration of the Military Leadership of the Three Countries at War, but the military ignored them.

That is why the Soviet Union declared that it is at war with Japan.

Moreover, the United States has invented a frightful atomic bomb that has been already used. One such atomic bomb has a destructive force equivalent to the bomb that was dropped from the large 2,000 B29 bomber. You were witnesses to the destruction that a single bomb dropped on Hiroshima has caused.

All the military forces prolonging this futile war will be destroyed by these terrible atomic bombs. The United States is hoping that you will all petition your Emperor to end the war before many more such bombs are used. The president of the United States encourages you to approve immediately the generous thirteen conditions proposed previously in order to start building a new, peace loving Japan.

Consequently, the Japanese people must stop armed resistance immediately.

Otherwise, the United States is determined to use these formidable weapons as well as other excellent weapons to force this war to a rapid conclusion.

 – EVACUATE YOUR CITIES IMMEDIATELY – 

 


Responses

  1. Though I don’t understand Japanese, it sounds like you just corrected the wrong English target text provided by machine translation. So, I think you played the role of an editor here. Is that right?

    Incidentally, my translation blog link (at start) is http://engpt.wordpress.com

    Like

  2. Thank you for your comment, Lionel.

    The point of my post was actually that it would take a human translator ten times as long to “correct” the machine translation as it would to translate it from scratch. That is why I used this “machine-translated” text to demonstrate that “post-editing” of machine translations is in fact retranslation. ‘

    I will now check out your blog.

    Liked by 3 people

  3. Q.E.D.

    Liked by 1 person

    • To the young’uns: Q.E.D. is Latin for Quod Erat Demonstrandum.
      GoogleTranslate told me that it means “which can be shown”, but because I am a human translator, I happen to know that it means “Which was to be demonstrated.”

      Liked by 2 people

      • 🙂

        Like

  4. To my surprise, today I was able to make use of free MT in a very basic computer translation. However, I attribute my success today to the “Machine Translation” engine being trained with a large corpus of similar text data in the first place. “MT” in my opinion is nothing but glorified Translation Memory (TM), with a few tricks up its sleeve compared with Babelfish about 15 years ago. I believe the publicly available MT we see today is better called “fragment assembly”, as it doesn’t possess a shred of intelligence! Also, fundamental mistakes committed by the MT engine for my main language combination, English > Norwegian, remains the same today as it did several years ago.

    And precisely because so-called MT is really just stupid fragment assembly cleverly marketed as machine “translation”, there is no way human translators will be replaced anytime soon.

    While Norwegian MT looks like fragment assembly, in the case of Japanese, the result is just a regurgitation of random dictionary lookups so atrocious as to make me weep. I would never accept an offer for Japanese > English post-editing – exactly like you say, it would be 10 times harder to work with than just retranslating the original. Instead of using MT, one might as well remove half the words from the original and require the translator to guess what the missing words are … actually that might be easier, or at least less soul-destroying.

    So, since MT is just glorified TM, a recent trend in the industry, in a desperate attempt to salvage hope, is to train MT engines to deal with very specific subject matters. It is no surprise, nor does it indicate human-like intelligence, that a computer can be fed thousands of pages of carefully aligned translation data, and can then to some limited degree perhaps replace human translators within a very limited domain in an amateurish manner (the kind of work that would probably be outsourced to $0.03-per-word bathroom-iPhone-translators anyway). But, since, as you have written previously Steve, that the computer is not endowed with intelligence, I think leaving sensitive translations to an MT engine could very well bankrupt a business.

    I would say the primary application of TM disguised as “MT” today is as a kind of expanded dictionary lookup and for exploring the words and sentences recorded in the ‘engine’.

    Liked by 1 person

  5. Excellent analysis, IMHO, and I love your metaphors.

    But I’m afraid you forgot one crucial point.

    Since “the translation industry” keeps repeating how much money the market for machine translation is worth, (it is more like a religious chant, you can almost hear the heavenly choir: $45,000,000 – GLORIA, GLORIA, IN EXCELSIS DEO!!!), but mostly just the post-processed machine translation, i.e. MT regurgitated by human translators so that it would make some sense, is worth 45 billion dollars (or is more? I am not sure now), this obsession will be around for a long, long time, until so many companies are bankrupted once angel investors’ money is gone that nobody will be willing to invest in it anymore.

    And when that happens, machine translation will be free once again, because it does have its uses when it is free, although it is not worth 45 billion dollars.

    Liked by 1 person

  6. A wonderful example and argument for

    Hands off of empty and brainless MT

    However, the sample was a bit creepy

    Liked by 1 person

    • “However, the sample was a bit creepy”.

      That was the idea. I was looking for important communication that would end up not only being mistranslated, but butchered beyond recognition by a machine to emphasize the huge gap between machine pseudo-translation and human translation. The creepy, shocking value was a bonus. I’m glad it worked.

      Like

  7. A good exercise, but don’t those promoting PEMT claim that they’re not using GT but some über-sophisticated, cutting-edge, finely-tuned 22nd century version of machine translation?

    Liked by 1 person

    • Ha ha, but even if they do, the results are not likely to be much better. I should know, I actually own one of those “über-sophisticated, cutting-edge, finely-tuned 22nd century versions of machine translation”: Slate Desktop (http://slate.rocks/). Haven’t had much time to play around with it yet though (as I am always translating).

      Liked by 1 person

    • Some use GT, some use Microsoft Translator, and some use a specially trained MT-dogie and claim that the results are much, much better. For the most part, MT is MT and what they say about it is mostly a lot of BS because you can’t grow it a brain no matter how hard you try.

      I have been contacted three times already (the last time yesterday) by an agency that uses Microsoft Translator and then has people who are signed up with it to post-process the MT-translated garbage for ONE CENT A WORD. This agency absolutely loves the ATA. They say that they “have” seven thousand translators, and when I asked them how is something like that even possible, it turned out they just use the ATA database of translators and consider them “their translators” (that’s how they found me).

      The thing is, none of these approaches can work without having a human being who possesses a human brain and thus can reconstruct the entire translation. Parts of the MT pseudo-translation may be almost flawless, but if you have just one word wrong, (for example, when the MT machine gives a paragraph positive meaning because it overlooks “not” somewhere in the sentence), the whole thing will be nonsense again. And some of the time the pseudo-translation will be only slightly better than the example in my post.

      If you have a unique text, such as this government leaflet, or my silly posts, the result will be always a comical mistranslation because it is impossible to train your obedient little MT dogie to imitate human brain.

      MT is really just a haphazard assembly of fragments as Eirik put it so eloquently, and MT post-processing is really just a clever but immoral fraud, aimed at wage theft by reclassifying human translators as “post-processors” and paying them a fraction of what they used to be paid.

      But this fraud can only work if human translators cooperate with the MT fraudsters.

      Will translators ultimately cooperate? I don’t know. Some will and some won’t. My hope is that most will be able to see through this fraud and avoid it all together, that is why I am writing my silly posts about machine pseudo-translation.

      Liked by 1 person

  8. Hmm. Interesting, but you need to keep in mind that Google Translate is much, much better at certain language pairs, and certain subjects. Throw some Dutch>English at it, for example, and you will see a very different picture.

    I don’t do post-editing, and never will, but Google Translate + Microsoft Translator are two very useful tools in my toolkit (I use them via the plugins inside memoQ 2015), and it can sometimes be faster to edit their output than translate a segment/sentence from scratch. The operative word being “sometimes”. If it is, I edit the MT output, if it isn’t, I translate the segment from scratch.

    My take on post-editing is that translators should never do it for a translation agency (at a greatly reduced rate), but instead do it as part of their own workflow, if appropriate for the relevant sentence/segment (at their full rate). After all, what value is the translation agency adding, other than running your source text through Google Translate (or their own usually pretty terrible MT engine)?

    Michael

    Liked by 4 people

  9. Of course, GoogleTranslate works much better with Dutch because unlike Japanese or Czech, Dutch is very similar to English. It also works almost flawlessly when it can match an existing translation done by a human translator to another translation. Each of these engines has advantages and disadvantages, but none of them works the way translation agencies who see a gold mine in them say they do work.

    Machine translation engines, whether the free ones like GoogleTranslate or MicrosoftTranslate, or the customized, trainable ones that cost on the order of a few hundred dollars, are very, very useful. Even the comical translation of GT in my example in this post is very useful …. but mostly just to a translator.

    If an imaginary English-only speaking person who lives in Japan in 1945 were trying to make sense of the Japanese leaflet by using machine translation, he would have basically no idea what’s in it.

    So my opinion is that by all means, translators should be using machine translation. I am often using it myself. But they should use it for their own purposes and they should definitely not cooperate with translation agencies who are trying to turn them into post-processors.

    Translators who do so are digging their own grave.

    Liked by 2 people

  10. My opinion: machine translation (MT) is DANGEROUS as you can ONLY end up omitting to correct big mistakes from time to time. MT is an ABSOLUTE GUARANTEE that you will make HUGE MISTAKES SOONER OR LATER.

    As to MT plugins in CAT tools, they might seem to accelerate your work (but, as I said, introducing mistakes which you might not see), however it does not even compensate for the huge amount of time you lose by encoding your terminology in software programmes like SDL MultiTerm.

    I recommend to all translators to only translate using MS Word coupled with its AutoCorrect macros (in the Tools menu), as a terminology database and typing accelerator – which is entirely for FREE!

    At least, AutoCorrect cannot be used as a pretext by non-translating intermediaries (thus totally incompetent in matters concerning translation, although they tend to present themselves otherwise to OUR customers) to EXTORT huge amounts from university-trained linguists (for the most part) who have never been reputed for being extremely rich in the first place (never heard of a millionnaire translator…).

    As to SDL’s supposed addition of AutoCorrect in Studio 2015:

    – it cannot save an infinite number of characters in the right-hand column, contrary to MS Word’s AutoCorrect (when you format the entry in the target language);

    – you need to click four (4) times to reach it because those non-translators have not understood that you need it several times per sentence, and not twice a month.

    In short, those CAT tools are only labyrinthine systems (“usine à gaz” in French, cfr https://fr.wiktionary.org/wiki/usine_%C3%A0_gaz), huge complicated systems (with manuals over 500 pages for SDL Trados Studio!) which actually SLOW DOWN a translator’s work, compared (!) with using MS WORD + AUTOCORRECT, which is for free (when you get MS Office, of course) and does not decrease your already modest revenues by about 15%.

    All end-customers, translators, translation students and even intermediaries must be aware of this and STOP thinking that TECHNOLOGY can (almost) do it all, “speed up” things (MultiTerm does the opposite compared to MS Word’s AutoCorrect), while (!?!) “increasing quality” (MT plugins do the opposite!!) (it’s nonsense to pretend that speed will increase quality, in the first place: those non-translating intermediaries DO NOT HAVE THE FAINTEST IDEA OF WHAT THEY ARE TALKING ABOUT!).

    And if CAT tools increase quality, as they are supposed to do, they should entail SURCHARGES, not rebates!

    The PERVERT PSYCHOPATHS at the head of some of those large “LSPs”, as they inappropriately dare to call themselves (which makes us, translators, what : non-language service providers ?…), have nothing to do in the translation sector!

    By competing on prices, large intermediaries are killing their own business, which proves they are CRAZY.

    A translation intermediary should only compete on QUALITY, which is what customers EXPECT.

    Those pervert psychopaths obviously take both translators AND customers for IDIOTS (which they sometimes are, but they are far from the majority, at least I hope so).

    Stop accepting to be INSULTED by the INSULTING BEHAVIOUR OF NON-TRANSLATING INTERMEDIARIES.

    Last but not least: non-translating intermediaries who dare to impose prices and tools upon translators can be requalified as EMPLOYERS and thus have to pay EMPLOYEES INDEMNITIES in some cases.

    So, whenever one of those intermediaries (translating or not) dares to impose on you the usage of a CAT tool, or dares to impose HIS price on you, just tell him that he is acting as an employer and can be sued for indemnities: IT WILL COOL OFF THEIR ARROGANCE IMMEDIATELY and WAKE THEM UP from DAYDREAMING that they are employers. 🙂

    (Sorry for the capital letters, but it is the only way to enhance anything in this format)

    Liked by 3 people

  11. Perhaps we should take a leaf out of the book of LSPs and start promoting the use of nomenclature that we (members of the profession) are comfortable with.

    I’ll open the bidding with: translation services procurement companies or LSPCs. Just to add a little clarity to LSP.
    Alternatively: translation intermediaries/brokers.

    Liked by 2 people

  12. That would be Language Services Procurement Companies then.

    It works, but for some reason I don’t like the word procurement, probably because I have seen it used mostly just in contracts.

    Like

    • I just want the juices to get flowing in the hope that someone with more imagination than I has a suggestion.

      Liked by 1 person

      • The name “translation intermediaries” is neutral AND puts them back to their place: they are intermediaries.

        The name “translation brokers” is mostly for large companies who do not have the faintest idea of what translation is all about.

        They have only understood that computer science coupled with the Web offer them opportunities to make money on the back of faceless Web slaves who are not paid per hour since they work from home (=> translators are paid per word, with computer-scientific pretexts to extort huge rebates from them, with no consideration at all for translators’ TIME, which previous rates were meant to cover, but “translation brokers” have not understood that either), translators who are faced with increased international competition, including from Third World countries, since it all happens on the Net nowadays.

        But translation brokers do not know what translation is all about.

        Just like some small agencies, for example agencies run by former lawyers with no training in translation & very little experience, who think they can translate (legal stuff, but also other things not law-related, preferably also into their foreign languages in order to increase business, because they do not understand the importance of translating into one’s mother tongue because they are no translation professionals) and who think they can also function as translation intermediaries, i.e. dump to others the difficult work (e.g. very specialised legal stuff: medical contracts, etc) while requesting “your best rate”, an expression they have seen on online job ads and which they think is normal to use (whereas it is rather a “Chindian” expression) and requesting rebates for repetitions (which is never done for legal translations because they are too touchy, too important for some segments to be left unread).

        So lawyers should beware of (mostly failed) lawyers who turn to translating without any training and, worse, also improvise themselves as “translation intermediaries”, or rather “translation brokers”.

        So there are all sorts of improvised intermediaries, also among small and even tiny entities with no experience and/or training in translation.

        The prospect of having huge websites translated into multiple languages by online slaves has produced a new breed: “translation brokers”, and their corollaries: producers of so-called “translation software”, which translators have never needed, as all the tools they need already existed before…

        Sometimes “translation brokers” and “translation software” producers infest the translation profession from within one same group of companies, like SDL.

        Like

  13. Says it all. People are still cleverer than computers – in McDonalds yesterday with my grandchildren there was a computer to tell people when their order was ready, which was completely out of sync with reality. They had a bright older lady looking at the system and telling people when their order was ready as well! Much more accurate and efficient…

    Like

  14. […] https://www.youtube.com/watch?v=qSx2HIi4dFg Instead of pontificating about the dangers of relying on machine translation and arguing that post-processing of machine translation is a really idiotic idea and probably not the way to go, I am posting …  […]

    Liked by 1 person

  15. The delights of machine translation, Japanese-English:

    ”The police squad has been scaffold demos of the imago about the estraterrestrial body. An amusing utilization has emerged with FOVE equally a serviceable snip of computer technology to help ancients with personal limitations, whether for single-spacing, robots or piping instrumental music. A mortification inward component is incontestable with a nonreader transposition the clavier with simply his eyes.”

    http://hiteh.us/fove-eye-tracking-vr-headset-looks-to-marketplace-reality/

    highly enjoyable read.

    Liked by 1 person

  16. […] And every translator knows that machine translations are full of mistranslations. In other words, th… […]

    Like

  17. If you need the translation full of errors then you can opt for machine translation. Culture can only be well handled by a human mind.

    Liked by 1 person

  18. […] The results of machine translation will vary, of course. The term “machine translation” is in fact a misnomer because what we are really getting from these and other machine translation programs are “machine pseudo-translations”, a jumble of words assembled and agglomerated based on algorithms, which means that the results are usually full of mistranslations. The result may in some cases even be an absurd and unusable text that seems to have been written by a madman, as I have described in many posts on my blog; for example in the post titled A Brief Comparison of Machine Translation and Human Translation. […]

    Like

  19. Thank you

    Like

  20. […] I tried to put machine translation to test a few months ago on a translation of a Japanese document which you can see it for yourself right here in this post to counter the constant drumbeat of machine translation propagandists, the result was pretty […]

    Like

  21. Thanks for sharing such a beautiful information with us please keep it up!
    English to German Translation

    Like


Leave a comment

Categories