Machine translations of Japanese patent applications have been available on the website of the Japan Patent Office (JPO) for about 10 years now. Recently, the World Intellectual Property Organization (WIPO) website added a machine translation (MT) function by incorporating Google Translate in the search function. Entire texts of applications can now be almost instantaneously translated between English, Spanish, Vietnamese, Hebrew, Portuguese, French, German, Japanese, Russian, Korean, Chinese and other languages, such as Czech. Ten years ago, I wrote an article about the MT function on the JPO website for Translation Journal. A post that I wrote a few months ago about the apparent threat of machine translation to human translators generated a lively discussion on this blog. I also wrote a post explaining how to use the machine translation tool available on the Japan Patent Office website, as well as a rather long post for this blog about Internet Resources for patent translators (such as the EPO, JPO, WIPO and DepatisNet websites), which is based on a chapter that I wrote for The Patent Translator’s Handbook published by the American Translators Association. I decided to test the new Google Translate function on a Patent Cooperation Treaty (PCT) patent application published in Japanese and write a post about it for my blog.
I started by searching on the WIPO website for a common Japanese term used in a patent that I translated yesterday. I searched for the term 記録制御手段 (kiroku seigyo shudan = recording control means), a technical term that I selected at random from my most recent translation that I just finished the day before. Out of 26 patent applications displayed, I selected the first one: WO/2009/084116 (Recording Device, Portable Device, Recording Program, and Recording Method, filed by Fujitsu, Ltd.) and then I translated paragraph 6 from Japanese to English using the Google Translate function because this was the first paragraph with a somewhat long, meaningful description. Item A below is the original text in Japanese, Item B is my translation, and Item C is machine translation obtained with the Google Translate function.
Item A – Original Text in Japanese
携帯装置に搭載されているＴＶ受信機能及びその記録機能を用いて放送番組を受信し、記録する場合には、移動中に放送受信や記録ができる利点があるもの の、場所や時間によっては、電界状態や受信感度の影響を受け、記録画像の不鮮明や、録音劣化等の不都合がある。画像品質が悪い場合、その状況を表すメッ セージを画面上に表示（特許文献１）しても、ユーザが記録を切望している場合には斯かるメッセージは無意味であるし、電界強度が低下した場合に記録を停止 することは（特許文献２、特許文献３）、この場合もユーザが記録を切望している場合には、ユーザの期待を裏切ることになる。画像品質の良否、記録の要否は 番組内容やユーザによって異なるものである。このため、記録停止や画像劣化を表すメッセージを自動表示したり、それを記録することは、記録の有効利用を損 う等、ユーザの要請に沿っているとは言えない。
Item B – My Translation
Portable devices equipped with a TV reception function use this function and a recording function to record broadcast programs. Although the advantage of these devices is that programs can be received and recorded while the devices are being moved, the status of the electromagnetic field and the reception signal sensitivity can by influenced by the location or by the time, and the disadvantage is that the recorded image is in some cases not clear, or that the sound quality can deteriorate, etc. If the user really wants to record something when the image quality is poor even though messages indicating this status are displayed on the screen (Patent Reference 1), the messages may be meaningless. However, when recording is stopped if the intensity of the electric field is decreased (Patent Reference 2, Patent Reference 3), the user will be greatly disappointed if there is no recording. When the image quality is poor, whether the recording will or will not be required will depend on the content and on the user. Therefore, when recording is performed while messages are automatically displayed prompting to stop recording or indicating a poor image, the recording may no longer be useful since the recording is not necessarily performed according to what the user really wants.
Item C – MT Text Obtained with Google Translate
Is the machine for Mobile TV to receive broadcast program for recording function and its receive function, if recorded, despite the advantage of recording and broadcast reception during the move, the place and time, the field affected states and receive sensitivity, and smearing the image recording, the recording deterioration disadvantages. If image quality is poor, a message on the screen represent the situation (Patent Document 1) but then if you are keen to record the user message is meaningless Magical biggest cotton textile, electric field strength decreases If you stop the recording (Patent Document 2, Patent Document 3), if you are keen to record the user in this case, users will be disappointed. Quality of image quality, the necessity of recorded programs which vary depending on the content and user. Therefore, you can view messages automatically stops recording and representing the image degradation, to record it, and harm the effective use of records, along with the request that the user can not say.
If you read my translation first and then read the Google translation, you can more or less understand and follow the meaning of the Google MT product. Even if you don’t read my translation first, you would probably be able to understand most of the translation produced by the Google software. The MT function is very useful considering that the alternative here for people who don’t know Japanese, which means most people on this planet except those who happen to be Japanese or foreigners who spent decades trying to learn the language, would be – no information about the Japanese text at all. But I must say, the quality of the MT product is not very different from the result of the MT software that I used for a similar test on the JPO website 10 years ago, see my article for Translation Journal from July of 2000. Although Google Translate uses a radically different statistical approach to machine translation, the result is in my opinion not very different from other types of MT software and in some cases it may be even worse than what one would expect from Systran-based machine translation tools such as Yahoo Babel Fish, see next paragraph.
The Magical Biggest Cotton Textile Mystery
I have no idea how this “magical biggest cotton textile” ended up in the Google translation. There is nothing even remotely similar to this wording in the Japanese text. I sometime look at machine translations from the JPO website, for example if I want to make sure that I did not skip anything, which is a mistake that human translators will often make. I sometime see hilarious bloopers in the MT product on the JPO website, but if I carefully read the Japanese text, I can always trace the origin of the nonsensical English formulation back to an unfortunate (or fortunate if you appreciate the entertainment value) sly combination of Japanese characters. But not in this case. Is this “magical biggest cotton textile” a contamination that is specific to the statistical model? Can somebody enlighten me as to what might have happened here? I would really appreciate it.
Disclaimer – There Is No Such Thing As A Perfect Translation
My translation is only my interpretation of the original Japanese text. Other translators could translate the same text somewhat differently, and I could have translated it differently under different circumstance – for example, this morning I had two large cups of coffee so far (French Roast purchased from my friendly local Food Lion supermarket, which is in fact a Belgian Company although I yet have to meet a Virginian or North Carolinian who actually knows that). With 3 cups, or with a different brand of coffee, the resulting translation could be a little different. But unlike Google’s MT product celebrated frequently in newspapers as “the new tool that will eliminate the language barrier”, I do believe that my translation expresses what the author of the patent application wanted to say in Japanese.
Just about every article about machine translation ends with words of caution along the lines of “this product still needs improvement, some tweaking, more work”, etc. The companies selling MT can obviously never admit that machine translation will never break the ultimate barrier – the barrier of meaning. Unless you understand the meaning of words, you are merely replacing words by other words in another language according to some algorithm, not translating. I think that the statistical approach to machine translation, pioneered by Google, is just another dead end. It may work very well for some applications but as my simple test seems to indicate, it is not likely to put human translators out of business. In addition to machine translation, Google is also working on other new applications for artificial intelligence that humans have been dreaming about for a long time, such as a self-driving car. Based on this New York Times article, they have been quite successful in this area, although truth be told, I’ll believe it when I see it. I imagine that taxi drivers reading the article linked above experience feelings similar to those experienced by professional translators when they read enthusiastic descriptions of breakthroughs in machine translation.
I would love to be driven by my car instead of having to drive it. And I think it is likely to happen some day, perhaps even soon. But unlike self-driving cars, I think that machine translation that is just as good as what a good human translator can do will not be available to us. And I don’t mean any time soon. I mean ever. Or at least until somebody figures out how to teach computers the meaning of meaning, or what in fancy MT speak is sometime referred to as disambiguation. If that ever happens and computers start understanding that they are merely our slaves, the computers just might decide at that point to get rid of humans. After all, who needs humans when computers understand the meaning of everything just fine.