I have been reading, on Facebook and in other equally trustworthy venues, numerous press releases about the incredible progress that has been achieved with language automation tools. The press releases use a lot of cool terminology with impressive sounding terms celebrating new integration capabilities (with source code), mobile application for improved customer interface, integration capabilities enabling customers to promptly deliver product experiences to global users, highly extensible platforms completely automating the translation process, automatic content detection, etc.
All of this new, highly innovative technology is then in the final stage integrated with cloud technology, which to me means that invisible beings residing in clouds are in charge of the seemingly least important part of the process, which would be the translating bit.
These invisible beings are probably not translators, and maybe not even real people, since only angels can reside in clouds, at least based on the teachings of Catholic religion. How many translators can be instantly translating new content in the cloud is a question that should prove to be no less interesting to consider and analyze than the question of how many angels can dance on the tip of a pin.
Mad Patent Translator is also proudly using various language and technology tools greatly facilitating the translating process. But this technology has nothing to do with the impressive terms thrown around with abandon in the modern version of the “Translation Industry”. Although at this point, I wonder whether it would be more appropriate to call it something else and instead refer to a certain segment of highly propagandized and highly automated “Translation Industry” for instance as “Language Conversion Industry”. The word “translation” is not really mentioned much in the press releases, and translators are never mentioned in them either, at least not as real human beings.
The bombastic propaganda may be working – I received a few days ago a Price Quote Request through a link on my website in which a paralegal from a patent law firm wondered how much it would cost “to convert a Japanese patent application to English”.
So let’s consider how this translator is applying cool language tool and translation technology to his own work.
This week I am translating, among other things, 5 Japanese patent applications ranging in length from 2 to 19 pages. The term page, however, can be somewhat misleading, because older Japanese patent applications have 4 small pages on one page when printed on paper for a total of between about 800 to about 1,200 words in English translation.
Here is my first step for application of cutting-edge technology: as an intrepid, early pioneer of innovation in the field of translation technology, I have discovered more than 20 years ago that the entire translating process is greatly facilitated and its quality is enhanced when the tiny Japanese characters, which somehow must be squished into 4 miniscule pages of the Japanese A4 page format, are enlarged on a copy machine, preferably with the ratio of 1 : 1.5.
The enlarged source text then fits perfectly on a second translation technology tool that I have been using for a long time called document holder, or paper stand – an inexpensive, trusty tool that further improves may translating experience. I have been using this tool with great success already for more than three decades.
The third, more recent tool of advanced technology that I am frequently using now when I translate patents is machine translation. But one must use it with caution.
Some translators are now beginning to refer to machine translation as machine pseudo-translation, and with good reason. Although machine translation can be very useful for translation (or pseudo-translation) of patents in languages such as German, or Russian, or French, provided that the sentences are not too long and that the software can for example match correctly the right verb, hiding at the end of a long sentence in German, with the right object or subject, machine translation does not work nearly as well with Japanese.
I will now attempt to demonstrate my claim on something that I was translating today. Here is a very simple sentence from a Japanese patent application claim:
“以上の工程を 含む半 導 体装 置の 型 造 方法であり 、 酸化等の 然処理によるゲート 電 極 配 線の表質の問 題がなくなり安定し た半 導体装 置を 提 供 で き る”.
which says something like this:
“A method for manufacturing a semiconductor device including the stages mentioned above, which makes it possible to provide a stable semiconductor device, free of electrode gate wiring problems due to thermal processing with oxidizing, etc”.
was translated by GoogleTranslate as follows:
“Ri Oh semi-conductive KaradaSo location of the type production method, including the above steps, a deer semi-conductor equipment table quality problems of the gate electrode wiring that I have such clauses were Na Ri stability in processes such as oxidation ∎ You can in the provision”
GoogleTranslate and other machine translation programs will generally do a much better job than what we see above, especially with European languages. (Except when they don’t, of course.)
But one big problem with translation of patents is that many older patent applications exist only as a PDF file that must be first converted to a digital form. The conversion in itself is not a problem and there are many software packages that can be used for this purpose. But because some of the characters in Japanese or other languages will be invariably misread by the software if we are talking about older documents, erroneous characters are introduced into the converted digital file, which will then make it impossible for the machine translation software to interpret such a file so that it would make sense at least on some level.
Even when the conversion from PDF to a digital file is perfect, as is the case in the two lines of Japanese text above, if you take a closer look at these two lines in Japanese, you will see that the spacing between the characters is not perfectly uniform. This is not a problem for human eyes, but a huge, perhaps insolvable problem for a scanner. A small irregularity (lack of perfectly uniform spacing), combined with the fact that there are no spaces between Japanese words (Japanesetextiswrittenlikethis), will thus result in completely useless machine translation, such as the pseudo-translation above.
The translation agencies who describe in almost adulatory language the nifty language technology tools that they are trying to sell to new customers live in a universe that does not seem to have anything in common with the real world in which translators must translate real documents, namely in such a way so that the goal of the translation would be met – or at least so that these documents would make sense in another language.
They have created a special world for a new kind of “translation industry”, or language conversion industry, a world in which “enterprise-grade translation management platform is integrated with the version control systems developers use to manage their product strings, including Git, Mercurial, Subversion and CVS, optimizes product internationalization and accelerates product release cycles, allowing companies to increase user engagement and satisfaction by providing a localized web, desktop or mobile app experience”.
This fabulous new world has almost nothing to do with translation, or maybe a little bit, since in the end the cloud workers (also referred to as clown workers), whoever they are and wherever they may be hiding, must be ultimately unleashed to “translate the corpus” from one language to another, or probably to many other languages, (to the extent permitted by the new technology).
Personally, if I were running an innovative language conversion enterprise, I would make sure to specialize only in translation into languages that my customers do not understand.