Posted by: patenttranslator | May 9, 2020

Human Translation in the Age of Machine Translation

I have been writing on the topic of machine translation and how MT is likely or unlikely to influence human translation for a very long time now. This was one of the subjects that I used to discuss endlessly with many commenters on my silly blog for more than ten years now. But even before I published my first post on my blog, I used to write among other subjects about machine translation for publications for translators online and on paper since the early nineties, for example online on the Translation Journal, on paper in the ATA Chronicle, the Translorial (publication of the Northern California Translators Association), the Gotham Translator (publication of the New York Circle of Translators), and other publications.

Some of my predictions turned out to be not exactly accurate, to put it mildly and gently. I definitely underestimated how quickly would MT be getting better after its pitiful beginnings once it became widely available for commercial purposes some three decades ago.

MT is of course much better now than three or two decades ago. It is so good now that it removed from my desk and the desks of other human translators a fairly large portion of material that we used to be translating in the BMT era (the era Before Machine Translation, which incidentally lasted several thousand years). Because of that, I don’t know whether I would be even able to pay my bills now from translation alone. Possibly not.

Fortunately, I am retired now, I was able to downsize and my two pensions are more than sufficient to pay the bills, which are fewer and smaller now. But I still work, mostly just because I like to work, although also because I like the money, of course. So, what kind of work does this human translator do now, and why all of this work has not been swallowed yet completely by MT?

I still translate mostly patents, and there is still a considerable amount of patent translation work for which MT is and will be mostly useless for a very long time, definitely for a longer time than what is still left for me on this planet, I think. So why are some of my clients, mostly patent law firms, spending even now thousands of dollars for human translations of patent documents for which very good machine-translated version are and have been available for free for decades? I don’t ask them, of course, I’m just glad that they still keep me busy.

There are several main reasons for the need for human translations of patent documents that I can think of; one group of them is related to the form in which the patents were published a relatively long time ago, by which I mean mostly legibility problems rendering MT unusable, while the other one is related mostly to the purpose for which a translation is to be used.

Legibility Problems

I sometime receive very poorly legible Japanese patents or utility models for translation that are 30, 40, or 50 years old or even older. Back in the sixties and seventies for example, Japanese utility models in particular were printed out by an applicant using a noisy dot matrix printer or later a fuzzy thermal printer and then faxed to the Japan Patent Office (JPO) to be filed. The legibility of the documents received at the JPO was good enough for the eyes of the Japanese employees, so they accepted and published the documents on the JPO website “sono mama” (as they were).

But because even the best MT package developed more than half a century later is completely useless when it is unable to read the fuzzy characters, these kinds of old documents sometime still end up on my desk. Not even the best algorithm can figure out what an illegible blob in a series of Japanese characters is supposed to mean. I can’t really see the illegible character either, but after 33 years, this human translator simply knows, or thinks he knows, what it has to mean for the whole thing to make sense. There is still a big difference between a human brain and a machine’s algorithm and that will never change.

Problems with Unreliability of Machine Translations

The other kind of patent documents this human patent translator receives relatively often are recent or brand-new patent applications that every machine translation package would have no problem processing, but that still need to be processed by human brain because of their purpose.

The clients sometime even include already with the document for translation also a machine translation available for free on the patent office website of the JPO, EPO (European Patent Office), or WIPO (World Intellectual Property Office) websites, or on official patent office websites of the respective countries.

I am not sure whether the clients send me the prior art documents for translation because they simply don’t trust MT, or whether legally they cannot hold their discussions of the minute but extremely important differences between the designs described in American, Japanese, or German patents on the basis of pseudo-documents created by machines. It is probably a mixture of both.

Although in these cases a “pretty good” machine translation is available to me as it is to my clients, it actually takes me significantly longer to translate these patents because I have to try to maintain consistency with the machine-translated text as much as possible. When I translate without an MT backup, I follow in my mind only two trains of thought: the original text and the text that I am creating in my head. When I need to compare these two trains of thought to an MT pseudo-document while trying to catch every mistake in it, it naturally slows me down. But it is interesting work anyway, although as I said, the translation usually takes a long time. Just because MT-generated text looks very, very good, it does not mean that the “translation” is actually accurate. Unless and until the text is “validated” by being processed through the brain of an experienced human translator, it cannot really be called a translation, which is why I call such “documents” pseudo-translations.

A special subcategory of patent documents that should never be translated with MT only are translations of patent applications that are used to file in English a patent application that was originally filed in another language.

I never get these kinds of translations of Japanese patent applications, called for filing, as opposed to translations of patents for information or prior art research. I understand they are being done mostly in Japan. But some years I receive in addition to translations of patent applications for prior art research quite a few requests for translations for filing of patent applications from German.

Just after I had filed for retirement during a slow period two and a half years ago, a new client found out about my services and I was suddenly swamped with translations of German patents for filing that I was receiving from a law firm for close to a year in a field that I particularly enjoy. Had I known that this would happen, I would have waited a little bit longer to further increase my retirement income. But unfortunately, I had no idea.

It would be very dangerous to use MT for translations that are used for filing, foolish even, because mistakes generated by a machine in conjunction with an algorithm could eventually prove very costly to the owner of the patent rights. I don’t think many patent law firms would dare to use MT for filing the text of a patent in English in United State or in Europe, but how do I know what is happening in the mad universe of machine and human translation these days?

I am just a lowly peon who has been translating patents for profit and for fun for over 33 years, and I consider myself very fortunate that nowadays the for-fun part is even more important to me than the profit.

The Water Keeps on Flowing, a folk song from Slovakia about two former lovers.

Responses

  1. One strange development is that there are some agencies now not only demanding that translators don’t use machine translation, but promising the clients that they won’t, and even a new one on me, which is demanding you use a proprietary interface specifically designed to prevent you from using MT. The reason for this is obviously that the customers feel cheated if they find the translator has used Google, but this is not at all logical, as the main purpose of paying a human translator nowadays is so that the client gets a guaranteed translation. In many, many cases there is no advantage whatsoever in having a human translation. There is, however, an advantage in having someone who knows both languages reading both texts and checking them. So agencies have tried all kinds of ways to find someone other than a translator to do that. One well-known agency now offers ‘light post-editing’ at a reduced rate. I asked them if this meant they wanted mistakes put in the translations on purpose, but I don’t think they understood the question. I suppose these are business decisions made by people who are not themselves experienced translators.

    Like

  2. I am so glad I don’t work for agencies. Most of them really don’t understand the first thing about translation, other than you need to buy it low and sell it high.

    I can hardly afford to ignore machine translations that the clients use themselves and send to me, because I know that they will be discussing the patents with their opposite numbers based on these machine translations.

    “Light post-editing” means no doubt that you do it quickly and therefore it is cheap. I agree that the agency does not or does not want to understand that this way the mistakes will not be corrected.

    But who cares, the main thing is it’s cheap!

    Like

  3. One (former) agency client of mine came up with a new version of the editing process, not necessarily MT post-editing: the one-hour quick edit of a translation of any length.

    Needless to say, the agency was entirely dismissive of the multiple objections I had to this new approach: an editor hardly ever has the time to get his or her teeth into the terminological issues of a translation the way the original translator does, might needlessly introduce inconsistencies in a good translation while being unable to rescue any really bad one, and spending a strict limit of one hour on any translation of any length and quality is really equivalent to sending a poor guy to check a new restaurant for half an hour and writing a review after just having glanced on the menu, possibly without even having ordered an hors d’oeuvre.

    Or just judging a book by its cover, really.

    But to them, the whole idea was… well… just fine and dandy.

    Which really just goes to show that the only thing they are interested in is cutting costs any way they came while keeping clients under the impression that they actually do some sort of QA, as nonsensical and counterproductive this might actually be.

    Like

  4. As long as the product is cheaper than an actual translation, they will be able to sell it, I think.

    “There’s a sucker born every minute.”

    Like

  5. I can’t comment on Japanese utility models in particular, because my experience is mostly with patents, but – the JPO would not accept dot-matrix (or thermal, because it’s also dot-matrix) printed documents though they would accept hand-written documents; they required formed characters because the resolution of the dot-matrix printing technology was considered too low. So, visiting Japanese patent law firms in the early 1980s, I would see them using computer-based accounting and docketing, and Roman alphabet (English-language) word processing, but there were Japanese typewriters being used for typing documents for filing with the JPO. Laser printers, with their much higher resolution, overcame that problem, and formed characters were no longer required; and now of course the filing systems are purely electronic – unlike the USPTO, which still relies on image (pdf) filing.
    As for translation for filing, there are technically two types, though one is mostly obsolete. The first, older type is the generation of a, let’s say English text from a Japanese priority application, for filing in the USPTO. There is not a requirement that the text be a literal translation of the priority document, what’s required is that the content of the English text be traceable back to the Japanese original, to get the benefit of the priority claim. This is the kind of translation that has been filed ever since the Paris Convention came into force in the late 1800s – and, as I say, it’s dying out. Most filers now file a Patent Cooperation Treaty (PCT) application to start their foreign filing process. if for no other reason than it defers a lot of the costs for an extra 18 months, allowing you also to cut your losses inexpensively if the application appears unpatentable or the described invention irrelevant. So our Japanese filer files a PCT application in Japanese, and then nationalizes it. That’s when the second type of translation comes in – the PCT application must be translated into the languages of the various patent offices in which it will be prosecuted, English for the USPTO. And that translation must be essentially literal, elegant variation is improper. The same is true in the rare instance in which a Japanese-language application is filed in the USPTO, which is OK as long as you supply a translation later, but again the translation must be literal. In practice there’s not a big difference between them; and, as I say, the first kind is dying out with increasing use of PCT.

    Like

    • I see. I thought that the illegible blobs were generated also by dot matrix printers because they were still widely used in the eighties, but once (or twice, or three times) the document has been faxed, who can tell?

      The fact is that the legibility of the old documents, and utility models in particular, still available on JPO website is simply horrible. Oh, yes, and I remember some handwritten Japanese patents I had the pleasure to translate too.

      Those were the days.

      And thanks for the explanation of the PCT system. Those are my favorite patents to translate because I can find in them references to other documents whenever I am looking for something.

      Like

      • I think the problems you’re talking of are mostly fax issues – I remember seeing and working on Japanese documents that had been faxed a couple of times and they were awful. [An aside, I was working on a Japanese divisional filing in the early 1980s where I had a definite view of how I wanted the claims to read in Japanese, and so I had to handwrite the claims in large letters to fax them, to make sure they didn’t turn blobby: there was no way at the time to send Japanese text by telex. Now this is all trivial.]
        I suspect there were more handwritten utility models than patents because UMs (called “petty patents” in some countries) were used for more transient, lower level innovations; so less effort put into the applications. You could file handwritten documents in the USPTO when I started practice, which dates me; I can’t remember when the rules changed to require typed/printed text.

        Like

  6. Great post as usually. I will share it on my LinkedIn group: Translation agencies – good, bad and cheap

    Liked by 1 person

  7. Very insightful. Can I translate it into my language and share it on a socialmedia platform (with copyright statement for you, absolutely)? @patenttranslator If you don’t reply me, I’ll know it is not allowed, and I will not translate and repose it but just share the blog site instead. 🙂

    Like

  8. Yes, you have my permission to both translate and share, and please send me a link to the translation.

    Like

    • Thank you! I’ll get back to you next week.

      Like

    • Hi Patenttranslator,
            This is the link to the translation. https://mp.weixin.qq.com/s/42H6zIabPntoT3Suu83bVA

            If it can’t work, see the picture below. It looks like this.

           Thank you!

      Best wishes

      Like

  9. That was a lot of work! I appreciate it, thank you!

    Like


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Categories

%d bloggers like this: