Posted by: patenttranslator | June 20, 2014

A Few Amazing Similarities Between Lapis Philosophorum, Rumpelstiltskin and Magical Tools of “Translation Technology”


People have been hoping to find a quick and simple solution to all of their problems in technology for many centuries. Rudolf II, The Holy Roman Emperor (1552 – 1612), also called “The Mad Alchemist”, had his own alchemist’s lab and was a patron of a number of alchemists at his court in Prague some four centuries ago. He was convinced that it had to be possible to create a magical substance called “Philosopher’s Stone” (“Lapis Philosophorum”), a substance that would “transmute” base metals (such as lead) into noble metals (gold) and that it was also possible to develop a magical “Elixir of Life” which would confer on him and those who could afford it almost eternal youth and longevity.

Personally, I think that spending a lot of money on similarly insane undertakings in the realm of science makes much more sense than spending trillions on bombing and invading foreign countries, because something good may eventually come out of these crazy projects since alchemy eventually gave birth to what is now called chemistry.

But that’s just me. My government obviously disagrees with me.

Tourists from several continents still flock to a tiny street behind the St. Wenceslaus Church (chrám Svatého Víta) in Prague called “The Golden Lane” (“Zlatá ulička”) to take with their tiny smart phones and huge tablets pictures of tiny houses that were built for the intrepid alchemists (people were much smaller four centuries ago).

The almost religious faith in the magic of technology is evident also in modern “translation technology”.

People have been trying to discover a way to translate human thoughts expressed in words with machine translation (MT) since about 1945. Although MT became a very powerful tool, especially compared to the situation 70, 50, or even 10 years ago, MT still only translates words, while it has no clue (and never will) about the human thoughts behind these words.

Google Translate is the only MT tool that kind of makes sense, some of the time, sometime even most of the time, but there is a reason why it seems to be better than other approaches to machine translation. Google Translate is not based on trying to analyze with machines and software human thoughts in one language and then “transmute” them into another language based on rules expressed with algorithms. Instead, it attempts to instantaneously locate a translation (originally created by a human translator) of the same or similar human thought into the desired language. It is a clever approach, but even this approach can only work, to a limited extent, for texts that have already been translated.

Several other aspects of “translation technology” are also hailed as magical tools that are going to revolutionize human translation.

Although speech recognition has been around for several decades now, relatively few translators are in fact using it. I thought of using it myself, but since I translate mostly Japanese and German patents, I realized that it would probably not work for me.

Every translator has an idiosyncratic method and every translator’s method is different from what somebody else is using. When I translate a Japanese patent, I need to visually keep jumping from the beginning to the end of long sentences to locate a verb that is likely to be hiding there, and then I need to jump back to the beginning and insert the verb where it would belong in English, while making sure that this is the correct placement based on the meaning of the text. It is obviously very easy to misplace the verb when a long sentence contains several verbs, which would result in a mistranslation. I also have to keep in mind that the subject is often not expressed at all in a Japanese sentence and that it can often be found several sentences prior to the present one, sometime on previous page. For these reasons, I don’t think that speech recognition software such as “Dragon NaturallySpeaking” would work for my purposes. And since it cost 800 dollars, I never gave it a try (and never will, unless they lower the cost to 80 dollars).

There are also other quasi magical and more recent tools, including computer assisted translation tools (CATs), crowdsourcing of translations, which means throwing texts into clouds where thousands of invisible humans (who are usually not really translators) will be eagerly translating texts from a foreign language (for peanuts or for free), or human-machine localization of translations, which is a concept in which humans (who again are usually not really translators), “edit” machine translations so that they would finally start making sense.

I think that CATs will probably be around for a while, maybe for a long time, because unlike the other tools mentioned above, these tools are in fact useful to translators, albeit only to some translators and only for certain types of translations. I don’t use them, and probably never will, for reasons that I have tried to explain in many posts on my silly blog.

But the more I am trying to explain why I find CATs useless for patent translation, and in particular for the way I translate patents, the more I am being attacked as a luddite and ignoramus by CAT true believers who simply refuse to conceive of the possibility that some translators may choose not to partake of their beloved toolbox.

I have noticed that all of these new and wonderful tools have one thing in common and that they are sold to translators in the same way: If you, dear translator, start using our Dragon, our CATs, or finally learn how to quickly “edit” MT for us, this will make it possible to increase the efficiency of your primitive human translations, which means that you will make much more money because instead of translating two or three thousand words per day, you will be able to easily translate TEN THOUSAND WORDS A DAY!

The problem is, recent history has shown that instead of making much more money, many translators are making less money if they are using these tools, especially if they work for translation agencies who started requesting obligatory discounts for “repeated words” and paying full rate only for “new words”, based on clearly insane concepts called “full matches” and “fuzzy matches”.

You can’t transmute lead into gold, no matter what kind of “Philosopher Stone” technology you are trying to use, and if you triple your daily output of translated words with your magical Dragon or your magical CAT, the chances are that some of the words will be incorrect, since human brain is not really designed to catch in a single day all of the potential mistakes that may be hidden in a huge chunk of text containing ten thousand words.

Old-fashioned work that is based on a high level of concentration, combined with human knowledge that is based on education and many years of experience is still and always will be the best approach to translation, and that is what I intend to keep offering to my clients.

Which is not to say that it is not possible to turn straw into gold with all of the new and terminally cool translation “translation technology tools”.

All of these tools are now surrounded by so much hype and so many lies that we are entering Rumpelstiltskin territory.

The translations created with the help of these tools will not be exactly pure gold, and if wealth is created by spinning straw into gold with these totally cool technologies, based on recent history, it is clear that translators will be paid with straw, and translation agencies who can force translators to use their “preferred tools” will get to keep the real gold.




  1. You should try Dragon, Steve. It’s $175USD according to their website, but I believe there is a free trial available. You don’t need the professional, the premium version works great.

    I also translate Japanese to English – not patents, but often contracts or government documents with loooong sentences, the kind where you just get the feeling that the Japanese author got lost about halfway through the sentence. You seem to think that Dragon is not good for these types of sentences, but actually these are the BEST types of text for Dragon in my experience. Dragon will wait for you if you stop to think or look up a word in the middle of a sentence, you don’t have to speak so fluidly like you might imagine.

    Actually, where Dragon doesn’t shine so much is stuff with a lot of broken-up text or very short phrases like charts, single words, etc.

    There are of course pluses and minuses to the software just like everything else. For example, if you don’t pay close attention to what Dragon is writing down you might let it mistakenly write “an” instead of “and” or something. But one nice thing is that commonly misspelled words or tough words to spell are no problem for it, assuming it correctly recognized what you said in the first place.

    One really nice thing I’ve noticed with Dragon (aside from staving off RSI and back pain from posture-related issues) is that my sentences actually end up sounding a little more natural some of the time. I think it’s because I am actually speaking the translation out loud, and in doing so it allows me to catch little nuances that don’t jump out at me just from the written page. It’s similar to how reading a finished translation aloud can help you catch awkward or confusing passages.

    I full agree with everything you say about MT, though. It’s 100% worthless for any serious J->E translating (even gisting is almost impossible), and I highly doubt that situation will change anytime soon.


  2. Thanks for your comment, Orrin.

    But if Dragon is so good, and I know another J to E translator who has been using it too, and relatively cheap (when I searched for it I saw that the full price was 800 dollars), why is it that relatively few translators are using it given that it has been available for so many years?


    • Good question. For one thing, you do need a quiet room. I imagine some people are translating side by side with their kids, dogs, babies, etc. That doesn’t work so great with Dragon.

      Another factor is cost. $175 isn’t terribly expensive, but it’s not free, either. You also need a headset w/mic to really use it effectively. My headset is some no-name brand that cost 20 bucks, though, and it works spectacularly.

      You know, maybe it’s just one of those “old habits die hard” things. Sometimes we get a little set in our ways and it takes some force to break us out of that inertia. For me, that needed push came in the form of arm and wrist pain related to RSI. I couldn’t type, but I had to work, so I gave Dragon a whirl. I was extremely skeptical the first time I tried it. I thought, “There’s no way that it can catch what I’m saying with any decent accuracy and speed.” But it does, and it’s really quite impressive how well it works.

      Of course, things like proper names, invented words, chemical names, place names etc. have to be trained manually (there’s an option to type in a word and record your pronunciation of it).

      If we are being perfectly honest then I have to say that of all the software on my computer, Dragon Naturally Speaking probably increases my translation productivity the most, byte-for-byte. Probably more than Trados 2011, even. But Dragon can even be used with most CAT tools, anyway, so they aren’t mutually exclusive.

      But don’t take my word for it, try it out if you get a chance. I think that you will be pleasantly surprised at how well it works. Of course, whether you can happily integrate it with you established workflow is something that only you can decide, but it’s definitely worth trying out if you can trial it for free.


  3. “You know, maybe it’s just one of those “old habits die hard” things.”

    That’s probably it.

    As far as I can tell, inertia is the most powerful force in the universe.


  4. “It is a clever approach, but even this approach can only work, to a limited extent, for texts that have already been translated.”

    This is not correct. You might want to read up on how Google Translate works.

    Computers keep getting more powerful every year and Google representatives attend Japanese conferences to buy bilingual papers from researchers, thereby increasing statistical accuracy. .


    • Not really, and the processing power of the computer is a minor issue.
      SMT will always be limited by the quality of data put into it. The attempts to “align” the internet fall short on one major aspect: Not everything on the internet is accurate and/or of high quality even in a single language.

      The added data extrapolation layer that Google adds to “guess” what is an appropriate sentence structure for a sentence it doesn’t already have in the huge databases used by the system is an interesting approach, but it also have its limitations.

      I can go on and on about it, but for sake of brevity here are two major points, and two minor ones:
      1) Machine Translation (or as some now starting to call it – after driving the reputation of the MT moniker to the ground – Computer Generated Translation) has in fact nothing to do with translation. It doesn’t has anything to do with the process of translation. It is a language transformation algorithm, and language transformation and translation are completely different things.

      2) Translation is not a big data problem. Translation is all about language and communication. Treating is a big data problem is again completely missing the essence of the translation process, the skills and expertise required to perform it, and its bigger picture function in human communication.

      3) Google Translate and the likes are generic raw MT engines. There are other MT engines that can be built and customized for a specific client in a specific domain – provided there is already enough material available and that it doesn’t change much that can yield better results. They are best used for recyling repetitive content and/or low priority, short shelf-life content that needs to be translated very quickly and no one really cares about too much.

      4) MT is a tool, just a tool, like every other technology. A tool that one can choose to use (for discovery purposes for example) or not. The few honest proponents of this technology have no illusions about the role of MT and they are very forthcoming with its benefits, shortcomings, and proper use cases.
      It is the opportunistic charlatans who claim that the technology is superior to the human expertise and experience that are misleading everyone who is foolish enough to buy into their claims.

      Liked by 1 person

  5. The statement that “MT has, in fact, nothing to do with translation” is like saying “Watson’s computation has, in fact, nothing to do with getting Jeopardy questions correct.”

    In 2011, Steve put up part of an email from a German to English translator where the guy said that a company that he had translated for no longer needed him because they had shifted to MT and editing. Why did that happen in 2011 and not 2001? Increasing computer power and software. More powerful computers are needed to go through the many recursions in a timely manner. It also looks like GT is going the rules based/statistical route soon, which also requires increasingly powerful computers.

    I don’t know which charlatans you are referring to. In the case of the German company that no longer hired translators to translate documents, it apparently concluded that in 2011 using MT and editors made more sense than using translators.


    • Computers can be built and programmed to perform certain tasks that mimic AI. No argument here. Yet, this is a proof-of-concept and PR in essence, nothing to do with real artificial intelligence.
      Machine learning is just a pompous name for data and process optimization. It can be useful as an aid in very specific scenarios, but thinking that it applies globally can be harmful. I’ve been enough around technology to see both.

      There are many reasons that prompted the move to the cloud. A lot of them are commercially-driven, but there is one big technical obstacle as well. We are approaching the end of Moore’s Law. You can shrink transistors only so much, and currently a commercially viable alternative has not been developed yet. So if you can’t shrink them, create big computing-clusters to do all the heavy-lifting and use the user’s device (particularly handheld devices) to pre- and post-process the data to be sent and received from those computing-clusters. This, in conjunction with the increased bandwidth of wired and wireless networks that facilitate data exchange, as well as the general acceptance of the cloud by the average user is what brought forward the increased availability of generic MT (and other cloud services). Availability is probably what made that agency switch to MT in 2011 and not 2001, but availability and viability are different things.

      Furthermore, the fact that a certain agency (or agencies) switched to MT doesn’t prove anything about the technology or the human translator. Many agencies are hiring for years now amateur and incompetent translators to produce low-quality translations because it allows them to sell high and buy low. So, I guess that we can also conclude that because incompetent people can get themselves hired in our world (not only in the translation market), the days of expertise, skills, knowledge, judgement and ethics are over.

      Technology is a tool, nothing more. When used properly it can have significant benefits to professionals. When misused and abused it could lead to disastrous results. Anyone who spent enough time with or around technology knows that.

      The charlatans I spoke of (you can also call them snake-oils salesmen) are those who put the technology in front of the human expertise, skills, and judgement and portray it as some kind of superior magic and/or alchemy.


  6. Shai, I just want you to know that you are wasting your time with Jeb.

    He is not going to consider seriously any of your obviously valid arguments. He will just keep repeating the same thing over an over again. MT will replace human translators, and that’s that.

    That is why I stopped talking to him 3 years ago.

    He came back recently under another name but I immediately recognized his substance-free style.

    You can respond to his missives if you want to, I am not going to censor him, but I repeat, it is a waste of time.

    There is a reason why he has to hide behind a pseudonym.


  7. Watson trounced two of the best Jeopardy players and if all that audiences wanted to see was the top performer than humans would no longer earn money playing. That level of computing power will be available for $1,000 around 2018 and will be a great assistance to lower end translators, thereby pushing down rates from today’s levels.

    Moore’s Law will end in 2021 or 2022 but new stacking designs will easily push the doubling of computer power to at least 2030.

    The reason that MT wasn’t used much in 2001 was simply because computers weren’t powerful enough to make it useful.

    I disagree that _many_ companies were using low end translators for years. If that is the case then why does Steve steadily increasingly freak out about the translation business every year.

    Any company that gets burned by a lower level translator (and some of these produce high quality work) will stop using a given agency. So far this seems to be working well unless there has been a rash of complaints that I haven’t heard about.


    • Wow, you do seem to be what they like to call a Troll in online fora.
      Some of your unfounded claims are even funny, I’ll give you that.
      It was nice “exchanging” opinions with you, but I think that I will take Steve’s advice and stop it here.


  8. @Shai

    MT is an alternative to human translation in some cases, such as when the cost is prohibitive.

    I recently gave an old client a cost estimate for translating a whole bunch of very long Japanese patents: 23,000 dollars.

    I waited with baited breath as they were negotiating with their client, a week, two weeks, and then I gave up on the project.

    After more than a month, they came back and asked me to translate only selected paragraphs from some of those patents. So I assume that since their client said not to the price, they had everything “translated” with MT, and now the patent lawyer who is working on this is grasping in the dark, trying to determine from the MT gibberish what is really in those patents.

    So far I charged them about 700 dollars for the selected portions.

    This is MT in action. It is possible that all of the translations of all of the patents were not really needed, in which case the law firm’s client will save a lot of money.

    But it is also possible that key issues that should have been covered in the patent lawyer’s work will be missing due to lack of information, i.e. lack of human translation, in which case the law firm’s client may be saving a significant amount of money in the short term, while it may be losing much more money in the long term.


    • MT is good for discovery purposes, but even then there are quite a lot of parameters that affect the suitability of a certain MT engine to fit for purpose.

      In the B2B world, budget limitations – aka “It’s too expensive” almost never means “I don’t have enough money to pay for this if I wanted to”, it usually means “I don’t want to pay for this that much”. They simply don’t appreciate or understand the importance of the service/product to pay for it a certain amount. Yet, they’ll happily spend twice as much on other things. Sometimes it is even means “we don’t know what we are doing and improvise as we go. Know we are facing a problem, turns out that we also need X, but already burnt through our project. We can’t (i.e. don’t want to) go back and ask for more because it will expose our incompetence, so please take it onto yourself to subsidize our mistakes and oversight”.

      Nothing one can do about it, different people have different needs and approaches.
      There is always a cheaper alternative to almost everything. One can take a cheap pair of pliers and pull one’s tooth instead of going to the dentist, for example. It all comes down to the perception of value, risk, and pain.

      We here all know the value of work and service. Others, especially the buyers, less. For some cheaper alternatives will work, for others not. I think that the recent availability and use raw and badly implemented MT for commercial purpose is in the embryonic stage, The damages and costs will be discovered in the long-term.

      Even the law firm understood that MT is not a replacement for your skill and expertise, Steve. But they fail to understand that even content discovery heavily depends on the accuracy and readability of the text your work with.
      A much better approach would be to hire your or someone else to do the discovery for them, while mentioning what they are looking for, but every day someone, somewhere, think that they have found a new and innovative way to do something. The joy and feeling of triumph that accompany the short-term savings are an addiction in our world of consumerism.


  9. ”Even the law firm understood that MT is not a replacement for your skill and expertise, Steve. But they fail to understand that even content discovery heavily depends on the accuracy and readability of the text your work with.”

    The law firm understands that MT is not translation. But if the client won’t pay (at least not at this stage), their hands are tied.

    “There is always a cheaper alternative to almost everything. One can take a cheap pair of pliers and pull one’s tooth instead of going to the dentist, for example. It all comes down to the perception of value, risk, and pain.”

    LOL …. But as the French proverb says, “le bon marché coûte toujours plus cher”.


  10. So I’m somehow a troll, who provides unfounded claims, some of which are even funny, yet you don’t feel compelled to point out the error of my ways? OK…

    I stated that the chip industry expects the extension of Moore’s Law to easily go to at least 2030 and that the power of Watson will be available for $1,000 by around 2018.

    Many companies have been using MT/editing for a while so when should we expect that in the “long term” they will understand that they messed up? At the same time translators who charge low rates will have an increasingly more powerful tool to assist them, thereby lowering all rates.

    Don’t you guys reading this blog wonder why if everything is so peachy in translation land why Steve rants about “nanolators” and “zombie translators” two or three times a month?

    I just think it is important for translators to keep in mind that 1) within 5 years every translator, charging 5 cents a word or 20 cents a word, will have the power of Watson on their desk 2) By 2024, computers will be a thousand times more powerful than today. 3) The number of Chinese and Indian translators will keep increasing.

    I’ve posted a couple of times a year. See y’all in 2015…


  11. Jeb,
    These are your predictions, but you choose to present them as absolute, undeniable facts. I would be wary of those who act as if they hold the one universal truth and who are certain without even a shade of a doubt that their way is the only way.

    About three years ago I had a debate with someone from the MT/Technology lobby who made very similar claims as you about the power of technology, what the future hold, and the gloomy future that awaits just around the corner all of those foolish human translators who don’t recognize it and refuse to jump on the MT bandwagon. They claimed that by 2015 not a single word in this world will be translated by humans anymore: Everything would be MTed, and the more “demanding” stuff (including literature – i.e. fiction) will be PEMTed, effectively marking the end of human translations.

    May I ask if you are a translator yourself or a proponent of MT?

    I’ve been enough around technology to experience several “revolutions” and their ensued rise and (usually) fall. The reality is that there isn’t magic, just best practices, expertise, knowledge, judgement, and good tools. Good tools are built to support and enhance the human expertise and sentient workflow, and are usually developed for the needs of the profession and professionals who will use them, and usually based on their input as well. There are certain ways to doing things, and technology that works against them doesn’t change anything just because it exists.

    Steve really doesn’t need me to speak for him, but I will do that anyway. He doesn’t rant, he addresses important subjects, part of them are about the fallacy and unethical practices in the translation market.
    Not everything is peachy in the translation market, not because of new magical technology, but due to “pitchy” people who decided to declare war on the profession so that they could tap into the demand, while inflating an artificial problem so that they could sell their magical services and/or products (i.e. solution) to maintain or increase their margins; So that they could shift the perception of what skills are required (i.e. what people will be willing to pay for) to translate from those of “traditional translation” to those of the elusive concept that is “IT”, that people don’t understand, but know that it has something to do with technology so they should be willing to pay for it (unlike translation, of course).

    Your last paragraph is actually quite telling about your motives for writing what you write.
    I will conclude that a technology (or intermediate service) that needs to resort to spreading fears and disinformation to promote sales, is in a less secure position than what it would like to led others to believe.

    I don’t plan to continue arguing with you here. It was just important for me to write the above.
    See you in 2015, I guess.


  12. Re: “Old-fashioned work that is based on a high level of concentration, combined with human knowledge that is based on education and many years of experience is still and always will be the best approach to translation, and that is what I intend to keep offering to my clients.”

    I can only say “Hear, hear!! 🙂

    (The rest of the article was well worth the read too, of course.)


  13. @Sias

    It is quite deductive of you Watson (not the Jeopardy champ) to be able to tell what my motives are from a short paragraph. But of course, once again, you are wrong. Those are not predictions since a simple extrapolation of the past. Anyone with an elementary knowledge of computers knows that Watson will be available for $1,000 by 2019 (within 5 years) and that a laptop/PC will be 1,000 times as powerful in 2024 just as a new model is 1,000 times as powerful as a 2004 model.
    And there is something controversial about more Indians and Chinese translating as MT becomes a more powerful tool?

    There is no such thing as an MT/Technology lobby just as there are no translator unions. I’ve never been a “proponent of MT” as Steve has been because it is similar to a translator saying he is a proponent of power tools. The reality is that they exist and are going to get better and better, thereby continually lowering the barrier of entry into what was once a sort of carpentry with only traditional tools — a dictionary.

    One could take this to extremes and say 1) only native English speakers, which Steve is not, should be given permission to translate into English 2) a translator must use a typewriter, not a word processor and 3) the translator must only use paper dictionaries as he or she 4) uses only the phone or mail — not email or fax — to communicate with other translators or deliver a translation. Of course, 6 months in jail for any translator caught violating one or more of the above. That would certainly keep those nanolators at bay!

    However, we have a concept called “freedom” that applies to both translators to enter and exit the business and to companies who use translation services.

    For the past year and especially over the past six months, Steve has repeatedly written how MT output is a joke and how translators who charge less than him or not worthy of kissing his feet, yet he rails against what he considers to be a dysfunctional business.

    So what’s going on? Why can’t Steve and the rest of you just go out and find direct clients or 25 yen a word? Why be so bothered by translators charging under 10 cents a word? Why be opposed to a translator’s freedom to charge whatever he wants or a client’s freedom to pay what she wants for a product?

    What is especially ironic is that Steve left his own country decades ago because it wasn’t so free, oui?


  14. oops, I meant @Shai. I should have let a computer type my message…


  15. I find Dragon useful because of problems with RSI. It’s also quicker than my typing. It doesn’t mis-spell words but it does mishear them. I find it less tiring but I do type for short periods sometimes just for fun (?). No magic but nevertheless impressive.


  16. I too have to invert verb and subject, whole clauses, etc. (Italian to English legal texts), so the ‘units’ Wordfast breaks text up into are no use sometimes (but they can be ‘forced’ to expand or reduce), but I do appreciate some features: just to mention a couple: the translation memory which suggests the translation for a term (or whole paragraphs) I may have translated once, pages and days before (or years before – think ongoing disputes with tax authorities), so I don’t have to search the document from the beginning – and breaking the text into parallel units means I don’t overlook text and leave something out. These two very basic features do indeed speed the process up.


  17. Whenever the topic of MT replacing humans comes up, I like to point out this video:


  18. The added data extrapolation layer that Google adds to “guess” what is an appropriate sentence structure for a sentence it doesn’t already have in the huge databases used by the system is an interesting approach, but it also have its limitations.


