Posted by: patenttranslator | December 26, 2017

Some Things Are Impossible to Prove Or Disprove

Some things are impossible to prove or disprove.

The existence of God, for example is one of them.

Or so I am told.

On the other hand, some things can be easily proven or disproven with a simple experiment based on a small dose of logical thinking. One of them is the fact that human editing of machine translation is in fact more time-consuming than actual translation from scratch.

It is easy to design an experiment on the basis of sound and logical criteria to prove that human editing of machine translation, even of a relatively very good machine translation, assuming that there is such a thing, takes longer than an actual translation done by a qualified and experienced translator.

Take as an example a translation that you have just finished; in my case it could be for instance a translation of a German or Japanese patent, for example about three thousand words long.

Given that I have been translating patents for a living for more than 30 years, I will boldly assume for the purposes of this post that my translation would be very good and very accurate, ready for publishing on an official website of a patent granting authority, without even minor problems which often occur in most machine translation, but which are relatively easy fixed.

By relatively minor problems I mean for example things like when the software  program does not know how to translate a long German compound noun and simply keeps the long German word in the machine-translated text which is otherwise in English. If the software does not have the answer, it simply keeps the words in the original language. I see this all the time in machine translations of patents from many languages.

Although this is a flaw that is frequently encountered in machine translations of patents, this kind of problem, which may be impossible to solve by machine translation software, is easily fixed within seconds by a human translator who understands the German term and knows the equivalent in another language, without even taking a look at the original text.

But since we are starting with a translation that was obtained from an experienced human translator, the translation would not contain the problems that often crop up in machine translations.

Now let’s assume that in order to simulate one of the problems that could be introduced by machine translation, we would use the search-and-replace function of our word processing program and replace five correct terms in a flawless translation obtained from a human translator with incorrect but perfectly plausible terms.

Problems like this are much more difficult to correct because they can be verified by a human translator only in the context of the original text.

For example, let’s say that for the purposes of the experiment, a translator would replace “tall” with “small, “acceleration” with “deceleration”, “wet” with “dry”, “organic” with “inorganic”, and “is not” with “is”. The word “organic” can be easily mistranslated by machine translation if the software misreads the first letter of the word and the word “not” can be easily overlooked by a machine translation program in a sentence, especially since it is in different positions in the sentence in different languages. And so on and so forth.

If we don’t know where the problem might be hidden, the only way to fix the mistakes that have been introduced into an otherwise perfect translation by non-thinking software is to proofread the translation in the context of the original text, which is to say to painstakingly compare the translation to the original text if not word by word, then at least sentence by sentence.

And as every translator knows, something like that is very time consuming.

Such a comparison, when we know that the translation is likely to contain problems, but don’t know where and what the problems are, would take much longer than if we proofread a translation that was done by a competent human translator, or by ourselves, and we are looking only for minor problems such as omissions and typos.

The “translation industry” likes to pretend that editing of machine translations is a logical next step and a straightforward process that can be easily used to fill in the gaps left in the approach that used machine translation to lower the cost, in conjunctions with humans who are supposed to quickly and in an inexpensive manner fix and clean up the machine translation output.

That is why the “translation industry” generally pays very low rates for proofreading, and the result is that with some exceptions, mostly just “newbies” are willing to do proofreading, even when it comes to translations that were done by human translators.

But the thing is, proofreading is a relatively painless and straightforward procedure only if the mistakes in the human-translated or machine-translated texts were so obvious that we would not need to compare the machine translation to the original, or if such a comparison could be made quickly and relatively infrequently.

Most “post-processors” of machine translations are probably working very quickly and without comparing the machine translation to the original text much for one simple reason: they get paid so little for their mind-numbing drudgery that they can’t really afford to do much more.

The fact is that even if a machine-translated text is post-processed by a human “almost translator” and even if the post-processed result looks a little bit better than a machine translation, the actual mistakes in the machine translation can be discovered only if the human post-processor compares the machine-translated text to the original text sentence by sentence, if not word by word.

As I have said in the introduction, I don’t know whether God exists, because that is something that can be neither proven nor disproven. Generally speaking, I don’t think God exists because it makes no sense to me. There are days when so many good things happen that I kind of have to wonder whether everything has been planned in advance by a benevolent higher power. And there are days when it is obvious to me that everything is controlled by Devil …. which in a way would also be a sort of a proof that God does exist, I suppose.

Call me an agnostic rather than a non-believer.

But one thing that I do know for sure is that post-processed machine translation that are “almost as good as human translations” as the “translation industry” likes to put it do not exist, because that is something that can be proven very easily with the simple test that I have proposed in my post today.

Post-processed machine translations are much more likely to be a poison rather than a cure for the problems that are unavoidable with machine translations.

 

Advertisements

Responses

  1. I fully agree with your critique of machine translation post-editing, although my conclusions on the existence of God are rather different from yours. In both areas, I feel that “the proof of the pudding is in the eating” – i.e. actual experience takes us further than mere abstract reasoning. (Or as you put it: “as every translator knows …”).
    There are of course MT advocates who would discount your experience and mine because we do not couch everything in numbers, averages, standard deviations and percentiles. I suppose this is the perennial divide in human endeavour – on the one hand there are those that “do”, on the other hand there are the “experts” who tell us why that is all wrong and we should do everything the way it looks on their drawing board (or calculator).
    Oh well, back to my complex scientific text …

    Liked by 1 person

  2. Hi Victor,

    Thanks for your comment.

    The problem with the “experts” who would like to use calculations to prove something about translation is that they are not translators and therefore have no idea with the term “translation” even means.

    For example, if a machine translation of an expert opinion for a banker who has to decide whether to approve a big mortgage loan says “It is our expert opinion that the property is definitely worth 10 million dollars” while the actual expert opinion in fact says “It is our expert opinion that the property is definitely not worth 10 million dollars”(because the MT software missed the word “not”), the statistical approach would tell us that the machine translation is 95% correct.

    And accuracy 95% is pretty good, right?

    Liked by 1 person

    • Steve, Your post demonstrates, once again that the so-called “translation industry” is just a gang of crooks who raided a profession and an area that they are TOTALLY CLUELESS about.

      That includes ALL of nowadays’ “translation software” developers, INCLUDING Kilgray.

      Those people are ALL CROOKS.

      The SOLE AND ONLY PROFESSIONALS OF THE TRANSLATION MARKET ARE TRAINED AND PREFERABLY EXPERIENCED TRANSLATORS, and I would add: who are still translating on a daily basis –

      not former translators who started an agency long ago, have forgotten what translating was all about – especially with nowadays’ “translation tools” – and usually use brainless (i.e. cheap) secretaries for daily operations – making them belong to the “LSP” gang of crooks, sorry…

      So the SOLE AND ONLY TRANSLATION MIDDLEMEN that END-CUSTOMERS should be using are HYBRID TRANSLATORS who, like Steve Vitek, derive about 66% of their revenues from translating themselves and about 33% of their revenues from contracting out to trusted, trained and experienced colleagues whatever they cannot do themselves (volumes above something like 2,000 source words per day in their competence areas; language pairs and specialisation areas in which they have less competence, etc.).

      Those colleagues are treated in a respectful way, not only in all communications, but also as to offered fees (they understand that TIME is MONEY, contrary to clueless LSPs) and given production deadlines (again, because they understand the translation process and know that TIME is needed for CHECKING anything that the translator is unsure of, leading to QUALITY).

      Clueless view “translators” as mere bilinguals who should “type as fast as possible” – thus for peanuts, which in turn leads to short production times: the very recipe for GARBAGE.

      Worse, those other crooks of the translation market, i.e. untrained bilinguals pretending to “translate” are POLLUTING MACHINE TRANSLATION CORPORA… leading to OTHER incorrect translations, based on that GARBAGE.

      So 3 types of crooks are polluting the translation market nowadays (and must be kicked OUT):

      – untrained bilinguals;

      – middlemen who are not trained and experienced translators and/or who are not translating on a daily basis, thus are incompetent (especially regarding the new “tools” that the market is being polluted with);

      – “translation software” developers, who have never translated a single line of text in their entire lives, yet pollute the translation market.

      Governments are accomplices, since most judicial systems rely on cheap, untrained bilinguals. They don’t try to understand the present translation market and seem to earnestly believe SDL’s (and other crooks’) sales pitches: that some obscure software programs are supposedly making translators “translating faster, better (?) and (thus?) cheaper”…

      All end-customers are completely DISINFORMED about the reality of the translation market.

      So it is up to us, freelance translators, to INFORM them.

      I repeat: if CAT tools give translators the IMPRESSION of having a translation environment, in reality is CONSIDERABLY SLOWS THEM DOWN compared to just using MS Word with Autocorrect (which most of them don’t know about…).

      The SOLE AND ONLY AIM of the so-called “CAT” tools are to EXTORT a substantial part of freelance translators’ remuneration.

      Never ADVERTISE (in your CV, online) that you own such or such CAT tool.

      Let those fucking customers ASK for it: THEN THEY ACT ILLEGALLY since ONLY EMPLOYERS have the right to impose such or such working tool. Not fucking customers.

      You have to EDUCATE those fucking customers.

      Nowadays they are only just “fucking customers” because, obviously, they tend to BELIEVE THE LIES of the above-mentioned fucking crooks.

      Of course, the 2008-? economic crisis has played a role, but it’s almost finished now, so there is no reason for customers to keep on believing such LIES from LIARS.

      Like

      • Hi Isabelle:

        I would like to correct your comment in one point: The income that I generate as a translation agency has been, since about the year 1992, in the range of 10 – 20% percent of my total income, not 33%, although 33% would be a good yardstick for determining who is still a translator and who became an agency.

        I personally am trying to remain a translator rather than trying to become an agency. Although I enjoy being an agency too and think that more translators who are inclined to work as an agency should give it a try, I am primarily a translator.

        But the thing is, if you want to work for direct clients, you have no choice but to be an agency as well because you should not be saying “no” to your direct clients if they ask for something that you can provide for them, and do it much better than your typical clueless agency, even if you can’t translate it yourself.

        This year the income from my own translations will be probably slightly more than 90% because thanks to my website, a new customer (a small patent law firm) found me and kept me very busy throughout the year, with the exception of one month.

        So about 10% of my income last year was from my work as an agency (and that is work too!), 20% from my Social Security income (yup, I’m an old fart now), and 70% from my own translation, of which more than 90% were translations of patents.

        It was a very successful year for me in terms of income, only year was better so far for me in the last thirty years, namely the year 2008.

        Thanks for your comments in the last year and years before, Happy New Year 2018 and keep those comments coming!

        Liked by 1 person

  3. Yup, 95% is mighty fine – much better than some of the fake news that currently adorns the digisphere. But the 99% accurate machine translations actually worry me more – they look so cursedly plausible, and they are phrased in such eloquent prose, that hardly any post editors will spot the 1% of potentially fatal or ruinous misinformation.

    Liked by 3 people

    • Indeed, Victor. And what most do not realize is that the plausible parts of the text are generally drawn from a database of human work; it is not created dynamically by the software algorithms. And when those kick in, well, good luck. One advantage of human work is that if a monkey does it, this is usually apparent throughout the text; one is on guard and it is soon apparent whether the text can be fixed or should be binned. If a more or less standard contract is pushed through machine pseudo-translation using, for example, my 17-year archive of such work as a basis, the result may sound very professional but be full of linguistic bear traps that only a real expert can discern.

      Liked by 1 person

      • “And what most do not realize is that the plausible parts of the text are generally drawn from a database of human work; it is not created dynamically by the software algorithms.”

        Exactly. But since even the plausible looking text is only the closest match found in a database of translations that were done at some point by human translators, the actual meaning of a new text to be translated may be the exact opposite of what the machine translation says based on a previous human translation.

        This happens for example even with extremely repetitive and fairly simple patent translations because every patent has a different context.

        Unless every single word in the machine translation is validated and mistakes are fixed by a human translator, which beats the main purpose of machine translations, the machine translation can be used for general information which may be accurate or inaccurate.

        Machine translation is only a tool. It is a very good tool for some purposes, but it is not translation.

        Like

  4. This is EXACTLY the issue with machine translation, as I have also explained in this article: https://www.loekalization.com/machinetranslation.html

    Like

  5. Seven thousand Euros for a laptop?

    Man, you must be raking it in!

    Like

  6. In the case of patents, in particular, I cannot see how MT could possibly handle the very complex explanations required, since most languages use descriptive text in very different ways. Then take the French expression “pan coupé” which means a right-angled object with the right angle cut off. There is no equivalent in English it would have to be explained and that is something that CAT tools and MT are completely incapable of doing.

    Like


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

Categories

%d bloggers like this: