Rage Against the Machine Translation

© Copyright.

Machine translation technologies are developing by leaps and bounds. They are penetrating further and further into the lives of people who have nothing to do with the translation business, not to mention professional translators.

These technologies enable tourists in foreign countries to communicate with the locals with ease in a language they would otherwise have to spend a long time studying. To read an advert written, for example, in the Khmer language, you simply need to point a smartphone camera at it and a second later you can read its translation. Skype can recognise your voice and automatically translate your speech into another language in real time. We are no longer on the threshold of the future — it is with us here and now.

Looking at all these technological triumphs, one wonders: what do translators do and what do they get paid for if machines have already learned to instantly translate voice and text from and to almost any language?

A cruel syllogism

Let’s make two assertions.

Firstly: translation is the art of conveying meaning.

Secondly: machines do not know how to convey meaning.

No one usually argues with the first assertion, while the second, on the contrary, gives rise to heated discussions. A detailed discussion of it can end up with us mired in philosophical discussions about the definition of meaning, what it means to think, etc. However, it is difficult to argue that modern computers, at least at the current stage of their development, do not extract meaning from texts in the human understanding of the word.

The following unexpected conclusion inevitably follows from these two assertions: what an engine produces is a priori not a translation.

Not only tourists, but also many linguists are often dumbfounded by this conclusion. Why can’t this machine perform a translation? After all, my smartphone has already read me ads written in another language, and I have already managed to translate texts from a completely unfamiliar language using Google Translate. It really works!

Let’s take a look at some terms.

Misleading terms

In science, technology and other areas of human activity there are many misleading, but, unfortunately, well-established terms. They are tolerated and continue to be used, not because they reflect a particular meaning, but because everyone is familiar with them, and historically that has been the case.

For example, ’supernova’ is a misleading term. A supernova is not the birth of a new star, as many people think, but rather the death throes of an old, dying star. Us humans, from a distance of thousands of light years, believe that a new star has lit up in the sky, but in fact, somewhere an incredibly remote star has died, which had been shining brightly for millions of years. By the way, ’light year’ is also a misleading term. Hearing the word year, uninformed people think that it is a unit of time, while in fact massive astronomical distances are measured by light years.

Machine translation is another example of an extremely misleading term. Machine translation is an oxymoron. A translation cannot be done by a machine. If it was performed by a machine, it is not a translation, but if it was performed by a person, it is not a machine translation.

Now it is difficult to establish who exactly coined the term machine translation. Apparently, this person was not a linguist or simply did not take the time to coin a more suitable term. And so he or she created a terrible terminological confusion: The thoughtless substitution of words from one language into another began to be referred to as actual translation (which conveys meaning) on the simple basis that outwardly the results of these two processes look similar.

So machine translation began to be perceived as translation in its own right. The difference is huge, but the layperson is not aware of this due to the presence of the word translation in both terms.

Translation is a real French wine, while machine translation is an attempt to reproduce its formula in a chemical laboratory. This is a sophisticated fake. It looks the same, it tastes the same, but it is not the same. You would not refer to this synthetic product French wine. But for some reason we refer to the synthetic product created by machines as a translation.

If the process of replacing one word with another was called something else, and not a translation, this terminological confusion and many related misunderstandings would not have arisen. You could call it, for example, transposition or auto-conversion — whatever you like, as long as the word translation is avoided.

But the misleading term machine translation has unfortunately taken root.

Machines and the Turing test

Here you may object: How can this be? After all, machine translation is almost indistinguishable from human translation. Of course, there are flaws, but it is understandable, and people are flawed too!

Yes, that’s right — the similarity is striking. After analyzing millions of sentences translated by humans (this is important), the machine itself works out complex and occasionally unclear relations between sentences in different languages. It subsequently uses these relations to produce texts that are good enough to be machine or human translations. Which means it has successfully passes the Turing test.

However, this does not change the facts of the matter: a machine translation is has no meaning in the sense that the machine that creates it does not reflect on the message it is conveying. The machine can find frequently occurring words, establish how they are syntactically related and determine the structure of sentences. But unlike a person, it is unable to extract meaning from this process.

Machines are able to manipulate data and identify corresponding relations. They do it much better and faster than humans, and this is what they were created for. The amount of data can be huge, and then it is called big data. This data can sometimes fit together in an altogether unexpected way. And the more data there is, the more corresponding complex relations are identified and the closer a machine translation gets to a human translation. In such cases they say: “the machine translation engine is first-rate”. However, a first-rate machine translation does not mean it conveys meaning.

For a machine, a text is nothing more than a sequence of characters. For a machine, Shakespeare’s Hamlet, and the delirium of a patient with schizophasia, and lorem ipsum, and a random set of letters typed by your cat while stepping on the keyboard are equally meaningless.

A machine will “translate” both Hamlet and random gibberish with an equal degree of enthusiasm. Since it makes no sense of either of them, for the machine Hamlet and random gibberish make for an equally meaningful (or rather, equally meaningless) sequence of words. And the machine will attach an equal amount of meaning to the translation of the words, and that is precisely zero. For the machine, both the source texts and the texts it generates are meaningless, no matter how meaningful they may be to a human.

Machines and the duck test

The so-called duck test implies that a person can identify an unknown subject by observing that subject’s habitual characteristics. This is its usual expression:

If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.

At first glance, the machine translation successfully passes the duck test along with the Turing test. After all, it looks like a translation, reads like a translation, and sounds like a translation — it would seem that there is every reason to believe that it is a translation.

But the devil is in the details. It is best to pass the duck test in ideal conditions — when it is daytime, you are close to an object and have a clear view. And what happens if this duck swims away to the other side of the river? And if you are short-sighted? And if it’s already twilight or it’s raining? And if you are driving and only get a quick glance at it?

Take a closer look, and it turns out that this is not a duck at all, but in fact a goose, or an otter, or a reflection of a strange shape in the water. And that’s it — the duck test has failed. No wonder the person who coined the phrase prudently left a loophole and included ’probably’ in his or her wording.

It is the same with machine translation. At first glance, everything seems fine — the words are syntactically related and the text is readable. But take a closer look and the house of cards collapses: the same term is translated differently in different places, at one point the machine incorrectly deciphered an abbreviation, in another it didn’t get a joke, later on it ’forgot’ the gender of the protagonist, and this sentence is just a complete mess, etc.

For each language there are certain features that make machine translations stand out. These features can be difficult to describe, but experienced translators can easily tell if a text has been translated by a human or by a machine. You can’t fool them.

Let’s give machines their due credit: they have learned to imitate human translations quite well, and sometimes they are even indistinguishable. This is great for tourists and bad for translators. But these are not translations, they are imitations. You just thought that.

Machines in search of meaning

As you can see, the dividing line between human and machine translations is the ability to convey meaning. So the question arises: what do machines need in order to convey meaning? What does they need to reach the level of human translators?

Consider, for example, the above-mentioned Hamlet. This is an unusual text, but we can use it to clearly show the difference between humans and machines.

Even humans sometimes find it hard to understand this play. It is considered a classic and children study it at school. But it is not a favourite among teenagers.

Only readers well-versed in the works of Shakespeare will understand the overall message of Hamlet. They should already know in advance what love, friendship, betrayal, irony and revenge are. To understand why the characters act as they do, we must experience the feelings that they experience at least once. We need to, as they say, “live in this world”. As a result, Hamlet is usually only enjoyed by adults who have gained some life experience.

The difficult job of translating such a complex work into another language is further complicated by the fact that it is set back in the Middle Ages and the text is very poetic. Not surprisingly, there are a huge number of different translations of Hamlet. There are more than 30 Russian translations of Hamlet alone.

Conclusion: in order to understand the author’s message, you need to understand human nature. Generally speaking, for this you need to have an awareness of how the world works. That’s what translators are paid for — to convey the message. And this doesn’t come cheap.

Machines lack the human abilities to fully convey meaning. This is the reality for machines. But they don’t care, and they don’t even understand the meaning of the word “care”.

Unexpected conclusions

In the final part of each article, we usually summarise the “expected” conclusions we have come to. This article differs from the others: it will contain “unexpected” conclusions. We will now list what does not follow from the above.

  • A real translator will never use machine translation. This is too categorical a statement. Using machines to help perform a translation is not such a bad thing. It is quite possible to use them. For example, a good machine translation engine significantly speeds up the translation of standard technical texts. But when using such an engine, it is important to be clearly aware that the author of the text you are working on did not reflect on its meaning.
  • Machine translation is useless. This is not entirely true. It is useless if used to translate texts for which it is not designed. It has no chance of conveying the deep themes or messages of Hamlet. It is so complex that humans have translated it repeatedly but there is still no single translation that is acceptable to all. However, there are texts that machine translations translate well (we have already discussed them), and this is where machines have value.
  • Machines will never learn to convey meaning. We do not know this. In any case, this does not follow from the above. It is possible that one fine day your smart refrigerator will begin to write poems or discuss with you the meaning of its existence. It is also possible that the development of machines will follow a different path and our future portends a Matrix-esque techno-apocalypse.

One thing is certain: when machines learn to convey meaning, translators will truly no longer be needed. But when this will happen and whether it will happen at all is impossible to predict.

-->

Other articles

QA-suicide: How Sometimes You Can Have “Too Much Of a Good Thing”

02.03.2016 Terminology control is one of the major challenges in localization. As a rule, serious companies have a well-elaborated terminology relating to the products they manufacture, including large glossaries of approved terms. It is important that terminology is consistent in translations. Inconsistent terminology, at best, will make a bad impression on users. At worst, it may result, for example, in patent proceedings. That’s why special checks are carried out to control terminology. One example of this is that every modern CAT (computer-aided translation) tool now has in-built terminology control functions.

Freelancer Selection Criteria for Realizing Projects at Translation Agencies

03.02.2015 In this article we will consider the reasons why freelance translators who have passed the initial tests at translation agencies do not always receive a large volume of work. Also, we will provide recommendations on building long-term relationships with customers.

Development and implementation of quality management systems in translation agencies

13.03.2015 At Technolex we break quality assurance process into 2 levels: - All of the steps necessary to ensure that the work requested by the customer is of high quality. Everybody is well aware of these processes. - Internal process of evaluating quality of the tasks fulfilled by translators. It is used to train translators, upgrade their skills and also for incentives in order to improve quality.

EN--Спасибо!

Мы получили ваше резюме.

Как только мы его изучим, мы свяжемся с вами.

Thank you!

We have received your message.

We will contact you once we read it.


Normally we reply within an hour
if the message is received between
7:00 and 15:00 GMT.

Thank you!


You have subscribed successfully.

Message

+ Attach file
EN-

Мы внимательно изучим ваше резюме
и свяжемся с вами в ближайшее время