Rage against the machine translation

our blog

19 November 2019

Machine translation technologies are developing by leaps and bounds. They are penetrating further and further into the lives of people who have nothing to do with the translation business, not to mention professional translators.

These technologies enable tourists in foreign countries to communicate with the locals with ease in a language they would otherwise have to spend a long time studying. To read an advert written, for example, in the Khmer language, you simply need to point a smartphone camera at it and a second later you can read its translation. Skype can recognize your voice and automatically translate your speech into another language in real-time. We are no longer on the threshold of the future — it is with us here and now.

Looking at all these technological triumphs, one wonders: what do translators do and what do they get paid for if machines have already learned to instantly translate voice and text from and to almost any language?

Cruel syllogism

Let’s make two assertions.

First: translation is the art of conveying meaning.

Second: machines do not know how to convey meaning.

No one usually argues with the first assertion, while the second, on the contrary, gives rise to heated discussions. A detailed discussion of it can end up with us mired in philosophical discussions about the definition of meaning, what it means to think, etc. However, it is difficult to argue that modern computers, at least at the current stage of their development, do not extract meaning from texts in the human understanding of the word.

The following unexpected conclusion inevitably follows from these two assertions: what an engine produces is a prior, not a translation.

Not only tourists but also many linguists are often dumbfounded by this conclusion. Why can’t this machine perform a translation? After all, my smartphone has already read me ads written in another language, and I have already managed to translate texts from a completely unfamiliar language using Google Translate. It works!

Let’s take a look at some terms.

Misleading terms

In science, technology, and other areas of human activity there are many misleading, but, unfortunately, well-established terms. They are tolerated and continue to be used, not because they reflect a particular meaning, but because everyone is familiar with them, and historically that has been the case.

For example, ’supernova’ is a misleading term. A supernova is not the birth of a new star, as many people think, but rather the death throes of an old, dying star. We, humans, from a distance of thousands of light years, believe that a new star has lit up in the sky, but in fact, somewhere an incredibly remote star has died, which had been shining brightly for millions of years. By the way, ’light year’ is also a misleading term. Hearing the word year, uninformed people think that it is a unit of time, while massive astronomical distances are measured by light years.

Machine translation is another example of an extremely misleading term. Machine translation is an oxymoron. A translation cannot be done by a machine. If it was performed by a machine, it is not a translation, but if it was performed by a person, it is not a machine translation.

Now it is difficult to establish who exactly coined the term machine translation. This person was not a linguist or simply did not take the time to coin a more suitable term. And so he or she created a terrible terminological confusion: The thoughtless substitution of words from one language into another began to be referred to as actual translation (which conveys meaning) on the simple basis that outwardly the results of these two processes look similar.

So machine translation began to be perceived as translation in its own right. The difference is huge, but the layperson is not aware of this due to the presence of the word translation in both terms.

Translation is a real French wine, while machine translation is an attempt to reproduce its formula in a chemical laboratory. This is a sophisticated fake. It looks the same, it tastes the same, but it is not the same. You would not refer to this synthetic product French wine. But for some reason, we refer to the synthetic product created by machines as a translation.

If the process of replacing one word with another was called something else, and not a translation, this terminological confusion and many related misunderstandings would not have arisen. You could call it, for example, transposition or auto-conversion — whatever you like, as long as the word translation is avoided.

But the misleading term machine translation has unfortunately taken root.

Machines and the Turing test

Here you may object: How can this be? After all, machine translation is almost indistinguishable from human translation. Of course, there are flaws, but it is understandable, and people are flawed too!

Yes, that’s right — the similarity is striking. After analyzing millions of sentences translated by humans (this is important), the machine itself works out complex and occasionally unclear relations between sentences in different languages. It subsequently uses these relations to produce texts that are good enough to be machine or human translations. Which means it has successfully passed the Turing test.

However, this does not change the facts of the matter: a machine translation has no meaning in the sense that the machine that creates it does not reflect on the message it is conveying. The machine can find frequently occurring words, establish how they are syntactically related, and determine the structure of sentences. But unlike a person, it is unable to extract meaning from this process.

Machines can manipulate data and identify corresponding relations. They do it much better and faster than humans, and this is what they were created for. The amount of data can be huge, and then it is called big data. This data can sometimes fit together in an altogether unexpected way. And the more data there is, the more corresponding complex relations are identified and the closer a machine translation gets to a human translation. In such cases, they say: “the machine translation engine is first-rate”. However, a first-rate machine translation does not mean it conveys meaning.

For a machine, text is nothing more than a sequence of characters. For a machine, Shakespeare’s Hamlet, the delirium of a patient with schizophasia, lorem ipsum, and a random set of letters typed by your cat while stepping on the keyboard are equally meaningless.

A machine will “translate” both Hamlet and random gibberish with an equal degree of enthusiasm. Since it makes no sense for either of them, for the machine Hamlet and random gibberish make for an equally meaningful (or rather, equally meaningless) sequence of words. The machine will attach an equal amount of meaning to the translation of the words, and that is precisely zero. For the machine, both the source texts and the texts it generates are meaningless, no matter how meaningful they may be to a human.

Machines and the Duck test

The so-called duck test implies that a person can identify an unknown subject by observing that subject’s habitual characteristics. This is its usual expression:

If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.

At first glance, the machine translation successfully passes the Duck test along with the Turing test. After all, it looks like a translation, reads like a translation, and sounds like a translation — it would seem that there is every reason to believe that it is a translation.

But the devil is in the details. It is best to pass the duck test in ideal conditions — when it is daytime, you are close to an object and have a clear view. And what happens if this duck swims away to the other side of the river? And if you are short-sighted? And if it’s already twilight or it’s raining? And if you are driving and only get a glance at it?

Take a closer look, and it turns out that this is not a duck at all, but a goose, or an otter, or a reflection of a strange shape in the water. And that’s it — the duck test has failed. No wonder the person who coined the phrase prudently left a loophole and included ’probably’ in his or her wording.

It is the same with machine translation. At first glance, everything seems fine — the words are syntactically related and the text is readable. But take a closer look and the house of cards collapses: the same term is translated differently in different places, at one point the machine incorrectly deciphered an abbreviation, in another, it didn’t get a joke, later on, it ’forgot’ the gender of the protagonist, and this sentence is just a complete mess, etc.

For each language, certain features make machine translations stand out. These features can be difficult to describe, but experienced translators can easily tell if a text has been translated by a human or by a machine. You can’t fool them.

Let’s give machines their due credit: they have learned to imitate human translations quite well, and sometimes they are even indistinguishable. This is great for tourists and bad for translators. But these are not translations, they are imitations. You just thought that.

Machines in search of the meaning

As you can see, the dividing line between human and machine translations is the ability to convey meaning. So the question arises: what do machines need to convey meaning? What do they need to reach the level of human translators?

Consider, for example, the above-mentioned Hamlet. This is an unusual text, but we can use it to clearly show the difference between humans and machines.

Even humans sometimes find it hard to understand this play. It is considered a classic and children study it at school. But it is not a favorite among teenagers.

Only readers well-versed in the works of Shakespeare will understand the overall message of Hamlet. They should already know in advance what love, friendship, betrayal, irony, and revenge are. To understand why the characters act as they do, we must experience the feelings that they experience at least once. We need to, as they say, “live in this world”. As a result, Hamlet is usually only enjoyed by adults who have gained some life experience.

The difficult job of translating such a complex work into another language is further complicated by the fact that it is set back in the Middle Ages and the text is very poetic. Not surprisingly, there are a huge number of different translations of Hamlet.

Conclusion: to understand the author’s message, you need to understand human nature. Generally speaking, for this you need to have an awareness of how the world works. That’s what translators are paid for — to convey the message. And this doesn’t come cheap.

Machines lack the human abilities to fully convey meaning. This is the reality for machines. But they don’t care, and they don’t even understand the meaning of the word “care”.

Unexpected conclusions

In the final part of each article, we usually summarise the “expected” conclusions we have come to. This article differs from the others: it will contain “unexpected” conclusions. We will now list what does not follow from the above.

A real translator will never use machine translation. This is too categorical a statement. Using machines to help perform a translation is not such a bad thing. It is quite possible to use them. For example, a good machine translation engine significantly speeds up the translation of standard technical texts. But when using such an engine, it is important to be aware that the author of the text you are working on did not reflect on its meaning.
Machine translation is useless. This is not entirely true. It is useless if used to translate texts for which it is not designed. It has no chance of conveying the deep themes or messages of Hamlet. It is so complex that humans have translated it repeatedly but there is still no single translation that is acceptable to all. However, there are texts that machine translations translate well (we have already discussed them), and this is where machines have value.
Machines will never learn to convey meaning. We do not know this. In any case, this does not follow from the above. One fine day your smart refrigerator may begin to write poems or discuss with you the meaning of its existence. It is also possible that the development of machines will follow a different path and our future portends a Matrix-esque techno-apocalypse.

One thing is certain: when machines learn to convey meaning, translators will truly no longer be needed. But when this will happen and whether it will happen at all is impossible to predict.

MTPE Cost Misuse in the Industry

We all know that nothing good follows when a client sends a message like this when ordering translations: “In our rapidly evolving industry, we must adapt to the fundamental changes it is undergoing…” What comes next is usually a long message that can be summarized as: “From now on, we will use machine translation, you […]

Articles

Why Do We Have an Internal Translation Team?

Language service providers operate under different models, and one core difference is whether they have an internal team of linguists or rely solely on freelancers and subcontractors. Both approaches have advantages, so why did we at Technolex choose to go with an internal team? It obviously comes with additional expenses and risks, as you need […]

Articles

Video Game Localization into Ukrainian

Before 2022, most video games were localized into Ukrainian by amateurs, so Ukrainian gamers could only dream of playing AAA games in their language. This seemed especially unfair, considering that Ukraine is a nation of 40 million people, at least half of whom would prefer a Ukrainian interface. It took a war to change this. For many […]

Articles

How different is Ukrainian language from Russian?

Many Westerners tend to think that Ukrainian and Russian are very similar and even mutually intelligible. This misconception partly arises from a commonly used classification of language families: On this picture, you see Russian, Ukrainian and Belarusian on the same branch, and Ukrainian is put aside from Slovak and Polish. But this picture is closer […]

Articles

Freelance or Ukrainian LSP Subcontractor?

While attending conferences and talking to different language service providers (LSPs) in the U.S., we noticed that some translation agencies have a policy of working only with freelancers and do not consider subcontracting to an LSP from the country where the target language is spoken—for example, outsourcing translation into Ukrainian to a translation agency in […]

Articles