Our initial pitch posited the idea of having a universal, “perfectly” accurate translator, in which cultural and even emotional nuances could be captured and transmitted. Though it is undeniable that translation technology has come to become a very powerful tool, there is no doubt that expanding its breadth to account for such nuances could impact the lives of many. Of course, there are various other resources to supplement machine translation, such as professional translators and formal education in the target language. However, it is no secret that these are luxuries that not everyone has access to, which leads many people to struggle in various settings, whether they be medical, legal, or educational. While we have chosen not to further explore the notion of perfect translation per se, we would still like to focus on how existing technology could be improved to facilitate more seamless communication between different languages and cultures.
Majority of the feedback we received suggested that we narrow the scope of our research. Though the idea of perfect translation would be a remarkable resource, cultures are so vastly different that it would be nearly impossible to interpret every element of a particular language. Some suggested that we focus on identifying a specific target group or context where accurate translation is necessary and crucial, for example, in a medical or legal context. We really took this feedback into consideration as we developed our final moonshot question.
Some of our peers also suggested focusing on improving translation between groups of similar languages, such as those that fall within the same family or emerged from a common ancestor. However, we felt that since these languages were already so similar, there was less impetus to develop a better translation device.
In addition, others also noted that translation apps, though indeed flawed, are still pretty useful and will continue to evolve as engineers develop more comprehensive algorithms in the near future. Thus, we were encouraged to think a bit bigger in terms of our ultimate goal. While it is true that algorithms will continue to advance, we wanted to stress the difficulty of capturing non-computational aspects of language, such as emotion, in an algorithm, and being able to do so would most definitely be in moonshot range. However, we are choosing not to explore this more technical avenue, as our team lacks the necessary experience with machine learning and artificial intelligence to begin developing such an algorithm.
We felt that shifting our focus onto specific contexts where this problem is particularly prevalent, such as the proposed medical and legal arenas, was the best decision because they rely heavily on accurate communication. An example that illustrates how inaccurate translation can negatively affect people in these contexts is as follows: In 1980 in Florida, an 18-year-old boy, Willie Ramirez, was admitted to hospital in a comatose state. His friends and family, who only spoke Spanish, tried to explain to the medical personnel what had happened. The family believed that the boy had been poisoned, so they communicated to the staff that he was “intoxicado”, which is more commonly used to describe drug intoxication, which is how a bilingual staff member interpreted it. This resulted in doctors treating Willie for what they thought was a drug overdose rather than a poisoning. In reality, he was suffering from an intracerebral hemorrhage, and the delay in proper treatment caused him to be left quadriplegic.
The fact is that language barriers can significantly affect access to and use of health care, patient to physician communication, as well as quality, safety and satisfaction of services provided. Medicine is just one context, so we probe you to imagine how this issue presents itself in others, such as legal and mental health settings.
It is worth noting that individuals with limited English proficiency are actually guaranteed language services under Title VI of the Civil Rights Act of 1964, which prevents discrimination based on race, colour, religion, sex or national origin from any organization receiving federal funding. However, the law itself did not come with associated funding, which means that these organizations are not funded to provide language services. As a result, one consideration we might want to look into is how to make such services more affordable and accessible.
Our general moonshot question could be essentially refined to: How can we affordably integrate knowledge of emotions and context into technology to provide a more comprehensive translating experience? We focus on technology in particular because it is integrated into most of our daily lives and communities, and is far more accessible than professional translators and formal education.
Current translation technologies and their issues:
Our research into this area revealed several issues with current translation technologies.
A paper by Koehn & Knowles (2017) highlighted several issues facing Neural Machine Translation (NMT) today:
Firstly, NMT can easily run into issues with unpopular or “low resource” languages that have fewer texts for which AI can be trained with. As a result, phrases that show up in religious texts such as the Bible and Koran, which exist in many languages, can show up as the output for some garbage inputs. This is especially the case since religious texts contain a lot of rare words that do not occur anywhere else except those texts, and so rare or uncommon (or nonsensical) words can trigger these texts as output, especially when the proportion of these texts in the AI’s training resume is large.
A real-life example we tried in Google Translate
Secondly, MT does poorly at out of-domain data, and generally does badly for translations of specialized domains like legal or finance. Below, the chart shows the quality of systems when trained on one domain (row) and tested on another domain (column). Comparably, NMT systems (green bars) show more degraded performance out of domain.
In addition, NMT would prove equally poor at informal text, such as on messaging or blogposts where much of the language used may slang or colloquial.
It is also difficult to control quality of NMT, since words often have multiple translations. Typical Machine Translation systems score over a lattice of possible translations for a source sentence. This problem is typically addressed by a heuristic search technique called Beam Search, which explores a subset of the space of possible translations. When predicting the next output word, the machine may not only commit to the highest scoring word prediction but also maintain the next best scoring words in a list of partial translations. Since the number of partial translation explodes exponentially with each new output word, we prune them down to a beam of highest scoring partial translations. A larger beam size would allow one to explore a larger set of the space of possible translations and hence find better output, but contrary to that, increasing beam size does not consistently improve translation quality.
Another problem with translation in general is Culturally Specific words. Some languages are loaded with cultural terms and expressions, words that denote concrete objects or abstract aspects that may be related to religious beliefs, social habits, customs and traditions of social situations, moral values, object or lifestyle that is specific to the particular culture in question. They are usually difficult to translate because cultural context is too vague, too unrelatable for anyone else to understand as they reflect the world view of a society, its beliefs, emotions and values. For example, while some cultural concepts seem to be universal, they may not be interpreted in the same way. Each language has its own interpretation according to its people’s way of thinking, living, even geographic position. e.g. the understanding of ‘conscience’ in Russian vs English vs Arabic.
Some hospitals have actually already implemented language interpretation programs, however many doctors, to save time or money, may elect to use their own skills or an ad hoc interpreter. Even bilingual patients may not be spared, because many people who have only had high school or college language training would not be able to translate specialized medical terminology like describing cancer treatment options. In a research paper by Flores (2012), it was found that ad hoc interpreters resulted in similar levels of error as if there was no interpreter present (around 20%). Only professional interpreters could significantly translate with lower likelihood of errors of potential consequence (12%).
Other options include Remote Simultaneous Medical Interpreting, where the clinician and patient each use a headset that is connected to an interpreter at a remote location. This approach, modeled after the UN interpreting system, allows for fast, reliable communication in a variety of languages. But phone interpreters are sometimes limited because they cannot see non-verbal cues, so some care providers have also begun to incorporate video conferencing with interpreters via tablets, laptops and smartphones—although these services can be expensive. Plus, sometimes these translators aren’t certified medical interpreters (not necessarily conversant in medical technology) and are remote – talking through a phone can be confusing to an elderly/dementia patient.
Other companies have engineered smartphone translation and interpretation applications that are specialized in common health care phrases and nomenclature. But such technologies are not perfect, and many physicians remain skeptical, and advocated that they should not be used for “safety-critical tasks”.
As a result, there is clearly still a gap in translating technologies that must be addressed.
Several possible ideas and how they work:
- Direct Brain Stimulation to Translation
Identifying which parts of the brain registered semantic and even emotional information of words, and directly stimulating those parts itself in conjunction with the word – so as to associate the feeling/context with the word itself
Currently, research has found that the insula, an area of the brain deep inside the cerebral cortex, takes in the information that people tell us and then tries to find a relevant experience in our memory banks that can give context to the information. Subconsciously, we find similar experiences and add them to what’s happening at the moment, and then the whole package of information is sent to the limbic regions, the part of the brain just below the cerebrum. ← Possibly mapping out the insula might help us.
- Reading physiological and emotional states while inputting translations
Incorporating physiological and emotional state detectors e.g. using something similar to lie detector technology, that can give the AI/technology context of the person’s emotional state, to aid in input of context. If the AI can "read" the physiological and emotional input and integrate it into the translation, this might help it to provide more accurate output, especially in the case of ambiguous or double-meaning words.
- A device that picks up your emotional state and uses that to better rank translation outcomes
Using brain studies to understand different emotional contexts a particular word or phrase is used in, and integrating that information into our translation, to increase the probability of having a more contextualized translation option.
Currently, Google displays the most common or likely translation of an input in any context, but in this case, it would adjust the output based on the person’s emotional state and the contexts associated with it.
- Manually inputting the context of the situation before the AI starts working
When inputting the text, having a separate text box, where you can write “Place: At Hospital”, “Condition: Critical/Emergency” and device uses that to make best guess translation
Limitations: not timely - translation must often be done rapidly. Also, how can we account for enough context - what is "enough"?
- Including context in AI technology
Involving the Hermeneutic circle - in some settings, such as a medical setting, it is efficient to have entire databases about the person (typically about previous illnesses, family history, current situation), and allowing the translation AI access to this information might help integrate an individual's situation/culture, and what each word means to the person. This would help to provide a more personalized translation.
Limitations: Collecting that information would require massive infrastructure, and possibly not cost-effective. Also, the question about privacy?
- Improving AI’s training set
Employing human translators to provide the training repertoire for AI. Include emotional text, professional and legal/medical texts and use [closed captions] to include this emotional context.
- Broadening output produced by existing translators
Google Translate, for example, is very good at offering several interpretations of a single word sorted by frequency. For example, inputting ‘happy’ to be translated into Spanish returns ‘feliz’ (level 3 frequency), ‘alegre’ (level 2), and ‘dichoso’ (level 1). However, slightly complicating the input to form a sentence, e.g. “I am happy” only returns one result, “Yo soy feliz”. Perhaps it would be useful to rank different interpretations of a particular sentence as is done for single words.
However, it is important to note that while making this possible for sentences may be possible, doing so for entire passages becomes much more messy and complicated algorithmically.
- Broadening language database to include particular dialects
As it relates to Spanish, the use of certain terminology will vary significantly across regions. For example, though they all speak the same language, Argentinians, Mexicans, and Panamanians have very different approaches to the language. This is definitely presented in various languages as well. Translation would definitely be improved if the scope could be narrowed down to a particular dialect as opposed to generalizing across all of them.
Unfortunately, this solution would be limited to much more popular and widely documented languages. This also wouldn’t ease much of the tension associated with medical and legal jargon.
- Ranking translation results by context rather than frequency
In the case of “intoxicado”, perhaps the translated result would be along the lines of:
Social: intoxicated, drunk
Medical: poisoned etc.
Ranking/producing an output based on frequency definitely leads to a much more rapid response. However, in these contexts the concern is not what most people generally say, but rather what a particular patient is trying to say. The practitioner himself would still have to make the ultimate decision, which may not be the right one.
- Integrating real time online community forums (where people ask for translation help) into machine learning algorithms
Often times, you may Google Translate a translation and get one thing, but when searching online the responses in a forum may be different. These aren’t formal resources, such as literary texts, however they feature communities of people each offering their different perspectives or interpretations of a word or phrase. People usually indicate where they’re from and what it means to them. If it was possible to crowd source translation from all over the word, at all times, this could help in providing real-time, person-centric translations.
Limitations: there needs to be a way to incentivise/monetise this enterprise. Also, who would claim responsibility for mistranslations? Also, what happens to rare languages with few speakers, or areas where there is little online connectivity?
That's what we have so far, thank you for reading up to here! Look forward to your thoughts! (: