Friday, May 30, 2014

Monolithic MT or 50 Shades of Grey?

In the many discussions by different parties in the professional translation world involving machine translation, we see a great deal of conflation and confusion because most people assume that all MT is equivalent and that any MT under discussion is largely identical in all aspects. Here is a slightly modified description of what conflation is from the Wikipedia.
Conflation occurs when the identities of two or more implementations, concepts, or products, sharing some characteristics of one another, seem to be a single identity — the differences appear to become lost.[1] In logic, it is the practice of treating two distinct MT variants as if they were one, which produces errors or misunderstandings as a fusion of distinct subjects tends to obscure analysis of relationships which are emphasized by contrasts.
However, there are many reasons to question this “all MT is the same” assumption, as there are in fact many variants of MT, and it is useful to have some general understanding of the core characteristics of each of these variants so that a meaningful and more productive dialogue can be had when discussing how the technology can be used. This is particularly true in discussions with translators as the general understanding is that all the variants are essentially the same. This can be seen clearly in the comments to the last post about improving the dialogue with translators. Misunderstandings are common when people use the same words to mean  very different things.

There may be some who view my characterizations as opinionated and biased, and perhaps they are, but I do feel that in general these characterizations are fair and reasonable and most who have been examining the possibilities of this technology for a while, will likely agree with some if not all of my characterizations.

The broadest characterization that can be made about MT is around the methodology used in developing the MT systems i.e. Rule-based MT (RbMT) and Statistical MT (SMT) or some kind of hybrid as today users of both of these methodologies claim to have a hybrid approach. If you know what you are doing both can work for you but for the most part the world has definitely moved away from RbMT, and towards statistically based approaches and the greatest amount of commercial and research activity is around evolving SMT technology. I have written previously about this but we continue to see misleading information about this often, even from alleged experts. For practitioners the technology you use has a definite impact on the kind and degree of control you have over the MT output during the system development process so one should care what technology is used. What are considered valuable skills and expertise in SMT may not be as useful with RbMT and vice versa, and they are both complex enough that real expertise only comes from a continuing focus and deep exposure and long-term experience. 

The next level of MT categorization that I think is useful is the following:
  • Free Online MT (Google, Bing Translate etc..)
  • Open Source MT Toolkits (Moses & Apertium)
  • Expert Proprietary MT Systems
The toughest challenge in machine translation is the one that online MT providers like Google and Bing Translate attempt to address. They want to translate anything that anybody wants to translate instantly across thousands of language pairs. Historically, Systran and some other RbMT systems also addressed this challenge on a smaller scale, but the SMT based solutions have easily surpassed the output quality of these older RbMT systems in a few short years. The quality of these MT systems varies by language, with the best output produced in Romance languages (FR, IT, ES, PT) and the worst quality in languages like Korean, Turkish and Hungarian and of course most African, Indic and lesser Asian languages. Thus the Spanish experience with “MT” is significantly different to the Korean one or the Hindi one. This is the “MT” that is most visible, and most widely used translation technology across the globe. This is also what most translators mean and reference when they complain about “poor MT quality”. For a professional translator user, there are very limited customization and tuning capabilities, but even the generic system output can be very useful to translators working with romance languages and save typing time if nothing else. Microsoft does allow some level of customization depending on user data availability. This type of generic MT is the most widely used “MT” today, and in fact is where most of the translation done on the planet today is done. The number of users numbers in the hundreds of millions per month. We should note that in the many discussions about MT in the professional translation world most people are referring to these generic online MT capabilities when they make a reference to “MT”.

Open Source MT Toolkits (Moses & Apertium)

I will confine the bulk of my comments to Moses, mostly because I pretty much know nothing about Apertium other than it being an open source RbMT tool. Moses is an open source SMT toolkit that allows anybody with a little bit of translation memory data to experiment and develop a personal MT system. This system can only be as good as the data and the expertise of the people using the system and tools, and I think it is quite fair to say that the bulk of Moses systems produce lesser/worse output quality than the major online generic MT systems. This does not mean that Moses users/developers cannot develop superior domain-focused systems but the data,skills and ancillary tools needed to do so are not easily acquired and I believe definitely missing in any instant DIY MT scenario. There is a growing suite of instant Moses based MT solutions that make it easy to produce an engine of some kind, but do not necessarily make it easy produce MT systems that meet professional use standards. For successful professional use the system output quality and standards requirements are generally higher than what is acceptable for the average user of Google or Bing Translate. 

While many know how to upload data into a web portal to build an MT engine of some sort, very few know what to do if the system underperforms (as many initially do) as it requires diagnostic, corpus analysis and identification skills to get to the source of the problem, and then knowledge on what to fix and how to fix it as not everything can be fixed. It is after all machine translation and more akin to a data transformation than a real human translation process.  Unfortunately, many translators have been subjected to “fixing” the output from these low quality MT systems and thus the outcry within the translator community about the horrors of “MT”. Most professional translation agencies that attempt to use these instant MT system toolkits underestimate the complexity and skills needed to produce good quality systems and thus we have a situation today where much of the “MT” experience is either generic online MT or low quality do-it-yourself (DIY) implementations.  DIY only makes sense if you really do know what you are doing and why you are doing it, otherwise it is just a gamble or a rough reference on what is possible with “MT”, with no skill required beyond getting data into an up loadable data format.

Expert Proprietary MT Systems
Given the complexity, suite of support tools and very deep skill requirements of getting MT output to quality levels that provide real business leverage in professional situations I think it is safe to say that this kind of “MT” is the exception rather than the rule. Here is a link to a detailed overview of how an expert MT development process would differ from a typical DIY scenario. I have seen a few expert MT development scenarios from the inside and here are some characteristics of the Asia Online MT development environment:
  • The ability to actively steer and enhance the quality of translation output produced by the MT system to critical business requirements and needs.
  • The degree of control over final translation output using the core engine together with linguist managed pre processing and post-processing rules in highly efficient translation production pipelines.
  • Improved terminological consistency with many tools and controls and feedback mechanisms to ensure this.
  • Guidance from experts who have built thousands of MT systems and who have learned and overcome the hundreds of different errors that developers can make that undermine output quality.
  • Improved predictability and consistency in the MT output, thus much more control over the kinds of errors and corrective strategies employed in professional use settings.
  • The ability to continuously improve the output produced by an MT system with small amounts of strategic corrective feedback.
  • Automatic identification and resolution of many fundamental problems that plague any MT development effort.
  • The ability to produce useful MT systems even in scarce data situations by leveraging proprietary data resources and strategically manufacturing the optimal kind of data to improve the post-editing experience.
   So while we observe many discussions about “MT” in the social and professional social web, they are most often referring to the translator experience with generic MT as this is the most easy to access MT. In translator forums and blogs the reference can also often be a failed DIY attempt. The best expert MT systems are only used in very specific client constrained situations and thus rarely get any visibility, except in some kind of raw form like support knowledge base content where the production goal is always understandability over linguistic excellence. The very best MT systems that are very domain focused and used by post editors who are going through projects at 10,000+ words/day are usually very client specific and for private use only and are rarely seen by anybody outside the involvement of these large production projects. 

It is important to understand that if any (LSP) competitor can reproduce your MT capabilities by simply throwing some TM data into an instant MT solution, then the business leverage and value of that MT solution is very limited. Having the best MT system in a domain can mean long-term production cost and quality advantage and this can provide meaningful competitive advantage and provide both business leverage and definite barriers to competition.

In the context of the use of "MT" in a professional context, the critical element for success is demonstrated and repeatable skill and a real understanding of how the technology works. The technology can only be as good as the skill, competence and expertise of the developers building these systems. In the right hands many of the MT variants can work, but the technology is complex and sophisticated enough that it is also true that non-informed use and ignorant development strategies (e.g. upload and pray) can only lead to problems and a very negative experience for those who come down the line to clean up the mess. Usually the cleaners are translators or post-editors and they need to learn and insist that they are working with competent developers who can assimilate and respond to their feedback before they engage in PEMT projects. I hope that in future they will exercise this power more frequently. 

So the next time you read about “MT”, think about what are they actually referring to and maybe I should start saying Language Studio MT or Google MT or Bing MT or Expert Moses or Instant Moses or Dumb Moses rather than just "MT". 

Addendum: added on June 20

This was a post that I just saw, and I think provides a similar perspective on the MT variants from a vendor independent point of view. Perhaps we are now getting to a point where more people realize that competence with MT requires more than dumping data into the DIY hopper and expect it to produce useful results.

Machine translation: separating fact from fiction


  1. Kirti, I think this is a really good idea. I like it a lot!

    The difference in quality of MT output is huge, and I'm sure translators would appreciate to know which MT engine has been used. They'll be less blind, understand the risk in the job a bit better, and as such it will help them to accept MT-jobs.

    It would be a smart move for LSPs to tell their translators what MT engine they used to pre-translate the jobs. I'll include this in my MT-coaching material for LSPs.

    Thanks for this splendid idea!


  2. Kirti, thanks for a very well written article. I will take it as a point of departure for future references to MT. All your points are valid and well received. Thank you for a very clear framework and reference tool.

    By Claudia Brauer

  3. Thanks Kirti for an in-depth description of the different kinds of MT. There is indeed a world of difference between editing the output of expert customised systems and the others.
    By Gillian Searl

  4. I like especially the suggestion that translators tell which MT engine produced the translation which they comment.

    By Walter Keutgen

  5. Thanks for this Kirti, I think your categorization of MT is absolutely to the point. Knowing exactly what you are doing, and why you are doing it, this is the best piece of advice when looking at using or implementing a transforming technology (and this can relate to any kind of business technology, not only MT!)

    By Sylvie Guerin

  6. Excellent article. Best comparison of available systems I've read. Thank you.

    By Richard Lankenau

  7. 50 or 5, doesn't matter

    True, the fact is that all MT is not the same - depending on who conceived the system, with what (mis)understanding of the fundamentals of the translation process, with what means available etc.
    Same as not all protein-based translation devices are to be bundled together.
    Having said that, MT is not going to go away just because some people don't like it.
    Nowadays, MT is all too often as useful as a steam-powered Papin car on a modern motorway, but that will keep changing, in only one direction. And maybe faster than anticipated.

  8. There are almost nothing in life that it Black or White. Talking about things in very simplified and general manner, especially complicated things, is more often than not just populism.

    With that in mind, most of what you wrote in this comparison, Kirti, could be mirrored to represent how the MT advocates talk about translators and the translation profession, as well as the application and purpose of MT in a commercial environment.

    As far as Conflation goes, many MT advocates/proprietors so naturally expect translators to do PEMT. I won't got about the whole PEMT debate right now, but those who think that way completely don't realize that a translator and an editor are completely different professions that require different skillets, and although superficially there is some overlap because both are dealing with language, one is not interchangeable with the other.

    Then there is the common claim that MT is universally applicable (and some even go so far as claiming it to be a human translator killer) and everyone should jump aboard or drown, a fallacy that considers: 1) All translators to be of equal experience, expertise, skillsets, and goals; 2) The segmentation of the market and demand for high quality, specialized human translation.

    And I can go on, but I won't for the sake of brevity and to avoid repeating myself as I've commented about this in few of your previous posts.

    So yes, not all translators understand the different shades of MT, but there is just as much misconception and ignorance about the market and translation profession from the side of MT advocates.

    And in response to the common claim of many irresponsible MT promoters out there, I will sign off by saying: Yes, MT is not going anywhere (but some of the investors do), and so is the human translator.

    1. I agree, translation is different from post-editing and that some translators will very likely never have an interest in doing any PEMT. For many new translators however, PEMT when using Expert and responsive MT is not so different from the experience of working with TM fuzzy matches even though the errors are different in character. The key is that it should only be used if it actually improves the productivity of the individual editor or translator. It is a tool to aid, not replace translators and any kind of premium translation service.

      Hopefully we all get smarter about quickly understanding where it works, who knows to make it work best and how to use it with humans to keep it a win-win scenario.

      And while we will see more MT (with many low quality variants) there will ALWAYS be a shortage of good translators with subject matter expertise and this shortage is likely to get worse in future.

  9. Much better? How good are they really?
    The article you linked above talks about different kinds of MT without giving any examples of how good or bad they really are, so where is the evidence?! And the article belittles all translators who think all MT is the same in the sense of its inadequacy. Those translators obviously don't know what they're talking about. Yeah, right.

    There is a good reason why many translators oppose MT completely - but if you can show us some examples of how good a machine translation is - I would gladly see it.
    A good MT to me means a machine that can translate relatively complex sentences CORRECTLY.

    Maybe the article gets the definition of MT wrong - yes, there are many tools that help us translate texts (online dictionaries, CAT tools and TMs, speech recognition programs are just a few), but they are not what is commonly referred to as machine translation.
    A machine translation to me is a translation of simple and complex sentences carried out by a machine, not a human.

    If the translation were good, MT could be a great tool or it could eventually replace human translators, and only programmers of MT machines would be needed.

    Also, and "if nothing else, at least MT helps reduce time spent on typing" is not an argument I can support. You need to compare what the machine translated with the original text and then you have to rearrange words and replace words and phrases which can be quite a headache and can end up being much more work than a "clean" human translation. Also, wrong sentence structure, word endings, mistakes in tense and mode can be a constant annoyance.

    So show me the result of one of those super MTs,


  10. Not sure I would characterize these as samples of super MT -- but they are indeed customized and domain focused

    Travel Domain EN to ES MT system

    All studios are equipped with a small kitchen, fridge and separate bathroom.
    The hotels facilities include an outdoor swimming pool and a beauty parlour.
    Enjoy typical French cuisine in the traditional restaurant Aux Trois Cochons.
    The hotel is in the heart of the historical centre, near to all major attractions.
    The hotel is located at the heart of the Huangpu District, close to Nanpu Bridge.

    MT of above source sentences:

    Todos los estudios están equipados con una pequeña cocina, nevera y un baño independiente.
    Las instalaciones del hotel incluyen una piscina exterior y un salón de belleza.
    Disfrute de la típica cocina francesa en el restaurante tradicional Aux Trois Cochons.
    El hotel está en el corazón del centro histórico, cerca de todas las atracciones principales.
    El hotel está situado en el corazón del distrito de Huangpu, cerca de puente Nanpu.

    Another set of source sentences from an English to Portuguese IT engine

    Next to the lab that you want to view, click View Schedule.
    Select the content that you want to share in the main training session.
    Specifies that My Meetings is not available in the user's My WebEx area.
    It can take a considerable amount of time to batch import or export.
    Specifies that you want to create a presentation using the On-Demand Module.
    Click Yes to confirm that you want to delete the account.
    Specifies the password to view or download the recording.
    To save the changes you make, click Save.

    MT of these sentences:

    Ao lado do laboratório que você deseja exibir, clique em exibir agenda.
    Selecione o conteúdo que deseja compartilhar na sessão de treinamento principal.
    Especifica que minhas reuniões não está disponível na área meu Webex do usuário.
    Pode levar um tempo considerável para importar ou exportar em lote.
    Especifica se você deseja criar uma apresentação usando o módulo sob demanda.
    Clique em sim para confirmar que você deseja excluir a conta.
    Especifica a senha para exibir ou baixar a gravação.
    Para salvar as alterações feitas, clique em salvar.

    And a few samples from an EN to DE Info Tech MT engine which is much more difficult direction for MT engines

    An alternate host can start the meeting and act as the host.
    Individual reports will be kept for 3 months on the server.
    Select the application you want to share from the list of available application.
    Sign in to the site you want to use.
    What is the difference between inserting files and using a URL?
    Additional bandwidth might be required based on your usage.
    They are listed as additional contacts, to help the members.
    Please delete one or more contacts and try again.
    Then invite attendees to use the reserved computers during the scheduled time.
    Click this icon to change any information about the recording that appears in this list.

    Unedited MT below:

    Alternative Gastgeber können das Meeting starten und als Gastgeber handeln.
    Individuelle Berichte für 3 Monate gehalten wird, auf dem Server.
    Wählen Sie die Applikation aus, die sie teilen möchten, wählen Sie aus der Liste der verfügbaren Applikationen.
    Melden Sie sich bei der Site, die Sie verwenden möchten.
    Welcher Unterschied besteht zwischen dem Einfügen von Dateien und dem Verwenden eines URL?
    Zusätzliche Bandbreite erforderlich sein könnten, basierend auf ihre Nutzung.
    Sie werden als zusätzliche Kontakte aufgeführt, um den Mitgliedern zu helfen.
    Bitte löschen Sie einen oder mehrere Kontakte, und versuchen Sie es erneut.
    Dann laden Sie Teilnehmer ein, die reservierten Computer verwenden während der angesetzten Zeit.
    Klicken Sie auf dieses Symbol, um die Informationen zur Aufzeichnung zu ändern, die in dieser Liste angezeigt.

    In the hands of experts these systems will get better as error feedback is given or simply by looking at corrections. The objective is not to replace the human, but rather to raise his/her productivity.