Friday, July 8, 2016

Overview of Expert MT Systems -- tauyou

This is a guest post by Diego Bartolome, the CEO of tauyou, who I regard as an expert MT developer with verifiable competence and a track record of success with MT. This post will be one in a series of upcoming posts to inform and introduce readers of this blog about competent MT technology alternatives available in the market today. I am aware of several successful MT implementations they have had, from presentations made by their customers at industry conferences. Also, I have had several conversations with Diego in the past and felt it would be good to highlight his company and it's capabilities, in his own words, as I think he offers solutions and services that are especially well suited for LSPs who realize that MT engine development is best left to experts.

tauyou <language technology> was created 10 years ago and initially had a completely different objective, which was to put machine translation in your pocket, in your mobile, everywhere. We won many prizes, we were in media continuously thanks to our innovation, but the truth is that we lacked the most important thing in a company: recurrent revenue. It took us too much time to realize, it was not until late 2008 when we pivoted to machine translation solutions for the language industry in general, and Language Service Providers in particular. The pivoting period was tough, because we were running out of money even though our burn rate was extremely low, but we managed to survive, and reached a state of sustainability, and then continuous growth thereafter.
In 2009, selling MT to LSPs was a tough sell. During many calls, I was getting from simple NO responses without any further explanation, to the typical excuses, which at that time were somewhat true: MT is the enemy, we will never use MT, it doesn't work for my use case, MT quality is too bad, etc. This has changed over the years, and since the beginning of 2012, we are getting active requests from LSPs that want to integrate MT into their workflow and companies that need a specialized custom MT solution. We are experts in customizing MT and putting it to work in the least possible time. Also, in the past three years, we have seen an explosion of the MT usage for post-editing, and also the raw MT usage, and we currently offer baseline engines companies can use, if little or no is data available. Another key aspect to providing MT technology is the integration into existing production systems. Having APIs that allow clients to connect to our engines for real-time translation in an easy way has been a great asset to succeed in the MT landscape. No matter what CAT tool you use, either the tauyou product is already integrated into it, or it can be integrated in a short period of time.
Our process is extremely customized, and we adapt everything to the unique client use case. We have a deep knowledge of the technology and various support components, so we can customize our technology building blocks according to the very specific application that we are dealing with. The first engines we built took ages to be productive, but currently, some engines are ready to be used in production mode in just a matter of hours because of our continuously improving technology platform! The hardest part for our clients is still recruiting post-editors in some language combinations and verticals. Once the engines are built, they continuously improve thanks to the corrective linguistic feedback of translators that we rapidly incorporate back into the system, the automatic post-editing rules we extract, and frequent incremental retraining. I would say that integrating editor user feedback is key, and we help translation companies engage post-editors, either theirs or the more than 1,500 post-editors we have in our database. 

Engaging translators is key to the success of the MT initiative, and we regularly have calls together with our clients to improve the communication and really make it work for all parties. The key element for many translators might be fair compensation for MT related work, but there are also many who see the possibility of learning a new skill and provide guidance for the MT engine out to become better. If the MT is better, it is also better for the translator! We have evolved and are now often considered as the translators best friend. 

Thanks to our team of NLP engineers, we have developed many support modules to enhance the effectiveness of our MT systems. These include:
  • Named Entity Recognition,
  • Statistical analysis of the source content,
  • Glossaries,
  • Forbidden word lists,
  • Automatic post-editing rules,
  • Extraction of unknown words,
  • Perfect tag positioning,
  • Estimation of the MT Output Quality,
  • Summarization technology,
  • Classification of the source content to use the best MT,
  • Detection of events,
  • Spelling and Grammar checking, etc.
We can integrate any open-source tool that is available from the NLP (Natural Language Processing) and computational linguistics community at large, or develop custom applications for our customers, as we have done in the past. What is more important now, is our thorough knowledge of the process, and our extensive experience, which enriches the customer workflow to some extent. We now also try to make it easy for clients to embrace change, and achieve a significant Return on Investment on their MT technology investments. 

The best engines we have produced are related to the region of the world where we are based, i.e. language pairs including Spanish as source or target, such as bidirectional Spanish - Catalan, French, Portuguese, Italian, etc.. Some of the engines that we are more proud of include generic bidirectional English - Danish, Swedish, and Norwegian that were developed for a major client with large TMs in the Nordic languages, on which we applied many NLP techniques and some innovative algorithms developed just for them to reach an impressive outcome. Other language pairs with English we can be proud of include Japanese, Korean, Hebrew, and Chinese. However, some clients also call us to develop engines not having English as source or target, where competitors do not perform so well, e.g. Danish into French, German into Swedish, or French into German, to name a few. If our clients have good data, the process becomes easier in any language. 

There are several slide decks available from presentations we have made in the past to view here. This may be most interesting to somebody who is not familiar with us:The discreet charm of machine translation. Here is another link with several past presentations. 

Recent applications of these MT technologies include chat translation for companies with internal employees that need to know a topic really well, and also real time social media translation. We can plug our MT into any need! In these cases, we use our stock/baseline engines, and they frequently outperform Google or Microsoft depending on the vertical and language pair. The advantage of our technology is that it can be installed within the client data center, thus securing the confidentiality of the data and with full control. Thanks to our NLP knowledge and expertise, we now have a predictive typing technology to assist translators and content writers, a tool for Project Managers to automatically select the best translator for a given project based on their previous work history, and an automatic content generation tool as well others in development.

Our pricing is fairly simple, it's just a flat rate per month depending on the number of engines, where we include some development and expert consulting work. The translation volume is unlimited, clients can translate as many words as they want! Prices start as low as $1100 per month, and don't involve any set-up fee nor do we have any upfront or hidden costs. Also, the model can be adjusted monthly based on the plan, without minimum commitment if the solution is installed online in a SaaS mode. It's better to try it and learn directly whether it will work for your application, than just think it won't work and you might be losing revenue and profit for your company! Either you succeed or you learn. 

In the recent past, we have started to look into Neural Machine Translation together with our partner Prompsit Language Engineering. Even if results are promising, we think that the technology still needs time to evolve to be practical in a realistic business case for MT post-editing. However, there might be cases and languages such as Japanese or German, where Neural MT will outperform our hybrid machine translation in the near future. In any case, just as rule-based MT has not been replaced by Statistical Machine Translation (SMT), SMT still has some years in her life. Any company interested in being a leader in the MT space has to invest in R&D, and NMT is definitely the technology to research.
The translation future is MT-based :-) 

You can contact tauyou at the address below for more information. Please use the hashtag #emptypages in your communications to receive a special promotion. 


phone: +34 93 711 29 96
address: C/ Les Planes 39, 1o 2a
08201 Sabadell - Spain

No comments:

Post a Comment