Dec
10
Filed Under (Semantic Intelligence) by M.Varone on 10-12-2008

Have you tried out the machine translation tools I was talking about last Wednesday? Were you surprised by the results? I believe you were.

 

In case you’ve translated short and specific texts (I chose articles of about ten lines – in Russian, a language I don’t know, not even a word), you may have obtained reasonably good results: in my case, the translation performed by Promt (a system based on shallow linguistic technology) was quite fair in general, as were the ones performed by the Google tools (statistical technology).

 

In the case that you’ve tried to translate full articles, you surely have realised that the results are nearly always disappointing (sometimes hilarious): no matter which system is used, the imperfections are plentiful, sentences have no logical structure, and in most cases the text makes no sense.

 

So, considering the results, why should we use these systems?

 

Because in some cases we may just need to get a general idea of the content – a general idea is better than nothing, in a world where only a very few people know more than two or three foreign languages (and often just one). But in most cases we actually need to know what we are reading, therefore we can’t go far with this approach.

 

If we want machine translation to become truly useful, we need to consider a new perspective and a different approach: semantics as a base technology, a technology in which to work, to find the missing piece of the puzzle, and to truly understand the general meaning of the text.

 

 

 

Dec
03

Is it possible to develop a system for machine translation able to overcome the many limitations of the existing systems, limits that in fact are preventing any practical application?

 

I believe so, but we need to abandon the idea of an easy and quick solution obtained by some statistical magic formula. Instead we need semantic comprehension of the content, a large quantity of conceptual information, and a great deal of work for each language to be managed.

 

To realize how disappointing the state of the art is, we just need to test the two main systems representing the current approaches:

  • statistical technology used be Google language tools;
  • linguistic technology (without semantics), for example the translation software made available by PROMT.

 

Have you ever tried these systems? I will talk about them very soon, so you may want to test them a little, so that you can compare your impressions.