0

I am trying to do sentiment analysis for non english languages like japenese, chinese, german etc. I want to know if any Machine translator available for translating documents in these languages to english. I am working on JAVA so I should be able to call the API or tool. I have used google translator API so please suggest anything apart from it.

  • I sincerely doubt you'll be able to perform anything close to [sentiment analysis](http://en.wikipedia.org/wiki/Sentiment_analysis) on Japanese text using only an English translator. If you already have something implemented for English, chances are that it will need to be at least partially reimplemented in order to work with most other cultures. – Frédéric Hamidi Dec 19 '10 at 11:49
  • Knowing some Japanese; you can understand every word said but not understand what is meant. They imply much more in their language than we do in English which is much more direct. e.g. Japanese don't like to say no, but have many ways of sounding positive but really meaning no. It a bit like in English when you say thank you in a strained overly polite way, you are being sarcastic and actually mean the opposite of what you say. Japanese has many more subtleties like this than English. – Peter Lawrey Dec 19 '10 at 12:01
  • An example; a friend studied masters in Japanese literature, lived in Japan had a Japanese boyfriend for five years. Yet when he proposed marriage, she had no idea he had done so. :P There is a traditional poem about nature and the cycle of life which implies he wants her to spend the rest of their live together, she accepts by saying the second half of the poem. – Peter Lawrey Dec 19 '10 at 12:05
  • It depends on what you plan to do with it. If you are just looking for broad market sentiment, you might just need to calibrate your processes for different languages/cultures. (You might expect to do that anyway) e.g. I would expect the use of Chinese in taiwan, hong kong and shanghai to be rather different. – Peter Lawrey Dec 19 '10 at 12:15
  • Peter, I agree with you on the accuracy but I am planning to use it for the testing purpose first. I will run my tests on Language rich corpus so please suggest me a good machine translator – Jagdeep Dec 19 '10 at 12:30
  • What's wrong with Google translate? This is going to be far and away the best natural language translator available - certainly for free. – Richard H Dec 19 '10 at 14:43
  • @Jagdeep. Your SA will be highly dependent on the subject area. I cannot common on discourse or informal writing - I would expect these to be very difficult in any language. Even in the formality of science (where I work) sentiment depends critically on phrasing. For example "It was possible to make X" and "it was not impossible to make X" have greatly different sentiments in English but I would expect that translators my remove the nuances. From my own experience I would want to build a Tree Bank but - again - I am not an expert – peter.murray.rust Dec 19 '10 at 16:58

2 Answers2

0

Sentiment analysis is highly dependent on both the culture and the domain of practice (see http://en.wikipedia.org/wiki/Sentiment_analysis ). We are working in the area of SA for scientific texts and this is undoubtedly a research area. So I don't think you will find anything off-the-shelf either for the human language or for the SA.

peter.murray.rust
  • 37,407
  • 44
  • 153
  • 217
  • Ya I agree with you but in many cases this approach was a great help. So please suggest me a machine translator then I will analyze the results of my sentiment analysis – Jagdeep Dec 19 '10 at 12:31
  • @Jagdeep, Unless you understand the original language and what the underlying sentiment is, how will you test it. What will you compare your result with? – Peter Lawrey Dec 19 '10 at 13:03
0

There are a plenty of different Machine Translation APIs: Google, Microsoft, Yandex, IBM, PROMT, Systran, Baidu etc.

I may refer you to our recent evaluation study (November 2017): https://www.slideshare.net/KonstantinSavenkov/state-of-the-machine-translation-by-intento-november-2017-81574321

However, it's not clear how MT quality scores are correlated with good sentiment analysis on the results. That's something we are going to explore soon.

savenkov
  • 658
  • 8
  • 13