1

I want to translate a string using Google Translator.

My sample string is "this is my string".

I want to use HTML Agility Pack to parse HTML documents.

I tried this:

using HtmlAgilityPack; 

........

var webGet = new HtmlWeb();
var document = webGet.Load(
    "http://translate.google.com/#en/bn/this%20is%20my%20string");

var node = document.DocumentNode.SelectNodes(
    "//span[@class='short_text' and @id='result_box']");

if (node != null)
{
    foreach (var xx in node)
    {
        x = xx.InnerText;
        MessageBox.Show(x);
    }
}

But I get no results.

My aim is to translate a complete string using Google Translate and to show the translated string in a label in Windows Forms.

How can I do this?

Paolo Moretti
  • 54,162
  • 23
  • 101
  • 92
  • 5
    Why aren't you using the [Google Translate API](https://developers.google.com/translate/)? Trying to skirt billing? Tsk, tsk. – casperOne Jan 15 '13 at 13:53
  • 5
    Rather than relying on screen-scraping, you should consider looking into using the API that google makes available for the translate service. Some documentation can be found [here](https://developers.google.com/translate/v2/getting_started) – Henrik Aasted Sørensen Jan 15 '13 at 13:51
  • 1
    "You cannot take this road, Sir because the other one has a toll booth." Yeah... no I don't see it. – Wim Ombelets Jan 15 '13 at 13:58
  • I want to translate "from English to Bengali".But In Google Translate API "Benglai" language is not available. –  Jan 15 '13 at 13:59

3 Answers3

6

This is a bad idea. As commenters have pointed out, Google offers a programmatic interface as a paid service. Google surely has security features in place to try to block exactly what you are doing, and that is why it isn't working. Perhaps you could get it working somehow, but even then you would always be in danger of Google improving its security and your script being blocked again. In addition, you are almost certainly breaking the Google terms of use.

2017 Update: Microsoft Translator API now supports Bengali, and is free for up to two million characters per month.

  • I want to translate "from English to Bengali".But In Google Translate API "Benglai" language is not available. –  Jan 15 '13 at 14:01
  • @Bayazid, everything I have seen indicates that the API works with the same languages as their translate page. Are you sure that is the case? Perhaps you are just looking at some outdated documentation. –  Jan 15 '13 at 14:10
  • 1
    Can i use the api freely??do i have to pay to use api in my winform dictionary app??? –  Jan 15 '13 at 14:14
  • @Bayazid, yes you have to pay. –  Jan 15 '13 at 14:18
  • Is there any other way to do what i want ?? –  Jan 15 '13 at 14:20
  • 1
    @Bayazid Find another translation API? Or write your own? – JDB Jan 15 '13 at 14:26
  • 1
    Bing seems to be the consensus choice for a free translation API now (http://www.microsoft.com/web/post/using-the-free-bing-translation-apis). However, they don't support Bengali yet. MyMemory has an API that is free for up to 2500 requests per day, and they do support Bengali: http://mymemory.translated.net/doc/spec.php –  Jan 15 '13 at 14:30
  • Proxy servers FTW. – Darth Continent Mar 10 '17 at 00:58
2

Basic example using HTML Agility Pack

using System;
using HtmlAgilityPack;    
class Traslator
    {
        private string url;
        private HtmlWeb web;
        private HtmlDocument htmlDoc;

        public Translator(string langPair) // LangPair = "SL|TL" ( Source Lang | Target Lang - Ex.: "en|pt"
        {
            this.url = "http://www.google.com/translate_t?hl=en&ie=UTF8&text={0}&langpair=" + langPair;
            this.web = new HtmlWeb();
            this.htmlDoc = new HtmlDocument();
        }

        public string Translate(string input)
        {
            this.htmlDoc = web.Load(String.Format(this.url, Uri.EscapeUriString(input)));
            HtmlNode htmlNode = htmlDoc.DocumentNode.SelectSingleNode("//*[@id=\"result_box\"]");
            return htmlNode.InnerText;
        }
    }

Whats wrong in your example: Just url used... try inspect the document.Text prop to get the html received from webGet... u will se that span.result_box will be empty.

Overnow
  • 21
  • 1
0

Rather than relying on screen-scraping, you should consider looking into using the API that google makes available for the translate service.

Some documentation can be found here

Update:

I belive your problems with screen-scraping approach may be that the translate application uses Ajax to call the server-side and retrieve the translation. The page you get when downloading using HtmlWeb is merely the JS application, it doesn't actually contain the translation. That doesn't get filled in until after a call has been made from the page to the server.

Henrik Aasted Sørensen
  • 6,966
  • 11
  • 51
  • 60