What is the best way to detect if a web page is a programming tutorial?
This means that a web page has programming code in its content. E.g. http://www.djangorocks.com/tutorials/how-to-create-a-basic-blog-in-django/starting-your-application.html
What is the best way to detect if a web page is a programming tutorial?
This means that a web page has programming code in its content. E.g. http://www.djangorocks.com/tutorials/how-to-create-a-basic-blog-in-django/starting-your-application.html
If a webpage is using proper semantic HTML then you can parse it for the 'code' element. However this doesn't sure that you will find all web pages that are presenting code as readable content.
More information: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/code
` just for its formatting properties. It predates any semantic markup initiatives by a large margin. Also, even a lot of semantically motivated uses are other types of code than program code (configuration data, markup examples, etc).
– tripleee
Dec 13 '15 at 19:39