Possible Duplicate:
How to extract textual contents from a web page?
I have searched a lot but not abled to find what I'm looking for.Actually I want to extract data from a web page(only main data like an article from a news page).On googling I found a lot of open source softwares like bottlepipe,Jtidy,etc but I want to write my own code to do that.Since I have done programming in java and hope to implement it in java.Is there any way to do this without using open source libraries ?
Can you provide me some good tutorial for this ?