Currently, I'm attempting to use the boilerpipe APl in order to extract text from news articles. However, it doesn't fully work. For example, see this link. Even though boilerpipe gets all of the main text, it also gets some of the unimportant text such as "Chat with us on Facebook Messenger." Are there any viable alternatives to boilerpipe, or is there a way to configure boilerpipe in order to find the main article text better?
Asked
Active
Viewed 155 times
1
-
Did you have any success with this? – Pritam Banerjee Oct 14 '16 at 09:08
-
Did you try [newspaper](http://newspaper.readthedocs.io/en/latest/) library in python? – Om Prakash Jan 03 '18 at 13:12