I'm trying to use golang to extract the text from html, and I use the goquery library to do this. The code like below:
document, err := goquery.NewDocumentFromReader(r)
if err != nil {
log.Fatalln(err)
}
document.Find("script").Remove()
document.Find("style").Remove()
text := document.Find("body").Text()
you can find the result still contains the html tag, how could I remove the html tags and only keep the text?