2

Lately, I've been trying to store the source code of some pages so I can later scrap what I need from them without having to worry about internet or possible anti-scraping measures. My first approach was to save the bs.prettify object of each link into a column of the same DataFrame. After a while, I realized I can't navigate the parse tree on those objects (for example, accessing bs.h1). So, I wanted to know if there's a way to turn the string from the bs.prettify object into a navigable BeautifulSoup object or if there's a better way than storing into a DataFrame the source code for later use?

Juan C
  • 5,846
  • 2
  • 17
  • 51

0 Answers0