-1

I am trying to print out the first and last 1000 lines using "prettify' from BeautifulSoup. I have downloaded Kafka's The Metamorphosis to my hard drive and I've successfully created a BeautifulSoup object:

Due to captcha issues with the Gutenberg site, I saved a copy of the document on my hard drive.

page = open('meta.htm', 'r').read()
soup = BeautifulSoup(page, "lxml")

How do I use soup.prettify() to print out the first and last 1000 lines of the document?

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
James
  • 37
  • 1
  • 6

1 Answers1

1

Just slice them:

result = soup.prettify().splitlines()
print('\n'.join(result[:1000] + result[-1000:]))
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • That worked great. Thanks! How would I need to change this code if I just wanted to print out the first 1000 characters instead of the first 1000 lines of the document? – James Sep 05 '16 at 13:54
  • @James sure, just do the `print('\n'.join(result[:1000]))` for the first 1000 and `print('\n'.join(result[-1000:]))` for the last 1000. – alecxe Sep 05 '16 at 14:50