0

I've tried to convert bz2 to text with "Wikipedia Extractor(https://github.com/attardi/wikiextractor). I've downloaded wikipedia dump with bz2 extension then on command line used this line of code:

python Wikiextractor.py -b 85M -o extracted D:\wikiextractor-master\wikiextractor\zhwiki-latest-pages-articles.xml.bz2

After finishing preprocessing the pages, I came out with error like this: enter image description here

How can I fix this?

stacy1212
  • 3
  • 1

1 Answers1

1

I encountered this problem. Likely caused by the StringIO issue with Windows. I re-run it on Windows Subsystem for Linux (WSL) and it went well.