1
>>> import bz2
>>> bz2.compress('hi')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/ryan/anaconda/lib/python3.4/bz2.py", line 498, in compress
return comp.compress(data) + comp.flush()
TypeError: 'str' does not support the buffer interface

I've seen examples using strings as input but it wont work for me

Serjik
  • 10,543
  • 8
  • 61
  • 70
Ryan Halabi
  • 95
  • 4
  • 17

1 Answers1

8

Compression algorithms compress bytes, not text.

3>> bz2.compress(b'hi')
b'BZh91AY&SY\x9a\x89\xb4"\x00\x00\x00\x01\x00\x00` \x00!\x00\x82\xb1w$S\x85\t\t\xa8\x9bB '
3>> bz2.compress('hi'.encode('utf-8'))
b'BZh91AY&SY\x9a\x89\xb4"\x00\x00\x00\x01\x00\x00` \x00!\x00\x82\xb1w$S\x85\t\t\xa8\x9bB '
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • Thanks! Do you know why in the examples shown here they use strings then? https://pymotw.com/2/bz2/ – Ryan Halabi Dec 23 '15 at 05:34
  • @RyanHalabi: `str` in Python 2.x is a bytestring. – Ignacio Vazquez-Abrams Dec 23 '15 at 05:35
  • why doesnt a ==c in the following: a = 'BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084' b = a.encode('utf-8') c = b'BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084' (sorry for poor editing, it won't let me space it out) – Ryan Halabi Dec 23 '15 at 05:57
  • @RyanHalabi: [Because UTF-8 encodes characters above U+007F as multiple bytes.](http://www.joelonsoftware.com/articles/Unicode.html) – Ignacio Vazquez-Abrams Dec 23 '15 at 06:21
  • @RyanHalabi whenever viewing pymotw.com make sure the url starts with pymotw.com/3 so you are reading the version for python 3. So you should use https://pymotw.com/3/bz2/ – Raymond Sep 09 '19 at 04:18