10

I am getting data from twitter in json format and storing the same in a file.

consumer_key = 'Consumer KEY'
consumer_secret = 'Secret'
access_token = 'Token'
access_secret = 'Access Secret'

auth = OAuthHandler(consumer_key, consumer_secret)

auth.set_access_token(access_token, access_secret)

api = tweepy.API(auth)

os.chdir('Path')
file = open('TwData.json','wb')

for status in tweepy.Cursor(api.home_timeline).items(15):
    simplejson.dump(status._json,file,sort_keys = True)
file.close

But I am getting the below error:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/Users/abc/anaconda/lib/python3.6/json/__init__.py", line 180, in dump
    fp.write(chunk)
TypeError: a bytes-like object is required, not 'str'
cs95
  • 379,657
  • 97
  • 704
  • 746
Ritesh
  • 333
  • 2
  • 3
  • 12

2 Answers2

17

From the json.dump() documentation:

The json module always produces str objects, not bytes objects. Therefore, fp.write() must support str input.

You opened the file in binary mode. Don't do that, remove the b from the file mode:

file = open('TwData.json','w')

It's better to use an absolute path rather than change the working directory, and if you used the file as a context manager (with the with statement), it'll be automatically closed for you when the block is done. That helps avoid errors like forgetting to actually call the file.close() method.

And if you are going to write multiple JSON documents to the file, at least put a newline between each document, making it a JSON lines file; this is much easier to parse again later on:

with open('Path/TWData.json', 'w') as file:    
    for status in tweepy.Cursor(api.home_timeline).items(15):
        json.dump(status._json, file, sort_keys=True)
        file.write('\n')

Alternatively, put everything into a top-level object like mapping or list, and write that single object to the file to create a valid JSON document.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • What if we actually *want* to store in a binary file? – user32882 Jun 06 '21 at 03:01
  • @user32882 then use `json.dumps()` to produce a string, and then encode that string to bytes. Or wrap your binary file object in a [`TextIOWrapper` instance](https://docs.python.org/3/library/io.html#io.TextIOWrapper), and pass that wrapper to `json.dump()`. You probably want to set `write_through=True` and don’t forget to call [`.detach()`](https://docs.python.org/3/library/io.html#io.TextIOBase.detach) once done. – Martijn Pieters Jun 06 '21 at 23:37
1

Don't store separate json objects. Append each one to a list, and then dump at once.

with open('TwData.json','w') as file:    
    data = []
    for status in tweepy.Cursor(api.home_timeline).items(15):
        data.append(status._json)

    simplejson.dump(data, file, sort_keys=True)

Should also note here that you shouldn't open the file in binary mode if you want to write text to it.

cs95
  • 379,657
  • 97
  • 704
  • 746
  • very useful..appreciate it – Ritesh Sep 11 '17 at 11:06
  • my bad, i wasnt aware about that..wanted to accept both of these, but since i had accepted Martjin's earlier, its not letting me accept yours..still both are equally useful for me.. – Ritesh Sep 11 '17 at 11:08