1

I'm trying to produce only the following JSON data fields, but for some reason it writes the entire page to the .html file? What am I doing wrong? It should only produce the boxes referenced e.g. title, audiosource url, medium sized image, etc?

r    = urllib.urlopen('https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=10000&page=1')
data = json.loads(r.read().decode('utf-8'))
for post in data['posts']:
#    data.append([post['title'], post['audioSource'], post['image']['medium'], post['excerpt']['long']])
    ([post['title'], post['audioSource'], post['image']['medium'], post['excerpt']['long']])
with io.open('criminal-json.html', 'w', encoding='utf-8') as r:
  r.write(json.dumps(data, ensure_ascii=False))
leopheard
  • 101
  • 7
  • 1
    Your code doesn't run (missing imports, `for` with only a do-nothing tuple indented under it). `data` currently is the whole JSON object as well and that's what's being written to the file. Create a [mcve] that works and reproduces the issue. – Mark Tolonen Jul 14 '19 at 04:13
  • you have to create new list and append to this list and write this list. – furas Jul 14 '19 at 04:23

2 Answers2

3

You want to differentiate from your input data and your output data. In your for loop, you are referencing the same variable data that you are using to take input in as you are using to output. You want to add the selected data from the input to a list containing the output.

Don't re-use the same variable names. Here is what you want:

import urllib
import json
import io

url = urllib.urlopen('https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=10000&page=1')
data = json.loads(url.read().decode('utf-8'))
posts = []
for post in data['posts']:
    posts.append([post['title'], post['audioSource'], post['image']['medium'], post['excerpt']['long']])
with io.open('criminal-json.html', 'w', encoding='utf-8') as r:
    r.write(json.dumps(posts, ensure_ascii=False))
Calder White
  • 1,287
  • 9
  • 21
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
1

You are loading the whole json in the variable data, and you are dumping it without changing it. That's the reason why this is happening. What you need to do is put whatever you want into a new variable and then dump it.

See the line - ([post['title'], post['audioSource'], post['image']['medium'], post['excerpt']['long']])

it does nothing. So, data remains unchanged. Do what Mark Tolonen suggested and it'll be fine.

Fuad Rafid
  • 96
  • 3