0

I have a standard Android Device Status JSON file that I am trying to read as a Pandas DF and then exporting it to an Excel file.

I am pasting the first two lines of my file below :

{"ageCorrectionFactor":{"d":"Age Correction Factor","i":"1252"},"backCamera":{"d":"Working Fine. No issues","i":"79"},"battery":{"d":"Working Fine. No issues","i":"86"},"bill":{"d":"No"},"bluetooth":{"d":"Working Fine. No issues"},"box":{"d":"No"},"boxHidden":{"d":"Box hidden","i":"467"},"cameraHidden":{"d":"Camera hidden ","i":"494"},"charger":{"d":"No","i":"87"},"chargerHidden":{"d":"Charger hidden","i":"476"},"chargingDefect":{"d":"Working Fine. No issues"},"chargingPortHidden":{"d":"Charging Port Hidden","i":"764"},"earphone":{"d":"No","i":"88"},"frontCamera":{"d":"Front Camera"},"hiddenBattery":{"d":"Hidden battery","i":"777"},"mobileAge":{"d":"Above 11 months","i":"97"},"physicalCondition":{"d":"Physical Condition","i":"800"},"powerButton":{"d":"Working Fine. No issues"},"screen":{"d":"Working Fine. No issues"},"screenHidden":{"d":"NA","i":"220"},"screenIssue":{"d":"Screen Touch Issue"},"speakers":{"d":"Working Fine. No issues"},"tmsPrice":{"d":"TMS Price Improvement"},"volumeButton":{"d":"Working Fine. No issues"},"wifiGpsBluetooth":{"d":"Working Fine. No issues"},"workingNonworking":{"d":"Yes","i":"76"}},
{"ageCorrectionFactor":{"d":"Age Correction Factor","i":"1252"},"backCamera":{"d":"Working Fine. No issues","i":"79"},"battery":{"d":"Working Fine. No issues"},"bill":{"d":"No","i":"90"},"bluetooth":{"d":"Working Fine. No issues"},"box":{"d":"No","i":"89"},"boxHidden":{"d":"Box hidden","i":"467"},"cameraHidden":{"d":"Camera hidden ","i":"496"},"charger":{"d":"No","i":"87"},"chargerHidden":{"d":"Charger hidden","i":"477"},"chargingDefect":{"d":"Working Fine. No issues"},"chargingPortHidden":{"d":"Charging Port Hidden","i":"764"},"earphone":{"d":"No","i":"88"},"frontCamera":{"d":"Front Camera"},"hiddenBattery":{"d":"Hidden battery","i":"779"},"mobileAge":{"d":"Above 11 months","i":"96"},"physicalCondition":{"d":"Physical Condition","i":"91"},"powerButton":{"d":"Working Fine. No issues"},"screen":{"d":"Working Fine. No issues"},"screenHidden":{"d":"NA","i":"219"},"screenIssue":{"d":"Screen Touch Issue"},"speakers":{"d":"Working Fine. No issues"},"tmsPrice":{"d":"TMS Price Improvement"},"volumeButton":{"d":"Working Fine. No issues"},"wifiGpsBluetooth":{"d":"Working Fine. No issues","i":"81"},"workingNonworking":{"d":"Yes"}},

Now I use the read_json() function to convert the file to a Pandas data frame object and I get the following result (a part of the entire first row I am pasting here) :

>>> df.head(1)


ageCorrectionFactor  \
0  {u'i': u'1252', u'd': u'Age Correction Factor'}   

                                        backCamera  \
0  {u'i': u'79', u'd': u'Working Fine. No issues'}   

                                           battery           bill  \
0  {u'i': u'86', u'd': u'Working Fine. No issues'}  {u'd': u'No'}   

                            bluetooth            box  \
0  {u'd': u'Working Fine. No issues'}  {u'd': u'No'}   

Apparently, the issue is I am not able to break down the internal 'key':'value' pair and so I am getting an improper output.

Additionally, I used regex to remove unwanted pairs but my motive is to try not to change any of the original data.

Is there any way that I can have a proper output using either pandas or combination of regex and python's native JSON parsing functions?


The same operation in R yielded a slightly convincing result,

json_file <- fromJSON("E:/pathto/file.json")
json_file <- lapply(json_file, function(x) {
  x[sapply(x, is.null)] <- NA
  unlist(x)
})    
JSON_DF <- as.data.frame(do.call("rbind", json_file))

enter image description here

sunitprasad1
  • 768
  • 2
  • 12
  • 28

2 Answers2

2

Try using Vaishali Garg's method, but first load the file with the json module.

import json
import pandas as pd

with open('E:/pathto/file.json') as f:
    data = json.load(f)

df = pd.io.json.json_normalize(data)
rtk22
  • 351
  • 3
  • 8
1

Try this:

df = pd.io.json.json_normalize(f) #f is the json filename

Its returning a dataframe with 40 columns

Vaishali
  • 37,545
  • 5
  • 58
  • 86
  • You sure? I am getting an `AttributeError` `Traceback (most recent call last): File "E:/Python_Un/DataImporting/JSON.py", line 103, in DF2 = Po.io.json.json_normalize(Dir+AndroidDevFile) File "C:\Python27\lib\site-packages\pandas\io\json.py", line 793, in json_normalize if any([isinstance(x, dict) for x in compat.itervalues(data[0])]): File "C:\Python27\lib\site-packages\pandas\compat\__init__.py", line 171, in itervalues return obj.itervalues(**kw) AttributeError: 'str' object has no attribute 'itervalues'` – sunitprasad1 Jan 31 '17 at 19:37