43

I have this short version of ADSB json data and would like to convert it into dataFrame columns as Icao, Alt, Lat, Long, Spd, Cou.....

After Alperen told me to do this

df = pd.read_json('2016-06-20-2359Z.json', lines=True), 

I can load it into a DataFrame. However, df.acList is

[{'Id': 10537990, 'Rcvr': 1, 'HasSig': False, ...
Name: acList, dtype: object

How can I get the Icao, Alt, Lat, Long, Spd, Cou data?

 "src":1,
   "feeds":[  
      {  
         "id":1,
         "name":"ADSBexchange.com",
         "polarPlot":false
      }
   ],
   "srcFeed":1,
   "showSil":true,
   "showFlg":true,
   "showPic":true,
   "flgH":20,
   "flgW":85,
   "acList":[  
      {  
         "Id":11281748,
         "Rcvr":1,
         "HasSig":false,
         "Icao":"AC2554",
         "Bad":false,
         "Reg":"N882AS",
         "FSeen":"\/Date(1466467166951)\/",
         "TSecs":3,
         "CMsgs":1,
         "AltT":0,
         "Tisb":false,
         "TrkH":false,
         "Type":"CRJ2",
         "Mdl":"2001 
BOMBARDIER INC 
 CL-600-2B19",
         "Man":"Bombardier",
         "CNum":"7503",
         "Op":"EXPRESSJET AIRLINES INC - ATLANTA, GA",
         "OpIcao":"ASQ",
         "Sqk":"",
         "VsiT":0,
         "WTC":2,
         "Species":1,
         "Engines":"2",
         "EngType":3,
         "EngMount":1,
         "Mil":false,
         "Cou":"United States",
         "HasPic":false,
         "Interested":false,
         "FlightsCount":0,
         "Gnd":false,
         "SpdTyp":0,
         "CallSus":false,
         "TT":"a",
         "Trt":1,
         "Year":"2001"
      },
      {  
         "Id":11402205,
         "Rcvr":1,
         "HasSig":true,
         "Sig":110,
         "Icao":"ADFBDD",
         "Bad":false,
         "FSeen":"\/Date(1466391940977)\/",
         "TSecs":75229,
         "CMsgs":35445,
         "Alt":8025,
         "GAlt":8025,
         "AltT":0,
         "Call":"TEST1234",
         "Tisb":false,
         "TrkH":false,
         "Sqk":"0262",
         "Help":false,
         "VsiT":0,
         "WTC":0,
         "Species":0,
         "EngType":0,
         "EngMount":0,
         "Mil":true,
         "Cou":"United States",
         "HasPic":false,
         "Interested":false,
         "FlightsCount":0,
         "Gnd":true,
         "SpdTyp":0,
         "CallSus":false,
         "TT":"a",
         "Trt":1
      }
   ],
   "totalAc":4231,
   "lastDv":"636019887431643594",
   "shtTrlSec":61,
   "stm":1466467170029
}
</pre>
Sergey Bushmanov
  • 23,310
  • 7
  • 53
  • 72
user8149657
  • 513
  • 1
  • 6
  • 11

5 Answers5

44

If you already have your data in acList column in a pandas DataFrame, simply do:

import pandas as pd
pd.io.json.json_normalize(df.acList[0])

Alt AltT    Bad CMsgs   CNum    Call    CallSus Cou EngMount    EngType ... Sqk TSecs   TT  Tisb    TrkH    Trt Type    VsiT    WTC Year
0   NaN 0   False   1   7503    NaN False   United States   1   3   ...     3   a   False   False   1   CRJ2    0   2   2001
1   8025.0  0   False   35445   NaN TEST1234    False   United States   0   0   ... 0262    75229   a   False   False   1   NaN 0   0   NaN

Since pandas 1.0 the imports should be:

import pandas as pd
pd.json_normalize(df.acList[0])
Sergey Bushmanov
  • 23,310
  • 7
  • 53
  • 72
  • This is on Python 2.7, right? I am not able to do it on Python 3. Would you please help me with that? – Hamid Feb 14 '19 at 14:50
  • @Hamid This is Python 3. If you're having problems with your code post it to SO, tag with `Pandas`, `json` or whatsoever and we will try to help you. Cheers! – Sergey Bushmanov Feb 14 '19 at 15:03
  • What if one or more than one rows has `None` , meaning no data available for some records? – Regressor Mar 25 '19 at 03:39
  • @Regressor you can solve by replacing `None` by `{}`, can check [my answer](https://stackoverflow.com/a/62862747/7385033) – Felipe Augusto Jul 12 '20 at 15:09
22

@Sergey's answer solved the issue for me but I was running into issues because the json in my data frame column was kept as a string and not as an object. I had to add the additional step of mapping the column:

import json
import pandas as pd
pd.io.json.json_normalize(df.acList.apply(json.loads))
ThinkBonobo
  • 15,487
  • 9
  • 65
  • 80
  • 3
    Thank you worked for my case, where I had JSON data in the 1st column, and I wanted to transpose it to other columns – Mantej Singh Apr 23 '19 at 15:03
  • This works, but what if I have this column that I am converting into multiple, in addition to other columns that I want to keep as-is (they are regular columns). How can I achieve this? – Aly Sep 29 '21 at 04:56
  • not sure and I don't really have a pandas env setup anymore to check but maybe you can search for a 'flatMap' function? – ThinkBonobo Sep 29 '21 at 17:52
7

Since pandas 1.0, json_normalize is available in the top-level namespace. Therefore use:

import pandas as pd
pd.json_normalize(df.acList[0])
Jonne Kleijer
  • 592
  • 5
  • 14
2

I can't comment yet on ThinkBonobo's answer but in case the JSON in the column isn't exactly a dictionary you can keep doing .apply until it is. So in my case

import json
import pandas as pd

json_normalize(
    df
    .theColumnWithJson
    .apply(json.loads)
    .apply(lambda x: x[0]) # the inner JSON is list with the dictionary as the only item
)
blockw
  • 29
  • 1
  • Useful contribution. On my data in Panda 0.24.2 the last line is not required. This may be specific to your data. I.e. mine looks like {'key':value} Perhaps yours looks like [{'key':value}] – jabberwocky Aug 19 '20 at 12:09
0

In my case I had some missing values (None) then I created a more specific code that also drops the original column after creating the new ones:

for prefix in ['column1', 'column2']:
    df_temp = df[prefix].apply(lambda x: {} if pd.isna(x) else x)
    df_temp = pd.io.json.json_normalize(df_temp)
    df_temp = df_temp.add_prefix(prefix + '_')
    df.drop([prefix], axis=1, inplace=True)
    df = pd.concat([df, df_temp], axis = 1, sort=False)
Felipe Augusto
  • 7,733
  • 10
  • 39
  • 73