2

Hey, guys! I have simple R script for parse JSON file:

json <- 
   rjson::fromJSON(readLines('http://data.rada.gov.ua/ogd/zpr/skl8/bills-
   skl8.json', warn=F))
bills <-  data.frame(
  id = numeric(), 
  title = character(),
  type = character(), 
  subject = character(), 
  rubric = character(),
  executive = character(),
  sesion = character(),
  result = character() 
)
for (row in json) 
{
  bill <- data.frame(
    id = row$id, 
    title = row$title, 
    type = row$type,
    subject = row$subject, 
    rubric = row$rubric,
    executive = row$mainExecutives$executive$department,
    sesion = row$registrationSession,
    result = row$currentPhase$title
)
  bills <- rbind(bills, bill)
}

But i have Error in data.frame(id = row$id, title = row$title, type = row$type, subject = row$subject, : arguments imply differing number of rows: 1, 0

so, my JSON file have NULL value in 277 line. Can i skip this error or replace NULL values in my cycle? Thanks!

user2554330
  • 37,248
  • 4
  • 43
  • 90

2 Answers2

1

To answer your immediate question, I'd wrap that with a small function that returns a string if the executive department is missing.

protect_against_null <- function( x ) {
  if( is.null(x) )
    return( "" ) # Replace with whatever string you'd like.
  else 
    return( x )
}

for (row in json) {
  bill <- data.frame(
    id = row$id, 
    title = row$title, 
    type = row$type,
    subject = row$subject, 
    rubric = row$rubric,
    executive = protect_against_null(row$mainExecutives$executive$department),
    sesion = row$registrationSession,
    result = row$currentPhase$title
  )
  bills <- rbind(bills, bill)
}

Long-term advice: Since this dataset is 11,000 nested records, I'd shy away from loops. Check out the purrr package for mapping the nested json/list into a rectangular data frame. Especially purrr::map_dfr().

wibeasley
  • 5,000
  • 3
  • 34
  • 62
1

For this purpose fromJSON (jsonlite package) could be handy.

library(jsonlite)

url <- 'http://data.rada.gov.ua/ogd/zpr/skl8/bills-skl8.json'
df <- jsonlite::fromJSON(url)   

df1 <- data.frame(
  id = df$id, 
  title = df$title, 
  type = df$type,
  subject = df$subject, 
  rubric = df$rubric,
  executive = df$mainExecutives$executive$department,
  sesion = df$registrationSession,
  result = df$currentPhase$title
)
Prem
  • 11,775
  • 1
  • 19
  • 33