0

I have several JSON files that should be read and merged using r. Each file contains data for 51 observations. However, when I read the JSON file in r, the information I need is nested in the column "mentions". I need the time stamp that is contained in "mentions" to create a new variable which counts the number of mentions in month t of the year 2017.

The outcome should be a data frame which contains the ID, Amentions in t, t2 ... t12 and Bmentions in t1, t2 ... t12. Hence a data frame with 51 rows and 25 columns per JSON file.

I used the jsonlite package and wrote the following code:

jsondata1 <- stream_in(file("1595450.txt"))
%>% jsonlite::flatten()
%>% as_data_frame()    
head(jsondata1)
ID                  mentions
12345               list(Amentions = list(license = "xxx", author =    
                    list(name = "Max M", url = 
                    "http://mentionexample.com/MaxM/", m_id = "123456", 
                    posted_on = "2017-03-20T21:35:57+00:00"))
12346               list()
12347              list(Bmentions = list(license = "xxx", title =   
                    "A new star is born", url = "http://...", author =    
                    list(url = "http://www...", c_ids = list(123455), 
                    posted_on = "2017-05-17T23:57:41+00:00"), Amentions 
                    =  list(license = "xxx", author = list(name = "Max 
                    M", url = "http://mentionexample.com/MaxM/", m_id = 
                    "123456", posted_on = "2017-03-20T21:35:57+00:00")
123489             list()

At the moment the JSON files are not read properly but the data in the column "mentions" is nested. Thus, the first column ID is correct but the second column is not.

pa_01
  • 11
  • 3

1 Answers1

0

Try to do %>% unlist() at the end of your first pipe, before converting your data to the data.frame. It might help.