1

I already read other questions but I still don't know how to parse Facebook Graph Search results in R. My main goal is to convert in something like a data frame, to analyze some columns.

library(RCurl)
library(RJSONIO)
library(rjson)

data <- getURL("https://graph.facebook.com/search?q=flamengo&type=post&limit=1000", cainfo="cacert.perm")
#if you don't have "cacert.perm" file, do as follow
#download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile="cacert.perm")

UPDATE: Thanks @user1609452

Now what if I want to include "count", nested in "likes"? Let me show:

names(fbData$data[[1]])
[1] "id"           "from"         "message"      "actions"      "privacy"     
[6] "type"         "created_time" "updated_time" "shares"       "likes"   
names(fbData$data[[1]]$likes)
[1] "data"  "count"

In this case, how sould I set match.fun argument?

likes <- lapply(fbData$data[[1]]$likes,name='count')
Error in match.fun(FUN) : no "FUN" argument, no pattern

likes <- lapply(fbData$data[[1]]$likes,'[[',name='count')
Error in FUN(X[[2L]], ...) : index out of bounds

Can someone help me, please?


And if I want to include "count", nested in "likes"? Let me show:

names(fbData$data[[1]])
[1] "id"           "from"         "message"      "actions"      "privacy"     
[6] "type"         "created_time" "updated_time" "shares"       "likes"   
names(fbData$data[[1]]$likes)
[1] "data"  "count"

In this case, how sould I set match.fun argument?

likes <- lapply(fbData$data[[1]]$likes,name='count')
Error in match.fun(FUN) : no "FUN" argument, no pattern

likes <- lapply(fbData$data[[1]]$likes,'[[',name='count')
Error in FUN(X[[2L]], ...) : index out of bounds

Can someone help me, please?

Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880

2 Answers2

2

Use RJSONIO or rjson no need to call both. Once you have imported the JSON data you need to convert it to a list.

library(RCurl)
library(RJSONIO)

data <- getURL("https://graph.facebook.com/search?q=flamengo&type=post&limit=1000")

fbData <- fromJSON(data)

The posts are contained in fbData$data.

#> length(fbData$data)
#[1] 500

The first post has various attributes:

#> names(fbData$data[[1]])
#[1] "id"           "from"         "message"      "privacy"      "type"        
#[6] "application"  "created_time" "updated_time"

To convert this data to a dataframe you will need to decide what you want to include and how to structure it. For example to get all the message bodies you could use:

lapply(fbData$data,'[[',name='message')

UPDATE:

To get the number of likes for a post you can use:

lapply(fbData$data,function(x){x$likes$count})
user1609452
  • 4,406
  • 1
  • 15
  • 20
  • Thanks @user1609452. After fbData <- fromJSON(data) I got 'Error in fromJSON(content, handler, default.size, depth, allowComments, : invalid JSON input' – Luiz Felipe Freitas Apr 02 '13 at 12:33
  • with rjson package, when I hit fbData <- fromJSON(data) returns "Error in fromJSON(data) : unexpected escaped character '\o' at pos 130". Any idea of how can I replace "\" in the "data" character object, right after getURL? – Luiz Felipe Freitas Apr 02 '13 at 12:49
1

This is actually an answer to the question you asked in a comment. I apologize for not answering in a comment, but I don't see the option to do so.

If you want to replace a / you can use

install.packages("stringr", dep=TRUE)
library("stringr")
library("RCurl")
library("RJSONIO")
data <- getURL("https://graph.facebook.com/search?q=flamengo&type=post&limit=1000")
clean <- str_replace_all(data,"\","whatever")
fbData <- fromJSON(clean)

where "whatever" is what you're replacing it with. By the way, if you can use rjson instead of RJSONIO then that might be slightly preferable, but they are basically the same anyway. rjson just runs slightly faster and more reliably whereas RJSONIO has more functionality.

Oh, and btw you can validate your JSON data at jsonlint.com

This sounds like an interesting app you've got going here, what is it? Some sort of FB stalker?

user2225772
  • 159
  • 1
  • 1
  • 7
  • Thanks @user2225772 for your help and your comments. You're right, my idea is to build a sort of stalker. Next step is to set a timeout or a batch to stream this data from FB, like streamR package does with Twitter - I set like 12 hours and it keeps capturing tweets with terms I'm tracking during this period. But I'm still a newbie in programming languages =/ – Luiz Felipe Freitas Apr 03 '13 at 04:58