0

I'm trying to access an API, which retrieves some data and stores it in a data frame.

The following code should be fully reproducible.

require("httr")
require("jsonlite")
require("tidyverse")

vouches2 <- data.frame()

reproducible_list <- c("0x00d18ca9782be1caef611017c2fbc1a39779a57c", "0x105645ffea02c7c8feaa1a32c100f1a30766d6a9")

for(i in reproducible_list){
  theURL <- paste0("HTTPS://api.poh.dev/profiles/", i, "/vouches")
  r <- GET(theURL)
  message("Getting ", theURL)
  s <- content(r, as = "text", encoding = "UTF-8")
  message("DEBUG contntent(...) success")
  df <- as.data.frame(fromJSON(s,flatten = TRUE, simplifyDataFrame=FALSE))
  message(names(df))
  message("as.data.frame success")
  # 
  # df_filtered <- df %>%
  #   select(given.eth_address,given.status,given.display_name) %>%
  #   mutate(voucher = i) %>%
  #   mutate(voucher_name = data_filtered$display_name[data_filtered$id == i]) %>%
  #   filter(!is.na(voucher_name)) # remueve los que no estan en la lista de challengers frecuentes
  message("DEBUG bind_rows")
  vouches2 <- bind_rows(vouches2, df) 
  message("DEBUG bind_rows DONE")
  Sys.sleep(0.5)
  
}

The first item in the list (0x00d18ca9782be1caef611017c2fbc1a39779a57c) goes well. The problem is that the second item in the list (value 0x105645ffea02c7c8feaa1a32c100f1a30766d6a9) shows this error:

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 0, 1

The reason I suppose that this is happening is due to the fact that the second value has an empty set of columns (associated with "given" in the json data).

First value gives

{
  "given": [
    {
      "eth_address": "0x9021346151cab1467982766e417377eaf8323aae",
      "status": "REGISTERED",
      "vanity_id": 4175,
      "display_name": "Katy Daza",
      "first_name": "Katy",
      "last_name": "Daza",
      "registered": true,
      "photo": "https://ipfs.kleros.io/ipfs/QmbRDPVhXdi1PeQ9wAbWEQDAvUxr9quDRiHhsKTd6nmkG2/whatsapp-image-2021-05-04-at-4.44.10-pm.jpeg",
      "video": "https://ipfs.kleros.io/ipfs/QmNbKYXYhahPrHP6rcqbjf9fgjCqTVSpcb3RffiH6Hs7Jj/katy2.mp4",
      "bio": "Environmental Lawyer",
      "profile": "https://app.proofofhumanity.id/profile/0x9021346151cab1467982766e417377eaf8323aae",
      "registered_time": "2021-05-15T14:58:30.000Z",
      "creation_time": "2021-05-04T22:02:03.000Z"
    },
    {
      "eth_address": "0x6beca7fb81c1f7b3f91b212e6830d15fe7bf1012",
      "status": "REGISTERED",
      "vanity_id": 2647,
      "display_name": "CamiloTD",
      "first_name": "Juan Camilo",
      "last_name": "Torres Cepeda",
      "registered": true,
      "photo": "https://ipfs.kleros.io/ipfs/QmQTPz6Z5jjCvPUY1KifEdy6PaXP2zDN4adm6GdH2bXk8C/1598481112063-1-.jfif",
      "video": "https://ipfs.kleros.io/ipfs/QmcejEjb1JSfpR3znNjd55SgLvLZByy8icZi19nsXqK1rM/whatsapp-video-2021-04-26-at-11.35.46-1-.mp4",
      "bio": "Blockchain developer & passionate researcher",
      "profile": "https://app.proofofhumanity.id/profile/0x6beca7fb81c1f7b3f91b212e6830d15fe7bf1012",
      "registered_time": "2021-04-30T08:05:10.000Z",
      "creation_time": "2021-04-21T18:16:51.000Z"
    },
    {
      "eth_address": "0xcc24fde84f1a18cb857f112eeea4a35192063663",
      "status": "REGISTERED",
      "vanity_id": 1548,
      "display_name": "Lauren",
      "first_name": "Lauren",
      "last_name": "Bajin",
      "registered": true,
      "photo": "https://ipfs.kleros.io/ipfs/QmbjLEdaHK1AixCpzA1JMwCH83hMGVTMzRRcsKQLxng1a1/20210112-190216.jpg",
      "video": "https://ipfs.kleros.io/ipfs/Qma8iHKhAsdgQhhbqtLWN9xnBADha6diskR8gnmF2Hfdto/video-2021-04-09-15-09-44.mp4",
      "bio": "Blockchain dAbbler and movement enthusiast",
      "profile": "https://app.proofofhumanity.id/profile/0xcc24fde84f1a18cb857f112eeea4a35192063663",
      "registered_time": "2021-04-22T18:53:20.000Z",
      "creation_time": "2021-04-09T21:37:13.000Z"
    },
    {
      "eth_address": "0x317bbc1927be411cd05615d2ffdf8d320c6c4052",
      "status": "REGISTERED",
      "vanity_id": 2023,
      "display_name": "Carlos Quintero",
      "first_name": "Carlos",
      "last_name": "Quintero",
      "registered": true,
      "photo": "https://ipfs.kleros.io/ipfs/QmeGoecmiJni67AEuNQFzSEHKP1cJngQdHqg3faC6TGWoP/proofofhumanityphoto.jpg",
      "video": "https://ipfs.kleros.io/ipfs/QmcyhkfTLtosQyjX79mH1b1duZojMjywajN6WEp41AnbNC/proofofhumanityvideo.mp4",
      "bio": "I am Software Engineer with great interest in the blockchain",
      "profile": "https://app.proofofhumanity.id/profile/0x317bbc1927be411cd05615d2ffdf8d320c6c4052",
      "registered_time": "2021-04-26T14:13:15.000Z",
      "creation_time": "2021-04-12T19:59:26.000Z"
    },
    {
      "eth_address": "0x7d547666209755fb833f9b37eebea38ebf513abb",
      "status": "REGISTERED",
      "vanity_id": 749,
      "display_name": "Juankbell",
      "first_name": "Juan Carlos",
      "last_name": "Bell Llinas",
      "registered": true,
      "photo": "https://ipfs.kleros.io/ipfs/QmXWsMjBsAPRcm8zFLXHWg9WEcpGTW9KVnRGrHdTytNGSi/img-1207-2.jpg",
      "video": "https://ipfs.kleros.io/ipfs/QmPx1AaChXYB4ef44BeXKb76tXwLrtBfqKN6V2ynxyidDW/poh-juan-bell.m4v",
      "bio": "Political scientist, Mag. in Conflict Management. Ethereum Colombia.",
      "profile": "https://app.proofofhumanity.id/profile/0x7d547666209755fb833f9b37eebea38ebf513abb",
      "registered_time": "2021-04-14T17:50:46.000Z",
      "creation_time": "2021-04-05T21:49:07.000Z"
    }
  ],
  "received": [
    {
      "eth_address": "0xb20a327c9b4da091f454b1ce0e2e4dc5c128b5b4",
      "status": "REGISTERED",
      "vanity_id": 11,
      "display_name": "Merlin Egalite",
      "first_name": "Merlin",
      "last_name": "Egalite",
      "registered": true,
      "photo": "https://ipfs.kleros.io/ipfs/QmcsDzTCPyrDwBAbVVWLxqjmLhsHvuGu7xvc1oiM36cQBs/merlin.JPG",
      "video": "https://ipfs.kleros.io/ipfs/QmbjNPuD85SMfMW3ocUtwbgd1Zk5KExcXPDjj81VDaFwKv/merlin-egalite.mp4",
      "bio": "Smart Contract Hacker",
      "profile": "https://app.proofofhumanity.id/profile/0xb20a327c9b4da091f454b1ce0e2e4dc5c128b5b4",
      "registered_time": "2021-03-11T18:53:58.000Z",
      "creation_time": "2021-03-11T18:53:58.000Z"
    }
  ]
}

Second value gives:

    {
  "given": [],
  "received": [
    {
      "eth_address": "0xc81d370e13a248e55208b52e4a9db9fbd5e01b6b",
      "status": "REGISTERED",
      "vanity_id": 4743,
      "display_name": "Ale",
      "first_name": "Mirian",
      "last_name": "Alejandra",
      "registered": true,
      "photo": "https://ipfs.kleros.io/ipfs/QmeD8TCcFZ8idYhiesX8EuHdsW1CaXEMTLFQgoJjzCz3mT/20210507-173114.jpg-2.jpg",
      "video": "https://ipfs.kleros.io/ipfs/QmThfU8LShbx5PAseE46mD7f3AuyX8Wcn6ztdAdTEjMVGJ/20210507-172812.mp4",
      "bio": "Love my kids",
      "profile": "https://app.proofofhumanity.id/profile/0xc81d370e13a248e55208b52e4a9db9fbd5e01b6b",
      "registered_time": "2021-05-21T01:09:43.000Z",
      "creation_time": "2021-05-17T12:18:07.000Z"
    }
  ]
}

I'd like some general guidance on how to face this issue. I believe it has something to do with the handling of empty rows, but I'm not entirely sure.

Thanks!

luisgonzalez
  • 153
  • 11
  • Is there a specific reason to avoid *jsonlite*'s simplification to data.frame, to then put it back into a wide data.frame? You could stack it together like `do.call(rbind, fromJSON(first, flatten = TRUE, simplifyDataFrame=TRUE))` as a long dataset and just be done with it. That would work for the cases where you have missing parts or not. – thelatemail Jun 02 '21 at 02:54
  • 1
    This is above my understanding but thanks for the help anyways. There is no particular reason. From what I got from other forums, a long dataset would not work in my case. I need to have the id of given and reveived in the same row. – luisgonzalez Jun 03 '21 at 02:45

1 Answers1

3

You can try a tidyverse approach. It can handle missing columns/rows most of the time.

library(dplyr)
library(purrr)
library(httr)

theURL <- paste0("HTTPS://api.poh.dev/profiles/", reproducible_list, "/vouches")

map_df(theURL, ~{
  r <- GET(.x)
  s <- content(r, as = "text", encoding = "UTF-8")
  bind_rows(fromJSON(s), .id = 'id')
}, .id = 'index')
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Haha we have almost similar code. No need to post – Onyambu Jun 02 '21 at 02:48
  • Ah... I see this is giving a long dataset now as I suggested above. +1 – thelatemail Jun 02 '21 at 02:57
  • this is neat, but it did not get the desired outcome. It creates an id for given and received, and the desired thing is to have them in a "wide form" where given.id and received.id are in the same row. – luisgonzalez Jun 02 '21 at 18:24
  • The important thing is to get all the information that you need first in a dataframe. You can manipulate the data to get it in desired format later. For example here, you can use `pivot_wider` to get the data in wide format if you wish. You have not shown your exact expected output so it is difficult to know how you want the final output to look like. – Ronak Shah Jun 02 '21 at 23:19