2

A low cost air quality sensor (AQE) sends its data to the opensensors.io server. Every x seconds it sends a string of information (timestamp, pollutant concentration, etc). The data can be retrieved structured as a json file. Opensensors terminology uses devices, topics, organizations, and payloads. I have figured out how to set up a curl handle and use the curl package to download a csv file. Here's the code

curl_download(url = myURL2, destfile = "curlDownloadTest.csv", mode = "w", handle = myCurlHandle)

An example of the downloaded data is at https://github.com/GeraldCNelson/AQEAnalysis/commit/c6ee29545d07835c5a920bf2b37625adb78462aa

I use fromJSON in the jsonlite package to transform this

temp <- fromJSON("curlDownloadTest.csv", simplifyDataFrame = FALSE)

The output (temp) is a large list with 2 elements - messages and next. messages contains all the data; next is a link to use to get the next set of data (it's not all downloaded at once).

The messages list consists of multiples sets of lists (one for each set of uploaded data); each set has five elements - device, owner, topic, date, and payload. Payload is a list of 3 - encoding (always chr utf-8), content-type (always chr "application/json"), and text. The text list looks like its in json format (here's a string fragment - "{\"serial-number\":\"egg00802aaa019b0111\",\"converted-value\":69.52,\"converted-units\":\"degF\")

I want to restructure this data into a data frame (or data table) that has the date information as a column and the test information from payload as the remaining columns (serial-number, converted value, etc...

I can't figure out how to convert the text list in the payload list from its current (json?) structure to something I can rbind to a data frame.

JerryN
  • 2,356
  • 1
  • 15
  • 49
  • Yodit from OpenSensors here, we are currently writing an R-Stats lib as we need it for our own needs. Will push and give you a heads up over the next couple of weeks so should make your life easier. – yods Oct 22 '16 at 19:05
  • Hi @JerryN - I have started to work on the R package here - https://github.com/lgatto/rosio. The vignette gives an overview. – Laurent Nov 19 '16 at 20:17

1 Answers1

2

Thankfully, everything is pretty uniform:

library(jsonlite)
library(dplyr)

df <- fromJSON("curlDownloadTest.csv")

bind_cols(
  select(df$messages, device, owner, topic, date),
  stream_in(textConnection(df$messages$payload$text), flatten=TRUE)
) -> df

glimpse(df)
## Observations: 742
## Variables: 14
## $ device             <chr> "egg00802aaa019b0111", "egg00802aaa019b0111", "egg00802aaa019b0111", "...
## $ owner              <chr> "wickeddevice", "wickeddevice", "wickeddevice", "wickeddevice", "wicke...
## $ topic              <chr> "/orgs/wd/aqe/temperature/egg00802aaa019b0111", "/orgs/wd/aqe/humidity...
## $ date               <chr> "2016-10-10T17:02:09.507Z", "2016-10-10T17:02:09.811Z", "2016-10-10T17...
## $ serial-number      <chr> "egg00802aaa019b0111", "egg00802aaa019b0111", "egg00802aaa019b0111", "...
## $ converted-value    <dbl> 63.20, 43.31, 0.52, -25.20, 63.70, 42.85, 0.53, -13.32, 64.01, 42.58, ...
## $ converted-units    <chr> "degF", "percent", "ppb", "ppb", "degF", "percent", "ppb", "ppb", "deg...
## $ raw-value          <dbl> 63.200000, 43.310000, 0.221252, -0.827832, 63.700000, 42.850000, 0.221...
## $ raw-instant-value  <dbl> 63.48000, 43.07000, 0.22149, -0.82785, 63.91000, 42.66000, 0.22073, -0...
## $ raw-units          <chr> "degF", "percent", "volt", "volt", "degF", "percent", "volt", "volt", ...
## $ sensor-part-number <chr> "SHT25", "SHT25", "NO2-B4-ISB", "3SP-O3-20-PCB", "SHT25", "SHT25", "NO...
## $ raw-value2         <dbl> NA, NA, 0.222732, NA, NA, NA, 0.222797, NA, NA, NA, 0.222460, NA, NA, ...
## $ raw-instant-value2 <dbl> NA, NA, 0.22330, NA, NA, NA, 0.22273, NA, NA, NA, 0.22341, NA, NA, NA,...
## $ compensated-value  <dbl> NA, NA, 0.62, -25.25, NA, NA, 0.63, -13.37, NA, NA, 0.02, -18.08, NA, ...
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205