I have managed to retrieve some data from Statistics Sweden using the web site api. The answers to this question solved most of my problems.
But I still have two problems.
If I have characters with umlauts in my json-question (like “Å”, “Ä”, “Ö”) I get a “404” response from the server.
I’m trying to download data from this table:
Population 16+ years (RAMS) by region, employment, age and sex. Year 2004 - 2015
(You can get the query to the api on the web site if you click “Continue” and then “api for this table”, but you have to change the response format from "px" to "json".)
This code works:
library(jsonlite)
library(httr)
bodytxt <- '{
"query": [
{
"code": "Region",
"selection": {
"filter": "vs:RegionKommun07",
"values": [
"0114",
"1280"
]
}
},
{
"code": "Alder",
"selection": {
"filter": "item",
"values": [
"16-19"
]
}
},
{
"code": "Tid",
"selection": {
"filter": "item",
"values": [
"2015"
]
}
}
],
"response": {
"format": "json"
}
}'
req <- POST("http://api.scb.se/OV0104/v1/doris/en/ssd/START/AM/AM0207/AM0207H/BefSyssAldKonK",
body = bodytxt, encode = "json")
stop_for_status(req)
json <- content(req, "text")
# JSON starts with an invalid character:
validate(json)
json <- substring(json, 2)
validate(json)
# Now we can parse
object <- fromJSON(json)
print(object)
But if I change the query so it includes a “Ö”, it doesn’t work. Example:
bodytxt <- '{
"query": [
{
"code": "Region",
"selection": {
"filter": "vs:RegionKommun07",
"values": [
"0114",
"1280"
]
}
},
{
"code": "Sysselsattning",
"selection": {
"filter": "item",
"values": [
"FÖRV"
]
}
},
{
"code": "Alder",
"selection": {
"filter": "item",
"values": [
"16-19"
]
}
},
{
"code": "Tid",
"selection": {
"filter": "item",
"values": [
"2015"
]
}
}
],
"response": {
"format": "json"
}
}'
The other problem I have is that, as far as I understand, it should be possible to change the json query to a list and include the list in the call to the server, but I get a "404"-error. Example:
body_list <- fromJSON(bodytxt)
req <- POST("http://api.scb.se/OV0104/v1/doris/en/ssd/START/AM/AM0207/AM0207H/BefSyssAldKonK",
body = body_list, encode = "json")
What am I doing wrong?
Ps! I know that it exists an excellent package on CRAN that is named pxweb that is very easy to use to download data from Statistics Sweden. But I want to learn the api and pxwed doesn’t let me skip dimensions in the query.
System: Windows 7, r script saved in utf-8 encoding.