They ask you to declare user agent in request headers - https://www.sec.gov/os/accessing-edgar-data
Apparently the one provided as an example is also accepted, though you really should provide your contact details there.
With httr2
, it still uses jsonlite
for parsing JSON responses:
library(httr2)
resp <- request("https://data.sec.gov/submissions/CIK0000320193.json") |>
req_user_agent("Sample Company Name AdminContact@<sample company domain>.com") |>
# set verbosity level for debugging, 1: show headers
req_perform(verbosity = 1)
#> -> GET /submissions/CIK0000320193.json HTTP/1.1
#> -> Host: data.sec.gov
#> -> User-Agent: Sample Company Name AdminContact@<sample company domain>.com
#> -> Accept: */*
#> -> Accept-Encoding: deflate, gzip
#> ->
#> <- HTTP/1.1 200 OK
#> <- Content-Type: application/json
#> <- x-amzn-RequestId: c634dcbe-68aa-4777-9f18-4edfae752eb4
#> <- Access-Control-Allow-Origin: *
#> <- x-amz-apigw-id: IvJu4HiHIAMFidw=
#> <- X-Amzn-Trace-Id: Root=1-64c2bcc5-5db9315369e664da512cb6b5
#> <- Vary: Accept-Encoding
#> <- Content-Encoding: gzip
#> <- Expires: Thu, 27 Jul 2023 18:51:49 GMT
#> <- Cache-Control: max-age=0, no-cache, no-store
#> <- Pragma: no-cache
#> <- Date: Thu, 27 Jul 2023 18:51:49 GMT
#> <- Content-Length: 28594
#> <- Connection: keep-alive
#> <- Strict-Transport-Security: max-age=31536000 ; preload
#> <- Set-Cookie: ak_bmsc=E9...
resp
#> <httr2_response>
#> GET https://data.sec.gov/submissions/CIK0000320193.json
#> Status: 200 OK
#> Content-Type: application/json
#> Body: In memory (157568 bytes)
# first few keys / values from JSON:
resp_body_json(resp, simplifyVector = TRUE, flatten = TRUE) |>
head(n = 10) |>
str()
#> List of 10
#> $ cik : chr "320193"
#> $ entityType : chr "operating"
#> $ sic : chr "3571"
#> $ sicDescription : chr "Electronic Computers"
#> $ insiderTransactionForOwnerExists : int 0
#> $ insiderTransactionForIssuerExists: int 1
#> $ name : chr "Apple Inc."
#> $ tickers : chr "AAPL"
#> $ exchanges : chr "Nasdaq"
#> $ ein : chr "942404110"
Created on 2023-07-27 with reprex v2.0.2
I'm from EU, I can open that JSON URL in the browser without any issues, but default jsonlite
& httr2
agents are blocked. Using my browser's agent with httr2
works only when I also set accept-language
. They check for some weird pattern in user agent when request is not coming from browser,
i.e. "foo_bar"
- NOK / "foo.bar"
- OK