2

I am trying to web scrape the information from a database using the httr package in R. Every time I try to connect to it to get the HTML content I get the following error.

url <- "https://metalpdb.cerm.unifi.it"

query <- httr::GET(url)
Error in curl::curl_fetch_memory(url, handle = handle) : 
  schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed).

What could be the problem here? It seems to be related to this specific website. I am by no means an expert when it comes to this and have never tried to web scrape something before. I have never encountered an error like this when using httr::GET.

Help would be highly appreciated!

> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=English_Switzerland.1252  LC_CTYPE=English_Switzerland.1252   
[3] LC_MONETARY=English_Switzerland.1252 LC_NUMERIC=C                        
[5] LC_TIME=English_Switzerland.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] httr_1.4.2

loaded via a namespace (and not attached):
[1] compiler_4.1.3 R6_2.5.1       tools_4.1.3    curl_4.3.2 
jpquast
  • 333
  • 2
  • 8

1 Answers1

1

When I run your code it actually works:

library(httr)
url <- "https://metalpdb.cerm.unifi.it"

query <- httr::GET(url)
query

Output:

Response [https://metalpdb.cerm.unifi.it/]
  Date: 2022-04-04 11:58
  Status: 200
  Content-Type: text/html; charset=UTF-8
  Size: 22.5 kB


<!DOCTYPE html>
<html>
    <head>
        <title>Welcome to MetalPDB</title>
        <script src="/assets/javascripts/jquery-1.11.1.min.js" type="text/javascript"></script>
        <script src="/assets/javascripts/jquery-ui.min.js" type="text/javascript"></script>
        <script src="/assets/javascripts/collapse.js" type="text/javascript"></script>
        <script src="/assets/javascripts/transition.js" type="text/javascript"></script>
...

What you could do is restart R then install.packages(c("curl", "httr")) again.

Session info:

R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS 12.3.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] nl_NL.UTF-8/nl_NL.UTF-8/nl_NL.UTF-8/C/nl_NL.UTF-8/nl_NL.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] httr_1.4.2

loaded via a namespace (and not attached):
[1] compiler_4.1.0 R6_2.5.1       tools_4.1.0    curl_4.3.2 
Quinten
  • 35,235
  • 5
  • 20
  • 53
  • Thanks for your reply! This is weird. I actually tried this on multiple computers. I also made sure to use the newest R version and newest package version. It still does not work. Now the error is however: `Error in curl::curl_fetch_memory(url, handle = handle) : schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed).` Could you maybe include your session info? – jpquast Apr 04 '22 at 13:19
  • @jpquast, Added session info to answer! – Quinten Apr 04 '22 at 13:24
  • I still don't quite understand why it works for you but not for me. I have tried ways of preventing SSL certificate verification using: `set_config(config(ssl_verifypeer = 0L))` as described in other issues, but still doesn't work. I event tried it in python, where it initially didn't work but after I prevented verification it worked. – jpquast Apr 04 '22 at 14:22
  • 1
    I had a similar issue, and the answer to this post solved the issue for me https://stackoverflow.com/questions/64147821/error-running-weathercan-package-fatal-ssl-tls-alert-e-g-handshake-failed – zoowalk Jul 29 '22 at 07:26