When I try to download one page from Linkedin with the following command:
curl -I https://www.linkedin.com/company/google
I get a 999 status code:
HTTP/1.1 200 Connection established
HTTP/1.1 999 Request denied
Date: Tue, 30 Aug 2016 08:19:35 GMT
X-Li-Pop: prod-tln1-hybla
Content-Length: 1629
Content-Type: text/html
Since users using a browser can access to Linkedin pages, it means that they can make difference between robots and users.
Else users would not be allow to access Linkedin pages due to the following lines at the end of robots.txt:
User-agent: *
Disallow: /
So, Linkedin can make difference between requests coming from browsers and others. How do they do that ?