Can we completely prevent robot accessing our web application?

Question

As I know, if we want to prevent robots accessing our web sites we have to parse 'User-Agent' header in http request then check whether the request coming from robots or browsers.

I think we can not completely prevent robot accessing our web sites because someone can program to use any http client to send Http request with FAKE browser user-agent so for this case, we can not know fake user-agent is real user-agent coming from a browser or coming from a robot program (by programmed).

My question is there is any way to prevent completely robot accessing our web sites?

your question is answered here. http://stackoverflow.com/questions/7045705/stop-abusive-bots-from-crawling — Zeeba, Dec 12 '13 at 20:40
Ok got it. I didn't search for this in this site before. Thanks. — LHA, Dec 12 '13 at 20:41
Look up [Turing test](http://en.wikipedia.org/wiki/Turing_test). — Hot Licks, Dec 12 '13 at 21:32

Lenny · Answer 1 · 2013-12-12T20:54:14.013

You cannot eliminate the bots, but you can greatly reduce them.

Obvious option you're already using is user-agent detection

You could also load your page content through ajax using JavaScript which would eliminate any bot that cannot process javascript. So just have an empty div with the id="content" and on page ready do an ajax call to insert the content. This means if anyone uses curl or similar to scrape your page content it wouldn't work. IF the bot is built for your site specifically it's easy to work around but most random bots wouldn't get through it probably.

You could also obfuscate the target url in JS... and/or make it automatic by using location.href to tell ajax to look for a content file by the same name in a different folder.

You could of course to a captcha before a user (or bot) could enter the site, but that's annoying to users.

IF it's less about accessing the page and has to do with form submission then captcha is a great choice or you could do a honey-pot where you put in a form field that is hidden by css and the robot will fill out that field but the human won't (because it's hidden) and you can detect that.

Thanks for good tip: using empty div + ajax call at body load. — LHA, Dec 12 '13 at 20:50

score 0 · Answer 2 · answered Dec 12 '13 at 20:40

0

Other than placing your pages behind some kind of authentication method, the answer is no.

Obviously, the authentication would also apply to humans.

answered Dec 12 '13 at 20:40

driis

161,458
45
265
341

score 0 · Answer 3 · answered Dec 12 '13 at 20:42

0

I think that autentication with captcha is the easier way and the most used. Other options would be to ask simply questions to the user (simply to humans but not to bots). However all these methods are annoying for human users.

answered Dec 12 '13 at 20:42

HAL9000

3,562
3
25
47

Can we completely prevent robot accessing our web application?

3 Answers3