1

Here's what I'm trying to do:

I want to distribute my Vcard (.vcf) file by hosting it on my personal website (this part is a rigid requirement). People will access it from a QR code on my business card, however, no links to the file will exist on my webpages.

I want to make the file publicly accessible, while ensuring that it doesn't get scraped by a bot. It will be contained in a folder disallowed from "normal" bots via robots.txt, and I will disable directory listings in Apache.

I do NOT want to introduce additional steps such as captchas or authentication.

My thought is something like how google drive does public sharing - a 44-character random string that represents the file. So....

http://mywebsite.com/private/34599771831821330576336168849178778047996955.vcf

My questions are:

1) How safe is this? Presumably, as long as I disable directory listing on Apache, the only way a bot can stumble on the file without a direct link is via random guessing. Do bots really bother trying to do just a thing?

2) If it's safe, presumably string length is key. Just how long does the string need to be to make it "safe"?

3) Is there a better way to do this than filename obscurity?

Christian
  • 41
  • 1
  • 3
  • I've seen so many obscure URLs getting picked up by Google over the years (with no idea how they did it), I don't trust this method any more. Even a simple password is better than just a URL if it's really sensitive data. – Pekka Jan 13 '16 at 21:22
  • 1
    It's not terribly sensitive, so I'd rather err on the side of open(on the assumption that many people already struggle with QR Codes + vcards and will give up if I add another step of authentication. The containing folder will certainly be disallowed from "legal" bots...I'm just worried about "illegal" ones. – Christian Jan 13 '16 at 21:24

1 Answers1

0

Yes, there is a better way. It is called recaptcha. The idea should be to present the user with the captcha and if he/she/it solves it correctly, then you proceed to the download.

https://www.google.com/recaptcha/intro/index.html

BCartolo
  • 720
  • 4
  • 21
  • I updated my post as I should have stated that more clearly - I do not want to introduce additional steps (burden) on the end user. QR codes are already a pain for people, and I'm afraid they'll give up if they have to jump through hoops. – Christian Jan 13 '16 at 21:27