0

I want to create a site that offers themes. People will be able to see the themes and use a demo but I want to prevent cheapskates from using wget or any other download manager to download the demo sites in an automated fashion without having to pay the original authors of the themes.

I was wondering if NGINX has the ability to block download managers like wget

I know you can block useragents like this

if ($http_user_agent ~ (agent1|agent2|Foo|Wget|Catall Spider|AcoiRobot) ) {
    return 403;
}

but that seems like something that can easily be circumvented by supplying a user-agent that goes through this filter.

note: As mentioned below a login system would help but the demo would run in a frame and consist of htmlpages only and does that stop download managers installed as browser extensions, aren't they using the same session so rendering the login useless

Why is this being downvoted? If you downvote please explain why first.

I know browsers can download everything by saving just the page but the main concern here is to stop people from leeching all themes in an automated fashion using a download manager.

It seems like the consensus is there is no reliable way to do this but I wonder if Nginx can't block requests that seem to be automated?

Stofke
  • 173
  • 1
  • 2
  • 11
  • 3
    User agents are trivial to spoof. If you want this level of restriction, you need to put resources behind a login. – EEAA Dec 29 '12 at 17:07
  • Yes there will be a login but there are also download managers as browser extension and once logged in they can do the job I think – Stofke Dec 29 '12 at 17:10
  • If you are talking about web themes, you don't need even something like `wget`, as browsers like Firefox are able to download all resources of a page in one go, including pictures and CSS files. – Sven Dec 29 '12 at 17:11
  • Yes thats what I was thinking, they just need to open the direct link to the frame content and download the webpage, is there no way to prevent this? – Stofke Dec 29 '12 at 17:13
  • The only safe way to prevent this is to only use pre-rendered images as a preview. – Sven Dec 29 '12 at 17:16
  • Yes but then you have no interaction, like ability to see menus for example – Stofke Dec 29 '12 at 17:17
  • there is no way to do this.. i can simply save the page in firefox since I have it locally – Mike Dec 29 '12 at 17:19
  • The only semi-reliable way would be to try writing/using an HTML/CSS obfuscator that makes the sources quasi-impossible to use and offer the clean version only after a purchase. I don't know if something like this exits, though. – Sven Dec 29 '12 at 17:23
  • Ok but that means they have to go through hundreds of themes one by one if they want to download everything. If I can't prevent that so be it but at least I can try to prevent automated downloads that target all content. – Stofke Dec 29 '12 at 17:24
  • Yes but using inspect element in Chrome renders obfuscators useless as it shows the code. It would make it more difficult but it's also a hassle to obfuscate all html. That said I would already be happy if I can stop bots from downloading all content in an automated fashion – Stofke Dec 29 '12 at 17:25
  • 1
    An obfuscator as I envision would make the HTML and CSS so difficult to read the it's unfeasible to actually use it in a project. But let's end this here - there is *no* reliable way to prevent robots from downloading your stuff if a browser can see it. – Sven Dec 29 '12 at 17:29
  • 1
    One last comment regarding download blockers: People have tried everything to block robots, but robot authors have countered everything. Advanced robots can counter UA detection, rate limits, they can even go distributed to prevent IP based limiting. It's a game you can't win. – Sven Dec 29 '12 at 17:59
  • 1
    Make a preview something like an image or partial content. This is not a problem you can solve at a system level. – Grumpy Dec 29 '12 at 20:36
  • Yes I'm thinking that is the only solution to make a partial demo not containing all files. – Stofke Dec 29 '12 at 22:14

1 Answers1

1

I found this .htacces file to stop downloadbots for Apache. This can be adjusted for Nginx

RewriteEngine On 
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.* – [F,L]
Stofke
  • 173
  • 1
  • 2
  • 11