1

I've recently set up my first Ubuntu server and installed scrapy and scrapyd. I've written a few spiders, and I've figured out how to execute the spiders through the API on port 6800. I also noticed there's a web interface there. I've also noticed that pretty much anyone could do the same, especially since the data at the web interface lists all the project and spider names. Is there a way to protect this feature so that only I can manage this?

Thanks, Chad

Burkhard
  • 14,596
  • 22
  • 87
  • 108
Chad Casey
  • 181
  • 1
  • 6
  • Are you asking about how to setup a firewall on your Ubuntu system? If so, this page may be of help: [Ubuntu Firewall Guide](https://help.ubuntu.com/12.04/serverguide/firewall.html#firewall-ufw) – John Hascall Sep 05 '14 at 20:00
  • @JohnHascall, I'm not really sure. I briefly Googled "password-protect scrapyd port 6800", and most of the answers dealt with .htaccess files or Nginx configs. I'm new to this, so I don't really know how to implement those suggestions. Plus, I'm running Apache, not Nginx. I'm also kind of surprised that scrapyd would leave the API wide open. Can I firewall the port and then still access it via password? – Chad Casey Sep 05 '14 at 21:35
  • A firewall blocks access via ports and/or IP addresses. If you want to password protect it, then .htaccess is the way to go (Apache uses .htaccess too). There are lots of examples of doing this, but this video seems like a pretty good intro to [Apache Access Control[(https://www.youtube.com/watch?v=b9j8KaBBrxE) – John Hascall Sep 05 '14 at 23:17

1 Answers1

1

I figured it out. I couldn't figure out how to get .htaccess to do what I wanted, which was to password-protect a port where a service was listening to requests. I used iptables instead, assuming that the communication to that port was only going to be coming from a few different IPs.

Chad Casey
  • 181
  • 1
  • 6