I'm going to try deploying my first web app soon, so my experience is lacking. I remember reading somewhere that port scans by bots happen within minutes of being exposed to the internet (maybe that's how long it takes for a windows 95 system to get compromised, it's been a while since I read the article). This particular server is running Ubuntu 9.10 server edition on amd64.
The web app itself should be entirely over https, as per this question I just asked here. In addition, the website has a file upload section, now done through http post, and that file is eventually farmed out through a separate (wireless, unfortunately) interface to another computer to handle actual processing.
So, on the actual network interface exposed to the world, I'm thinking that ports 80 and 443 should be exposed, and nothing else. As I said before, 80 should redirect to 443. Is that sane? Is there something else I don't know, some other port I should have active? The files are moved to the processing system using ruby DRb over ports 9000 and 9001, so those need to be open as well, but only on the second interface.
Also, what firewall program should I be using to handle two network interfaces like this? There are a few listed here, but I'm not sure which is appropriate for serving web pages, or even if this is a special case at all.