My department does data migrations when a client switches from another software vendor to us, often we need to get a copy of their old data (whatever that may have been) and send it to us.
The big challenge we face is some systems will have hundreds of thousands of files (mainly document/image repositories) that the whole collection can be in the 10's of gigabytes in size. We grab a copy of their data at the start of the conversion process, then we grab a 2nd set right before the install which could be months later.
We are looking to find a better solution for uploading that 2nd set of data. Right now the main method is just creating a large zip of the whole directory and FTPing (via a write only account) it to our server, that of course has a large overhead due to a large portion of the files are that likely have not changed seance the initial data grab.
Tools like rsync
seem like the perfect solution but from what I have researched there is no easy way to do "write only" a account like we did with the FTP. Preventing un-authorized downloading of another client's data is a big concern from the higher ups.
In Summary, what kind of tools should I be using with these kind of requirements:
- Does not allow downloading of other client's data.
- Minimal setup work to be performed client side. Usually instructions are given over the phone on how to upload the data and we don't have anyone on site. Also the person on the other end of the phone is often VERY unskilled in computer use.
- Windows comparability of the client. 95% of our clients are windows users, the other 5% are Macs but Mac comparability is not a major concern (but would be a +).
- Allows us to not send redundant files that have not changed.
- Reliability on the client side. We have attempted to use
BITS
in the past to upload but we found that a fairly large number of XP era machines just could not get it to work correctly. Any client we use needs to work 99% of the time on any Windows machine XP SP2 or newer. - Minimal setup work server side per client. We do not want to have to create a separate user for every single client who uploads, but if we had to it would not rule out a tool only counted as a -.
- The server side program runs inside windows. We are mainly a Windows/C# shop, having to setup and manage a Linux box would be not preferred. However if the tool in question fills all the other requirements well it would not be ruled out for not running in windows.
Currently the frontrunner is rsync
and writing some sort of user manager that would create a separate user account on the rsync server per client, but I am sure there are other options I do not know about which could be better suited.