Is it possible to download contents of a website—a set of HTML pages—straight to memory without writing out to disk?
I have a cluster of machines with 24G of installed each, but I’m limited by a disk quota to several hundreds MB. I was thinking of redirecting the output wget
to some kind of in-memory structure without storing the contents on a disk. The other option is to create my own version of wget
but may be there is a simple way to do it with pipes
Also what would be the best way to run this download in parallel (the cluster has >20 nodes). Can’t use the file system in this case.