3

I have a variety of Linux hosts on my office LAN. I run apt-cacher-ng on a box to cache packagae downloads for all of the Debian and Ubuntu machines on the network. We have a few Gentoo users and I would like to cache their distfiles downloads as well.

I am already running an rsync mirror for Gentoo, and that has proven to be an easy setup and reliable.

What I would like is something like http-replicator but that is actually maintained and has a Debian Squeeze package available. I've looked at Squid and it was just too much, I would like something simpler. I also looked at Polipo and that seemed to be on the right track, but suffered this fatal flaw.

All of the distfiles on the Gentoo mirrors are the same, but if you attempted to download the same file from a different source mirror, Polipo would think it was a different file, resulting in a cache miss. http-replicator didn't suffer this issue, and since I don't administrate all of the Gentoo boxes, I don't think I can guarantee a high level of compliance on mirror selection, since most people just do it with mirrorselect, anyway.

So I'm looking for something that is:

  1. Pretty easy to set up and doesn't require too much fiddling or complicated cache-expiring setups
  2. Can act as a transparent HTTP proxy
  3. Will deliver the same local file, even if it is being "downloaded" from a different server
  4. Doesn't require mirroring of the entire collection of all Gentoo distfiles

Is this too much to ask?

Martin
  • 109
  • 1
  • 2
  • 5
Sean O'Leary
  • 493
  • 2
  • 8
  • Can you explain why you thought squid was too much ? It's an solution for what you're asking. – Shyam Sundar C S Mar 15 '12 at 19:28
  • I agree. Squid in its default configuration already does a good job caching (debian apt-get) downloads, so you don't even have to tweak it. – aseq Mar 15 '12 at 20:11
  • `cat /etc/squid/squid.conf | wc -l` `4948` I really just want to be a good member of these distro's communities by not putting undue load on their servers and don't want to spend a lot of resources (my time or server disk space) to do it. http-replicator was very easy to setup. It took about 15 minutes and required the editing of 5 lines of a 46 line config file. I guess I go with Squid if there's no choice in the middle, but I've spent an hour or so looking into how to configure it, it's not working correctly, and that's already more time than I wanted to spend on this. – Sean O'Leary Mar 16 '12 at 15:15

1 Answers1

5

You can use use apt-cacher-ng easily.

Remap-gentoo: file:gentoo_mirrors http://distfiles.gentoo.org/ /gentoo ; file:backends_gentoo # Gentoo Archives

  • In the file gentoo_mirrors, put all of the mirrors you want to capture.
  • In the file backends_gentoo, put the backup mirror you want to use for fetching.

Here's a script to create gentoo_mirrors

# This fetches the live Gentoo mirrors list
# robbat2@gentoo.org - 2013/Dec/03
OUTFILE=gentoo_mirrors
URL=http://www.gentoo.org/main/en/mirrors3.xml
wget --save-headers -q $URL -O - \
| sed -n \
-e '/^[A-Z]/{s,^,#,g;p}' \
-e '/<mirrorgroup/{s,^,\n#,g;p}' \
-e '/<name/{s,^,#,g;p}' \
-e '/<uri/{/protocol="http"/{s/.*<uri[^>]\+>//g;s/<\/uri>//g;p}}' \
>$OUTFILE

Source: I'm a senior Gentoo developer, and run the Gentoo infrastructure. I have submitted a variant on the above to the upstream apt-cacher-ng author.

robbat2
  • 350
  • 5
  • 10