April 28, 2012

Your APT caching proxy is not that efficient

So you read about an "APT caching proxy" and that it can "save time and network bandwidth."
Sounds like something you should have, right? After all, the trade-off is just some additional disk space. APT already has its own cache at /var/cache/apt/archives/, so part of what the caching proxy would store in disk is already there.

Truth is, for a single client a caching proxy might not provide any benefit and might even cause a slowdown, but for at least one of the alternatives it is even worse: every request that goes through the proxy that requires a download from a mirror creates a new connection.

It is 2012 and a piece of software that aims to "save time and network bandwidth" can't even use keep-alive connections?

8 comments:

  1. For a single client, a slowdown would be unavoidable. A cache is only efficient if the object a client needs has a high probability of it already being in the cache. The probability for a single client to hit anything in the cache would be 0 since every new request would inherently get a cache miss- no other clients would have asked for the same object and therefore warmed the cache. If there were 2 clients, the probability of getting a cache hit is 1/2, and so on.

    And connection keep alive is annoying. I ran a public mirror and did not appreciate long lived connections on my already taxed webserver.

    ReplyDelete
    Replies
    1. Keep-alive might affect the server but you have two options: a) reduce the max allowed keep-alive time on the server, and b) switch httpd. Or both.

      Creating a new connection for every request is simply suboptimal.

      Delete
  2. May I ask what is your preferred caching/proxying solution for multiple clients, mixed Debian and Ubuntu?

    ReplyDelete
    Replies
    1. I used to use apt-cacher. Not sure if it ever gained support for multiple repositories.

      Nowadays, apt-cacher-ng looks promising, but I haven't actually used it. I know for a fact that it supports multiple repositories.

      Delete
    2. It does. I'm running three different instances in different setups, one of them my university. It has clearly saved us a lot of time and bandwith.

      Delete
    3. Does apt-cacher-ng still suffer from only being able to handle one download from origin server at a time? http://awaseconfigurations.wordpress.com/2011/11/13/problems-with-apt-cacher-ng-and-parallel-fabric-execution-or-the-crash-of-the-cacher/

      Delete
  3. I too use apt-cacher. I have to run one instance for Debian and one instance for Ubuntu because it only uses filename as its cache path. Was wondering what to upgrade to, so was also considering apt-cacher-ng.

    ReplyDelete
  4. I agree that there's no sense to use an APT caching proxy if you are the only user, but if your are administrator of a local network where have quite some Debian users, then it can be more useful.

    ReplyDelete