Previous | Contents | Next

Chapter 4: Advanced Server Configuration

4.1 Vocabulary

This chapter introduces some terminology which is needed to understand the functionality of apt-cacher-ng; it's recommended to understand it before continuing with the advanced configuration.

4.2 Configuration file types

By default, the /etc/apt-cacher-ng directory (or the one specified with program options) contains all config files, HTML page templates, the stylesheet and other text-based support files used by apt-cacher-ng. The contents may vary depending on the installation of apt-cacher-ng, refer to the package documentation for Linux Distribution packages.

There are a few certain file types distinguished by apt-cacher-ng:

  1. Main configuration files:

    *.conf files are assumed to contain configuration directives in the form of "key: value" pairs. The package comes with a commented example configuration file. apt-cacher-ng reads all files matching *.conf in alphabetical order and merges the contents. For options documentation, see commented example file shipped with apt-cacher-ng (conf/ directory in original source).

  2. URL lists and remote repository list files. The file names are arbitrary, no special suffix is expected. They are included during processing of configuration files and can contain data in one the following formats:
  3. Various support files used for the configuration web interface, named like *.css and *.html.
  4. *.default files are used in some rare cases as replacement for list files having the same name without .default suffix.
  5. *.hooks files specify custom actions which can be executed upon connection/disconnection (see section 4.3.2 for details).

4.3 Repositories and URL mapping

With the most simple configuration, apt-cacher-ng will act almost like an ordinary HTTP proxy with improved caching behaviour. When files are requested, they are downloaded from a remote location specified in client's request and are stored in a unique way.

However, for some use cases it can be beneficial to specify additional rules to achieve further improvements, e.g. in order to detect and prevent avoidable downloads, to reduce space requirements for the cache directory or simply hide real download locations from the APT clients.

These modifications are generally achieved by two strategies, Merging and Redirection, which are configured in a context of a specified cache Repository. The configuration for them is created using one or multiple Remap-... configuration directives (see below).

Merging:

"Merging" of incoming requests can be done if some subdirectories of different remote servers are considered equal where the last part of the remote file path leads to the same file content. When specified, the internal cache content is shared and the live download stream is shared. The configuration work consists of setting an "equality list" containing a set of URLs representing the base directories (like http://ftp.debian.org/debian and http://ftp.uni-kl.de/pub/linux/debian).

Redirection:

With redirection, client requests cause a download from a remote location which is different from what clients requested and believe to receive from. Redirection is an optional feature; if used, it's configured by one or multiple URL(s) pointing to target servers. The URL(s) must include a directory spec which matches the directory level of the URLs in the Merging URL(s), for example all ending with /ubuntu/ for usual Ubuntu mirror URLs. If redirection is not used (i.e. the target URL list is empty) the original URL from client's request is used to get the data.

Repository:

A (cache) repository is the internal identifier which declares the scope in which Merging/Redirection specs are applied. It also represents the name of an internal cache subdirectory.

4.3.1 Writing Remap-... configuration

When use cases for merging/redirection are identified and a repository name is chosen, these components are written into configuration directives starting with Remap- which follow the simple syntax:

Remap-RepositoryName: MergingURLs ; TargetURLs

The repository name is a symbolic name which should be chosen carefully and should not be changed afterwards, otherwise the data might become inaccessible for clients until the files are extracted and reimported semi-manually. Internally, this string shares the namespace with host names and/or top directory names of other URLs. Name collisions can cause nasty side effects and should be avoided. Recommended names are made up from alphanumeric or URL-friendly characters. Also, a repository name should not be associated to a real hostname. Examples for good names: archlinux-repo, debianlocal; examples for bad names: fedora.example.com, _very&weird.

The TargetURLs part is optional (see Redirection description above). If multiple targets are specified, the order of servers here defines their order of preference (see also the NetworkTimeout option and additional notes below).

Both URL lists simply contain URLs separated by spaces. The strings must be properly URL-encoded. Since all URLs are assumed to belong to http:// protocol and point to a remote directory, the http:// protocol prefix and trailing slashes are optional. There is no hard limit to the number of URLs. However, for readability reasons it's recommended to put them into separate list files (see section 4.2) and specify the particular list files with tags like file:urlsDebian.list instead of writing them into a single line. Raw URLs and file:... lists can be mixed.

Fully configured Remap lines can look like:

Example I:

Remap-debrep: ftp.de.debian.org/debian http:://ftp.at.debian.org/debian

for the use case: small home network, clients have de... or at... servers in their sources.list files and use acng as HTTP proxy. Now the files are still downloaded from at... or de... mirrors depending on the user request, but already cached data is served to both, at... and de... users.

Example II:

Remap-ubuntu: file:ubumir.lst ; 192.168.17.23/pu ca.archive.ubuntu.com/ubuntu

for the use case: small home network, clients have various Ubuntu mirrors (which are listed in ubumir.lst) in their sources.list files and use acng as HTTP proxy. All requests are redirected to a mirror in the /pu directory of some local machine. When that machine is down, Canadian public server is used instead.

4.3.2 Special tricks and additional notes

There are some implementation details (partially explained above) and some configuration options related to repository settings which should be mentioned explicitly.

The internal cache directory tree follows the URL requests from the clients unless modified by Remapping rules. For proxy-style configuration on the user side, it is always the hostname of the requested URL. But if clients access the apt-cacher-ng server like a regular mirror (not using APT's proxy config) then it's just passed as regular directory name. And at this point, it's possible to use Remapping constructs to access random remote locations while the client assumes to download from a subdirectory of apt-cacher-ng (as http server). This is configured by simply using /some/directory/string/ instead of URLs in the Merging list to let your clients download from http://acngserver/some/directory/string/... paths.

If multiple Remap- lines for the same Repository are specified, the contents of both URL lists are merged.

On some restricted networks, it may be needed to enforce the use of predefined mirrors. If the ForceManaged option is set, only requests to URL matched in some Remap-... config is allowed.

Sometimes, it may be needed to execute a system command before connection to certain machines is established. This is possible by associating commands with a repository declaration, i.e. by storing a file named like repositoryname.hooks in the main configuration directory. It can contain PreUp, Down and DownTimeout settings. PreUp/Down are executed by the system shell and it's up to the administrator to make sure that no malicious code is contained there and that the execution of these commands does not cause significant delays for other apt-cacher-ng users.

If the Redirection part contains multiple URLs, the server prefers to use them in the order of appearance. On success, the first target is used all the time, and so this should be the preferred mirror. "Success" means getting a started download or a non-critical failure in this context. A "404 File not found" status is not a critical failure since APT (client) can expect it while checking for existence of remote files and change its behaviour accordingly.


Comments to blade@debian.org
[Eduard Bloch, Sat, 12 Feb 2011 23:33:20 +0100]