Skip to content

Technical details of CernVM-FS

CernVM-FS is implemented as a POSIX read-only filesystem in user space (FUSE) with repositories of files that are served via outgoing HTTP connections only, thus avoiding problems with firewalls.

Files in a CernVM-FS repository are automatically downloaded on-demand to a client system as they are accessed, from web servers that support the CernVM-FS repository being used.

Internally, CernVM-FS uses content-adressable storage (CAS) and Merkle trees (like Git also does) to store file data and metadata.

Caching

CernVM-FS uses a caching mechanism with a least-recently used (LRU) cache replacement policy, in which configurable local client cache is populated via either a forward proxy server (like Squid), or from a Stratum-1 replica server.

Both the proxy and the replica server could be within the same local network as the client, or not.

To help reduce performance problems regarding network latency and bandwidth, clients can leverage the Geo API supported by CernVM-FS Stratum-1 replica servers to automatically sort them geographically, in order to prioritize connecting to the closest ones.

Furthermore, additional caches can be made available to CernVM-FS, such as an alien cache on a shared cluster filesystem like GPFS or Lustre that is not managed by CernVM-FS, and a Content Delivery Network (CDN) can be used to help limit the time required to download files that are not cached yet.


(next: Flagship CernVM-FS repositories)