Table Of Contents

Mounting

A S3QL file system is mounted with the mount.s3ql command. It has the following syntax:

mount.s3ql [options] <storage url> <mountpoint>

Note

S3QL is not a network file system like NFS or CIFS. It can only be mounted on one computer at a time.

This command accepts the following options:

--log <target> Write logging info into this file. File will be rotated when it reaches 1 MiB, and at most 5 old log files will be kept. Specify none to disable logging. Default: ~/.s3ql/mount.log
--cachedir <path>
 Store cached data in this directory (default: ~/.s3ql)
--authfile <path>
 Read authentication credentials from this file (default: ~/.s3ql/authinfo2)
--debug <module>
 activate debugging output from <module>. Use all to get debug messages from all modules. This option can be specified multiple times.
--quiet be really quiet
--no-ssl Do not use secure (ssl) connections when connecting to remote servers.
--ssl-ca-path path
 File or directory or containing the trusted CA certificates. If not specified, the defaults compiled into the system’s OpenSSL library are used.
--version just print program version and exit
--cachesize <size>
 Cache size in KiB (default: autodetect).
--max-cache-entries <num>
 Maximum number of entries in cache (default: autodetect). Each cache entry requires one file descriptor, so if you increase this number you have to make sure that your process file descriptor limit (as set with ulimit -n) is high enough (at least the number of cache entries + 100).
--allow-other Normally, only the user who called mount.s3ql can access the mount point. This user then also has full access to it, independent of individual file permissions. If the --allow-other option is specified, other users can access the mount point as well and individual file permissions are taken into account for all users.
--allow-root Like --allow-other, but restrict access to the mounting user and the root user.
--fg Do not daemonize, stay in foreground
--single Run in single threaded mode. If you don’t understand this, then you don’t need it.
--upstart Stay in foreground and raise SIGSTOP once mountpoint is up.
--profile Create profiling information. If you don’t understand this, then you don’t need it.
--compress <algorithm-lvl>
 Compression algorithm and compression level to use when storing new data. algorithm may be any of lzma, bzip2, zlib, or none. lvl may be any integer from 0 (fastest) to 9 (slowest). Default: lzma-6
--metadata-upload-interval <seconds>
 Interval in seconds between complete metadata uploads. Set to 0 to disable. Default: 24h.
--threads <no> Number of parallel upload threads to use (default: auto).
--nfs Enable some optimizations for exporting the file system over NFS. (default: False)

Compression Algorithms

S3QL supports three compression algorithms, LZMA, Bzip2 and zlib (with LZMA being the default). The compression algorithm can be specified freely whenever the file system is mounted, since it affects only the compression of new data blocks.

Roughly speaking, LZMA is slower but achieves better compression ratios than Bzip2, while Bzip2 in turn is slower but achieves better compression ratios than zlib.

For maximum file system performance, the best algorithm therefore depends on your network connection speed: the compression algorithm should be fast enough to saturate your network connection.

To find the optimal algorithm and number of parallel compression threads for your system, S3QL ships with a program called benchmark.py in the contrib directory. You should run this program on a file that has a size that is roughly equal to the block size of your file system and has similar contents. It will then determine the compression speeds for the different algorithms and the upload speeds for the specified backend and recommend the best algorithm that is fast enough to saturate your network connection.

Obviously you should make sure that there is little other system load when you run benchmark.py (i.e., don’t compile software or encode videos at the same time).

Notes about Caching

S3QL maintains a local cache of the file system data to speed up access. The cache is block based, so it is possible that only parts of a file are in the cache.

Maximum Number of Cache Entries

The maximum size of the cache can be configured with the --cachesize option. In addition to that, the maximum number of objects in the cache is limited by the --max-cache-entries option, so it is possible that the cache does not grow up to the maximum cache size because the maximum number of cache elements has been reached. The reason for this limit is that each cache entry requires one open file descriptor, and Linux distributions usually limit the total number of file descriptors per process to about a thousand.

If you specify a value for --max-cache-entries, you should therefore make sure to also configure your system to increase the maximum number of open file handles. This can be done temporarily with the ulimit -n command. The method to permanently change this limit system-wide depends on your distribution.

Cache Flushing and Expiration

S3QL flushes changed blocks in the cache to the backend whenever a block has not been accessed for at least 10 seconds. Note that when a block is flushed, it still remains in the cache.

Cache expiration (i.e., removal of blocks from the cache) is only done when the maximum cache size is reached. S3QL always expires the least recently used blocks first.

Automatic Mounting

If you want to mount and umount an S3QL file system automatically at system startup and shutdown, you should do so with one dedicated S3QL init script for each S3QL file system.

If your system is using upstart, an appropriate job can be defined as follows (and should be placed in /etc/init/):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
description	"S3QL Backup File System"
author		"Nikolaus Rath <Nikolaus@rath.org>"

# This assumes that eth0 provides your internet connection
start on (filesystem and net-device-up IFACE=eth0)

# We can't use "stop on runlevel [016]" because from that point on we
# have only 10 seconds until the system shuts down completely.
stop on starting rc RUNLEVEL=[016]

# Time to wait before sending SIGKILL to the daemon and
# pre-stop script
kill timeout 300

env STORAGE_URL="s3://my-backup-bla"
env MOUNTPOINT="/mnt/backup"

env USER="myusername"
env AUTHFILE="/path/to/authinfo2"

expect stop

script
    # Redirect stdout and stderr into the system log
    DIR=$(mktemp -d)
    mkfifo "$DIR/LOG_FIFO"
    logger -t s3ql -p local0.info < "$DIR/LOG_FIFO" &
    exec > "$DIR/LOG_FIFO"
    exec 2>&1
    rm -rf "$DIR"

    # Check and mount file system
    su -s /bin/sh -c 'exec "$0" "$@"' "$USER" -- \
        fsck.s3ql --batch --authfile "$AUTHFILE" "$STORAGE_URL"
    exec su -s /bin/sh -c 'exec "$0" "$@"' "$USER" -- \
        mount.s3ql --upstart --authfile "$AUTHFILE" "$STORAGE_URL" "$MOUNTPOINT"
end script

pre-stop script
    su -s /bin/sh -c 'exec "$0" "$@"' "$USER" -- umount.s3ql "$MOUNTPOINT"
end script

Note

In principle, it is also possible to automatically mount an S3QL file system with an appropriate entry in /etc/fstab. However, this is not recommended for several reasons:

  • file systems mounted in /etc/fstab will be unmounted with the umount command, so your system will not wait until all data has been uploaded but shutdown (or restart) immediately (this is a FUSE limitation, see issue #1).
  • There is no way to tell the system that mounting S3QL requires a Python interpreter to be available, so it may attempt to run mount.s3ql before it has mounted the volume containing the Python interpreter.
  • There is no standard way to tell the system that internet connection has to be up before the S3QL file system can be mounted.