Checking for Errors

It is recommended to periodically run the fsck.s3ql and s3ql_verify commands (in this order) to ensure that the file system is consistent, and that there has been no data corruption or data loss in the storage backend.

fsck.s3ql is intended to detect and correct problems with the internal file system structure, caused by e.g. a file system crash or a bug in S3QL. It assumes that the storage backend can be fully trusted, i.e. if the backend reports that a specific storage object exists, fsck.s3ql takes that as proof that the data is present and intact.

In contrast to that, the s3ql_verify command is intended to check the consistency of the storage backend. It assumes that the internal file system data is correct, and verifies that all data can actually be retrieved from the backend. Running s3ql_verify may therefore take much longer than running fsck.s3ql.

Checking and repairing internal file system errors

fsck.s3ql checks that the internal file system structure is consistent and attempts to correct any problems it finds. If an S3QL file system has not been unmounted correctly for any reason, you need to run fsck.s3ql before you can mount the file system again.

The fsck.s3ql command has the following syntax:

fsck.s3ql [options] <storage url>

This command accepts the following options:

--log <target>

Destination for log messages. Specify none for standard output or syslog for the system logging daemon. Anything else will be interpreted as a file name. Log files will be rotated when they reach 1 MiB, and at most 5 old log files will be kept. Default: ~/.s3ql/fsck.log

--cachedir <path>

Store cached data in this directory (default: ~/.s3ql)

--debug-modules <modules>

Activate debugging output from specified modules (use commas to separate multiple modules, ‘all’ for everything). Debug messages will be written to the target specified by the --log option.

--debug

Activate debugging output from all S3QL modules. Debug messages will be written to the target specified by the --log option.

--quiet

be really quiet

--backend-options <options>

Backend specific options (separate by commas). See backend documentation for available options.

--version

just print program version and exit

--authfile <path>

Read authentication credentials from this file (default: ~/.s3ql/authinfo2)

--compress <algorithm-lvl>

Compression algorithm and compression level to use when storing new data. algorithm may be any of lzma, bzip2, zlib, or none. lvl may be any integer from 0 (fastest) to 9 (slowest). Default: lzma-6

--keep-cache

Do not purge locally cached files on exit.

--batch

If user input is required, exit without prompting.

--force

Force checking even if file system is marked clean.

--force-remote

Force use of remote metadata even when this would likely result in data loss.

Detecting and handling backend data corruption

The s3ql_verify command verifies all data in the file system. In contrast to fsck.s3ql, s3ql_verify does not trust the object listing returned by the backend, but actually attempts to retrieve every object. By default, s3ql_verify will attempt to retrieve just the metadata for every object (for e.g. the S3-compatible or Google Storage backends this corresponds to a HEAD request for each object), which is generally sufficient to determine if the object still exists. When specifying the --data option, s3ql_verify will instead read every object entirely. To determine how much data will be transmitted in total when using --data, look at the After compression row in the s3qlstat output.

s3ql_verify is not able to correct any data corruption that it finds. Instead, a list of the corrupted and/or missing objects is written to a file and the decision about the proper course of action is left to the user. If you have administrative access to the backend server, you may want to investigate the cause of the corruption or check if the missing/corrupted objects can be restored from backups. If you believe that the missing/corrupted objects are indeed lost irrevocably, you can use the remove_objects.py script (from the contrib directory of the S3QL distribution) to explicitly delete the objects from the storage backend. After that, you should run fsck.s3ql. Since the (now explicitly deleted) objects should now no longer be included in the object index reported by the backend, fsck.s3ql will identify the objects as missing, update the internal file system structures accordingly, and move the affected files into the lost+found directory.

The s3ql_verify command has the following syntax:

s3ql_verify [options] <storage url>

This command accepts the following options:

--log <target>

Destination for log messages. Specify none for standard output or syslog for the system logging daemon. Anything else will be interpreted as a file name. Log files will be rotated when they reach 1 MiB, and at most 5 old log files will be kept. Default: None

--debug-modules <modules>

Activate debugging output from specified modules (use commas to separate multiple modules, ‘all’ for everything). Debug messages will be written to the target specified by the --log option.

--debug

Activate debugging output from all S3QL modules. Debug messages will be written to the target specified by the --log option.

--quiet

be really quiet

--version

just print program version and exit

--cachedir <path>

Store cached data in this directory (default: ~/.s3ql)

--backend-options <options>

Backend specific options (separate by commas). See backend documentation for available options.

--authfile <path>

Read authentication credentials from this file (default: ~/.s3ql/authinfo2)

--missing-file <name>

File to store keys of missing objects.

--corrupted-file <name>

File to store keys of corrupted objects.

--data

Read every object completely, instead of checking just the metadata.

--parallel PARALLEL

Number of connections to use in parallel.

--start-with <n>

Skip over first <n> objects and with verifying object <n>+1.