blob: 139fab175c8aa33e9640d018f35765dbd1bb4a92 [file] [log] [blame]
Gang Hed750c422016-03-22 14:24:33 -07001 OCFS2 online file check
2 -----------------------
3
4This document will describe OCFS2 online file check feature.
5
6Introduction
7============
Masanari Iidaa039ba32016-07-01 20:28:18 +09008OCFS2 is often used in high-availability systems. However, OCFS2 usually
Gang Hed750c422016-03-22 14:24:33 -07009converts the filesystem to read-only when encounters an error. This may not be
10necessary, since turning the filesystem read-only would affect other running
11processes as well, decreasing availability.
12Then, a mount option (errors=continue) is introduced, which would return the
Masanari Iidaa039ba32016-07-01 20:28:18 +090013-EIO errno to the calling process and terminate further processing so that the
Gang Hed750c422016-03-22 14:24:33 -070014filesystem is not corrupted further. The filesystem is not converted to
15read-only, and the problematic file's inode number is reported in the kernel
16log. The user can try to check/fix this file via online filecheck feature.
17
18Scope
19=====
20This effort is to check/fix small issues which may hinder day-to-day operations
21of a cluster filesystem by turning the filesystem read-only. The scope of
22checking/fixing is at the file level, initially for regular files and eventually
23to all files (including system files) of the filesystem.
24
25In case of directory to file links is incorrect, the directory inode is
26reported as erroneous.
27
28This feature is not suited for extravagant checks which involve dependency of
29other components of the filesystem, such as but not limited to, checking if the
30bits for file blocks in the allocation has been set. In case of such an error,
31the offline fsck should/would be recommended.
32
33Finally, such an operation/feature should not be automated lest the filesystem
34may end up with more damage than before the repair attempt. So, this has to
35be performed using user interaction and consent.
36
37User interface
38==============
39When there are errors in the OCFS2 filesystem, they are usually accompanied
40by the inode number which caused the error. This inode number would be the
41input to check/fix the file.
42
43There is a sysfs directory for each OCFS2 file system mounting:
44
45 /sys/fs/ocfs2/<devname>/filecheck
46
Masanari Iidaa039ba32016-07-01 20:28:18 +090047Here, <devname> indicates the name of OCFS2 volume device which has been already
Gang Hed750c422016-03-22 14:24:33 -070048mounted. The file above would accept inode numbers. This could be used to
49communicate with kernel space, tell which file(inode number) will be checked or
50fixed. Currently, three operations are supported, which includes checking
51inode, fixing inode and setting the size of result record history.
52
531. If you want to know what error exactly happened to <inode> before fixing, do
54
55 # echo "<inode>" > /sys/fs/ocfs2/<devname>/filecheck/check
56 # cat /sys/fs/ocfs2/<devname>/filecheck/check
57
58The output is like this:
59 INO DONE ERROR
6039502 1 GENERATION
61
62<INO> lists the inode numbers.
63<DONE> indicates whether the operation has been finished.
64<ERROR> says what kind of errors was found. For the detailed error numbers,
65please refer to the file linux/fs/ocfs2/filecheck.h.
66
672. If you determine to fix this inode, do
68
69 # echo "<inode>" > /sys/fs/ocfs2/<devname>/filecheck/fix
70 # cat /sys/fs/ocfs2/<devname>/filecheck/fix
71
72The output is like this:
73 INO DONE ERROR
7439502 1 SUCCESS
75
76This time, the <ERROR> column indicates whether this fix is successful or not.
77
783. The record cache is used to store the history of check/fix results. It's
Masanari Iidaa039ba32016-07-01 20:28:18 +090079default size is 10, and can be adjust between the range of 10 ~ 100. You can
Gang Hed750c422016-03-22 14:24:33 -070080adjust the size like this:
81
82 # echo "<size>" > /sys/fs/ocfs2/<devname>/filecheck/set
83
84Fixing stuff
85============
Masanari Iidaa039ba32016-07-01 20:28:18 +090086On receiving the inode, the filesystem would read the inode and the
Gang Hed750c422016-03-22 14:24:33 -070087file metadata. In case of errors, the filesystem would fix the errors
88and report the problems it fixed in the kernel log. As a precautionary measure,
89the inode must first be checked for errors before performing a final fix.
90
91The inode and the result history will be maintained temporarily in a
92small linked list buffer which would contain the last (N) inodes
93fixed/checked, the detailed errors which were fixed/checked are printed in the
94kernel log.