blob: 6fcfadae3d8fb38ead3f1df5bbe2d8d58730fa68 [file] [log] [blame]
Miklos Szeredi407e6a72005-03-25 12:19:43 +00001The following diagram shows how a filesystem operation (in this
2example unlink) is performed in FUSE.
3
4NOTE: everything in this description is greatly simplified
5
6 | "rm /mnt/fuse/file" | FUSE filesystem daemon
7 | |
8 | | >sys_read()
9 | | >fuse_dev_read()
10 | | >request_wait()
11 | | [sleep on fc->waitq]
12 | |
13 | >sys_unlink() |
14 | >fuse_unlink() |
15 | [get request from |
16 | fc->unused_list] |
17 | >request_send() |
18 | [queue req on fc->pending] |
19 | [wake up fc->waitq] | [woken up]
20 | >request_wait_answer() |
21 | [sleep on req->waitq] |
22 | | <request_wait()
23 | | [remove req from fc->pending]
24 | | [copy req to read buffer]
25 | | [add req to fc->processing]
26 | | <fuse_dev_read()
27 | | <sys_read()
28 | |
29 | | [perform unlink]
30 | |
31 | | >sys_write()
32 | | >fuse_dev_write()
33 | | [look up req in fc->processing]
34 | | [remove from fc->processing]
35 | | [copy write buffer to req]
36 | [woken up] | [wake up req->waitq]
37 | | <fuse_dev_write()
38 | | <sys_write()
39 | <request_wait_answer() |
40 | <request_send() |
41 | [add request to |
42 | fc->unused_list] |
43 | <fuse_unlink() |
44 | <sys_unlink() |
45
46There are a couple of ways in which to deadlock a FUSE filesystem.
47Since we are talking about unprivileged userspace programs,
48something must be done about these.
49
50Scenario 1 - Simple deadlock
51-----------------------------
52
53 | "rm /mnt/fuse/file" | FUSE filesystem daemon
54 | |
55 | >sys_unlink("/mnt/fuse/file") |
56 | [acquire inode semaphore |
57 | for "file"] |
58 | >fuse_unlink() |
59 | [sleep on req->waitq] |
60 | | <sys_read()
61 | | >sys_unlink("/mnt/fuse/file")
62 | | [acquire inode semaphore
63 | | for "file"]
64 | | *DEADLOCK*
65
66The solution for this is to allow requests to be interrupted while
67they are in userspace:
68
69 | [interrupted by signal] |
70 | <fuse_unlink() |
71 | [release semaphore] | [semaphore acquired]
72 | <sys_unlink() |
73 | | >fuse_unlink()
74 | | [queue req on fc->pending]
75 | | [wake up fc->waitq]
76 | | [sleep on req->waitq]
77
78If the filesystem daemon was single threaded, this will stop here,
79since there's no other thread to dequeue and execute the request.
80In this case the solution is to kill the FUSE daemon as well. If
81there are multiple serving threads, you just have to kill them as
82long as any remain.
83
84Moral: a filesystem which deadlocks, can soon find itself dead.
85
86Scenario 2 - Tricky deadlock
87----------------------------
88
89This one needs a carefully crafted filesystem. It's a variation on
90the above, only the call back to the filesystem is not explicit,
91but is caused by a pagefault.
92
93 | Kamikaze filesystem thread 1 | Kamikaze filesystem thread 2
94 | |
95 | [fd = open("/mnt/fuse/file")] | [request served normally]
96 | [mmap fd to 'addr'] |
97 | [close fd] | [FLUSH triggers 'magic' flag]
98 | [read a byte from addr] |
99 | >do_page_fault() |
100 | [find or create page] |
101 | [lock page] |
102 | >fuse_readpage() |
103 | [queue READ request] |
104 | [sleep on req->waitq] |
105 | | [read request to buffer]
106 | | [create reply header before addr]
107 | | >sys_write(addr - headerlength)
108 | | >fuse_dev_write()
109 | | [look up req in fc->processing]
110 | | [remove from fc->processing]
111 | | [copy write buffer to req]
112 | | >do_page_fault()
113 | | [find or create page]
114 | | [lock page]
115 | | * DEADLOCK *
116
117Solution is again to let the the request be interrupted (not
118elaborated further).
119
120An additional problem is that while the write buffer is being
121copied to the request, the request must not be interrupted. This
122is because the destination address of the copy may not be valid
123after the request is interrupted.
124
125This is solved with doing the copy atomically, and allowing
126interruption while the page(s) belonging to the write buffer are
127faulted with get_user_pages(). The 'req->locked' flag indicates
128when the copy is taking place, and interruption is delayed until
129this flag is unset.
130