Blame - Documentation/livepatch/livepatch.txt - kernel/msm-4.19

blob: 1ae2de758c081d1bd1fe1d8147fa0148c7637500 [file] [log] [blame]

Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	1	=========
				2	Livepatch
				3	=========
				4
				5	This document outlines basic information about kernel livepatching.
				6
				7	Table of Contents:
				8
				9	1. Motivation
				10	2. Kprobes, Ftrace, Livepatching
				11	3. Consistency model
				12	4. Livepatch module
				13	4.1. New functions
				14	4.2. Metadata
				15	4.3. Livepatch module handling
				16	5. Livepatch life-cycle
				17	5.1. Registration
				18	5.2. Enabling
				19	5.3. Disabling
				20	5.4. Unregistration
				21	6. Sysfs
				22	7. Limitations
				23
				24
				25	1. Motivation
				26	=============
				27
				28	There are many situations where users are reluctant to reboot a system. It may
				29	be because their system is performing complex scientific computations or under
				30	heavy load during peak usage. In addition to keeping systems up and running,
				31	users want to also have a stable and secure system. Livepatching gives users
				32	both by allowing for function calls to be redirected; thus, fixing critical
				33	functions without a system reboot.
				34
				35
				36	2. Kprobes, Ftrace, Livepatching
				37	================================
				38
				39	There are multiple mechanisms in the Linux kernel that are directly related
				40	to redirection of code execution; namely: kernel probes, function tracing,
				41	and livepatching:
				42
				43	+ The kernel probes are the most generic. The code can be redirected by
				44	putting a breakpoint instruction instead of any instruction.
				45
				46	+ The function tracer calls the code from a predefined location that is
				47	close to the function entry point. This location is generated by the
				48	compiler using the '-pg' gcc option.
				49
				50	+ Livepatching typically needs to redirect the code at the very beginning
				51	of the function entry before the function parameters or the stack
				52	are in any way modified.
				53
				54	All three approaches need to modify the existing code at runtime. Therefore
				55	they need to be aware of each other and not step over each other's toes.
				56	Most of these problems are solved by using the dynamic ftrace framework as
				57	a base. A Kprobe is registered as a ftrace handler when the function entry
				58	is probed, see CONFIG_KPROBES_ON_FTRACE. Also an alternative function from
				59	a live patch is called with the help of a custom ftrace handler. But there are
				60	some limitations, see below.
				61
				62
				63	3. Consistency model
				64	====================
				65
				66	Functions are there for a reason. They take some input parameters, get or
				67	release locks, read, process, and even write some data in a defined way,
				68	have return values. In other words, each function has a defined semantic.
				69
				70	Many fixes do not change the semantic of the modified functions. For
				71	example, they add a NULL pointer or a boundary check, fix a race by adding
				72	a missing memory barrier, or add some locking around a critical section.
				73	Most of these changes are self contained and the function presents itself
				74	the same way to the rest of the system. In this case, the functions might
Miroslav Benes	d0807da	2018-01-10 11:01:28 +0100	[diff] [blame]	75	be updated independently one by one.
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	76
				77	But there are more complex fixes. For example, a patch might change
				78	ordering of locking in multiple functions at the same time. Or a patch
				79	might exchange meaning of some temporary structures and update
				80	all the relevant functions. In this case, the affected unit
				81	(thread, whole kernel) need to start using all new versions of
				82	the functions at the same time. Also the switch must happen only
				83	when it is safe to do so, e.g. when the affected locks are released
				84	or no data are stored in the modified structures at the moment.
				85
				86	The theory about how to apply functions a safe way is rather complex.
				87	The aim is to define a so-called consistency model. It attempts to define
				88	conditions when the new implementation could be used so that the system
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	89	stays consistent.
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	90
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	91	Livepatch has a consistency model which is a hybrid of kGraft and
				92	kpatch: it uses kGraft's per-task consistency and syscall barrier
				93	switching combined with kpatch's stack trace switching. There are also
				94	a number of fallback options which make it quite flexible.
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	95
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	96	Patches are applied on a per-task basis, when the task is deemed safe to
				97	switch over. When a patch is enabled, livepatch enters into a
				98	transition state where tasks are converging to the patched state.
				99	Usually this transition state can complete in a few seconds. The same
				100	sequence occurs when a patch is disabled, except the tasks converge from
				101	the patched state to the unpatched state.
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	102
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	103	An interrupt handler inherits the patched state of the task it
				104	interrupts. The same is true for forked tasks: the child inherits the
				105	patched state of the parent.
				106
				107	Livepatch uses several complementary approaches to determine when it's
				108	safe to patch tasks:
				109
				110	1. The first and most effective approach is stack checking of sleeping
				111	tasks. If no affected functions are on the stack of a given task,
				112	the task is patched. In most cases this will patch most or all of
				113	the tasks on the first try. Otherwise it'll keep trying
				114	periodically. This option is only available if the architecture has
				115	reliable stacks (HAVE_RELIABLE_STACKTRACE).
				116
				117	2. The second approach, if needed, is kernel exit switching. A
				118	task is switched when it returns to user space from a system call, a
				119	user space IRQ, or a signal. It's useful in the following cases:
				120
				121	a) Patching I/O-bound user tasks which are sleeping on an affected
				122	function. In this case you have to send SIGSTOP and SIGCONT to
				123	force it to exit the kernel and be patched.
				124	b) Patching CPU-bound user tasks. If the task is highly CPU-bound
				125	then it will get patched the next time it gets interrupted by an
				126	IRQ.
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	127
				128	3. For idle "swapper" tasks, since they don't ever exit the kernel, they
				129	instead have a klp_update_patch_state() call in the idle loop which
				130	allows them to be patched before the CPU enters the idle state.
				131
				132	(Note there's not yet such an approach for kthreads.)
				133
Miroslav Benes	d0807da	2018-01-10 11:01:28 +0100	[diff] [blame]	134	Architectures which don't have HAVE_RELIABLE_STACKTRACE solely rely on
				135	the second approach. It's highly likely that some tasks may still be
				136	running with an old version of the function, until that function
				137	returns. In this case you would have to signal the tasks. This
				138	especially applies to kthreads. They may not be woken up and would need
				139	to be forced. See below for more information.
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	140
Miroslav Benes	d0807da	2018-01-10 11:01:28 +0100	[diff] [blame]	141	Unless we can come up with another way to patch kthreads, architectures
				142	without HAVE_RELIABLE_STACKTRACE are not considered fully supported by
				143	the kernel livepatching.
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	144
				145	The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
				146	is in transition. Only a single patch (the topmost patch on the stack)
				147	can be in transition at a given time. A patch can remain in transition
				148	indefinitely, if any of the tasks are stuck in the initial patch state.
				149
				150	A transition can be reversed and effectively canceled by writing the
				151	opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
				152	the transition is in progress. Then all the tasks will attempt to
				153	converge back to the original patch state.
				154
				155	There's also a /proc/<pid>/patch_state file which can be used to
				156	determine which tasks are blocking completion of a patching operation.
				157	If a patch is in transition, this file shows 0 to indicate the task is
				158	unpatched and 1 to indicate it's patched. Otherwise, if no patch is in
				159	transition, it shows -1. Any tasks which are blocking the transition
				160	can be signaled with SIGSTOP and SIGCONT to force them to change their
Miroslav Benes	43347d5	2017-11-15 14:50:13 +0100	[diff] [blame]	161	patched state. This may be harmful to the system though.
				162	/sys/kernel/livepatch/<patch>/signal attribute provides a better alternative.
				163	Writing 1 to the attribute sends a fake signal to all remaining blocking
				164	tasks. No proper signal is actually delivered (there is no data in signal
				165	pending structures). Tasks are interrupted or woken up, and forced to change
				166	their patched state.
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	167
Miroslav Benes	c99a2be	2017-11-22 11:29:21 +0100	[diff] [blame]	168	Administrator can also affect a transition through
				169	/sys/kernel/livepatch/<patch>/force attribute. Writing 1 there clears
				170	TIF_PATCH_PENDING flag of all tasks and thus forces the tasks to the patched
				171	state. Important note! The force attribute is intended for cases when the
				172	transition gets stuck for a long time because of a blocking task. Administrator
				173	is expected to collect all necessary data (namely stack traces of such blocking
				174	tasks) and request a clearance from a patch distributor to force the transition.
				175	Unauthorized usage may cause harm to the system. It depends on the nature of the
				176	patch, which functions are (un)patched, and which functions the blocking tasks
				177	are sleeping in (/proc/<pid>/stack may help here). Removal (rmmod) of patch
				178	modules is permanently disabled when the force feature is used. It cannot be
				179	guaranteed there is no task sleeping in such module. It implies unbounded
				180	reference count if a patch module is disabled and enabled in a loop.
				181
Miroslav Benes	d0807da	2018-01-10 11:01:28 +0100	[diff] [blame]	182	Moreover, the usage of force may also affect future applications of live
				183	patches and cause even more harm to the system. Administrator should first
				184	consider to simply cancel a transition (see above). If force is used, reboot
				185	should be planned and no more live patches applied.
				186
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	187	3.1 Adding consistency model support to new architectures
				188	---------------------------------------------------------
				189
				190	For adding consistency model support to new architectures, there are a
				191	few options:
				192
				193	1) Add CONFIG_HAVE_RELIABLE_STACKTRACE. This means porting objtool, and
				194	for non-DWARF unwinders, also making sure there's a way for the stack
				195	tracing code to detect interrupts on the stack.
				196
				197	2) Alternatively, ensure that every kthread has a call to
				198	klp_update_patch_state() in a safe location. Kthreads are typically
				199	in an infinite loop which does some action repeatedly. The safe
				200	location to switch the kthread's patch state would be at a designated
				201	point in the loop where there are no locks taken and all data
				202	structures are in a well-defined state.
				203
				204	The location is clear when using workqueues or the kthread worker
				205	API. These kthreads process independent actions in a generic loop.
				206
				207	It's much more complicated with kthreads which have a custom loop.
				208	There the safe location must be carefully selected on a case-by-case
				209	basis.
				210
				211	In that case, arches without HAVE_RELIABLE_STACKTRACE would still be
				212	able to use the non-stack-checking parts of the consistency model:
				213
				214	a) patching user tasks when they cross the kernel/user space
				215	boundary; and
				216
				217	b) patching kthreads and idle tasks at their designated patch points.
				218
				219	This option isn't as good as option 1 because it requires signaling
				220	user tasks and waking kthreads to patch them. But it could still be
				221	a good backup option for those architectures which don't have
				222	reliable stack traces yet.
				223
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	224
				225	4. Livepatch module
				226	===================
				227
				228	Livepatches are distributed using kernel modules, see
				229	samples/livepatch/livepatch-sample.c.
				230
				231	The module includes a new implementation of functions that we want
				232	to replace. In addition, it defines some structures describing the
				233	relation between the original and the new implementation. Then there
				234	is code that makes the kernel start using the new code when the livepatch
				235	module is loaded. Also there is code that cleans up before the
				236	livepatch module is removed. All this is explained in more details in
				237	the next sections.
				238
				239
				240	4.1. New functions
				241	------------------
				242
				243	New versions of functions are typically just copied from the original
				244	sources. A good practice is to add a prefix to the names so that they
				245	can be distinguished from the original ones, e.g. in a backtrace. Also
				246	they can be declared as static because they are not called directly
				247	and do not need the global visibility.
				248
				249	The patch contains only functions that are really modified. But they
				250	might want to access functions or data from the original source file
				251	that may only be locally accessible. This can be solved by a special
				252	relocation section in the generated livepatch module, see
				253	Documentation/livepatch/module-elf-format.txt for more details.
				254
				255
				256	4.2. Metadata
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	257	-------------
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	258
				259	The patch is described by several structures that split the information
				260	into three levels:
				261
				262	+ struct klp_func is defined for each patched function. It describes
				263	the relation between the original and the new implementation of a
				264	particular function.
				265
				266	The structure includes the name, as a string, of the original function.
				267	The function address is found via kallsyms at runtime.
				268
				269	Then it includes the address of the new function. It is defined
				270	directly by assigning the function pointer. Note that the new
				271	function is typically defined in the same source file.
				272
				273	As an optional parameter, the symbol position in the kallsyms database can
				274	be used to disambiguate functions of the same name. This is not the
				275	absolute position in the database, but rather the order it has been found
				276	only for a particular object ( vmlinux or a kernel module ). Note that
				277	kallsyms allows for searching symbols according to the object name.
				278
				279	+ struct klp_object defines an array of patched functions (struct
				280	klp_func) in the same object. Where the object is either vmlinux
				281	(NULL) or a module name.
				282
				283	The structure helps to group and handle functions for each object
				284	together. Note that patched modules might be loaded later than
				285	the patch itself and the relevant functions might be patched
				286	only when they are available.
				287
				288
				289	+ struct klp_patch defines an array of patched objects (struct
				290	klp_object).
				291
				292	This structure handles all patched functions consistently and eventually,
				293	synchronously. The whole patch is applied only when all patched
				294	symbols are found. The only exception are symbols from objects
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	295	(kernel modules) that have not been loaded yet.
				296
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	297	For more details on how the patch is applied on a per-task basis,
				298	see the "Consistency model" section.
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	299
				300
				301	4.3. Livepatch module handling
				302	------------------------------
				303
				304	The usual behavior is that the new functions will get used when
				305	the livepatch module is loaded. For this, the module init() function
				306	has to register the patch (struct klp_patch) and enable it. See the
				307	section "Livepatch life-cycle" below for more details about these
				308	two operations.
				309
				310	Module removal is only safe when there are no users of the underlying
Miroslav Benes	d0807da	2018-01-10 11:01:28 +0100	[diff] [blame]	311	functions. This is the reason why the force feature permanently disables
				312	the removal. The forced tasks entered the functions but we cannot say
				313	that they returned back. Therefore it cannot be decided when the
				314	livepatch module can be safely removed. When the system is successfully
				315	transitioned to a new patch state (patched/unpatched) without being
				316	forced it is guaranteed that no task sleeps or runs in the old code.
Josh Poimboeuf	3ec2477	2017-03-06 11:20:29 -0600	[diff] [blame]	317
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	318
				319	5. Livepatch life-cycle
				320	=======================
				321
				322	Livepatching defines four basic operations that define the life cycle of each
				323	live patch: registration, enabling, disabling and unregistration. There are
				324	several reasons why it is done this way.
				325
				326	First, the patch is applied only when all patched symbols for already
				327	loaded objects are found. The error handling is much easier if this
				328	check is done before particular functions get redirected.
				329
Miroslav Benes	d0807da	2018-01-10 11:01:28 +0100	[diff] [blame]	330	Second, it might take some time until the entire system is migrated with
				331	the hybrid consistency model being used. The patch revert might block
				332	the livepatch module removal for too long. Therefore it is useful to
				333	revert the patch using a separate operation that might be called
				334	explicitly. But it does not make sense to remove all information until
				335	the livepatch module is really removed.
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	336
				337
				338	5.1. Registration
				339	-----------------
				340
				341	Each patch first has to be registered using klp_register_patch(). This makes
				342	the patch known to the livepatch framework. Also it does some preliminary
				343	computing and checks.
				344
				345	In particular, the patch is added into the list of known patches. The
				346	addresses of the patched functions are found according to their names.
				347	The special relocations, mentioned in the section "New functions", are
				348	applied. The relevant entries are created under
				349	/sys/kernel/livepatch/<name>. The patch is rejected when any operation
				350	fails.
				351
				352
				353	5.2. Enabling
				354	-------------
				355
				356	Registered patches might be enabled either by calling klp_enable_patch() or
				357	by writing '1' to /sys/kernel/livepatch/<name>/enabled. The system will
				358	start using the new implementation of the patched functions at this stage.
				359
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	360	When a patch is enabled, livepatch enters into a transition state where
				361	tasks are converging to the patched state. This is indicated by a value
				362	of '1' in /sys/kernel/livepatch/<name>/transition. Once all tasks have
				363	been patched, the 'transition' value changes to '0'. For more
				364	information about this process, see the "Consistency model" section.
				365
				366	If an original function is patched for the first time, a function
				367	specific struct klp_ops is created and an universal ftrace handler is
				368	registered.
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	369
				370	Functions might be patched multiple times. The ftrace handler is registered
				371	only once for the given function. Further patches just add an entry to the
				372	list (see field `func_stack`) of the struct klp_ops. The last added
				373	entry is chosen by the ftrace handler and becomes the active function
				374	replacement.
				375
				376	Note that the patches might be enabled in a different order than they were
				377	registered.
				378
				379
				380	5.3. Disabling
				381	--------------
				382
				383	Enabled patches might get disabled either by calling klp_disable_patch() or
				384	by writing '0' to /sys/kernel/livepatch/<name>/enabled. At this stage
				385	either the code from the previously enabled patch or even the original
				386	code gets used.
				387
Josh Poimboeuf	d83a7cb	2017-02-13 19:42:40 -0600	[diff] [blame]	388	When a patch is disabled, livepatch enters into a transition state where
				389	tasks are converging to the unpatched state. This is indicated by a
				390	value of '1' in /sys/kernel/livepatch/<name>/transition. Once all tasks
				391	have been unpatched, the 'transition' value changes to '0'. For more
				392	information about this process, see the "Consistency model" section.
				393
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	394	Here all the functions (struct klp_func) associated with the to-be-disabled
				395	patch are removed from the corresponding struct klp_ops. The ftrace handler
				396	is unregistered and the struct klp_ops is freed when the func_stack list
				397	becomes empty.
				398
				399	Patches must be disabled in exactly the reverse order in which they were
				400	enabled. It makes the problem and the implementation much easier.
				401
				402
				403	5.4. Unregistration
				404	-------------------
				405
				406	Disabled patches might be unregistered by calling klp_unregister_patch().
				407	This can be done only when the patch is disabled and the code is no longer
				408	used. It must be called before the livepatch module gets unloaded.
				409
				410	At this stage, all the relevant sys-fs entries are removed and the patch
				411	is removed from the list of known patches.
				412
				413
				414	6. Sysfs
				415	========
				416
				417	Information about the registered patches can be found under
				418	/sys/kernel/livepatch. The patches could be enabled and disabled
				419	by writing there.
				420
Miroslav Benes	c99a2be	2017-11-22 11:29:21 +0100	[diff] [blame]	421	/sys/kernel/livepatch/<patch>/signal and /sys/kernel/livepatch/<patch>/force
				422	attributes allow administrator to affect a patching operation.
Miroslav Benes	43347d5	2017-11-15 14:50:13 +0100	[diff] [blame]	423
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	424	See Documentation/ABI/testing/sysfs-kernel-livepatch for more details.
				425
				426
				427	7. Limitations
				428	==============
				429
				430	The current Livepatch implementation has several limitations:
				431
				432
				433	+ The patch must not change the semantic of the patched functions.
				434
				435	The current implementation guarantees only that either the old
				436	or the new function is called. The functions are patched one
				437	by one. It means that the patch must _not_ change the semantic
				438	of the function.
				439
				440
				441	+ Data structures can not be patched.
				442
				443	There is no support to version data structures or anyhow migrate
				444	one structure into another. Also the simple consistency model does
				445	not allow to switch more functions atomically.
				446
				447	Once there is more complex consistency mode, it will be possible to
				448	use some workarounds. For example, it will be possible to use a hole
				449	for a new member because the data structure is aligned. Or it will
				450	be possible to use an existing member for something else.
				451
				452	There are no plans to add more generic support for modified structures
				453	at the moment.
				454
				455
				456	+ Only functions that can be traced could be patched.
				457
				458	Livepatch is based on the dynamic ftrace. In particular, functions
				459	implementing ftrace or the livepatch ftrace handler could not be
				460	patched. Otherwise, the code would end up in an infinite loop. A
				461	potential mistake is prevented by marking the problematic functions
				462	by "notrace".
				463
				464
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	465
				466	+ Livepatch works reliably only when the dynamic ftrace is located at
				467	the very beginning of the function.
				468
				469	The function need to be redirected before the stack or the function
				470	parameters are modified in any way. For example, livepatch requires
				471	using -fentry gcc compiler option on x86_64.
				472
				473	One exception is the PPC port. It uses relative addressing and TOC.
				474	Each function has to handle TOC and save LR before it could call
				475	the ftrace handler. This operation has to be reverted on return.
				476	Fortunately, the generic ftrace code has the same problem and all
Masanari Iida	8da9704	2017-01-24 21:45:15 +0900	[diff] [blame]	477	this is handled on the ftrace level.
Petr Mladek	5e4e384	2016-04-25 17:14:35 +0200	[diff] [blame]	478
				479
				480	+ Kretprobes using the ftrace framework conflict with the patched
				481	functions.
				482
				483	Both kretprobes and livepatches use a ftrace handler that modifies
				484	the return address. The first user wins. Either the probe or the patch
				485	is rejected when the handler is already in use by the other.
				486
				487
				488	+ Kprobes in the original function are ignored when the code is
				489	redirected to the new implementation.
				490
				491	There is a work in progress to add warnings about this situation.