Blame - Documentation/powerpc/phyp-assisted-dump.txt - kernel/msm-4.9

blob: c4682b982a2e4ff32e1a3a20075c4e400e6391b0 [file] [log] [blame]

Manish Ahuja	d28a793	2008-03-22 10:33:10 +1100	[diff] [blame]	1
				2	Hypervisor-Assisted Dump
				3	------------------------
				4	November 2007
				5
				6	The goal of hypervisor-assisted dump is to enable the dump of
				7	a crashed system, and to do so from a fully-reset system, and
				8	to minimize the total elapsed time until the system is back
				9	in production use.
				10
				11	As compared to kdump or other strategies, hypervisor-assisted
				12	dump offers several strong, practical advantages:
				13
				14	-- Unlike kdump, the system has been reset, and loaded
				15	with a fresh copy of the kernel. In particular,
				16	PCI and I/O devices have been reinitialized and are
				17	in a clean, consistent state.
				18	-- As the dump is performed, the dumped memory becomes
				19	immediately available to the system for normal use.
				20	-- After the dump is completed, no further reboots are
				21	required; the system will be fully usable, and running
				22	in it's normal, production mode on it normal kernel.
				23
				24	The above can only be accomplished by coordination with,
				25	and assistance from the hypervisor. The procedure is
				26	as follows:
				27
				28	-- When a system crashes, the hypervisor will save
				29	the low 256MB of RAM to a previously registered
				30	save region. It will also save system state, system
				31	registers, and hardware PTE's.
				32
				33	-- After the low 256MB area has been saved, the
				34	hypervisor will reset PCI and other hardware state.
				35	It will not clear RAM. It will then launch the
				36	bootloader, as normal.
				37
				38	-- The freshly booted kernel will notice that there
				39	is a new node (ibm,dump-kernel) in the device tree,
				40	indicating that there is crash data available from
				41	a previous boot. It will boot into only 256MB of RAM,
				42	reserving the rest of system memory.
				43
				44	-- Userspace tools will parse /sys/kernel/release_region
				45	and read /proc/vmcore to obtain the contents of memory,
				46	which holds the previous crashed kernel. The userspace
				47	tools may copy this info to disk, or network, nas, san,
				48	iscsi, etc. as desired.
				49
				50	For Example: the values in /sys/kernel/release-region
				51	would look something like this (address-range pairs).
				52	CPU:0x177fee000-0x10000: HPTE:0x177ffe020-0x1000: /
				53	DUMP:0x177fff020-0x10000000, 0x10000000-0x16F1D370A
				54
				55	-- As the userspace tools complete saving a portion of
				56	dump, they echo an offset and size to
				57	/sys/kernel/release_region to release the reserved
				58	memory back to general use.
				59
				60	An example of this is:
				61	"echo 0x40000000 0x10000000 > /sys/kernel/release_region"
				62	which will release 256MB at the 1GB boundary.
				63
				64	Please note that the hypervisor-assisted dump feature
				65	is only available on Power6-based systems with recent
				66	firmware versions.
				67
				68	Implementation details:
				69	----------------------
				70
				71	During boot, a check is made to see if firmware supports
				72	this feature on this particular machine. If it does, then
				73	we check to see if a active dump is waiting for us. If yes
				74	then everything but 256 MB of RAM is reserved during early
				75	boot. This area is released once we collect a dump from user
				76	land scripts that are run. If there is dump data, then
				77	the /sys/kernel/release_region file is created, and
				78	the reserved memory is held.
				79
				80	If there is no waiting dump data, then only the highest
				81	256MB of the ram is reserved as a scratch area. This area
				82	is not released: this region will be kept permanently
				83	reserved, so that it can act as a receptacle for a copy
				84	of the low 256MB in the case a crash does occur. See,
				85	however, "open issues" below, as to whether
				86	such a reserved region is really needed.
				87
				88	Currently the dump will be copied from /proc/vmcore to a
				89	a new file upon user intervention. The starting address
				90	to be read and the range for each data point in provided
				91	in /sys/kernel/release_region.
				92
				93	The tools to examine the dump will be same as the ones
				94	used for kdump.
				95
				96	General notes:
				97	--------------
				98	Security: please note that there are potential security issues
				99	with any sort of dump mechanism. In particular, plaintext
				100	(unencrypted) data, and possibly passwords, may be present in
				101	the dump data. Userspace tools must take adequate precautions to
				102	preserve security.
				103
				104	Open issues/ToDo:
				105	------------
				106	o The various code paths that tell the hypervisor that a crash
				107	occurred, vs. it simply being a normal reboot, should be
				108	reviewed, and possibly clarified/fixed.
				109
				110	o Instead of using /sys/kernel, should there be a /sys/dump
				111	instead? There is a dump_subsys being created by the s390 code,
				112	perhaps the pseries code should use a similar layout as well.
				113
				114	o Is reserving a 256MB region really required? The goal of
				115	reserving a 256MB scratch area is to make sure that no
				116	important crash data is clobbered when the hypervisor
				117	save low mem to the scratch area. But, if one could assure
				118	that nothing important is located in some 256MB area, then
				119	it would not need to be reserved. Something that can be
				120	improved in subsequent versions.
				121
				122	o Still working the kdump team to integrate this with kdump,
				123	some work remains but this would not affect the current
				124	patches.
				125
				126	o Still need to write a shell script, to copy the dump away.
				127	Currently I am parsing it manually.