Armando Montanez | 179aa8e | 2021-03-10 11:46:35 -0800 | [diff] [blame] | 1 | .. _module-pw_snapshot: |
| 2 | |
| 3 | =========== |
| 4 | pw_snapshot |
| 5 | =========== |
| 6 | |
| 7 | .. warning:: |
| 8 | |
| 9 | This module is unstable and under active development. The snapshot proto |
| 10 | format may see breaking changes as it stabilizes. |
| 11 | |
| 12 | ``pw_snapshot`` provides a storage format and associated tooling for capturing a |
| 13 | device's system state at a given point in time for analysis at a later time. |
| 14 | This is particularly useful for capturing information at crash time to provide |
| 15 | context to the cause of the crash. Outside of crash reporting, snapshots can be |
| 16 | used to debug anomalies that don't result in crashes by treating snapshotting as |
| 17 | a heavyweight alternative to tracing, logging-based dump commands, or other |
| 18 | on-demand system state capturing. |
| 19 | |
| 20 | |
| 21 | .. toctree:: |
| 22 | :maxdepth: 1 |
| 23 | |
| 24 | setup |
| 25 | module_usage |
| 26 | proto_format |
| 27 | design_discussion |
| 28 | |
| 29 | ------------------ |
| 30 | Life of a Snapshot |
| 31 | ------------------ |
| 32 | A "snapshot" is just a `proto message |
Rob Mohr | 6493303 | 2021-05-22 10:39:35 -0700 | [diff] [blame] | 33 | <https://cs.opensource.google/pigweed/pigweed/+/HEAD:pw_snapshot/pw_snapshot_protos/snapshot.proto>`_ |
Armando Montanez | 179aa8e | 2021-03-10 11:46:35 -0800 | [diff] [blame] | 34 | with many optional fields that describe a device's state at the time the |
| 35 | snapshot was captured. The serialized proto can then be stored and transfered |
| 36 | like a file so it can be analyzed at a later time. |
| 37 | |
| 38 | #. **Snapshot capture triggered** - The device encounters a condition that |
| 39 | indicates a snapshot should be captured. This could be through a crash |
| 40 | handler, or through other developer-specified entry points. |
| 41 | #. **Device "pauses"** - In order to capture system state, the device must |
| 42 | temporarily disable the thread scheduler and regular servicing of interrupts |
| 43 | to prevent the system state from churning while it is captured. |
| 44 | #. **Snapshot captured** - The device collects information throughout the |
| 45 | system through a project-provided snapshot collection logic flow. This data |
| 46 | is stored as a serialized Snapshot proto message for later retrieval. |
| 47 | #. **Device resumes** - After a snapshot is stored, the device resumes normal |
| 48 | execution. In a crash handler, the device will usually reboot instead of |
| 49 | returning to normal execution. |
| 50 | #. **Snapshot retrieved from device** - During normal device operation, stored |
| 51 | snapshots are retrieved from a device by a client that is interested in |
| 52 | analyzing the snapshot, or forwarding it elsewhere to be analyzed. |
| 53 | #. **Snapshot analyzed** - Finally, analysis tooling is run on the captured |
| 54 | snapshot proto to produce human readable dumps (akin to a crash report). |
| 55 | Alternatively, the data can be ingested by a server to act as a cloud crash |
| 56 | reporting endpoint. The structured form of a snapshot enables common |
| 57 | cloud-based crash reporting needs like version filtering, crash signatures, |
| 58 | de-duplication, and binary-matched symbolization. |
| 59 | |
| 60 | While Pigweed provides libraries for each part of a snapshot's lifecycle, the |
| 61 | glue that puts all these pieces together is project specific. Please see the |
Ted Pudlik | 7348790 | 2022-02-11 16:48:45 -0800 | [diff] [blame] | 62 | section on :ref:`Setting up a Snapshot Pipeline<module-pw_snapshot-setup>` for |
| 63 | more information on how to bring up snapshot support for your project. |