trappy: Speed up trappy by caching trace parsing
Pandas is extremely fast at parsing csv to data frames. Astonishingly it takes
< 1s to serialize/deserialize a 100MB work of traces with 430000 events to/from
csv. We leverage this and write out a data frames into a csv file when they are
created for the first time. Next time we read it out if it exists. To make
sure, the cache isn't stale, we take the md5sum of the trace file and also
ensure all CSVs exist before reading from the cache. I get a speed up of 16s to
1s when parsing a 100MB trace.
Co-developed-by: Brendan Jackman <brendan.jackman@arm.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-by: KP Singh <kpsingh@google.com>
diff --git a/tests/utils_tests.py b/tests/utils_tests.py
index 617cfa3..e13b868 100644
--- a/tests/utils_tests.py
+++ b/tests/utils_tests.py
@@ -19,6 +19,8 @@
import shutil
import subprocess
import tempfile
+import trappy
+from trappy.ftrace import GenericFTrace
TESTS_DIRECTORY = os.path.dirname(os.path.realpath(__file__))
@@ -36,6 +38,7 @@
def __init__(self, files_to_copy, *args, **kwargs):
self.files_to_copy = files_to_copy
super(SetupDirectory, self).__init__(*args, **kwargs)
+ GenericFTrace.disable_cache = True
def setUp(self):
self.previous_dir = os.getcwd()