Blame - scheduler/rdb_cache_manager.py - platform/external/autotest

blob: 8ad1627e98da7415905ffcd09b1dac81e06ce065 [file] [log] [blame]

Prashanth B	2d8047e	2014-04-27 18:54:47 -0700	[diff] [blame]	1	# Copyright (c) 2014 The Chromium OS Authors. All rights reserved.
				2	# Use of this source code is governed by a BSD-style license that can be
				3	# found in the LICENSE file.
				4
				5
				6	"""Cache module for rdb requests/host objects.
				7
				8	This module supplies the following api:
				9	1. A cache backend.
				10	2. A cache manager for the backend.
				11	3. A memoize decorator to encapsulate caching logic.
				12
				13	This cache manager functions as a lookaside buffer for host requests.
				14	Its correctness is contingent on the following conditions:
				15	1. The concurrency of the rdb remains 0.
				16	2. Clients of the cache don't trust the leased bit on the cached object.
				17	3. The cache is created at the start of a single batched request,
				18	populated during the request, and completely discarded at the end.
				19
				20	Rather than caching individual hosts, the cache manager maintains
				21	'cache lines'. A cache line is defined as a key: value pair, where
				22	the key is as returned by get_key, and the value is a list of RDBHosts
				23	that match the key. The following limitations are placed on cache lines:
				24	1. A new line can only contain unleased hosts.
				25	2. A key can only be set once, with a single line, before removal.
				26	3. Every 'get' deletes the entire line.
				27
				28	Consider the following examples:
				29	Normal case: 3 grouped requests, all with the same deps/acls, but different
				30	priorities/parent_job_ids. The requests (X+Y+Z) > matching hosts (K):
				31	(request1, count=X)- hits the database, takes X hosts, caches (K-X)
				32	(request2, count=Y) - hits the cache and is fully satisfied, caches (K-(X+Y))
				33	(request3, count=Z) - hits the cache, needs to acquire (X+Y+Z)-K next tick]:
				34
				35	Host Count \| RDB \| Cache
				36	------------------------------------------------------------------
				37	X: \| request1 \| {}
				38	K: \| find_hosts(deps, acls) \|
				39	X: \| leased_hosts \|
				40	K-X: \| ---------------------------> \| {key: [K-X hosts]}
				41	Y<K-X: \| request2 <---[K-X hosts]---- \| {}
				42	Y: \| leased_hosts \|
				43	K-(X+Y): \| ---------------------------> \| {key: [K-(X+Y) hosts]}
				44	Z>K-(X+Y): \| request3 <-[K-(X+Y) hosts]-- \| {}
				45	Z-(K-(X+Y)):\| leased_hosts \|
				46
				47	Since hosts are only released by the scheduler there is no way the
				48	third request could have been satisfied completely even if we had checked
				49	the database real-time.
				50
				51	Stale cache entries: 3 grouped requests that don't have the same deps/acls.
				52	P(1,2,3) are priorities, with P3 being the highest:
				53	(request1(deps=[a,b], P3), Count=X) - Caches hosts
				54	(request2(deps=[a], P2), Count=Y) - hits the database
				55	(request3(deps=[a,b], P1)], Count=Z) - Tries to use cached hosts but fails
				56
				57	Host Count \| RDB \| Cache
				58	------------------------------------------------------------------
				59	X: \| request1(deps=[a,b]) \| {}
				60	K: \| find_hosts(deps=[a,b]) \|
				61	X: \| leased_hosts \|
				62	K-X: \| ---------------------------> \| {deps=[a,b]: [(K-X) hosts]}
				63	Y<K-X: \| request2(deps=[a]) \| {}
				64	K-X: \| find_hosts(deps=[a]) \|
				65	Y: \| leased_hosts \|
				66	K-(X+Y): \| ---------------------------> \| {deps=[a]: [(K-(X+Y)) hosts],
				67	\| \| \| overlap \|
				68	\| \| deps=[a, b], [(K-X) hosts]}
				69	Z: \| request3(deps=[a,b])<-[K-X]--\| {deps=[a]: [K-(X+Y) hosts]}
				70	Z-(K-(X+Y)):\| leased_hosts \| {deps=[a]: [N-Y hosts]}
				71
				72	Note that in the last case, even though the cache returns hosts that
				73	have already been assigned to request2, request3 cannot use them. This is
				74	acceptable because the number of hosts we lease per tick is << the number
				75	of requests, so it's faster to check leased bits real time than query for hosts.
				76	"""
				77
				78
				79	import abc
				80	import collections
				81	import logging
				82
				83	import common
Michael Liang	da8c60a	2014-06-03 13:24:51 -0700	[diff] [blame^]	84	from autotest_lib.client.common_lib.cros.graphite import stats
Prashanth B	2d8047e	2014-04-27 18:54:47 -0700	[diff] [blame]	85	from autotest_lib.client.common_lib.global_config import global_config
				86	from autotest_lib.scheduler import rdb_utils
Prashanth B	2d8047e	2014-04-27 18:54:47 -0700	[diff] [blame]	87
				88	MEMOIZE_KEY = 'memoized_hosts'
				89
				90	def memoize_hosts(func):
				91	"""Decorator used to memoize through the cache manager.
				92
				93	@param func: The function/method to decorate.
				94	Before calling this function we check the cache for values matching
				95	its request argument, and anything returned by the function is cached
				96	cached under the same request.
				97	"""
				98	def cache(self, request, count, **kwargs):
				99	"""Caching function for the memoize decorator.
				100
				101	@param request: The rdb request, as defined in rdb_requests.
				102	@param count: The count of elements needed to satisfy the request.
				103	@param kwargs:
				104	Named args for the memoized function. This map should not contain
				105	the key MEMOIZED_KEY, as this is reserved for the passing of
				106	the cached/memoized hosts to the function itself.
				107	"""
				108	cache_key = self.cache.get_key(request.deps, request.acls)
				109	try:
				110	kwargs[MEMOIZE_KEY] = self.cache.get_line(cache_key)
				111	except rdb_utils.CacheMiss:
				112	pass
				113	hosts = func(self, request, count, **kwargs)
				114	self.cache.set_line(cache_key, hosts)
				115	return hosts
				116	return cache
				117
				118
				119	class CacheBackend(object):
				120	"""Base class for a cache backend."""
				121	__metaclass__ = abc.ABCMeta
				122
				123	def set(self, key, value):
				124	"""Set a key.
				125
				126	@param key: The key to set.
				127	@param value: The value to cache.
				128	"""
				129	pass
				130
				131
				132	def get(self, key):
				133	"""Get the value stored under a key.
				134
				135	@param key: The key to retrieve the value for.
				136	@return: The value stored under the key.
				137	@raises KeyError: If the key isn't present in the cache.
				138	"""
				139	pass
				140
				141
				142	def delete(self, key):
				143	"""Delete the key, value pair from the cache.
				144
				145	@param key: The key used to find the key, value pair to delete.
				146	@raises KeyError: If the key isn't already in the cache.
				147	"""
				148	pass
				149
				150
				151	def has_key(self, key):
				152	"""Check if the key exists in the cache.
				153
				154	@param key: The key to check.
				155	@return: True if the key is in the cache.
				156	"""
				157	return False
				158
				159
				160	class DummyCacheBackend(CacheBackend):
				161	"""A dummy cache backend.
				162
				163	This cache will claim to have no keys. Every get is a cache miss.
				164	"""
				165
				166	def get(self, key):
				167	raise KeyError
				168
				169
				170	class InMemoryCacheBackend(CacheBackend):
				171	"""In memory cache backend.
				172
				173	Uses a simple dictionary to store key, value pairs.
				174	"""
				175	def __init__(self):
				176	self._cache = {}
				177
				178	def get(self, key):
				179	return self._cache[key]
				180
				181	def set(self, key, value):
				182	self._cache[key] = value
				183
				184	def delete(self, key):
				185	self._cache.pop(key)
				186
				187	def has_key(self, key):
				188	return key in self._cache
				189
				190	# TODO: Implement a MemecacheBackend, invalidate when unleasing a host, refactor
				191	# the AcquireHostRequest to contain a core of (deps, acls) that we can use as
				192	# the key for population and invalidation. The caching manager is still valid,
				193	# regardless of the backend.
				194
				195	class RDBHostCacheManager(object):
				196	"""RDB Cache manager."""
				197
				198	key = collections.namedtuple('key', ['deps', 'acls'])
				199	use_cache = global_config.get_config_value(
				200	'RDB', 'use_cache', type=bool, default=True)
				201
				202	def __init__(self):
				203	self._cache_backend = (InMemoryCacheBackend()
				204	if self.use_cache else DummyCacheBackend())
				205	self.hits = 0
				206	self.misses = 0
				207	self.stale_entries = []
				208
				209
				210	def mean_staleness(self):
				211	"""Compute the average stale entries per line.
				212
				213	@return: A floating point representing the mean staleness.
				214	"""
				215	return (reduce(lambda x, y: float(x+y), self.stale_entries)/
				216	len(self.stale_entries)) if self.stale_entries else 0
				217
				218
				219	def hit_ratio(self):
				220	"""Compute the hit ratio of this cache.
				221
				222	@return: A floating point percentage of the hit ratio.
				223	"""
				224	if not self.hits and not self.misses:
				225	return 0
				226	requests = float(self.hits + self.misses)
				227	return (self.hits/requests) * 100
				228
				229
				230	def record_stats(self):
				231	"""Record stats about the cache managed by this instance."""
				232	hit_ratio = self.hit_ratio()
				233	staleness = self.mean_staleness()
				234	logging.debug('Cache stats: hit ratio: %.2f%%, '
				235	'avg staleness per line: %.2f%%.', hit_ratio, staleness)
				236	stats.Gauge(rdb_utils.RDB_STATS_KEY).send('cache.hit_ratio', hit_ratio)
				237	stats.Gauge(rdb_utils.RDB_STATS_KEY).send('cache.stale_entries',
				238	staleness)
				239
				240
				241	@classmethod
				242	def get_key(cls, deps, acls):
				243	"""Return a key for the given deps, acls.
				244
				245	@param deps: A list of deps, as taken by the AcquireHostRequest.
				246	@param acls: A list of acls, as taken by the AcquireHostRequest.
				247	@return: A cache key for the given deps/acls.
				248	"""
				249	# All requests with the same deps, acls should hit the same cache line.
				250	# TODO: Do something smarter with acls, only one needs to match.
				251	return cls.key(deps=frozenset(deps), acls=frozenset(acls))
				252
				253
				254	def get_line(self, key):
				255	"""Clear and return the cache line matching the key.
				256
				257	@param key: The key the desired cache_line is stored under.
				258	@return: A list of rdb hosts matching the key, or None.
				259
				260	@raises rdb_utils.CacheMiss: If the key isn't in the cache.
				261	"""
				262	try:
				263	cache_line = self._cache_backend.get(key)
				264	except KeyError:
				265	self.misses += 1
				266	raise rdb_utils.CacheMiss('Key %s not in cache' % (key,))
				267	self.hits += 1
				268	self._cache_backend.delete(key)
				269	return list(cache_line)
				270
				271
				272	def _check_line(self, line, key):
				273	"""Sanity check a cache line.
				274
				275	This method assumes that a cache line is made up of RDBHost objects,
				276	and checks to see if they all match each other/the key passed in.
				277	Checking is done in terms of host labels and acls, note that the hosts
				278	in the line can have different deps/acls, as long as they all have the
				279	deps required by the key, and at least one matching acl of the key.
				280
				281	@param line: The cache line value.
				282	@param key: The key the line will be stored under.
				283	@raises rdb_utils.RDBException:
				284	If one of the hosts in the cache line is already leased.
				285	The cache already has a different line under the given key.
				286	The given key doesn't match the hosts in the line.
				287	"""
				288	# Note that this doesn't mean that all hosts in the cache are unleased.
				289	if any(host.leased for host in line):
				290	raise rdb_utils.RDBException('Cannot cache leased hosts %s' % line)
				291
				292	# Confirm that the given line can be used to service the key by checking
				293	# that all hosts have the deps mentioned in the key, and at least one
				294	# matching acl.
				295	h_keys = set([self.get_key(host.labels, host.acls) for host in line])
				296	for h_key in h_keys:
				297	if (not h_key.deps.issuperset(key.deps) or
				298	not key.acls.intersection(h_key.acls)):
				299	raise rdb_utils.RDBException('Given key: %s does not match key '
				300	'computed from hosts in line: %s' % (key, h_keys))
				301	if self._cache_backend.has_key(key):
				302	raise rdb_utils.RDBException('Cannot override a cache line. It '
				303	'must be cleared before setting. Key: %s, hosts %s' %
				304	(key, line))
				305
				306
				307	def set_line(self, key, hosts):
				308	"""Cache a list of similar hosts.
				309
				310	set_line will no-op if:
				311	The hosts aren't all unleased.
				312	The hosts don't have deps/acls matching the key.
				313	A cache line under the same key already exists.
				314	The first 2 cases will lead to a cache miss in the corresponding get.
				315
				316	@param hosts: A list of unleased hosts with the same deps/acls.
				317	@raises RDBException: If hosts is None, since None is reserved for
				318	key expiration.
				319	"""
				320	if hosts is None:
				321	raise rdb_utils.RDBException('Cannot set None in the cache.')
				322
				323	# An empty list means no hosts matching the request are available.
				324	# This can happen if a previous request leased all matching hosts.
				325	if not hosts or not self.use_cache:
				326	self._cache_backend.set(key, [])
				327	return
				328	try:
				329	self._check_line(hosts, key)
				330	except rdb_utils.RDBException as e:
				331	logging.error(e)
				332	else:
				333	self._cache_backend.set(key, set(hosts))