Page Speed Optimization Libraries  1.13.35.1
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Classes | Namespaces | Typedefs
property_cache.h File Reference
#include <map>
#include <vector>
#include "pagespeed/kernel/base/basictypes.h"
#include "pagespeed/kernel/base/ref_counted_ptr.h"
#include "pagespeed/kernel/base/scoped_ptr.h"
#include "pagespeed/kernel/base/string.h"
#include "pagespeed/kernel/base/string_util.h"
#include "pagespeed/kernel/cache/cache_interface.h"
#include "pagespeed/opt/http/request_context.h"

Go to the source code of this file.

Classes

class  net_instaweb::PropertyValue
 Holds the value & stability-metadata for a property. More...
 
class  net_instaweb::PropertyCache
 Adds property-semantics to a raw cache API. More...
 
class  net_instaweb::PropertyCache::Cohort
 
class  net_instaweb::AbstractPropertyPage
 Abstract interface for implementing a PropertyPage. More...
 
class  net_instaweb::PropertyPage
 

Namespaces

 net_instaweb
 Unit-test framework for wget fetcher.
 

Typedefs

typedef std::vector
< PropertyPage * > 
net_instaweb::PropertyPageStarVector
 

Detailed Description

Implements a cache that can be used to store multiple properties on a key. This can be useful if the origin data associated with the key is not cacheable itself, but we think some properties of it might be reasonably stable. The cache can optionally track how frequently the properties change, so that when a property is read, the reader can gauge how stable it is. It also will manage time-based expirations of property-cache data (NYI).

It supports properties with widely varying update frequences, though these must be specified by the programmer by grouping objects of similar frequency in a Cohort.

Terminology: PropertyCache – adds property semantics & grouping to the raw name/value Cache Interface.

PropertyValue – a single name/value pair with stability metadata, so that users of the PropertyValue can find out whether the property being measured appears to be stable.

PropertyCache::Cohort – labels a group of PropertyValues that are expected to have similar write-frequency. Properties are grouped together to minimize the number of cache lookups and puts. But we do not want to put all values into a single Cohort to avoid having fast-changing properties stomp on a slow-changing properties that share the same cache entry. Thus we initiate lookpus for all Cohorts immediately on receiving a URL, but we write back each Cohort independently, under programmer control.

The concurrent read of all Cohorts can be implemented on top of a batched cache lookup if the platform supports it, to reduce RPCs.

Note that the Cohort* is simply a label, and doesn't hold the properties or the data.

PropertyPage – this tracks all the PropertyValues in all the Cohorts for a key (e.g., an HTML page URL). Generally a PropertyPage must be read prior to being written, so that unmodified PropertyValues in a Cohort are not erased by updating a single Cohert property. The page executes a Read/Modify/Write sequence, but there is no locking. Multiple processes & threads are potentially writing entries to the cache simultaneously, so there can be races which might stomp on writes for individual properties in a Cohort.

The value of aggregating multiple properties into a Cohort is to reduce the query-traffic on caches.

Let's study an example for URL "http://..." with two Cohorts, "dom_metrics" and "render_data", where we expect dom_metrics to be updated very frequently. In dom_metrics we have (not that this is useful) "num_divs" and "num_a_tags". In "render_data" we have "critical_image_list" and "referenced_resources". When we get a request for "http://example.com/index.html" we'll make a batched lookup for 2 keys:

"prop/http://example.com/index.html@dom_metrics". "prop/http://example.com/index.html@render_data".

Within the values for "prop/http://example.com/index.html@dom_metrics" we'll have a 2-element array of Property values for "num_divs" and "num_a_tags". We'll write to that cache entry; possibly every time http://example.com/index.html is rewritten, so that we can track how stable the number of divs and a_tags is, so that rewriters that might wish to exploit advance knowledge of how many tags are going to be in the document can determine how reliable that information is.

In the future we might track real-time & limit the frequency of updates for a given entry.