Page Speed Optimization Libraries  1.13.35.1
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Public Member Functions | Friends | List of all members
net_instaweb::DomainLawyer Class Reference

Public Member Functions

DomainLawyeroperator= (const DomainLawyer &src)
 
 DomainLawyer (const DomainLawyer &src)
 
bool MapRequestToDomain (const GoogleUrl &original_request, const StringPiece &resource_url, GoogleString *mapped_domain_name, GoogleUrl *resolved_request, MessageHandler *handler) const
 
bool IsDomainAuthorized (const GoogleUrl &original_request, const GoogleUrl &domain_to_check) const
 
bool IsOriginKnown (const GoogleUrl &domain_to_check) const
 
bool MapOrigin (const StringPiece &in, GoogleString *out, GoogleString *host_header, bool *is_proxy) const
 
bool MapOriginUrl (const GoogleUrl &gurl, GoogleString *out, GoogleString *host_header, bool *is_proxy) const
 
bool AddDomain (const StringPiece &domain_name, MessageHandler *handler)
 
bool AddKnownDomain (const StringPiece &domain_name, MessageHandler *handler)
 
bool AddRewriteDomainMapping (const StringPiece &to_domain, const StringPiece &comma_separated_from_domains, MessageHandler *handler)
 
bool AddTwoProtocolRewriteDomainMapping (const StringPiece &to_domain_name, const StringPiece &from_domain_name, MessageHandler *handler)
 
bool AddOriginDomainMapping (const StringPiece &to_domain, const StringPiece &comma_separated_from_domains, const StringPiece &host_header, MessageHandler *handler)
 
bool AddProxyDomainMapping (const StringPiece &proxy_domain, const StringPiece &origin_domain, const StringPiece &to_domain_name, MessageHandler *handler)
 
bool AddTwoProtocolOriginDomainMapping (const StringPiece &to_domain_name, const StringPiece &from_domain_name, const StringPiece &host_header, MessageHandler *handler)
 
bool AddShard (const StringPiece &to_domain, const StringPiece &comma_separated_shards, MessageHandler *handler)
 
bool ShardDomain (const StringPiece &domain_name, uint32 hash, GoogleString *sharded_domain) const
 
void Merge (const DomainLawyer &src)
 
void Clear ()
 
bool empty () const
 
bool WillDomainChange (const GoogleUrl &url) const
 
bool IsProxyMapped (const GoogleUrl &url) const
 Determines whether a URL's domain was proxy-mapped from a different origin.
 
bool can_rewrite_domains () const
 
int num_wildcarded_domains () const
 Visible for testing.
 
bool DoDomainsServeSameContent (const StringPiece &domain1, const StringPiece &domain2) const
 
void FindDomainsRewrittenTo (const GoogleUrl &domain_name, ConstStringStarVector *from_domains) const
 
void set_proxy_suffix (const GoogleString &suffix)
 
const GoogleStringproxy_suffix () const
 
bool StripProxySuffix (const GoogleUrl &gurl, GoogleString *url, GoogleString *host) const
 
bool AddProxySuffix (const GoogleUrl &base_url, GoogleString *href) const
 
GoogleString Signature () const
 
GoogleString ToString (StringPiece line_prefix) const
 
GoogleString ToString () const
 Version that's easier to call from debugger.
 

Friends

class DomainLawyerTest
 

Member Function Documentation

bool net_instaweb::DomainLawyer::AddDomain ( const StringPiece &  domain_name,
MessageHandler handler 
)

The methods below this comment are intended only to be run only at configuration time. Adds a simple domain to the set that can be rewritten. No mapping or sharding will be performed. Returns false if the domain syntax was not acceptable. Wildcards (*, ?) may be used in the domain_name. Careless use of wildcards can expose the user to XSS attacks.

bool net_instaweb::DomainLawyer::AddKnownDomain ( const StringPiece &  domain_name,
MessageHandler handler 
)

Adds a simple domain to the set that is known but not authorized for rewriting. Observes all other constraints mentioned for AddDomain.

bool net_instaweb::DomainLawyer::AddOriginDomainMapping ( const StringPiece &  to_domain,
const StringPiece &  comma_separated_from_domains,
const StringPiece &  host_header,
MessageHandler handler 
)

Adds a domain mapping, to assist with fetching resources from locally signficant names/ip-addresses. host_header may be empty ("") in which case the corresponding from_domain will be used.

Wildcards may not be used in the to_domain, but they can be used in the from_domains. Various tests depend on being able to add a port on to_domain (reference domain), though this functionality should not be relied on in production.

This routine can be called multiple times for the same to_domain. If the 'from' domains overlap due to wildcards, this will not be detected.

It is invalid to use the same origin_domain in AddProxyDomainMapping and as the to_domain of AddOriginDomainMapping. The latter requires a Host: request-header on fetches, whereas the former will not get one.

If host_header is empty, then MapOrigin will return a host_header matching the passed-in URL. If host_header is non-empty, it will be returned from MapOrigin as specified.

bool net_instaweb::DomainLawyer::AddProxyDomainMapping ( const StringPiece &  proxy_domain,
const StringPiece &  origin_domain,
const StringPiece &  to_domain_name,
MessageHandler handler 
)

Adds a mapping to enable proxying & optimizing resources hosted on a domain we do not control, going back to the origin to fetch them.

Wildcards may not be used in the proxy_domain or origin_domain.

Subdirectories should normally be used in the proxy_domain, the origin_domain, and to_domain. This is a not a strict requirement. If you fully control the entire origin domain and are dedicating a proxy domain for the sole use of that origin domain then subdirectories are not needed.

The proxy_domain must be running mod_pagespeed and configured consistently. The resources will be referenced from this domain in CSS and HTML files.

The origin_domain does not need to run mod_pagespeed; it is used to fetch the resources.

If to_domain is provided then resources are rewritten to to_domain instead of proxy_domain. This is useful for rewriting to a CDN.

It is invalid to use the same origin_domain in AddProxyDomainMapping and to_domain of AddOriginDomainMapping. The latter requires a overriding the Host: request-header on fetches.

bool net_instaweb::DomainLawyer::AddProxySuffix ( const GoogleUrl base_url,
GoogleString href 
) const

Adds a proxy suffix to the Host in *href if it matches the the base URL. Returns true if the href was modified, false if it wasn't.

bool net_instaweb::DomainLawyer::AddRewriteDomainMapping ( const StringPiece &  to_domain,
const StringPiece &  comma_separated_from_domains,
MessageHandler handler 
)

Adds a domain mapping, to assist with serving resources from cookieless domains or CDNs. This implicitly calls AddDomain(to_domain) and AddDomain(from_domain) if necessary. If either 'to' or 'from' has invalid syntax then this function returns false and has no effect.

Wildcards may not be used in the to_domain, but they can be used in the from_domains.

This routine can be called multiple times for the same to_domain. If the 'from' domains overlap due to wildcards, this will not be detected.

bool net_instaweb::DomainLawyer::AddShard ( const StringPiece &  to_domain,
const StringPiece &  comma_separated_shards,
MessageHandler handler 
)

Specifies domain-sharding. This implicitly calls AddDomain(to_domain).

Wildcards may not be used in the to_domain or the from_domain.

bool net_instaweb::DomainLawyer::AddTwoProtocolOriginDomainMapping ( const StringPiece &  to_domain_name,
const StringPiece &  from_domain_name,
const StringPiece &  host_header,
MessageHandler handler 
)

Adds domain mappings that handle fetches on both http and https for the given from_domain. No wildcards may be used in either domain, and both must be protocol-free and should not have port numbers. host_header behaves the same as passed into AddOriginDomainMapping.

This routine may be called multiple times for the same to_domain.

bool net_instaweb::DomainLawyer::AddTwoProtocolRewriteDomainMapping ( const StringPiece &  to_domain_name,
const StringPiece &  from_domain_name,
MessageHandler handler 
)

Adds domain mappings that handle both http and https urls for the given from_domain_name. No wildcards may be used in either domain, and both must be protocol-free and should not have port numbers.

This routine can be called multiple times for the same to_domain.

bool net_instaweb::DomainLawyer::can_rewrite_domains ( ) const
inline

Determines whether any resources might be domain-mapped, either via sharding, rewriting, or due to a proxy_suffix

bool net_instaweb::DomainLawyer::DoDomainsServeSameContent ( const StringPiece &  domain1,
const StringPiece &  domain2 
) const

Determines whether two domains have been declared as serving the same content by the user, via Rewrite or Shard mapping.

void net_instaweb::DomainLawyer::FindDomainsRewrittenTo ( const GoogleUrl domain_name,
ConstStringStarVector *  from_domains 
) const

Finds domains rewritten to this domain. Includes only non-wildcarded domains. comma_separated_from_domains is empty if no mapping found.

bool net_instaweb::DomainLawyer::IsDomainAuthorized ( const GoogleUrl original_request,
const GoogleUrl domain_to_check 
) const

Given the context of an HTTP request to 'original_request', checks whether 'domain_to_check' is authorized for rewriting.

For example, if we are rewriting http://www.myhost.com/index.html, then all resources from www.myhost.com are implicitly authorized for rewriting. Additionally, any domains specified via AddDomain() are also authorized.

bool net_instaweb::DomainLawyer::IsOriginKnown ( const GoogleUrl domain_to_check) const

Returns true if the given origin (domain:port) is one that we were explicitly told about in any form — e.g. as a rewrite domain, origin domain, simple domain, or a shard.

Note that this method returning true does not mean that resources from the given domain should be rewritten.

The intent of this method is identify external hostnames fetchers should connect to. IMPORTANT: users of this method MUST NOT trust the Host: header for authorizing external connections, since doing that would make it trivial to bypass the check.

bool net_instaweb::DomainLawyer::MapOrigin ( const StringPiece &  in,
GoogleString out,
GoogleString host_header,
bool *  is_proxy 
) const

Maps an origin resource; just prior to fetching it. This fails if the input URL is not valid. It succeeds even if there is no mapping done. You must compare 'in' to 'out' to determine if mapping was done.

"*host_header is set to the Host header to use when fetching the resource from *out".

*is_proxy is set to true if the origin-domain was established via AddProxyDomainMapping.

bool net_instaweb::DomainLawyer::MapRequestToDomain ( const GoogleUrl original_request,
const StringPiece &  resource_url,
GoogleString mapped_domain_name,
GoogleUrl resolved_request,
MessageHandler handler 
) const

Determines whether a resource can be rewritten, and returns the domain that it should be written to. The domain and the path of the resolved request are considered - first just the domain, then the domain plus the root of the path, and so on down the path until a match is found or the path is exhausted; this is done because we can map to a domain plus a path and we want to retain the previous behavior of 'working' when a mapped-to domain was provided. If the resource_url is relative (has no domain) then the resource can always be written, and will share the domain of the original request.

The resource_url is considered relative to original_request. Generally it is always accessible to rewrite resources in the same domain as the original.

Note: The mapped domain name will not incorporate any sharding. This is handled by ShardDomain().

The returned mapped_domain_name will always end with a slash on success. The returned resolved_request incorporates rewrite-domain mapping and the original URL.

Returns false on failure.

This is used both for domain authorization and domain rewriting, but not domain sharding.

See also IsDomainAuthorized, which can be used to determine domain authorization without performing a mapping.

void net_instaweb::DomainLawyer::Merge ( const DomainLawyer src)

Merge the domains declared in src into this. There are no exclusions, so this is really just aggregating the mappings and authorizations declared in both domains. When the same domain is mapped in 'this' and 'src', 'src' wins.

void net_instaweb::DomainLawyer::set_proxy_suffix ( const GoogleString suffix)
inline

A proxy suffix provides a mechanism to implement a reverse proxy of sorts. With a suffix ".suffix.net", a site foo.com can be served by foo.com.suffix.net, and the system, when set up as a proxy, will know how to strip the ".suffix.net" when fetching from oriign. It will also know how to re-insert the suffix when rewriting hyperlinks to try to keep users in the proxied domain as they navigate within the site.

As of Oct 1, 2014, resource-mapping is not supported by proxy_suffix, but it doesn't need to be. For example, given a reference on example.com to 'example.com/styles.css', such a reference would not be remapped when serving HTML from example.com.suffix.net. Relative references to 'styles.css' would be absolutified by the browser to example.com.suffix.net/styles.css, and served by the proxy, which would strip the '.suffix.net' and fetch the origin content from example.com/styles.css.

Todo:
TODO(jmarantz): In the future we will likely want to map absolutely referenced resources to from the origin domain to .suffix.net so we can optimize them. This can be implemented by integrating the proxy_suffix into MapRewriteDomain and MapOriginDomain, as a variation on MapProxyDomain.
bool net_instaweb::DomainLawyer::ShardDomain ( const StringPiece &  domain_name,
uint32  hash,
GoogleString sharded_domain 
) const

Computes a domain shard based on a passed-in hash, returning true if the domain was sharded. Output argument 'sharded_domain' is only updated if when the return value is true.

The hash is an explicit uint32 so that we get the same shard for a resource, whether the server is 32-bit or 64-bit. If we have 5 shards and used size_t for hashes, then we'd wind up with different shards on 32-bit and 64-bit machines and that would reduce cacheability of the sharded resources.

GoogleString net_instaweb::DomainLawyer::Signature ( ) const

Computes a signature for the DomainLawyer object including containing classes (Domain).

bool net_instaweb::DomainLawyer::StripProxySuffix ( const GoogleUrl gurl,
GoogleString url,
GoogleString host 
) const

Writes *url after stripping the proxy suffix from gurl, returing false if the gurl does not have a Host with the expected suffix.

Writes the origin host into *host.

GoogleString net_instaweb::DomainLawyer::ToString ( StringPiece  line_prefix) const

Computes a string representation meant for debugging purposes only. (The format might change in unpredictable ways and is not meant for machine consumption). Each domain will appear on a separate line, and each line will be prefixed with 'line_prefix'.

bool net_instaweb::DomainLawyer::WillDomainChange ( const GoogleUrl url) const

Determines whether a resource is going to change domains due to RewriteDomain mapping or domain sharding. Note that this does not account for the actual domain shard selected.

The entire URL should be passed in, not just the domain name.

Note that this is currently oblivious to proxy_suffix, whereas can_rewrite_domains() takes proxy_suffix into account.


The documentation for this class was generated from the following file: