oauth_dropins.webutil¶
util¶
Misc utilities.
-
class
oauth_dropins.webutil.util.
CacheDict
[source]¶ Bases:
dict
A dict that also implements memcache’s get_multi() and set_multi() methods.
Useful as a simple in memory replacement for App Engine’s memcache API for e.g. get_activities_response() in snarfed/activitystreams-unofficial.
-
class
oauth_dropins.webutil.util.
FileLimiter
(file_obj, read_limit)[source]¶ Bases:
object
A file object wrapper that reads up to a limit and then reports EOF.
From http://stackoverflow.com/a/29838711/186123 . Thanks SO!
-
class
oauth_dropins.webutil.util.
SimpleTzinfo
[source]¶ Bases:
datetime.tzinfo
A simple, DST-unaware tzinfo subclass.
-
offset
= datetime.timedelta(0)¶
-
-
class
oauth_dropins.webutil.util.
Struct
(**kwargs)[source]¶ Bases:
object
A generic class that initializes its attributes from constructor kwargs.
-
class
oauth_dropins.webutil.util.
UrlCanonicalizer
(scheme='https', domain=None, subdomain=None, approve=None, reject=None, query=False, fragment=False, trailing_slash=False, redirects=True, headers=None)[source]¶ Bases:
object
Converts URLs to their canonical form.
If an input URL matches approve or reject, it’s automatically approved as is without following redirects.
If we HEAD the URL to follow redirects and it returns 4xx or 5xx, we return None.
-
oauth_dropins.webutil.util.
add_query_params
(url, params)[source]¶ Adds new query parameters to a URL. Encodes as UTF-8 and URL-safe.
Parameters: - url – string URL or
urllib2.Request
. May already have query parameters. - params – dict or list of (string key, string value) tuples. Keys may repeat.
Returns: string URL
- url – string URL or
-
oauth_dropins.webutil.util.
as_utc
(input)[source]¶ Converts a timezone-aware datetime to a naive UTC datetime.
If input is timezone-naive, it’s returned as is.
Doesn’t support DST!
-
oauth_dropins.webutil.util.
base_url
(url)[source]¶ Returns the base of a given URL.
For example, returns ‘http://site/posts/’ for ‘http://site/posts/123’.
Parameters: url – string
-
oauth_dropins.webutil.util.
clean_url
(url)[source]¶ Removes transient query params (e.g. utm_*) from a URL.
The utm_* (Urchin Tracking Metrics?) params come from Google Analytics. https://support.google.com/analytics/answer/1033867
The source=rss-… params are on all links in Medium’s RSS feeds.
Parameters: url – string Returns: string, the cleaned url, or None if it can’t be parsed
-
oauth_dropins.webutil.util.
dedupe_urls
(urls)[source]¶ Normalizes and de-dupes http(s) URLs.
Converts domain to lower case, adds trailing slash when path is empty, and ignores scheme (http vs https), preferring https. Preserves order.
Domains are case insensitive, even modern domains with Unicode/punycode characters:
http://unicode.org/faq/idn.html#6 https://tools.ietf.org/html/rfc4343#section-5
As examples, http://foo/ and https://FOO are considered duplicates, but http://foo/bar and http://foo/bar/ aren’t.
Background: https://en.wikipedia.org/wiki/URL_normalization
TODO: port to https://pypi.python.org/pypi/urlnorm
Parameters: urls – sequence of string URLs Returns: sequence of string URLs
-
oauth_dropins.webutil.util.
domain_from_link
(url)[source]¶ Extracts and returns the meaningful domain from a URL.
Strips www., mobile., and m. from the beginning of the domain.
Parameters: url – string Returns: string
-
oauth_dropins.webutil.util.
domain_or_parent_in
(input, domains)[source]¶ Returns True if an input domain or its parent is in a set of domains.
Examples:
foo, [] => False foo, [foo] => True foo.bar.com, [bar.com] => True foo.bar.com, [.bar.com] => True foo.bar.com, [fux.bar.com] => False bar.com, [fux.bar.com] => False
Parameters: - input – string domain
- domains – sequence of string domains
Returns: boolean
-
oauth_dropins.webutil.util.
ellipsize
(str, words=14, chars=140)[source]¶ Truncates and ellipsizes str if it’s longer than words or chars.
Words are simply tokenized on whitespace, nothing smart.
-
oauth_dropins.webutil.util.
extract_links
(text)[source]¶ Returns a list of unique string URLs in the given text.
URLs in the returned list are in the order they first appear in the text.
-
oauth_dropins.webutil.util.
follow_redirects
(url, cache=None, fail_cache_time_secs=86400, **kwargs)[source]¶ Fetches a URL with HEAD, repeating if necessary to follow redirects.
Does not raise an exception if any of the HTTP requests fail, just returns the failed response. If you care, be sure to check the returned response’s status code!
Parameters: - url – string
- cache – optional, a cache object to read and write resolved URLs to. Must have get(key) and set(key, value, time=…) methods. Stores ‘R [original URL]’ in key, final URL in value.
- kwargs – passed to requests.head()
Returns: - the requests.Response for the final request. The url attribute has the
final URL.
-
oauth_dropins.webutil.util.
fragmentless
(url)[source]¶ Strips the fragment (e.g. ‘#foo’) from a URL.
Parameters: url – string Returns: string URL
-
oauth_dropins.webutil.util.
generate_secret
()[source]¶ Generates a URL-safe random secret string.
Uses App Engine’s os.urandom(), which is designed to be cryptographically secure: http://code.google.com/p/googleappengine/issues/detail?id=1055
Parameters: bytes – integer, length of string to generate Returns: random string
-
oauth_dropins.webutil.util.
get_first
(dict, key, default=None)[source]¶ Returns the first element of a dict value.
If the value is a list or tuple, returns the first value. If it’s something else, returns the value itself. If the key doesn’t exist, returns None.
-
oauth_dropins.webutil.util.
get_list
(dict, key)[source]¶ Returns a value from a dict as a list.
If the value is a list or tuple, it’s converted to a list. If it’s something else, it’s returned as a single-element list. If the key doesn’t exist, returns [].
-
oauth_dropins.webutil.util.
if_changed
(cache, updates, key, value)[source]¶ Returns a value if it’s different from the cached value, otherwise None.
Values that evaluate to False are considered equivalent to None, in order to save cache space.
If the values differ, updates[key] is set to value. You can use this to collect changes that should be made to the cache in batch. None values in updates mean that the corresponding key should be deleted.
Parameters: - cache – any object with a get(key) method
- updates – mapping (e.g. dict)
- key – anything supported by cache
- value – anything supported by cache
Returns: value or None
-
oauth_dropins.webutil.util.
interpret_http_exception
(exception)[source]¶ Extracts the status code and response from different HTTP exception types.
Parameters: - exception – one of:
- apiclient.errors.HttpError (*) –
- exc.WSGIHTTPException (*) –
- gdata.client.RequestError (*) –
- oauth2client.client.AccessTokenRefreshError (*) –
- requests.HTTPError (*) –
- urllib2.HTTPError (*) –
- urllib2.URLError (*) –
Returns: (string status code or None, string response body or None)
-
oauth_dropins.webutil.util.
is_base64
(arg)[source]¶ Returns True if arg is a base64 encoded string, False otherwise.
-
oauth_dropins.webutil.util.
is_connection_failure
(exception)[source]¶ Returns True if the given exception is a network connection failure.
…False otherwise.
-
oauth_dropins.webutil.util.
is_float
(arg)[source]¶ Returns True if arg can be converted to a float, False otherwise.
-
oauth_dropins.webutil.util.
is_int
(arg)[source]¶ Returns True if arg can be converted to an integer, False otherwise.
-
oauth_dropins.webutil.util.
linkify
(text, pretty=False, skip_bare_cc_tlds=False, **kwargs)[source]¶ Adds HTML links to URLs in the given plain text.
For example:
linkify('Hello http://tornadoweb.org!')
would return ‘Hello <a href=”http://tornadoweb.org”>http://tornadoweb.org</a>!’Ignores URLs that are inside HTML links, ie anchor tags that look like <a href=”…”> .
Parameters: - text – string, input
- pretty – if True, uses
pretty_link()
for link text
Returns: string, linkified input
-
oauth_dropins.webutil.util.
load_file_lines
(file)[source]¶ Reads lines from a file and returns them as a set.
Leading and trailing whitespace is trimmed. Blank lines and lines beginning with # (ie comments) are ignored.
Parameters: file – a file object or other iterable that returns lines Returns: set of strings
-
oauth_dropins.webutil.util.
maybe_iso8601_to_rfc3339
(input)[source]¶ Tries to convert an ISO 8601 date/time string to RFC 3339.
The formats are similar, but not identical, eg. RFC 3339 includes a colon in the timezone offset at the end (+0000 instead of +00:00), but ISO 8601 doesn’t.
If the input can’t be parsed as ISO 8601, it’s silently returned, unchanged!
-
oauth_dropins.webutil.util.
maybe_timestamp_to_rfc3339
(input)[source]¶ Tries to convert a string or int UNIX timestamp to RFC 3339.
-
oauth_dropins.webutil.util.
parse_acct_uri
(uri, hosts=None)[source]¶ Parses acct: URIs of the form acct:user@example.com .
Background: http://hueniverse.com/2009/08/making-the-case-for-a-new-acct-uri-scheme/
Parameters: - uri – string
- hosts – sequence of allowed hosts (usually domains). None means allow all.
Returns: (username, host) tuple
Raises: ValueError if the uri is invalid or the host isn’t allowed.
-
oauth_dropins.webutil.util.
parse_iso8601
(str)[source]¶ Parses an ISO 8601 or RFC 3339 date/time string and returns a datetime.
Time zone designator is optional. If present, the returned datetime will be time zone aware.
Parameters: str – string ISO 8601 or RFC 3339, e.g. ‘2012-07-23T05:54:49+00:00’ Returns: datetime
-
oauth_dropins.webutil.util.
parse_tag_uri
(uri)[source]¶ Returns the domain and name in a tag URI string.
Inverse of
tag_uri()
.- Returns: (string domain, string name) tuple, or None if the tag URI couldn’t
- be parsed
-
oauth_dropins.webutil.util.
pretty_link
(url, text=None, keep_host=True, glyphicon=None, attrs=None, new_tab=False, max_length=None)[source]¶ Renders a pretty, short HTML link to a URL.
If text is not provided, the link text is the URL without the leading http(s)://[www.], ellipsized at the end if necessary. URL escape characters and UTF-8 are decoded.
The default maximum length follow’s Twitter’s rules: full domain plus 15 characters of path (including leading slash). * https://dev.twitter.com/docs/tco-link-wrapper/faq * https://dev.twitter.com/docs/counting-characters
Parameters: - url – string
- text – string, optional
- keep_host – if False, remove the host from the link text
- glyphicon – string glyphicon to render after the link text, if provided. Details: http://glyphicons.com/
- attrs – dict of attributes => values to include in the a tag. optional
- new_tab – boolean, include target=”_blank” if True
- max_length – int, max link text length in characters. ellipsized beyond this.
Returns: unicode string HTML snippet with <a> tag
-
oauth_dropins.webutil.util.
requests_fn
(fn)[source]¶ Wraps requests.* and logs the HTTP method and URL.
-
oauth_dropins.webutil.util.
requests_get
(url, *args, **kwargs)¶
-
oauth_dropins.webutil.util.
requests_head
(url, *args, **kwargs)¶
-
oauth_dropins.webutil.util.
requests_post
(url, *args, **kwargs)¶
-
oauth_dropins.webutil.util.
schemeless
(url, slashes=True)[source]¶ Strips the scheme (e.g. ‘https:’) from a URL.
Parameters: - url – string
- leading_slashes – if False, also strips leading slashes and trailing slash, e.g. ‘http://example.com/’ becomes ‘example.com’
Returns: string URL
-
oauth_dropins.webutil.util.
tag_uri
(domain, name, year=None)[source]¶ Returns a tag URI string for the given domain and name.
Example return value: ‘tag:twitter.com,2012:snarfed_org/172417043893731329’
Background on tag URIs: http://taguri.org/
-
oauth_dropins.webutil.util.
to_utc_timestamp
(input)[source]¶ Converts a datetime to a float POSIX timestamp (seconds since epoch).
-
oauth_dropins.webutil.util.
to_xml
(value)[source]¶ Renders a dict (usually from JSON) as an XML snippet.
-
oauth_dropins.webutil.util.
tokenize_links
(text, skip_bare_cc_tlds=False)[source]¶ Splits text into link and non-link text.
Parameters: - text – string to linkify
- skip_bare_cc_tlds – boolean, whether to skip links of the form [domain].[2-letter TLD] with no schema and no path
Returns: a tuple containing two lists of strings, a list of links and list of non-link text. Roughly equivalent to the output of re.findall and re.split, with some post-processing.
-
oauth_dropins.webutil.util.
trim_nulls
(value)[source]¶ Recursively removes dict and list elements with None or empty values.
-
oauth_dropins.webutil.util.
uniquify
(input)[source]¶ Returns a list with duplicate items removed.
Like list(set(…)), but preserves order.
handlers¶
Request handler utility classes.
Includes classes for serving templates with common variables and XRD[S] and JRD files like host-meta and friends.
-
class
oauth_dropins.webutil.handlers.
HostMetaHandler
(*args, **kwargs)[source]¶ Bases:
oauth_dropins.webutil.handlers.XrdOrJrdHandler
Renders and serves the /.well-known/host-meta file.
-
class
oauth_dropins.webutil.handlers.
HostMetaXrdsHandler
(*args, **kwargs)[source]¶ Bases:
oauth_dropins.webutil.handlers.TemplateHandler
Renders and serves the /.well-known/host-meta.xrds XRDS-Simple file.
-
class
oauth_dropins.webutil.handlers.
ModernHandler
(*args, **kwargs)[source]¶ Bases:
webapp2.RequestHandler
Base handler that adds modern open/secure headers like CORS, HSTS, etc.
-
class
oauth_dropins.webutil.handlers.
TemplateHandler
(*args, **kwargs)[source]¶ Bases:
oauth_dropins.webutil.handlers.ModernHandler
Renders and serves a template based on class attributes.
Subclasses must override
template_file()
and may also overridetemplate_vars()
andcontent_type()
.-
USE_APPENGINE_WEBAPP
= False¶
-
-
class
oauth_dropins.webutil.handlers.
XrdOrJrdHandler
(*args, **kwargs)[source]¶ Bases:
oauth_dropins.webutil.handlers.TemplateHandler
Renders and serves an XRD or JRD file.
JRD is served if the request path ends in .json, or the query parameters include ‘format=json’, or the request headers include ‘Accept: application/json’.
Subclasses must override
template_prefix()
.
-
oauth_dropins.webutil.handlers.
handle_exception
(self, e, debug)[source]¶ A webapp2 exception handler that propagates HTTP exceptions into the response.
Use this as a webapp2.RequestHandler handle_exception() method by adding this line to your handler class definition:
handle_exception = handlers.handle_exception
I originally tried to put this in a RequestHandler subclass, but it gave me this exception:
File ".../webapp2-2.5.1/webapp2_extras/local.py", line 136, in _get_current_object raise RuntimeError('no object bound to %s' % self.__name__) RuntimeError: no object bound to app
These are probably related: * http://eemyop.blogspot.com/2013/05/digging-around-in-webapp2-finding-out.html * http://code.google.com/p/webapp-improved/source/detail?r=d962ac4625ce3c43a3e59fd7fc07daf8d7b7c46a
models¶
App Engine datastore model base classes and utilites.
-
class
oauth_dropins.webutil.models.
KeyNameModel
(*args, **kwargs)[source]¶ Bases:
google.appengine.ext.db.Model
A db model class that requires a key name.
-
class
oauth_dropins.webutil.models.
SingleEGModel
(*args, **kwargs)[source]¶ Bases:
google.appengine.ext.db.Model
A model class that stores all entities in a single entity group.
All entities use the same parent key (below), and
all()
automatically adds it as an ancestor. That allows, among other things, fetching all entities of this kind with strong consistency.-
classmethod
all
()[source]¶ Returns a query over all instances of this model from the datastore.
Returns: Query that will retrieve all instances from entity collection.
-
enforce_parent
()[source]¶ Sets the parent keyword arg. If it’s already set, checks that it’s correct.
-
classmethod
get_by_id
(*args, **kwargs)[source]¶ Get instance of Model class by id.
Parameters: - key_names – A single id or a list of ids.
- parent – Parent of instances to get. Can be a model or key.
- config – datastore_rpc.Configuration to use for this request.
-
classmethod
get_by_key_name
(*args, **kwargs)[source]¶ Get instance of Model class by its key’s name.
Parameters: - key_names – A single key-name or a list of key-names.
- parent – Parent of instances to get. Can be a model or key.
- config – datastore_rpc.Configuration to use for this request.
-
classmethod
get_or_insert
(*args, **kwargs)[source]¶ Transactionally retrieve or create an instance of Model class.
This acts much like the Python dictionary setdefault() method, where we first try to retrieve a Model instance with the given key name and parent. If it’s not present, then we create a new instance (using the *kwds supplied) and insert that with the supplied key name.
Subsequent calls to this method with the same key_name and parent will always yield the same entity (though not the same actual object instance), regardless of the *kwds supplied. If the specified entity has somehow been deleted separately, then the next call will create a new entity and return it.
If the ‘parent’ keyword argument is supplied, it must be a Model instance. It will be used as the parent of the new instance of this Model class if one is created.
This method is especially useful for having just one unique entity for a specific identifier. Insertion/retrieval is done transactionally, which guarantees uniqueness.
Example usage:
- class WikiTopic(db.Model):
- creation_date = db.DatetimeProperty(auto_now_add=True) body = db.TextProperty(required=True)
# The first time through we’ll create the new topic. wiki_word = ‘CommonIdioms’ topic = WikiTopic.get_or_insert(wiki_word,
body=’This topic is totally new!’)assert topic.key().name() == ‘CommonIdioms’ assert topic.body == ‘This topic is totally new!’
# The second time through will just retrieve the entity. overwrite_topic = WikiTopic.get_or_insert(wiki_word,
body=’A totally different message!’)assert topic.key().name() == ‘CommonIdioms’ assert topic.body == ‘This topic is totally new!’
Parameters: - key_name – Key name to retrieve or create.
- **kwds – Keyword arguments to pass to the constructor of the model class if an instance for the specified key name does not already exist. If an instance with the supplied key_name and parent already exists, the rest of these arguments will be discarded.
Returns: Existing instance of Model class with the specified key_name and parent or a new one that has just been created.
Raises: - TransactionFailedError if the specified Model instance could not be
- retrieved or created transactionally (due to high contention, etc).
Returns the shared parent key for this class.
It’s not actually an entity, just a placeholder key.
-
classmethod
testutil¶
Unit test utilities.
-
class
oauth_dropins.webutil.testutil.
HandlerTest
(methodName='runTest')[source]¶ Bases:
oauth_dropins.webutil.testutil.TestCase
Base test class for webapp2 request handlers.
Uses App Engine’s testbed to set up API stubs: http://code.google.com/appengine/docs/python/tools/localunittesting.html
-
application
¶
-
handler
¶
-
-
class
oauth_dropins.webutil.testutil.
TestCase
(methodName='runTest')[source]¶ Bases:
mox.MoxTestBase
Test case class with lots of extra helpers.
-
assert_entities_equal
(a, b, ignore=frozenset([]), keys_only=False, in_order=False)[source]¶ Asserts that a and b are equivalent entities or lists of entities.
…specifically, that they have the same property values, and if they both have populated keys, that their keys are equal too.
Parameters: - b (a,) –
db.Model
orndb.Model
instances or lists of instances - ignore – sequence of strings, property names not to compare
- keys_only – boolean, if True only compare keys
- in_order – boolean. If False, all entities must have keys.
- b (a,) –
-
assert_equals
(expected, actual, msg=None, in_order=False)[source]¶ Pinpoints individual element differences in lists and dicts.
If in_order is False, ignores order in lists and tuples.
-
assert_multiline_equals
(expected, actual)[source]¶ Compares two multi-line strings and reports a diff style output.
Ignores leading and trailing whitespace on each line, and squeezes repeated blank lines down to just one.
-
assert_multiline_in
(expected, actual)[source]¶ Checks that a multi-line string is in another and reports a diff output.
Ignores leading and trailing whitespace on each line, and squeezes repeated blank lines down to just one.
-
expect_urlopen
(url, response=None, status=200, data=None, headers=None, response_headers={}, **kwargs)[source]¶ Stubs out
urllib2.urlopen()
and sets up an expected call.If status isn’t 2xx, makes the expected call raise a
urllib2.HTTPError
instead of returning the response.If data is set, url must be a
urllib2.Request
.If response is unset, returns the expected call.
Parameters: - url – string,
re.RegexObject
orurllib2.Request
orwebob.Request
- response – string
- status – int, HTTP response code
- data – optional string POST body
- headers – optional expected request header dict
- response_headers – optional response header dict
- kwargs – other keyword args, e.g. timeout
- url – string,
-