rnginline API

This is the Python API reference for rnginline.

rnginline

rnginline.inline(source-arg[, optional-kwargs])

Load an XML document containing a RELAX NG schema, recursively loading and inlining any <include href="...">/<externalRef href="..."> elements to form a complete schema in a single XML document.

URLs in href attributes are dereferenced to obtain the RELAX NG schemas they point to using one or more URL Handlers. By default, handlers for file: and pydata: URLs are registered.

Keyword Arguments:
 
  • src – The source to load the schema from. Either an lxml.etree Element, a URL, filesystem path or file-like object
  • etree – Explicitly provide an lxml.etree Element as the source
  • url – Explicitly provide a URL as the source
  • path – Explicitly provide a filesystem path as the source
  • file – Explicitly provide a file-like object as the source
  • handlers – An iterable of UrlHandler objects which are, in turn, requested to fetch each href attribute’s URL. Defaults to the rnginline.urlhandlers.file and rnginline.urlhandlers.pydata in that order.
  • base_uri – A URI to override the base URI of the schema with. Useful when the source doesn’t have a sensible base URI, e.g. passing a file object like sys.stdin
  • postprocessors – An iterable of PostProcess objects which perform arbitary transformations on the inlined XML before it’s returned/ loaded as a schema. Defaults to the result of calling rnginline.postprocess.get_default_postprocessors()
  • create_validator – If True, a validator created via lxml.etree.RelaxNG() is returned instead of an lxml Element
  • default_base_uri – The root URI which all others are resolved against. Defaults to file:<current directory> which relative file URLs such as 'external.rng' to be found relative to the current working directory.
  • inliner – The class to create the Inliner instance from. Defaults to rnginline.Inliner.
  • create_validator – If True, an lxml RelaxNG validator is created from the loaded XML document and returned. If False then the loaded XML is returned.
Returns:

A lxml.etree.RelaxNG validator from the fully loaded and inlined XML, or the XML itself, depending on the create_validator argument.

Raises:

RelaxngInlineError – (or subclass) is raised if the schema can’t be loaded.

class rnginline.Inliner(handlers=None, postprocessors=None, default_base_uri=None)

Inliners merge references to external schemas into an input schema via their inline() method.

Typically you can ignore this class and just use rnginline.inline() which handles instantiating an Inliner and calling its inline() method.

__init__(handlers=None, postprocessors=None, default_base_uri=None)

Create an Inliner with the specified Handlers, PostProcessors and default base URI.

Parameters:
inline([src, ]**kwargs)

Load an XML document containing a RELAX NG schema, recursively loading and inlining any <include>/<externalRef> elements to form a complete schema.

URLs in <include>/<externalRef> elements are resolved against the base URL of their containing document, and fetched using one of this Inliner’s urlhandlers.

Parameters:
  • src – The source to load the schema from. Either an lxml.etree Element, a URL, filesystem path or file-like object.
  • etree – Explicitly provide an lxml.etree Element as the source
  • url – Explicitly provide a URL as the source
  • path – Explicitly provide a path as the source
  • file – Explicitly provide a file-like as the source
  • base_uri – A URI to override the base URI of the grammar with. Useful when the source doesn’t have a sensible base URI, e.g. passing sys.stdin as a file.
  • create_validator – If True, an lxml RelaxNG validator is created from the loaded XML document and returned. If False then the loaded XML is returned.
Returns:

A lxml.etree.RelaxNG validator from the fully loaded and inlined XML, or the XML itself, depending on the create_validator argument.

Raises:

RelaxngInlineError – (or subclass) is raised if the schema can’t be loaded.

rnginline.urlhandlers

This module contains the built-in URL Handlers provided by rnginline.

URL Handler objects are responsible for:

  • Saying if they can handle a URL — can_handle(url)
  • Fetching the data referenced by a URL — dereference(url)

Default URL Handler instances

The following URL Handler objects are provided, ready to use:

rnginline.urlhandlers.file

A UrlHandler for file: URLs. This handler can resolve references to files on the local filesystem.

rnginline.urlhandlers.pydata

A URL Handler which allows data files in Python packages to be referenced.

The URLs handled by instances of this class are layed out as follows:

pydata://<package-path>/<path-under-package>

For example pydata://rnginline.test/data/loops/start.rng.

They’re also available via:

rnginline.urlhandlers.get_default_handlers()

Get a list of the default URL Handler objects.

URL Handler Classes

class rnginline.urlhandlers.FilesystemUrlHandler

A UrlHandler for file: URLs. This handler can resolve references to files on the local filesystem.

can_handle(url)

Check if this handler supports url.

This handler supports URLs with the file: scheme.

Parameters:url – A URL as a string.
Returns:True if url is supported by this handler, False otherwise
Return type:bool
dereference(url)

Read the contents of the file identified by url.

Parameters:url – A file: URL
Returns:The content of the file as a byte string
Raises:DereferenceError – if an IOError prevents the file being read
static makeurl(file_path, abs=False)

Create relative or absolute URL pointing to the filesystem path file_path.

(Absolute refers to whether or not the URL has a scheme, not whether the path is absolute.)

Parameters:
  • file_path – The path on the filesystem to point to
  • abs – Whether the returned URL should be absolute (with a file scheme) or a relative URL (URI-reference) without the scheme
Returns:

A file URL pointing to file_path

Note

The current directory of the program has no effect on this function

Examples

>>> from rnginline.urlhandlers import file
>>> file.makeurl('/tmp/foo')
'/tmp/foo'
>>> file.makeurl('/tmp/foo', abs=True)
'file:/tmp/foo'
>>> file.makeurl('file.txt')
'file.txt'
>>> file.makeurl('file.txt', abs=True)
'file:file.txt'
static breakurl(file_url)

Decode a file: URL into a filesystem path.

Parameters:file_url – The URL to decode. Can be an absolute URL with a file: scheme, or a relative URL without a scheme.
Returns:The filesystem path implied by the URL

Examples

>>> from rnginline.urlhandlers import file
>>> file.breakurl('file:/tmp/some%20file.txt')
'/tmp/some file.txt'
>>> file.breakurl('some/path/file%20name.dat')
'some/path/file name.dat'
class rnginline.urlhandlers.PackageDataUrlHandler

A URL Handler which allows data files in Python packages to be referenced.

The URLs handled by instances of this class are layed out as follows:

pydata://<package-path>/<path-under-package>

For example pydata://rnginline.test/data/loops/start.rng.

can_handle(url)

Check if this handler supports url.

This handler supports URLs with the pydata: scheme.

Parameters:url – A URL as a string.
Returns:True if url is supported by this handler, False otherwise
Return type:bool
dereference(url)

Get the contents of the data file identified by url

pkgutil.get_data() is used to fetch the data.

Parameters:url – A pydata: URL pointing at a file under a Python package
Returns:A byte string
Raises:DereferenceError – If the data identified by the URL does not exist or cannot be read
classmethod makeurl(package, resource_path)

Create a URL referencing data under a Python package.

Parameters:
  • package – A dotted path you’d use to import the package in question
  • resource_path – The path under the package to a data file
Returns:

A URL of the form pydata://<package>/<resource_path>

Return type:

...

Example

>>> from rnginline.urlhandlers import pydata
>>> pydata.makeurl('mypkg.subpkg', 'some/file.txt')
'pydata://mypkg.subpkg/some/file.txt'
classmethod breakurl(url)

Deconstruct a pydata: URL into constituent parts.

Parameters:url – A pydata: URL
Returns:A 2-tuple of the package and path contained in the URL

Example

>>> from rnginline.urlhandlers import pydata
>>> pydata.breakurl('pydata://mypkg.subpkg/some/file.txt')
('mypkg.subpkg', 'some/file.txt')

rnginline.postprocess

rnginline.postprocess.datatypelibrary = <rnginline.postprocess.PropagateDatatypeLibraryPostProcess object>

Implements the propagation part of simplification 4.3: datatypeLibrary attributes are resolved and explicitly set on each data and value element, then removed from all other elements.

This can be used to work around libxml2 not resolving datatypeLibrary attributes from div elements.

rnginline.postprocess.get_default_postprocessors()

Get a list containing the default postprocessor objects.

Currently contains just datatypelibrary.

class rnginline.postprocess.PropagateDatatypeLibraryPostProcess

Implements the propagation part of simplification 4.3: datatypeLibrary attributes are resolved and explicitly set on each data and value element, then removed from all other elements.

This can be used to work around libxml2 not resolving datatypeLibrary attributes from div elements.