EBML Library
Extensible Binary Meta Language read/write/modify library for Python
view on
github

About

This is an Extensible Binary Meta Language library library for use with Python 2.x and 3.x. Its primary goal is to provide ease of use and minimal memory usage.

Features

  • Encoding/decoding
    The ability to decode an EBML file into a document object model, and encode a document into a file. Encoding a decoded file with no changes will result in an identical file.
  • Document object model
    The EBML file is decoded into a DOM, similar to that of XML or HTML, but more strict. Elements can be created, moved, removed, and edited easily this way.
  • DOM selectors
    Similar to CSS selectors, elements can be selected using CSS-like selector objects. This makes it easy to quickly find the element(s) you want.
  • Element Pointers
    Any element is allowed to be a pointer, meaning that the decoding of its value from the source is deferred. This can save decoding time and memory.
  • Strictness settings
    The strictness of certain decoding errors can vary from exception throwing, warnings, or ignored.
  • Easy schema versioning
    Different version features can be added to a single schema and the version can be set later.

Download

  • ebml.py - The EBML library
  • mkv.py - The Matroska container schema setup

Documentation

The following section documents the public API of the classes and functions which are useful under normal circumstances. Modifying objects using other methods may break the internal state of an EBML document.

The extended descriptions are minimized by default for ease of locating functions by name. Click the +/ or the function name to expand/shrink the description. (expand all | shrink all)

References

ebml Module

The following functions are available in the module and are used for encoding/decoding file-like streams.

  • ebml.(schema, stream...) : ElementContainer
    Decodes an EBML formatted stream into a document object model.
    schema : ebml.Schema
    The document semantics schema
    stream : input_stream
    The input stream to read from
    return : ElementContainer
    A root object which contains the document nodes.
  • ebml.(element, stream, pointers_temporary=True...)
    Encodes an EBML element into an output stream
    element : ebml.Element
    The element which should be output to the stream. Typically this will be a root element
    stream : input_stream
    The output stream to write to
    pointers_temporary : boolean
    Any pointer elements are to be temporary, and will return to being pointers after their data is used
  • ebml.,
    ebml.,
    ebml.,
    ebml.,
    ebml.,
    ebml.,
    ebml.,
    ebml.
    These are the type constants included in the module for schema definitions

Schema

The Schema object is used to define the semantic information for decoding and validating an EBML document.

  • ebml.() : Schema
    Creates a new Schema which is used for decoding and creating nodes
    return : Schema
    A newly created Schema object which should have element definitions performed on
  • Schema.(id, name, el_type, level="g", versions=0, validator=None, pointer=False...) : ebml.ElementDescriptor
    Creates an element definition in the schema
    id : list | tuple | string | bytes
    The binary id which is used when writing the file. If the value is invalid, an exception will be raised.
    If this value is a list or tuple, each entry of the object will be treated as a byte.
    If this value is a string, each set of 2 characters will be treated as a hexadecimal byte.
    If this value is a bytes object, each character will be treated as a byte.
    name : string
    The name of the element; it should be meaningful for humans. This is used for element creation and xml-ification.
    el_type : TYPE_CONSTANT
    This should be one of the type constants specified in the module
    level : string | integer
    If this value is an integer, then it can only appear on the specified level.
    If this value is "g", then it can appear anywhere in the document.
    If this value is a string formatted as "#+", then it can appear on level # and deeper. (it is recursive)
    versions : integer
    A set of version bit-flags specifying which version this feature appears in. If the value is 0, it can appear in any version.
    validator : function
    A validation function which takes one input (the value) and should return True if it's valid, and False otherwise
    pointer : boolean
    True if this value should be loaded as a pointer (the reading of its value from the input stream will be deferred), or False otherwise
    return : ebml.ElementDescriptor
    A new element descriptor object is returned
  • Schema.(name, value=None...) : ebml.Element
    Creates an element given an element name or descriptor
    name : string | ebml.ElementDescriptor
    If this value is a string, it uses the descriptor with the specified name property in the schema.
    If this value is an ebml.ElementDescriptor returned from a .define call, it creates an element of the descriptor's type.
    value : any
    The initial value to give the element. This value will vary depending on the type of element being created, as it simply calls the element's .set function
    return : ebml.Element
    The newly created element which can be added to a document object model
  • Schema.() : ebml.ElementContainer
    Creates a root container for the schema. The root container's metadata is not written to the output stream when encoding, only its children are
    return : ebml.ElementContainer
  • Schema.=True
    Set to True if the schema is allowed to decode data into pointers, or False if pointers should not be allowed
  • Schema.=0
    Sets the version number of the schema. If the value is 0, no version checking is performed. Otherwise, it is performed as a bitwise & operator: if the &'d value is not 0, the element's version is considered valid
  • Schema.=Schema.STRICT
    The strictness level for UTF8 decoding
  • Schema.=Schema.STRICT
    The strictness level for ASCII string decoding
  • Schema.=Schema.WARN
    The strictness level for elements in the document with IDs that are not in the schema
  • Schema.=Schema.WARN
    The strictness level for checking the version of elements
  • Schema.=Schema.STRICT
    The strictness level for making sure the decoded length of elements is correct
  • Schema.=Schema.STRICT
    The strictness level for using the validation functions on elements' values
  • Schema.,
    Schema.,
    Schema.
    These class constants can be used to set the strictness settings of a schema

Element

The following functions are available on all Element instances. However, certain functions may not be implemented for certain subclasses, so they may raise exceptions.

  • Element.(decode_depth=-1...) : any
    Gets the value of an element. If the element is currently a pointer, it is decoded before returning the value
    decode_depth : integer
    This parameter sets how many descendent elements should also be decoded if the element is a pointer.
    Negative values indicate all descendents should be decoded, unless the schema indicates otherwise.
    0 indicates only the element itself should be decoded, all child elements will be pointers.
    1 indicates only element and its first level children should be decoded, all other elements will be pointers...
    And so on
    return : any
    The return type will vary depending on the type of element.
  • Element.(value...)
    Sets the value of an element. If the value is invalid, an exception will be raised
    value : any (not None)
    The value to apply to the element. The type and validity of this value depends on the type of the element and the schema
  • Element.()
    Clears the value to the default value for the element.
    For string-type elements, this is an empty string.
    For numeric-type elements, this is a 0.
    For container-type elements, this removes all child elements.
  • Element.() : boolean
    Checks if an element is a pointer or not
    return : boolean
    True if the element is a pointer, False otherwise
  • Element.()
    Converts an element to a pointer. If the element was not decoded from a stream, it cannot be converted to a pointer
  • Element.() : string
    Returns the tag name for the element. This is the same value as the name specified in the schema
    return : string
    The tag name
  • Element.() : integer
    Returns the size in bytes that the element would take up in a stream
    return : integer
    The size in bytes that the element would take up in a stream
  • Element.(element, before=None, after=None, prepend=False...)
    Add a child element to a container element. If the self object is not a container element, an exception will be raised.
    element : ebml.Element
    The element that should be inserted
    before : ebml.Element
    If the value is not None, the element will be inserted before this child element.
    after : ebml.Element
    If before is None, and the value is not None, the element will be inserted after this child element.
    prepend : boolean
    If before and after are both None, this parameter specifies where the new element should be inserted.
    If the value is False, it's inserted after all of the children as the last element.
    If the value is True, it's inserted before all of the children as the first element.
  • Element.(element...)
    Removes a child element from a container element
    element : ebml.Element
    The child element to remove
  • Element.()
    Removes this element from its container element, if any
  • Element.(element...) : boolean
    Checks if an element is the child of the self element
    element : ebml.Element
    The child element to check
    return : boolean
    True if it is a child, False otherwise
  • Element.() : string
    Converts the element to a human-readable form. This is useful for testing and finding where elements are. Element values may be shortened for readability, such as binary values.
    return : string
    A human-readable XML document string
  • Element. : ebml.ContainerElement | None
    The parent of the element
  • Element. : ebml.Element | None
    The previous sibling of the element
  • Element. : ebml.Element | None
    The next sibling of the element
  • Element. : integer
    The depth of the element in the DOM tree. Root elements have a depth of -1

ElementDate's Date class

The following class is used to provide high-precision date objects for encoding/decoding dates.

  • ebml.ElementDate.(year, month, day, hour=0, minute=0, second=0, nanoseconds=0...) : ebml.ElementDate.Date
    Creates a new date object to be used with ElementDate elements
    year : integer
    The year
    month : integer
    The month of the year, between 1 and 12 (inclusive)
    day : integer
    The day of the month, between 1 and the number of days in the month (inclusive)
    hour : integer
    The hour, between 0 and 23 (inclusive)
    minute : integer
    The minute, between 0 and 59 (inclusive)
    second : integer
    The second, between 0 and 59 (inclusive)
    nanoseconds : integer
    The number of nanoseconds, between 0 and 1e-9
    return : ebml.ElementDate.Date
    A newly created date object

Selector

EBML selectors are similar to CSS selectors, and are used to easily select elements out of a document. The exact syntax of a valid selector string will not be described, as it is mostly identical to the standard CSS selectors.

The following are the available selectors:

  • element_name
    Selects elements by name
  • *
    Selects every element
  • [value]
    Selects elements with the specified value[1]
  • :root
    Selects only root elements
  • :empty
    Selects elements with no children
  • :first-child
    Selects elements that have no previous sibling
  • :first-of-type
    Selects elements that have no previous siblings of the same type
  • :last-child
    Selects elements that have no following sibling
  • :last-of-type
    Selects elements that have no following siblings of the same type
  • :not(...)
    Selects the inverse of the specified selector
  • :nth-child(...)
    Selects the nth child of any element given an n-expression[2]
  • :nth-of-type(...)
    Selects the nth child of a certian type of any element[2]
  • :nth-last-child(...)
    Selects the nth last child of any element[2]
  • :nth-last-of-type(...)
    Selects the nth last child of a certian type of any element[2]
  • :pointer
    Selects elements that are pointers
  • :type(...)
    Select elements of a certain type[3]
  • selector1,selector2
    Select elements that match either of the specified expressions
  • selector1>selector2
    Select elements that are direct children of another element
  • selector1 selector2
    Select elements that are descendants of another element
  • selector1+selector2
    Select elements that immediately follow another element
  • selector1~selector2
    Select elements that follow another element

  • Value matching depends on the type of element
  • n-expressions are the same as those in CSS; e.g. 3n+1, n+4, n, 4
  • The type can be any one of the type constants defined on the module, only as a lowercase string
    e.g. "int" represents ebml.INT elements

  • ebml.(selector...) : Selector
    Creates a new selector object
    selector : string
    A selector string to construct the selector from
    return : Selector
    A newly created selector object
  • Selector.(element...) : boolean
    Returns if an element matches the selector or not
    element : ebml.Element
    The element to start compare against
    return : boolean
  • Selector.(element...) : ebml.Element | None
    element : ebml.Element
    The element to start comparing in. If this element is a container, all descendents are checked as well
    return : ebml.Element | None
    The first match found is returned, or None if no matches are found
  • Selector.(element...) : list
    element : ebml.Element
    The element to start comparing in. If this element is a container, all descendents are checked as well
    return : list
    A list of all elements matching the selector