Pike Reference Manual

11. Parsers

Module Parser.XML

Inherit XML: inherit Parser._parser.XML : XML

Method autoconvert: string autoconvert(string xml)

Method node_to_struct

mapping(string:string|array|mapping) node_to_struct(.NSTree.NSNode|.Tree.Node rootnode)

Description

XML parsing made easy.

Returns

A hierarchical structure of nested mappings and arrays representing the XML structure starting at rootnode using a minimal depth.

`"" : string`	The text content of the node.
`"/" : mapping`	The arguments on this node.
`"..." : string`	The text content of a simple subnode.
`"..." : array`	A list of subnodes.
`"..." : mapping`	A complex subnode (recurse).

Example

Parser.XML.node_to_struct(Parser.XML.NSTree.parse_input("<foo>bar</foo>"));

Class Parser.XML.Simple

Method allow_rxml_entities: void allow_rxml_entities(bool yes_no)

Method compat_allow_errors

void compat_allow_errors(string version)

Description

Set whether the parser should allow certain errors for compatibility with earlier versions. version can be:

`"7.2"`	Allow more data after the root element.
`"7.6"`	Allow multiple and invalidly placed "<?xml ... ?>" and "<!DOCTYPE ... >" declarations (invalid "<?xml ... ?>" declarations are otherwise treated as normal PI:s). Allow "<![CDATA[ ... ]]>" outside the root element. Allow the root element to be absent.

version can also be zero to enable all error checks.

Method define_entity: void define_entity(string entity, string s, function(:void) cb, mixed ... extras)
Description: Define an entity or an SMEG.
Parameter entity: Entity name, or SMEG name (if preceeded by a "%").
Parameter s: Expansion of the entity. Entity evaluation will be performed.
See also: define_entity_raw()

Method define_entity_raw: void define_entity_raw(string entity, string raw)
Description: Define an entity or an SMEG.
Parameter entity: Entity name, or SMEG name (if preceeded by a "%").
Parameter raw: Verbatim expansion of the entity.
See also: define_entity()

Method lookup_entity: string lookup_entity(string entity)
Returns: Returns the verbatim expansion of the entity.

Method parse: array parse(string xml, string context, function(:void) cb, mixed ... extra_args)
array parse(string xml, function(:void) cb, mixed ... extra_args)

Method parse_dtd: mixed parse_dtd(string dtd, string context, function(:void) cb, mixed ... extras)
mixed parse_dtd(string dtd, function(:void) cb, mixed ... extras)

Class Parser.XML.Simple.Context

Method create: Parser.XML.Simple.Context Parser.XML.Simple.Context(string s, string context, int flags, function(:void) cb, mixed ... extra_args)
Parser.XML.Simple.Context Parser.XML.Simple.Context(string s, int flags, function(:void) cb, mixed ... extra_args)
Parameter s
Parameter context: These two arguments are passed along to push_string().
Parameter flags: Parser flags.
Parameter cb: Callback function. This function gets called at various stages during the parsing.

Method parse_dtd: mixed parse_dtd()

Method parse_entity: string parse_entity()

Method parse_xml: mixed parse_xml()

Method push_string: void push_string(string s)
void push_string(string s, string context)
Description: Add a string to parse at the current position.
Parameter s: String to insert at the current parsing position.
Parameter context: Optional context used to refer to the inserted string. This is typically an URL, but may also be an entity (preceeded by an "&") or a SMEG reference (preceeded by a "%"). Not used by the XML parser as such, but is simply passed into the callbackinfo mapping as the field "context" where it can be useful for eg resolving relative URLs when parsing DTDs, or for determining where errors occur.

Class Parser.XML.Validating

Description

Validating XML parser.

Validates an XML file according to a DTD.

cf http://www.w3.org/TR/REC-xml/

Inherit Simple: inherit .Simple : Simple
Description: Extends the Simple XML parser.

Method get_external_entity

string|zero get_external_entity(string sysid, string|void pubid, mapping|void info, mixed ... extra)

Description

Get an external entity.

Called when a <!DOCTYPE> with a SYSTEM identifier is encountered, or when an entity reference needs expanding.

Parameter sysid

The SYSTEM identifier.

Parameter pubid

The PUBLIC identifier (if any).

Parameter info

The callbackinfo mapping containing the current parser state.

Parameter extra

The extra arguments as passed to parse() or parse_dtd().

Returns

Returns a string with a DTD fragment on success. Returns 0 (zero) on failure.

Note

Returning zero will cause the validator to report an error.

Note

In Pike 7.7 and earlier info had the value 0 (zero).

Note

The default implementation always returns 0 (zero). Override this function to provide other behaviour.

See also

parse(), parse_dtd()

Method isname: int isname(string s)
Description: Check if s is a valid Name.

Method isnames: int isnames(string s)
Description: Check if s is a valid list of Names.

Method isnmtoken: int isnmtoken(string s)
Description: Check if s is a valid Nmtoken.

Method isnmtokens: int isnmtokens(string s)
Description: Check if s is a valid list of Nmtokens.

Method parse: array parse(string data, string|function(string, string, mapping, array|string, mapping(string:mixed), __unknown__ ... :mixed) callback, mixed ... extra)
FIXME: Document this function

Method parse_dtd: array parse_dtd(string data, string|function(string, string, mapping, array|string, mapping(string:mixed), __unknown__ ... :mixed) callback, mixed ... extra)
FIXME: Document this function

Method validate: private mixed validate(string kind, string name, mapping attributes, array|string contents, mapping(string:mixed) info, function(string, string|zero, mapping|zero, array|string, mapping(string:mixed), __unknown__ ... :mixed) callback, array(mixed) extra)
Description: The validation callback function.
See also: ::parse()

Class Parser.XML.Validating.Element

Description: XML Element node.

Module Parser.XML.NSTree

Description: A namespace aware version of Parser.XML.Tree. This implementation does as little validation as possible, so e.g. you can call your namespace xmlfoo without complaints.

Inherit Tree: inherit Parser.XML.Tree : Tree

Method parse_input: NSNode parse_input(string data, void|string default_ns)
Description: Takes a XML string data and produces a namespace node tree. If default_ns is given, it will be used as the default namespace.
Throws: Throws an error when an error is encountered during XML parsing.

Method visualize

string visualize(Node n, void|string indent)

Description

Makes a visualization of a node graph suitable for printing out on a terminal.

Example

> object x = parse_input("<a><b><c/>d</b><b><e/><f>g</f></b></a>");
   > write(visualize(x));
   Node(ROOT)
     NSNode(ELEMENT,"a")
       NSNode(ELEMENT,"b")
         NSNode(ELEMENT,"c")
         NSNode(TEXT)
       NSNode(ELEMENT,"b")
         NSNode(ELEMENT,"e")
         NSNode(ELEMENT,"f")
           NSNode(TEXT)
   Result 1: 201

Class Parser.XML.NSTree.NSNode

Description: Namespace aware node.

Inherit Node: inherit Node : Node

Method add_namespace: void add_namespace(string ns, void|string symbol, void|bool chain)
Description: Adds a new namespace to this node. The preferred symbol to use to identify the namespace can be provided in the symbol argument. If chain is set, no attempts to overwrite an already defined namespace with the same identifier will be made.

Method change_namespace: void change_namespace(string from, string to)
Description: Change all elements and attributes in the subtree in namespace from to namespace to. In case an attribute is defined in both namespaces it will be overwritten.

Method child_namespaces: mapping child_namespaces(mapping(Node:mapping(string:string)) intermediate)
Description: Return the defined namespaces from the tree.
Parameter intermediate: If namespaces are clobbered, the node that needs additional xmlns attributes are added to this mapping.

Method diff_namespaces: mapping(string:string) diff_namespaces()
Description: Returns the difference between this node and its parent namespaces.

Method get_default_ns: string get_default_ns()
Description: Returns the default namespace in the current scope.

Method get_defined_nss: mapping(string:string) get_defined_nss()
Description: Returns a mapping with all the namespaces defined in the current scope, except the default namespace.
Note: The returned mapping is the same as the one in the node, so destructive changes will affect the node.

Method get_ns: string get_ns()
Description: Returns the namespace in which the current element is defined in.

Method get_ns_attributes: mapping(string:mapping(string:string)) get_ns_attributes()
Description: Returns all the attributes in all namespaces that is associated with this node.
Note: The returned mapping is the same as the one in the node, so destructive changes will affect the node.

Method get_ns_attributes: mapping(string:string) get_ns_attributes(string namespace)
Description: Returns the attributes in this node that is declared in the provided namespace.

Method get_ns_short: string get_ns_short(string ns)
Description: Returns the short name for the given namespace in this context. Returns the empty string if the namespace is the default namespace. Returns 0 if the namespace is unknown.

Method get_short_attributes: mapping(string:string) get_short_attributes()
Description: Return the attributes for the element with the names given their short name prefixes.

Method get_xml_name: string get_xml_name()
Description: Returns the element name as it occurs in xml files. E.g. "zonk:name" for the element "name" defined in a namespace denoted with "zonk". It will look up a symbol for the namespace in the symbol tables for the node and its parents. If none is found a new label will be generated by hashing the namespace.

Method remove_child: void remove_child(NSNode child)
Description: The remove_child is a not updated to take care of name space issues. To properly remove all the parents name spaces from the chid, call remove_node in the child.

Method rename_namespace: void rename_namespace(string from, string to)
Description: Renames the namespace prefix of a namespace. No checks will be made to see if the namespace represented is the same throughout the subtree.

Method render_xml: string render_xml(void|string encoding)
Description: Renders the object tree to a string.
Parameter encoding: The character encoding to be used. Defaults the character encoding in the XML header, or UTF-8 if none.

Module Parser.XML.SloppyDOM

Description

A somewhat DOM-like library that implements lazy generation of the node tree, i.e. it's generated from the data upon lookup. There's also a little bit of XPath evaluation to do queries on the node tree.

Implementation note: This is generally more pragmatic than Parser.XML.DOM, meaning it's not so pretty and compliant, but more efficient.

Implementation status: There's only enough implemented to parse a node tree from source and access it, i.e. modification functions aren't implemented. Data hiding stuff like NodeList and NamedNodeMap is not implemented, partly since it's cumbersome to meet the "live" requirement. Also, Parser.HTML is used in XML mode to parse the input. Thus it's too error tolerant to be XML compliant, and it currently doesn't handle DTD elements, like "<!DOCTYPE", or the XML declaration (i.e. "<?xml version='1.0'?>".

Method parse: Document parse(string source, void|int raw_values)
Description: Normally entities are decoded, and Node.xml_format will encode them again. If raw_values is nonzero then all text and attribute values are instead kept in their original form.

Class Parser.XML.SloppyDOM.Document

Note: The node tree is very likely a cyclic structure, so it might be an good idea to destruct it when you're finished with it, to avoid garbage. Destructing the Document object always destroys all nodes in it.

Inherit NodeWithChildElements: inherit NodeWithChildElements : NodeWithChildElements

Method get_elements: array(Element) get_elements(string name)
Description: Note that this one looks among the top level elements, as opposed to get_elements_by_tag_name. This means that if the document is correct, you can only look up the single top level element here.
Note: Not DOM compliant.

Method get_raw_values: int get_raw_values()
Note: Not DOM compliant.

Class Parser.XML.SloppyDOM.Node

Description: Basic node.

Method get_text_content: string get_text_content()
Description: If the raw_values flag is set in the owning document, the text is returned with entities and CDATA blocks intact.
See also: parse

Method simple_path

Description

Access a node or a set of nodes through an expression that is a subset of an XPath RelativeLocationPath in abbreviated form.

That means one or more Steps separated by "/" or "//". A Step consists of an AxisSpecifier followed by a NodeTest and then optionally by one or more Predicate's.

"/" before a Step causes it to be matched only against the immediate children of the node(s) selected by the previous Step. "//" before a Step causes it to be matched against any children in the tree below the node(s) selected by the previous Step. The initial selection before the first Step is this element.

The currently allowed AxisSpecifier NodeTest combinations are:

name to select all elements with the given name. The name can be "*" to select all.
@name to select all attributes with the given name. The name can be "*" to select all.
comment() to select all comments.
text() to select all text and CDATA blocks. Note that all entity references are also selected, under the assumption that they would expand to text only.
processing-instruction("name") to select all processing instructions with the given name. The name can be left out to select all. Either ' or " may be used to delimit the name. For compatibility, it can also occur without surrounding quotes.
node() to select all nodes, i.e. the whole content of an element node.
. to select the currently selected element itself.

A Predicate is on the form [PredicateExpr] where PredicateExpr currently can be in any of the following forms:

An integer indexes one item in the selected set, according to the document order. A negative index counts from the end of the set.
A RelativeLocationPath as specified above. It's executed for each element in the selected set and those where it yields an empty result are filtered out while the rest remain in the set.
A RelativeLocationPath as specified above followed by ="value". The path is executed for each element in the selected set and those where the text result of it is equal to the given value remain in the set. Either ' or " may be used to delimit the value.

If xml_format is nonzero, the return value is an xml formatted string of all the matched nodes, in document order. Otherwise the return value is as follows:

Attributes are returned as one or more index/value pairs in a mapping. Other nodes are returned as the node objects. If the expression is on a form that can give at most one answer (i.e. there's a predicate with an integer index) then a single mapping or node is returned, or zero if there was no match. If the expression can give more answers then the return value is an array containing zero or more attribute mappings and/or nodes. The array follows document order.

Note

Not DOM compliant.

Method xml_format: string xml_format()
Description: Returns the formatted XML that corresponds to the node tree.
Note: Not DOM compliant.

Class Parser.XML.SloppyDOM.NodeWithChildElements

Description: Node with child elements.

Inherit NodeWithChildren: inherit NodeWithChildren : NodeWithChildren

Method get_descendant_elements: array(Element) get_descendant_elements()
Description: Returns all descendant elements in document order.
Note: Not DOM compliant.

Method get_descendant_nodes: array(Node) get_descendant_nodes()
Description: Returns all descendant nodes (except attribute nodes) in document order.
Note: Not DOM compliant.

Method get_elements: array(Element) get_elements(string name)
Description: Lightweight variant of get_elements_by_tag_name that returns a simple array instead of a fancy live NodeList.
Note: Not DOM compliant.

Module Parser.XML.Tree

Description

XML parser that generates node-trees.

Has some support for XML namespaces http://www.w3.org/TR/REC-xml-names/ RFC 2518 section 23.4.

Note

This module defines two sets of node trees; the SimpleNode-based, and the Node-based. The main difference between the two, is that the Node-based trees have parent pointers, which tend to generate circular data references and thus garbage.

There are some more subtle differences between the two. Please read the documentation carefully.

Constant DTD_ATTLIST: constant int Parser.XML.Tree.DTD_ATTLIST

Constant DTD_ELEMENT: constant int Parser.XML.Tree.DTD_ELEMENT

Constant DTD_ENTITY: constant int Parser.XML.Tree.DTD_ENTITY

Constant DTD_NOTATION: constant int Parser.XML.Tree.DTD_NOTATION

Constant STOP_WALK: constant int Parser.XML.Tree.STOP_WALK

Constant XML_ATTR: constant int Parser.XML.Tree.XML_ATTR
Description: Attribute nodes are created on demand

Constant XML_COMMENT: constant int Parser.XML.Tree.XML_COMMENT

Constant XML_DOCTYPE: constant int Parser.XML.Tree.XML_DOCTYPE

Constant XML_ELEMENT: constant int Parser.XML.Tree.XML_ELEMENT

Constant XML_HEADER: constant int Parser.XML.Tree.XML_HEADER

Constant XML_NODE: constant Parser.XML.Tree.XML_NODE

Constant XML_PI: constant int Parser.XML.Tree.XML_PI

Constant XML_ROOT: constant int Parser.XML.Tree.XML_ROOT

Constant XML_TEXT: constant int Parser.XML.Tree.XML_TEXT

Method attribute_quote: string attribute_quote(string data, void|string ignore)
Description: Quotes the string given in data by escaping &, <, >, ' and ".

Method parse_file: Node parse_file(string path, bool|void parse_namespaces)
Description: Loads the XML file path, creates a node tree representation and returns the root node.

Method parse_input: RootNode parse_input(string data, void|bool no_fallback, void|bool force_lowercase, void|mapping(string:string) predefined_entities, void|bool parse_namespaces, ParseFlags|void flags)
Description: Takes an XML string and produces a node tree.
Note: flags is not used for PARSE_WANT_ERROR_CONTEXT, PARSE_FORCE_LOWERCASE or PARSE_ENABLE_NAMESPACES since they are covered by the separate flag arguments.

Method roxen_attribute_quote: string roxen_attribute_quote(string data, void|string ignore)
Description: Quotes strings just like attribute_quote, but entities in the form &foo.bar; will not be quoted.

Method roxen_text_quote: string roxen_text_quote(string data)
Description: Quotes strings just like text_quote, but entities in the form &foo.bar; will not be quoted.

Method simple_parse_file: SimpleRootNode simple_parse_file(string path, void|mapping predefined_entities, ParseFlags|void flags, string|void default_namespace)
Description: Loads the XML file path, creates a SimpleNode tree representation and returns the root node.

Method simple_parse_input: SimpleRootNode simple_parse_input(string data, void|mapping predefined_entities, ParseFlags|void flags, string|void default_namespace)
Description: Takes an XML string and produces a SimpleNode tree.

Method text_quote: string text_quote(string data)
Description: Quotes the string given in data by escaping &, < and >.

Enum Parser.XML.Tree.ParseFlags

Description: Flags used together with simple_parse_input() and simple_parse_file().

Constant PARSE_CHECK_ALL_ERRORS: constant Parser.XML.Tree.PARSE_CHECK_ALL_ERRORS

Constant PARSE_COMPAT_ALLOW_ERRORS_7_2: constant Parser.XML.Tree.PARSE_COMPAT_ALLOW_ERRORS_7_2

Constant PARSE_COMPAT_ALLOW_ERRORS_7_6: constant Parser.XML.Tree.PARSE_COMPAT_ALLOW_ERRORS_7_6

Constant PARSE_DISALLOW_RXML_ENTITIES: constant Parser.XML.Tree.PARSE_DISALLOW_RXML_ENTITIES

Constant PARSE_ENABLE_NAMESPACES: constant Parser.XML.Tree.PARSE_ENABLE_NAMESPACES

Constant PARSE_FORCE_LOWERCASE: constant Parser.XML.Tree.PARSE_FORCE_LOWERCASE

Constant PARSE_WANT_ERROR_CONTEXT: constant Parser.XML.Tree.PARSE_WANT_ERROR_CONTEXT

Class Parser.XML.Tree.AbstractNode

Annotations

@Pike.Annotations.Implements(AbstractSimpleNode)

Description

Base class for nodes with parent pointers.

Inherit AbstractSimpleNode: inherit AbstractSimpleNode : AbstractSimpleNode

Method add_child: AbstractNode add_child(AbstractNode c)
Description: Adds the node c to the list of children of this node. The node is added before the node old, which is assumed to be an existing child of this node. The node is added first if old is zero.
Note: Returns the new child node, NOT the current node.
Returns: The new child node is returned.

Method add_child_after: AbstractNode add_child_after(AbstractNode c, AbstractNode old)
Description: Adds the node c to the list of children of this node. The node is added after the node old, which is assumed to be an existing child of this node. The node is added first if old is zero.
Returns: The current node.

Method add_child_before: AbstractNode add_child_before(AbstractNode c, AbstractNode old)
Description: Adds the node c to the list of children of this node. The node is added before the node old, which is assumed to be an existing child of this node. The node is added last if old is zero.
Returns: The current node.

Method clone: AbstractNode clone(void|int(-1..1) direction)
Description: Clones the node, optionally connected to parts of the tree. If direction is -1 the cloned nodes parent will be set, if direction is 1 the clone nodes childen will be set.

Method fix_tree: void fix_tree()
Description: Fix all parent pointers recursively in a tree that has been built with tmp_add_child.

Method get_ancestors: array(AbstractNode) get_ancestors(bool include_self)
Description: Returns a list of all ancestors, with the top node last. The list will start with this node if include_self is set.

Method get_following: array(AbstractNode) get_following()
Description: Returns all the nodes that follows after the current one.

Method get_following_siblings: array(AbstractNode) get_following_siblings()
Description: Returns all following siblings, i.e. all siblings present after this node in the parents children list.

Method get_parent: AbstractNode get_parent()
Description: Returns the parent node.

Method get_preceding: array(AbstractNode) get_preceding()
Description: Returns all preceding nodes, excluding this nodes ancestors.

Method get_preceding_siblings: array(AbstractNode) get_preceding_siblings()
Description: Returns all preceding siblings, i.e. all siblings present before this node in the parents children list.

Method get_root: AbstractNode get_root()
Description: Follows all parent pointers and returns the root node.

Method get_siblings: array(AbstractNode) get_siblings()
Description: Returns all siblings, including this node.

Method low_clone: optional AbstractNode low_clone()
Description: Returns an initialized copy of the node.
Note: The returned node has no children, and no parent.

Method remove_child: void remove_child(AbstractNode c)
Description: Removes all occurrences of the provided node from the called nodes list of children. The removed nodes parent reference is set to null.

Method remove_node: void remove_node()
Description: Removes this node from its parent. The parent reference is set to null.

Method replace_child: AbstractNode|zero replace_child(AbstractNode old, AbstractNode|array(AbstractNode) new)
Description: Replaces the first occurrence of the old node child with the new node child or children. All parent references are updated.
Note: The returned value is NOT the current node.
Returns: Returns the new child node.

Method replace_children: void replace_children(array(AbstractNode) children)
Description: Replaces the nodes children with the provided ones. All parent references are updated.

Method replace_node: AbstractNode|array(AbstractNode) replace_node(AbstractNode|array(AbstractNode) new)
Description: Replaces this node with the provided one.
Returns: Returns the new node.

Method set_parent: void set_parent(AbstractNode parent)
Description: Sets the parent node to parent.

Method tmp_add_child
Method tmp_add_child_before
Method tmp_add_child_after

AbstractNode tmp_add_child(AbstractNode c)
AbstractNode tmp_add_child_before(AbstractNode c, AbstractNode old)
AbstractNode tmp_add_child_after(AbstractNode c, AbstractNode old)

Description

Variants of add_child, add_child_before and add_child_after that doesn't set the parent pointer in the newly added children.

This is useful while building a node tree, to get efficient refcount garbage collection if the build stops abruptly. fix_tree has to be called on the root node when the building is done.

Class Parser.XML.Tree.AbstractSimpleNode

Description: Base class for nodes.

Method `[]: AbstractSimpleNode|zero res = Parser.XML.Tree.AbstractSimpleNode()[ pos ]
Description: The [] operator indexes among the node children, so node[0] returns the first node and node[-1] the last.
Note: The [] operator will select a node from all the nodes children, not just its element children.

Method add_child: AbstractSimpleNode add_child(AbstractSimpleNode c)
Description: Adds the given node to the list of children of this node. The new node is added last in the list.
Note: The return value differs from the one returned by Node()->add_child().
Returns: The current node.

Method add_child_after: AbstractSimpleNode add_child_after(AbstractSimpleNode c, AbstractSimpleNode old)
Description: Adds the node c to the list of children of this node. The node is added after the node old, which is assumed to be an existing child of this node. The node is added first if old is zero.
Returns: The current node.

Method add_child_before: AbstractSimpleNode add_child_before(AbstractSimpleNode c, AbstractSimpleNode old)
Description: Adds the node c to the list of children of this node. The node is added before the node old, which is assumed to be an existing child of this node. The node is added last if old is zero.
Returns: The current node.

Method clone: optional AbstractSimpleNode clone()
Description: Returns a clone of the sub-tree rooted in the node.

Method count_children: int count_children()
Description: Returns the number of children of the node.

Method get_children: array(AbstractSimpleNode) get_children()
Description: Returns all the nodes children.

Method get_descendants: array(AbstractSimpleNode) get_descendants(bool include_self)
Description: Returns a list of all descendants in document order. Includes this node if include_self is set.

Method get_last_child: AbstractSimpleNode|zero get_last_child()
Description: Returns the last child node or zero.

Method iterate_children: int iterate_children(function(AbstractSimpleNode, mixed ... :int|void) callback, mixed ... args)
Description: Iterates over the nodes children from left to right, calling the function callback for every node. If the callback function returns STOP_WALK the iteration is promptly aborted and STOP_WALK is returned.

Method low_clone: optional AbstractSimpleNode low_clone()
Description: Returns an initialized copy of the node.
Note: The returned node has no children.

Method node_factory

optional this_program node_factory(int type, string name, mapping attr, string text)

Description

Optional factory for creating contained nodes.

Parameter type

Type of node to create. One of:

`XML_TEXT`	XML text. `text` contains a string with the text.
`XML_COMMENT`	XML comment. `text` contains a string with the comment text.
`XML_HEADER`	`<?xml?>`-header `attr` contains a mapping with the attributes.
`XML_PI`	XML processing instruction. `name` contains the name of the processing instruction and `text` the remainder.
`XML_ELEMENT`	XML element tag. `name` contains the name of the tag and `attr` the attributes.
`XML_DOCTYPE`	DTD information.
`DTD_ENTITY`
`DTD_ELEMENT`
`DTD_ATTLIST`
`DTD_NOTATION`

Parameter name

Name of the tag if applicable.

Parameter attr

Attributes for the tag if applicable.

Parameter text

Contained text of the tab if any.

This function is called during parsning to create the various XML nodes.

Define this function to provide application-specific XML nodes.

Returns

Returns one of

`AbstractSimpleNode`	A node object representing the XML tag.
`int(0)`	`0` (zero) if the subtree rooted here should be cut.
`zero`	`UNDEFINED` to fall back to the next level of parser (ie behave as if this function does not exist).

Note

This function is only relevant for XML_ELEMENT nodes.

Note

This function is not available in Pike 7.6 and earlier.

Note

In Pike 8.0 and earlier this function was only called in root nodes.

Method remove_child: void remove_child(AbstractSimpleNode c)
Description: Removes all occurrences of the provided node from the list of children of this node.

Method replace_child: AbstractSimpleNode|zero replace_child(AbstractSimpleNode old, AbstractSimpleNode|array(AbstractSimpleNode) new)
Description: Replaces the first occurrence of the old node child with the new node child or children.
Note: The return value differs from the one returned by Node()->replace_child().
Returns: Returns the current node on success, and 0 (zero) if the node old wasn't found.

Method replace_children: void replace_children(array(AbstractSimpleNode) children)
Description: Replaces the nodes children with the provided ones.

Method walk_inorder: int walk_inorder(function(AbstractSimpleNode, mixed ... :int|void) callback, mixed ... args)
Description: Traverse the node subtree in inorder, left subtree first, then root node, and finally the remaining subtrees, calling the function callback for every node. If the function callback returns STOP_WALK the traverse is promptly aborted and STOP_WALK is returned.

Method walk_postorder: int walk_postorder(function(AbstractSimpleNode, mixed ... :int|void) callback, mixed ... args)
Description: Traverse the node subtree in postorder, first subtrees from left to right, then the root node, calling the function callback for every node. If the function callback returns STOP_WALK the traverse is promptly aborted and STOP_WALK is returned.

Method walk_preorder: int walk_preorder(function(AbstractSimpleNode, mixed ... :int|void) callback, mixed ... args)
Description: Traverse the node subtree in preorder, root node first, then subtrees from left to right, calling the callback function for every node. If the callback function returns STOP_WALK the traverse is promptly aborted and STOP_WALK is returned.

Method walk_preorder_2: int walk_preorder_2(function(AbstractSimpleNode, mixed ... :int|void) cb_1, function(AbstractSimpleNode, mixed ... :int|void) cb_2, mixed ... args)
Description: Traverse the node subtree in preorder, root node first, then subtrees from left to right. For each node we call cb_1 before iterating through children, and then cb_2 (which always gets called even if the walk is aborted earlier). If the callback function returns STOP_WALK the traverse decend is aborted and STOP_WALK is returned once all waiting cb_2 functions have been called.

Method zap_tree: void zap_tree()
Description: Destruct the tree recursively. When the inheriting AbstractNode or Node is used, which have parent pointers, this function should be called for every tree that no longer is in use to avoid frequent garbage collector runs.

Class Parser.XML.Tree.AttributeNode

Annotations

@Pike.Annotations.Implements(Node)

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.AttributeNode Parser.XML.Tree.AttributeNode(string name, string value)

Class Parser.XML.Tree.CommentNode

Annotations

@Pike.Annotations.Implements(Node)

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.CommentNode Parser.XML.Tree.CommentNode(string text)

Class Parser.XML.Tree.DTDAttlistNode

Annotations

@Pike.Annotations.Implements(Node)

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.DTDAttlistNode Parser.XML.Tree.DTDAttlistNode(string name, mapping(string:string) attrs, string contents)

Class Parser.XML.Tree.DTDElementNode

Annotations

@Pike.Annotations.Implements(Node)

@Pike.Annotations.Implements(DTDElementHelper)

Inherit DTDElementHelper: inherit DTDElementHelper : DTDElementHelper

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.DTDElementNode Parser.XML.Tree.DTDElementNode(string name, array expression)

Class Parser.XML.Tree.DTDEntityNode

Annotations

@Pike.Annotations.Implements(Node)

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.DTDEntityNode Parser.XML.Tree.DTDEntityNode(string name, mapping(string:string) attrs, string contents)

Class Parser.XML.Tree.DTDNotationNode

Annotations

@Pike.Annotations.Implements(Node)

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.DTDNotationNode Parser.XML.Tree.DTDNotationNode(string name, mapping(string:string) attrs, string contents)

Class Parser.XML.Tree.DoctypeNode

Annotations

@Pike.Annotations.Implements(Node)

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.DoctypeNode Parser.XML.Tree.DoctypeNode(string name, mapping(string:string) attrs, array|zero contents)

Class Parser.XML.Tree.ElementNode

Annotations

@Pike.Annotations.Implements(Node)

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.ElementNode Parser.XML.Tree.ElementNode(string name, mapping(string:string) attrs)

Class Parser.XML.Tree.HeaderNode

Annotations

@Pike.Annotations.Implements(Node)

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.HeaderNode Parser.XML.Tree.HeaderNode(mapping(string:string) attrs)

Class Parser.XML.Tree.Node

Annotations

@Pike.Annotations.Implements(AbstractNode)

@Pike.Annotations.Implements(VirtualNode)

Description

XML node with parent pointers.

Inherit AbstractNode: inherit AbstractNode : AbstractNode

Inherit VirtualNode: inherit VirtualNode : VirtualNode

Method get_attr_name: string get_attr_name()
Description: Returns the name of the attribute node.

Method get_attribute_nodes: array(Node) get_attribute_nodes()
Description: Creates and returns an array of new nodes; they will not be added as proper children to the parent node, but the parent link in the nodes are set so that upwards traversal is made possible.

Method get_tag_name: string get_tag_name()
Description: Returns the name of the element node, or the nearest element above if an attribute node.

Class Parser.XML.Tree.PINode

Annotations

@Pike.Annotations.Implements(Node)

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.PINode Parser.XML.Tree.PINode(string name, mapping(string:string) attrs, string contents)

Class Parser.XML.Tree.RootNode

Annotations

@Pike.Annotations.Implements(Node)

Description

The root node of an XML-tree consisting of Nodes.

Inherit Node: inherit Node : Node

Inherit XMLParser: inherit XMLParser : XMLParser

Method create: Parser.XML.Tree.RootNode Parser.XML.Tree.RootNode(string|void data, mapping|void predefined_entities, ParseFlags|void flags)

Method flush_node_id_cache: void flush_node_id_cache()
Description: Clears the node id cache built and used by get_element_by_id.

Method get_element_by_id: ElementNode get_element_by_id(string id, int|void force)
Description: Find the element with the specified id.
Parameter id: The XML id of the node to search for.
Parameter force: Force a regeneration of the id lookup cache. Needed the first time after the node tree has been modified by adding or removing element nodes, or by changing the id attribute of an element node.
Returns: Returns the element node with the specified id if any. Returns UNDEFINED otherwise.
See also: flush_node_id_cache

Class Parser.XML.Tree.SimpleCommentNode

Annotations

@Pike.Annotations.Implements(SimpleNode)

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Method create: Parser.XML.Tree.SimpleCommentNode Parser.XML.Tree.SimpleCommentNode(string comment)

Class Parser.XML.Tree.SimpleDTDAttlistNode

Annotations

@Pike.Annotations.Implements(SimpleNode)

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Method create: Parser.XML.Tree.SimpleDTDAttlistNode Parser.XML.Tree.SimpleDTDAttlistNode(string name, mapping(string:string) attrs, string contents)

Class Parser.XML.Tree.SimpleDTDElementNode

Annotations

@Pike.Annotations.Implements(SimpleNode)

@Pike.Annotations.Implements(DTDElementHelper)

Inherit DTDElementHelper: inherit DTDElementHelper : DTDElementHelper

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Method create: Parser.XML.Tree.SimpleDTDElementNode Parser.XML.Tree.SimpleDTDElementNode(string name, array expression)

Class Parser.XML.Tree.SimpleDTDEntityNode

Annotations

@Pike.Annotations.Implements(SimpleNode)

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Method create: Parser.XML.Tree.SimpleDTDEntityNode Parser.XML.Tree.SimpleDTDEntityNode(string name, mapping(string:string) attrs, string contents)

Class Parser.XML.Tree.SimpleDTDNotationNode

Annotations

@Pike.Annotations.Implements(SimpleNode)

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Method create: Parser.XML.Tree.SimpleDTDNotationNode Parser.XML.Tree.SimpleDTDNotationNode(string name, mapping(string:string) attrs, string contents)

Class Parser.XML.Tree.SimpleDoctypeNode

Annotations

@Pike.Annotations.Implements(SimpleNode)

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Method create: Parser.XML.Tree.SimpleDoctypeNode Parser.XML.Tree.SimpleDoctypeNode(string name, mapping(string:string) attrs, array|zero contents)

Class Parser.XML.Tree.SimpleElementNode

Annotations

@Pike.Annotations.Implements(SimpleNode)

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Method create: Parser.XML.Tree.SimpleElementNode Parser.XML.Tree.SimpleElementNode(string name, mapping(string:string) attrs)

Class Parser.XML.Tree.SimpleHeaderNode

Annotations

@Pike.Annotations.Implements(SimpleNode)

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Method create: Parser.XML.Tree.SimpleHeaderNode Parser.XML.Tree.SimpleHeaderNode(mapping(string:string) attrs)

Class Parser.XML.Tree.SimpleNode

Annotations

@Pike.Annotations.Implements(AbstractSimpleNode)

@Pike.Annotations.Implements(VirtualNode)

Description

XML node without parent pointers and attribute nodes.

Inherit AbstractSimpleNode: inherit AbstractSimpleNode : AbstractSimpleNode

Inherit VirtualNode: inherit VirtualNode : VirtualNode

Class Parser.XML.Tree.SimplePINode

Annotations

@Pike.Annotations.Implements(SimpleNode)

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Method create: Parser.XML.Tree.SimplePINode Parser.XML.Tree.SimplePINode(string name, mapping(string:string) attrs, string contents)

Class Parser.XML.Tree.SimpleRootNode

Annotations

@Pike.Annotations.Implements(SimpleNode)

Description

The root node of an XML-tree consisting of SimpleNodes.

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Inherit XMLParser: inherit XMLParser : XMLParser

Method create: Parser.XML.Tree.SimpleRootNode Parser.XML.Tree.SimpleRootNode(string|void data, mapping|void predefined_entities, ParseFlags|void flags, string|void default_namespace)

Method flush_node_id_cache: void flush_node_id_cache()
Description: Clears the node id cache built and used by get_element_by_id.

Method get_element_by_id: SimpleElementNode get_element_by_id(string id, int|void force)
Description: Find the element with the specified id.
Parameter id: The XML id of the node to search for.
Parameter force: Force a regeneration of the id lookup cache. Needed the first time after the node tree has been modified by adding or removing element nodes, or by changing the id attribute of an element node.
Returns: Returns the element node with the specified id if any. Returns UNDEFINED otherwise.
See also: flush_node_id_cache

Class Parser.XML.Tree.SimpleTextNode

Annotations

@Pike.Annotations.Implements(SimpleNode)

Inherit SimpleNode: inherit SimpleNode : SimpleNode

Method create: Parser.XML.Tree.SimpleTextNode Parser.XML.Tree.SimpleTextNode(string text)

Class Parser.XML.Tree.TextNode

Annotations

@Pike.Annotations.Implements(Node)

Inherit Node: inherit Node : Node

Method create: Parser.XML.Tree.TextNode Parser.XML.Tree.TextNode(string text)

Class Parser.XML.Tree.VirtualNode

Description: Node in XML tree

Method cast: (int)Parser.XML.Tree.VirtualNode() (float)Parser.XML.Tree.VirtualNode() (string)Parser.XML.Tree.VirtualNode() (array)Parser.XML.Tree.VirtualNode() (mapping)Parser.XML.Tree.VirtualNode() (multiset)Parser.XML.Tree.VirtualNode()
Description: It is possible to cast a node to a string, which will return render_xml() for that node.

Method create: Parser.XML.Tree.VirtualNode Parser.XML.Tree.VirtualNode(int type, string|zero name, mapping|zero attr, string|zero text)

Method get_any_name: string get_any_name()
Description: Return name of tag or name of attribute node.

Method get_attributes: mapping(string:string) get_attributes()
Description: Returns this nodes attributes, which can be altered destructivly to alter the nodes attributes.
See also: replace_attributes()

Method get_doc_order: int get_doc_order()

Method get_elements: array(AbstractNode) get_elements(string|void name, bool|void full)
Description: Returns all element children to this node.
Parameter name: If provided, only elements with that name is returned.
Parameter full: If specified, name matching will be done against the full name.
Returns: Returns an array with matching nodes.

Method get_first_element: AbstractNode|zero get_first_element(string|void name, bool|void full)
Description: Returns the first element child to this node.
Parameter name: If provided, the first element child with that name is returned.
Parameter full: If specified, name matching will be done against the full name.
Returns: Returns the first matching node, and 0 if no such node was found.

Method get_full_name: string get_full_name()
Description: Return fully qualified name of the element node.

Method get_namespace: string get_namespace()
Description: Return the (resolved) namespace for this node.

Method get_node_type: int get_node_type()
Description: Returns the node type. See defined node type constants.

Method get_short_attributes: mapping get_short_attributes()
Description: Returns this nodes name-space adjusted attributes.
Note: set_short_namespaces() or set_short_attributes() must have been called before calling this function.

Method get_tag_name: string get_tag_name()
Description: Returns the name of the element node, or the nearest element above if an attribute node.

Method get_text: string|zero get_text()
Description: Returns text content in node.

Method render_to_file: void render_to_file(Stdio.File f, void|bool preserve_roxen_entities)
Description: Creates an XML representation for the node sub tree and streams the output to the file f. If the flag preserve_roxen_entities is set, entities on the form &foo.bar; will not be escaped.

Method render_xml

string render_xml(void|bool preserve_roxen_entities, void|mapping(string:string) namespace_lookup, void|string encoding, void|int(2bit) quote_mode)

Description

Creates an XML representation of the node sub tree. If the flag preserve_roxen_entities is set, entities on the form &foo.bar; will not be escaped.

Parameter namespace_lookup

Mapping from namespace prefix to namespace symbol prefix.

Parameter encoding

Force a specific output character encoding. By default the encoding set in the document XML processing instruction will be used, with UTF-8 as a fallback. Setting this value will change the XML processing instruction, if present.

Parameter quote_mode

`0`	Defaults to single quote, but use double quote if it avoids escaping.
`1`	Defaults to double quote, but use single quote if it avoids escaping.
`2`	Use only single quote.
`3`	Use only double quote.

Method replace_attributes: void replace_attributes(mapping(string:string) attrs)
Description: Replace the entire set of attributes.
See also: get_attributes()

Method set_doc_order: void set_doc_order(int o)

Method set_short_attributes: void set_short_attributes(mapping short_attrs)
Description: Sets this nodes name-space adjusted attributes.

Method set_tag_name: void set_tag_name(string name)
Description: Change the tag name destructively. Can only be used on element and processing-instruction nodes.

Method set_text: string set_text(string txt)
Description: Change the text content destructively.

Method value_of_node: string value_of_node()
Description: If the node is an attribute node or a text node, its value is returned. Otherwise the child text nodes are concatenated and returned.

Class Parser.XML.Tree.XMLNSParser

Description: Namespace aware parser.

Method Enter: mapping(string:string) Enter(mapping(string:string) attrs)
Description: Check attrs for namespaces.
Returns: Returns the namespace expanded version of attrs.

Class Parser.XML.Tree.XMLParser

Description

Mixin for parsing XML.

Uses Parser.XML.Simple to perform the actual parsing.

Method node_factory

protected AbstractSimpleNode node_factory(int type, string name, mapping attr, string text)

Description

Factory for creating nodes.

Parameter type

Type of node to create. One of:

`XML_TEXT`	XML text. `text` contains a string with the text.
`XML_COMMENT`	XML comment. `text` contains a string with the comment text.
`XML_HEADER`	`<?xml?>`-header `attr` contains a mapping with the attributes.
`XML_PI`	XML processing instruction. `name` contains the name of the processing instruction and `text` the remainder.
`XML_ELEMENT`	XML element tag. `name` contains the name of the tag and `attr` the attributes.
`XML_DOCTYPE`	DTD information.
`DTD_ENTITY`
`DTD_ELEMENT`
`DTD_ATTLIST`
`DTD_NOTATION`

Parameter name

Name of the tag if applicable.

Parameter attr

Attributes for the tag if applicable.

Parameter text

Contained text of the tab if any.

This function is called during parsning to create the various XML nodes.

Overload this function to provide application-specific XML nodes.

Returns

Returns a node object representing the XML tag, or 0 (zero) if the subtree rooted in the tag should be cut.

Note

This function is not available in Pike 7.6 and earlier.

See also

node_factory_dispatch(), AbstractSimpleNode()->node_factory()

Method node_factory_dispatch

protected AbstractSimpleNode node_factory_dispatch(int type, string name, mapping|zero attr, string text)

Description

Dispatcher of node_factory().

This function finds a suitable node_factory() given the current parser context to call with the same arguments.

Class Parser.HTML

Description

This is a simple parser for SGML structured markups. It's not really HTML, but it's useful for that purpose.

The simple way to use it is to give it some information about available tags and containers, and what callbacks those are to call.

The object is easily reused, by calling the clone() function.

See also

add_tag, add_container, finish

Method _inspect

mapping _inspect()

Description

This is a low-level way of debugging a parser. This gives a mapping of the internal state of the Parser.HTML object.

The format and contents of this mapping may change without further notice.

Method _set_tag_callback
Method _set_entity_callback
Method _set_data_callback

Description

These functions set up the parser object to call the given callbacks upon tags, entities and/or data. The callbacks will only be called if there isn't another tag/container/entity handler for these.

The callback function will be called with the parser object as first argument, and the active string as second. Note that no parsing of the contents has been done. Both endtags and normal tags are called; there is no container parsing.

The return values from the callbacks are handled in the same way as the return values from callbacks registered with add_tag and similar functions.

The data callback will be called as seldom as possible with the longest possible string, as long as it doesn't get called out of order with any other callback. It will never be called with a zero length string.

If a string or array is given instead of a function, it will act as the return value from the function. Arrays or empty strings is probably preferable to avoid recursion.

Returns

Returns the object being called.

Method add_tag
Method add_container
Method add_entity
Method add_quote_tag
Method add_tags
Method add_containers
Method add_entities

Parser.HTML add_tag(string name, mixed to_do)
Parser.HTML add_container(string name, mixed to_do)
Parser.HTML add_entity(string entity, mixed to_do)
Parser.HTML add_quote_tag(string name, mixed to_do, string end)
Parser.HTML add_tags(mapping(string:mixed) tags)
Parser.HTML add_containers(mapping(string:mixed) containers)
Parser.HTML add_entities(mapping(string:mixed) entities)

Description

Registers the actions to take when parsing various things. Tags, containers, entities are as usual. add_quote_tag() adds a special kind of tag that reads any data until the next occurrence of the end string immediately before a tag end.

Parameter to_do

This argument can be any of the following.

`function(:void)`	The function will be called as a callback function. It will get the following arguments, depending on the type of callback. mixed tag_callback(Parser.HTML parser,mapping args,mixed ... extra) mixed container_callback(Parser.HTML parser,mapping args,string content,mixed ... extra) mixed entity_callback(Parser.HTML parser,mixed ... extra) mixed quote_tag_callback(Parser.HTML parser,string content,mixed ... extra)
`string`	This tag/container/entity is then replaced by the string. The string is normally not reparsed, i.e. it's equivalent to writing a function that returns the string in an array (but a lot faster). If `reparse_strings` is set the string will be reparsed, though.
`array`	The first element is a function as above. It will receive the rest of the array as extra arguments. If extra arguments are given by `set_extra()`, they will appear after the ones in this array.
`int(0..)`	If there is a tag/container/entity with the given name in the parser, it's removed.

The callback function can return:

`string`	This string will be pushed on the parser stack and be parsed. Be careful not to return anything in this way that could lead to a infinite recursion.
`array`	The element(s) of the array is the result of the function. This will not be parsed. This is useful for avoiding infinite recursion. The array can be of any size, this means the empty array is the most effective to return if you don't care about the result. If the parser is operating in `mixed_mode`, the array can contain anything. Otherwise only strings are allowed.
`int(0)`	This means "don't do anything", ie the item that generated the callback is left as it is, and the parser continues.
`int(1)`	Reparse the last item again. This is useful to parse a tag as a container, or vice versa: just add or remove callbacks for the tag and return this to jump to the right callback.

Returns

Returns the object being called.

See also

tags, containers, entities

Method at
Method at_line
Method at_char
Method at_column

array(int) at()
int at_line()
int at_char()
int at_column()

Description

Returns the current position. Characters and columns count from 0, lines count from 1.

at() gives an array with the following layout.

Array
`int 0`	Line.
`int 1`	Character.
`int 2`	Column.

Method case_insensitive_tag: int case_insensitive_tag(void|int value)
Description: All tags and containers are matched case insensitively, and argument names are converted to lowercase. Tags added with add_quote_tag() are not affected, though. Switching to case insensitive mode and back won't preserve the case of registered tags and containers.

Method clear_tags
Method clear_containers
Method clear_entities
Method clear_quote_tags: Parser.HTML clear_tags()
Parser.HTML clear_containers()
Parser.HTML clear_entities()
Parser.HTML clear_quote_tags()
Description: Removes all registered definitions in the different categories.
Returns: Returns the object being called.
See also: add_tag, add_tags, add_container, add_containers, add_entity, add_entities

Method clone

Parser.HTML clone(mixed ... args)

Description

Clones the Parser.HTML object. A new object of the same class is created, filled with the parse setup from the old object.

This is the simpliest way of flushing a parse feed/output.

The arguments to clone is sent to the new object, simplifying work for custom classes that inherits Parser.HTML.

Returns

Returns the new object.

Note

create is called _before_ the setup is copied.

Method tags
Method containers
Method entities

mapping(string:mixed) tags()
mapping(string:mixed) containers()
mapping(string:mixed) entities()

Description

Returns the current callback settings. When matching is done case insensitively, all names will be returned in lowercase.

Implementation note: These run in constant time since they return copy-on-write mappings.

See also

add_tag, add_tags, add_container, add_containers, add_entity, add_entities

Method context

string context()

Description

Returns the current output context as a string.

`"data"`	In top level data. This is always returned when called from tag or container callbacks.
`"arg"`	In an unquoted argument.
`"splice_arg"`	In a splice argument.

The return value can also be a single character string, in which case the context is a quoted argument. The string contains the starting quote character.

This function is typically only useful in entity callbacks, which can be called both from text and argument values of different sorts.

See also

splice_arg

Method current: string current()
Description: Gives the current range of data, ie the whole tag/entity/etc being parsed in the current callback. Returns zero if there's no current range, i.e. when the function is not called in a callback.

Method feed

Parser.HTML feed()
Parser.HTML feed(string s, void|int do_parse)

Description

Feed new data to the Parser.HTML object. This will start a scan and may result in callbacks. Note that it's possible that all data fed isn't processed - to do that, call finish().

If the function is called without arguments, no data is fed, but the parser is run. If the string argument is followed by a 0, ->feed(s,0);, the string is fed, but the parser isn't run.

Returns

Returns the object being called.

See also

finish, read, feed_insert

Method feed_insert: Parser.HTML feed_insert(string s)
Description: This pushes a string on the parser stack.
Returns: Returns the object being called.
Note: Don't use!

Method finish: Parser.HTML finish()
Parser.HTML finish(string s)
Description: Finish a parser pass. A string may be sent here, similar to feed().
Returns: Returns the object being called.

Method get_extra: array get_extra()
Description: Gets the extra arguments set by set_extra().
Returns: Returns the object being called.

Method ignore_comments: int ignore_comments(void|int value)

Method ignore_tags: int ignore_tags(void|int value)
Description: Do not look for tags at all. Normally tags are matched even when there's no callbacks for them at all. When this is set, the tag delimiters '<' and '>' will be treated as any normal character.

Method ignore_unknown: int ignore_unknown(void|int value)
Description: Treat unknown tags and entities as text data, continuing parsing for tags and entities inside them.
Note: When functions are specified with _set_tag_callback() or _set_entity_callback(), all tags or entities, respectively, are considered known. However, if one of those functions return 1 and ignore_unknown is set, they are treated as text data instead of making another call to the same function again.

Method lazy_argument_end: int lazy_argument_end(void|int value)
Description: A '>' in a tag argument closes both the argument and the tag, even if the argument is quoted.

Method lazy_entity_end: int lazy_entity_end(void|int value)
Description: Normally, the parser search indefinitely for the entity end character (i.e. ';'). When this flag is set, the characters '&', '<', '>', '"', ''', and any whitespace breaks the search for the entity end, and the entity text is then ignored, i.e. treated as data.

Method match_tag: int match_tag(void|int value)
Description: Unquoted nested tag starters and enders will be balanced when parsing tags. This is the default.

Method max_parse_depth: int max_parse_depth(void|int value)
Description: Maximum recursion depth during parsing. Recursion occurs when a tag/container/entity/quote tag callback function returns a string to be reparsed. The default value is 10.

Method mixed_mode: int mixed_mode(void|int value)
Description: Allow callbacks to return arbitrary data in the arrays, which will be concatenated in the output.

Method nestling_entity_end: int nestling_entity_end(void|int value)

Method parse_tag_args: mapping parse_tag_args(string tag)
Description: Parses the tag arguments from a tag string without the name and surrounding brackets, i.e. a string on the form "some='tag' args".
Returns: Returns a mapping containing the tag arguments.
See also: tag_args

Method parse_tag_name: string parse_tag_name(string tag)
Description: Parses the tag name from a tag string without the surrounding brackets, i.e. a string on the form "tagname some='tag' args".
Returns: Returns the tag name or an empty string if none.

Method quote_stapling: int quote_stapling(int|void enable)
Description: Enable old-style attribute quoting by stapling.
Parameter enable: Enable/disable the mode. Defaults to keeping the old setting.
Returns: Returns the prior setting.
Note: Any use of this mode is discouraged, and is only provided for compatibility with versions of Pike prior to 8.0.
Note: Note also that this mode will output runtime warnings whenever the mode has had an effect on the parsing.

Method quote_tags

mapping(string:array(mixed|string)) quote_tags()

Description

Returns the current callback settings. The values are arrays ({callback, end_quote}). When matching is done case insensitively, all names will be returned in lowercase.

Implementation note: quote_tags() allocates a new mapping for every call and thus, unlike e.g. tags() runs in linear time.

See also

add_quote_tag

Method read: string|array(mixed) read()
string|array(mixed) read(int max_elems)
Description: Read parsed data from the parser object.
Returns: Returns a string of parsed data if the parser isn't in mixed_mode, an array of arbitrary data otherwise.

Method reparse_strings: int reparse_strings(void|int value)
Description: When a plain string is used as a tag/container/entity/quote tag callback, it's not reparsed if this flag is unset. Setting it causes all such strings to be reparsed.

Method set_extra: Parser.HTML set_extra(mixed ... args)
Description: Sets the extra arguments passed to all tag, container and entity callbacks.
Returns: Returns the object being called.

Method splice_arg

string splice_arg(void|string name)

Description

If given a string, it sets the splice argument name to it. It returns the old splice argument name.

If a splice argument name is set, it's parsed in all tags, both those with callbacks and those without. Wherever it occurs, its value (after being parsed for entities in the normal way) is inserted directly into the tag. E.g:

<foo arg1="val 1" splice="arg2='val 2' arg3" arg4>

becomes

<foo arg1="val 1" arg2='val 2' arg3 arg4>

if "splice" is set as the splice argument name.

Method tag

array tag(void|mixed default_value)

Description

Returns the equivalent of the following calls.

Array
`string 0`	`tag_name()`
`mapping(string:mixed) 1`	`tag_args(default_value)`
`string 2`	`tag_content()`

Method tag_args: mapping(string:mixed) tag_args(void|mixed default_value)
Description: Gives the arguments of the current tag, parsed to a convenient mapping consisting of key:value pairs. If the current thing isn't a tag, it gives zero. default_value is used for arguments which have no value in the tag. If default_value isn't given, the value is set to the same string as the key.

Method tag_content: string tag_content()
Description: Gives the content of the current tag, if it's a container or quote tag. Otherwise returns zero.

Method tag_name: string|zero tag_name()
Description: Gives the name of the current tag, or zero. If used from an entity callback, it gives the string inside the entity.

Method write_out

Parser.HTML write_out(mixed ... args)

Description

Send data to the output stream, i.e. it won't be parsed and it won't be sent to the data callback, if any.

Any data is allowed when the parser is running in mixed_mode. Only strings are allowed otherwise.

Returns

Returns the object being called.

Method ws_before_tag_name: int ws_before_tag_name(void|int value)
Description: Allow whitespace between the tag start character and the tag name.

Method xml_tag_syntax

int xml_tag_syntax(void|int value)

Description

Whether or not to use XML syntax to tell empty tags and container tags apart.

`0`	Use HTML syntax only. If there's a `'/'` last in a tag, it's just treated as any other argument.
`1`	Use HTML syntax, but ignore a `'/'` if it comes last in a tag. This is the default.
`2`	Use XML syntax, but when a tag that does not end with `'/>'` is found which only got a non-container tag callback, treat it as a non-container (i.e. don't start to seek for the container end).
`3`	Use XML syntax only. If a tag got both container and non-container callbacks, the non-container callback is called when the empty element form (i.e. the one ending with `'/>'`) is used, and the container callback otherwise. If only a container callback exists, it gets the empty string as content when there's none to be parsed. If only a non-container callback exists, it will be called (without the content argument) for both kinds of tags.

Module Parser

Method decode_numeric_xml_entity: string|zero decode_numeric_xml_entity(string chref)
Description: Decodes the numeric XML entity chref, e.g. "4" and returns the character as a string. chref is the name part of the entity, i.e. without the leading '&' and trailing ';'. Returns zero if chref isn't on a recognized form or if the character number is too large to be represented in a string.

Method encode_html_entities

string encode_html_entities(string raw)

Description

Encode characters to HTML entities, e.g. turning "<" into "<".

The characters that will be encoded are characters <= 32, "\"&'<>" and characters >= 127 and <= 160 and characters >= 255.

Method get_xml_parser: HTML get_xml_parser()
Description: Returns a Parser.HTML initialized for parsing XML. It has all the flags set properly for XML syntax and callbacks to ignore comments, CDATA blocks and unknown PI tags, but it has no registered tags and doesn't decode any entities.

Method html_entity_parser
Method parse_html_entities: HTML html_entity_parser()
string parse_html_entities(string in)
HTML html_entity_parser(int noerror)
string parse_html_entities(string in, int noerror)
Description: Parse any HTML entities in the string to unicode characters. Either return a complete parser (to build on or use) or parse a string. Throw an error if there is an unrecognized entity in the string if noerror is not set.
Note: Currently using XHTML 1.0 tables.

Class Parser.CSV

Description

This is a parser for line oriented data that is either comma, semi-colon or tab separated. It extends the functionality of the Parser.Tabular with some specific functionality related to a header and record oriented parsing of huge datasets.

We document only the differences with the basic Parser.Tabular.

See also

Parser.Tabular

Inherit Tabular: inherit Parser.Tabular : Tabular

Method fetchrecord: mapping fetchrecord(void|array|mapping format)
Description: This function consumes a single record from the input. To be used in conjunction with parsehead().
Returns: It returns the mapping describing the record.
See also: parsehead(), fetch()

Method parsehead: int parsehead(void|string delimiters, void|string|object matchfieldname)
Description: This function consumes the header-line preceding a typical comma, semicolon or tab separated value list and autocompiles a format description from that. After this function has successfully parsed a header-line, you can proceed with either fetchrecord() or fetch() to get the remaining records.
Parameter delimiters: Explicitly specify a string containing all the characters that should be considered field delimiters. If not specified or empty, the function will try to autodetect the single delimiter in use.
Parameter matchfieldname: A string containing a regular expression, using Regexp.SimpleRegexp syntax, or an object providing a Regexp.SimpleRegexp.match() single string argument compatible method, that must match all the individual fieldnames before the header will be considered valid.
Returns: It returns true if a CSV head has successfully been parsed.
See also: fetchrecord(), fetch(), compile()

Class Parser.RCS

Description: A RCS file parser that eats a RCS *,v file and presents nice pike data structures of its contents.

Inherit _RCS: inherit Parser._RCS : _RCS

Constant max_revisions_supported: constant int Parser.RCS.max_revisions_supported
Description: Feature detection constant for the max_revisions argument to create(), parse() and parse_delta_sections().

Variable access: array(string) Parser.RCS.access
Description: The usernames listed in the ACCESS section of the RCS file.

Variable branch: string|int(0) Parser.RCS.branch
Description: The default branch (or revision), if present, 0 otherwise.

Variable branches: mapping(string:string) Parser.RCS.branches
Description: Maps branch numbers (indices) to branch names (values).
Note: The indices are short branch revision numbers (ie "1.1.2" and not "1.1.0.2").

Variable comment: string|int(0) Parser.RCS.comment
Description: The RCS file comment if present, 0 otherwise.

Variable description: string Parser.RCS.description
Description: The RCS file description.

Variable expand: string Parser.RCS.expand
Description: The keyword expansion options (as named by RCS) if present, 0 otherwise.

Variable head: string Parser.RCS.head
Description: Version number of the head version of the file.

Variable locks: mapping(string:string) Parser.RCS.locks
Description: Maps from username to revision for users that have acquired locks on this file.

Variable rcs_file_name: string Parser.RCS.rcs_file_name
Description: The filename of the RCS file as sent to create().

Variable revisions: mapping(string:Revision) Parser.RCS.revisions
Description: Data for all revisions of the file. The indices of the mapping are the revision numbers, whereas the values are the data from the corresponding revision.

Variable strict_locks: bool Parser.RCS.strict_locks
Description: 1 if strict locking is set, 0 otherwise.

Variable tags: mapping(string:string) Parser.RCS.tags
Description: Maps tag names (indices) to tagged revision numbers (values).
Note: This mapping typically contains raw revision numbers for branches (ie "1.1.0.2" and not "1.1.2").

Variable trunk: array(Revision) Parser.RCS.trunk
Description: Data for all revisions on the trunk, sorted in the same order as the RCS file stored them - ie descending, most recent first, I'd assume (rcsfile(5), of course, fails to state such irrelevant information).

Method create: Parser.RCS Parser.RCS(string|void file_name, string|int(0)|void file_contents, void|int max_revisions)
Description: Initializes the RCS object.
Parameter file_name: The path to the raw RCS file (includes trailing ",v"). Used mainly for error reporting (truncated RCS file or similar). Stored in rcs_file_name.
Parameter file_contents: If a string is provided, that string will be parsed to initialize the RCS object. If a zero (0) is sent, no initialization will be performed at all. If no value is given at all, but file_name was provided, that file will be loaded and parsed for object initialization.
Parameter max_revisions: Maximum number of revisions to process. If unset, all revisions will be processed.

Method expand_keywords_for_revision

string|zero expand_keywords_for_revision(string|Revision rev, string|void text, int|void expansion_mode)

Description

Expand keywords and return the resulting text according to the expansion rules set for the file.

Parameter rev

The revision to apply the expansion for.

Parameter text

If supplied, substitute keywords for that text instead using values that would apply for the given revision. Otherwise, revision rev is used.

Parameter expansion_mode

Expansion mode

`1`	Perform expansion even if the file was checked in as binary.
`0`	Perform expansion only if the file was checked in as non-binary with expansion enabled.
`-1`	Perform contraction if the file was checked in as non-binary.

Note

The Log keyword (which lacks sane quoting rules) is not expanded. Keyword expansion rules set in CVSROOT/cvswrappers are ignored. Only implements the -kkv, -ko and -kb expansion modes.

Note

Does not perform any line-ending conversion.

See also

get_contents_for_revision

Method get_contents_for_revision: string|zero get_contents_for_revision(string|Revision rev, void|bool dont_cache_data)
Description: Returns the file contents from the revision rev, without performing any keyword expansion. If dont_cache_data is set we will not keep intermediate revisions in memory unless they already existed. This will cut down memory use at the expense of slow access to older revisions.
See also: expand_keywords_for_revision()

Method parse: this_program parse(array raw, void|function(string:void) progress_callback, void|int max_revisions)
Description: Parse the RCS file raw and initialize all members of this object fully initialized.
Parameter raw: The unprocessed RCS file.
Parameter progress_callback: Passed on to parse_deltatext_sections.
Parameter max_revisions: Maximum number of revisions to process. If unset, all revisions will be processed.
Returns: The fully initialized object (only returned for API convenience; the object itself is destructively modified to match the data extracted from raw)
See also: parse_admin_section, parse_delta_sections, parse_deltatext_sections, create

Method parse_admin_section: array parse_admin_section(string|array raw)
Description: Lower-level API function for parsing only the admin section (the initial chunk of an RCS file, see manpage rcsfile(5)) of an RCS file. After running parse_admin_section, the RCS object will be initialized with the values for head, branch, access, branches, tokenize, tags, locks, strict_locks, comment and expand.
Parameter raw: The tokenized RCS file, or the raw RCS-file data.
Returns: The rest of the RCS file, admin section removed.
See also: parse_delta_sections, parse_deltatext_sections, parse, create
FIXME: Does not handle rcsfile(5) newphrase skipping.

Method parse_delta_sections: array parse_delta_sections(array raw, void|int max_revisions)
Description: Lower-level API function for parsing only the delta sections (the second chunk of an RCS file, see manpage rcsfile(5)) of an RCS file. After running parse_delta_sections, the RCS object will be initialized with the value of description and populated revisions mapping and trunk array. Their Revision members are however only populated with the members Revision->revision, Revision->branch, Revision->time, Revision->author, Revision->state, Revision->branches, Revision->rcs_next, Revision->ancestor and Revision->next.
Parameter raw: The tokenized RCS file, with admin section removed. (See parse_admin_section.)
Parameter max_revisions: Maximum number of revisions to process. If unset, all revisions will be processed.
Returns: The rest of the RCS file, delta sections removed.
See also: parse_admin_section, tokenize, parse_deltatext_sections, parse, create
FIXME: Does not handle rcsfile(5) newphrase skipping.

Method parse_deltatext_sections: void parse_deltatext_sections(array raw, void|function(string:void) progress_callback, array|void callback_args)
Description: Lower-level API function for parsing only the deltatext sections (the final and typically largest chunk of an RCS file, see manpage rcsfile(5)) of an RCS file. After a parse_deltatext_sections run, the RCS object will be fully populated.
Parameter raw: The tokenized RCS file, with admin and delta sections removed. (See parse_admin_section, tokenize and parse_delta_sections.)
Parameter progress_callback: This optional callback is invoked with the revision of the deltatext about to be parsed (useful for progress indicators).
Parameter args: Optional extra trailing arguments to be sent to progress_callback
See also: parse_admin_section, parse_delta_sections, parse, create
FIXME: Does not handle rcsfile(5) newphrase skipping.

Method tokenize: array(array(string)) tokenize(string data)
Description: Tokenize an RCS file into tokens suitable as argument to the various parse functions
Parameter data: The RCS file data
Returns: An array with arrays of tokens

Class Parser.RCS.DeltatextIterator

Description

Iterator for the deltatext sections of the RCS file. Typical usage:

Example

string raw = Stdio.read_file(my_rcs_filename);
   Parser.RCS rcs = Parser.RCS(my_rcs_filename, 0);
   raw = rcs->parse_delta_sections(rcs->parse_admin_section(raw));
   foreach(rcs->DeltatextIterator(raw); int n; Parser.RCS.Revision rev)
     do_something(rev);

Method _iterator_index: protected int _iterator_index()
Returns: the number of deltatext entries processed so far (0..N-1, N being the total number of revisions in the rcs file)

Method _iterator_next

protected int _iterator_next()

Description

Advance the iterator one step.

Returns UNDEFINED when the iterator is finished, and otherwise the same as _iterator_index().

Method _iterator_value: protected Revision _iterator_value()
Returns: the Revision at whose deltatext data we are, updated with its info

Method create: Parser.RCS.DeltatextIterator Parser.RCS.DeltatextIterator(array deltatext_section, void|function(string, mixed ... :void) progress_callback, void|array(mixed) progress_callback_args)
Parameter deltatext_section: the deltatext section of the RCS file in its entirety
Parameter progress_callback: This optional callback is invoked with the revision of the deltatext about to be parsed (useful for progress indicators).
Parameter progress_callback_args: Optional extra trailing arguments to be sent to progress_callback
See also: the rcsfile(5) manpage outlines the sections of an RCS file

Syntax: int Parser.RCS.DeltatextIterator.nprotected bool read_next()
Description: Drops the leading whitespace before next revision's deltatext entry and sets this_rev to the revision number we're about to read.

Method parse_deltatext_section: protected int parse_deltatext_section(array raw, int o)
Description: Chops off the first deltatext section from the token array raw and returns the rest of the string, or the value 0 (zero) if we had already visited the final deltatext entry. The deltatext's data is stored destructively in the appropriate entry of the revisions array.
Note: raw+o must start with a deltatext entry for this method to work
FIXME: does not handle rcsfile(5) newphrase skipping
FIXME: if the rcs file is truncated, this method writes a descriptive error to stderr and then returns 0 - some nicer error handling wouldn't hurt

Class Parser.RCS.Revision

Description: All data tied to a particular revision of the file.

Variable added: int Parser.RCS.Revision.added
Description: The number of lines that were added from the previous revision to make this revision (for the initial revision too).
See also: lines, removed

Variable ancestor: string|zero Parser.RCS.Revision.ancestor
Description: The revision of the ancestor of this revision, or 0 if this was the initial revision.
See also: next

Variable author: string Parser.RCS.Revision.author
Description: The userid of the user that committed the revision.

Variable branch: string Parser.RCS.Revision.branch
Description: The branch name on which this revision was committed (calculated according to how cvs manages branches).

Variable branches

array(string) Parser.RCS.Revision.branches

Description

When there are branches from this revision, an array with the first revision number for each of the branches, otherwise 0.

Follow the next fields to get to the branch head.

Variable lines: int Parser.RCS.Revision.lines
Description: The number of lines this revision contained, altogether (not of particular interest for binary files).
See also: added, removed

Variable log: string Parser.RCS.Revision.log
Description: The log message associated with the revision.

Variable next: string|zero Parser.RCS.Revision.next
Description: The revision that succeeds this revision, or 0 if none exists (ie if this is the HEAD of the trunk or of a branch).
See also: ancestor

Variable rcs_next: string|zero Parser.RCS.Revision.rcs_next
Description: The revision stored next in the RCS file, or 0 if none exists.
Note: This field is straight from the RCS file, and has somewhat weird semantics. Usually you will want to use one of the derived fields next or prev or possibly rcs_prev.
See also: next, prev, rcs_prev

Variable rcs_prev

string|zero Parser.RCS.Revision.rcs_prev

Description

The revision that this revision is based on, or 0 if it is the HEAD.

This is the reverse pointer of rcs_next and branches, and is used by get_contents_for_revision() when applying the deltas to set text.

See also

rcs_next

Variable rcs_text: string Parser.RCS.Revision.rcs_text
Description: The raw delta as stored in the RCS file.
See also: text, get_contents_for_revision()

Variable removed: int Parser.RCS.Revision.removed
Description: The number of lines that were removed from the previous revision to make this revision.
See also: lines, added

Variable revision: string Parser.RCS.Revision.revision
Description: The revision number (i e rcs_file->revisions["1.1"]->revision == "1.1").

Variable state: string Parser.RCS.Revision.state
Description: The state of the revision - typically "Exp" or "dead".

Variable text

string|zero Parser.RCS.Revision.text

Description

The text as committed or 0 if get_contents_for_revision() hasn't been called for this revision yet.

Typically you don't access this field directly, but use get_contents_for_revision() to retrieve it.

See also

get_contents_for_revision(), rcs_text

Variable time: Calendar.TimeRange Parser.RCS.Revision.time
Description: The (UTC) date and time when the revision was committed (second precision).

Class Parser.SGML

Description

This is a handy simple parser of SGML-like syntax like HTML. It doesn't do anything advanced, but finding the corresponding end-tags.

It's used like this:

array res=Parser.SGML()->feed(string)->finish()->result();

The resulting structure is an array of atoms, where the atom can be a string or a tag. A tag contains a similar array, as data.

Example

A string
     "<gat>&nbsp;<gurka>&nbsp;</gurka>&nbsp;<banan>&nbsp;<kiwi>&nbsp;</gat>"
     results in
({
   tag "gat" object with data:
   ({
       tag "gurka" object with data:
       ({
           " "
       })
       tag "banan" object with data:
       ({
           " "
           tag "kiwi" object with data:
           ({
              " "
           })
       })
   })
})
ie, simple "tags" (not containers) are not detected,
	but containers are ended implicitely by a surrounding
	container _with_ an end tag.
 	The 'tag' is an object with the following variables:
	
	 string name;           - name of tag
	 mapping args;          - argument to tag
	 int line,char,column;  - position of tag
	 int eline,echar,ecolumn;  - end position of tag, src[char..echar-1] got the block. add by Xuesong Guo
	 string file;           - filename (see <ref>create</ref>)
	 array(SGMLatom) data;  - contained data
	 int open;		- is not an empty element and has no end tag. add by Xuesong Guo

Variable file: string Parser.SGML.file

Method create: Parser.SGML Parser.SGML()
Parser.SGML Parser.SGML(string filename, function(:void)|void name_formater, function(:void)|void argname_formater)
Description: This object is created with this filename. It's passed to all created tags, for debug and trace purposes. All tag name will be replace as name_formater(name) All arg_name will be replace as argname_formater(arg_name)
Note: No, it doesn't read the file itself. See feed().

Method feed
Method finish
Method result

object feed(string s)
array(SGMLatom|string) finish()
array(SGMLatom|string) result(string s)

Description

Feed new data to the object, or finish the stream. No result can be used until finish() is called.

Both finish() and result() return the computed data.

feed() returns the called object.

Class Parser.SGML.SGMLatom

Variable name
Variable args
Variable line
Variable char
Variable column
Variable eline
Variable echar
Variable ecolumn
Variable file
Variable data
Variable open: string Parser.SGML.SGMLatom.name
mapping Parser.SGML.SGMLatom.args
int Parser.SGML.SGMLatom.line
int Parser.SGML.SGMLatom.char
int Parser.SGML.SGMLatom.column
int Parser.SGML.SGMLatom.eline
int Parser.SGML.SGMLatom.echar
int Parser.SGML.SGMLatom.ecolumn
string Parser.SGML.SGMLatom.file
array(SGMLatom) Parser.SGML.SGMLatom.data
int Parser.SGML.SGMLatom.open

Class Parser.Tabular

Description: This is a parser for line and block oriented data. It provides a flexible yet concise record-description language to parse character/column/delimiter-organised records.
See also: Parser.LR, http://www.wikipedia.org/wiki/Comma-separated_values, http://www.wikipedia.org/wiki/EDIFACT

Method compile

array|mapping compile(string|Stdio.File|Stdio.FILE input)

Description

Compiles the format description language into a compiled structure that can be fed to setformat, fetch, or create.

The format description is case sensitive.
The format description starts with a single line containing: [Tabular description begin]
The format description ends with a single line containing: [Tabular description end]
Any lines before the startline are skipped.
Any lines after the endline are not consumed.
Empty lines are skipped.
Comments start after a # or ;.
The depth level of a field is indicated by the number of leading spaces or colons at the beginning of the line.
The fieldname must not contain any whitespace.
An arbitrary number of single character field delimiters can be specified between brackets, e.g. [,;] or [,] would be for CSV.
When field delimiters are being used: in case of CSV type delimiters [\t,; ] the standard CSV quoting rules apply, in case other delimiters are used, no quoting is supported and the last field on a line should not specify a delimiter, but should specify a 0 fieldwidth instead.
A fixed field width can be specified by a plain decimal integer, a value of 0 indicates a field with arbitrary length that extends till the end of the line.
A matching regular expression can be enclosed in "", it has to match the complete field content and uses Regexp.SimpleRegexp syntax.
On records the following options are supported:

mandatory

This record is required.

fold

Fold this record's contents in the enclosing record.

single

This record is present at most once.
On fields the following options are supported:

drop

After reading and matching this field, drop the field content from the resulting mappingstructure.

See also

setformat(), create(), fetch()

Example

Example of the description language:
[Tabular description begin]
csv
:gtz
::mybankno           [,]
::transferdate       [,]
::mutatiesoort       [,]
::volgnummer         [,]
::bankno             [,]
::name               [,]
::kostenplaats       [,]                     drop
::amount             [,]
::afbij              [,]
::mutatie            [,]
::reference          [,]
::valutacode         [,]
mt940
:messageheader1                     mandatory
::exporttime            "0000"               drop
::CS1                   " "                  drop
::exportday             "01"                 drop
::exportaddress      12
::exportnumber       5  "[0-9]+"
:messageheader3                     mandatory fold single
::messagetype           "940"                drop
::CS1                   " "                  drop
::messagepriority       "00"                 drop
:TRN                             fold
::tag                   ":20:"               drop
::reference             "GTZPB|MPBZ|INGEB"
:accountid                     fold
::tag                   ":25:"               drop
::accountno          10
:statementno                     fold
::tag                   ":28C:"              drop
::settlementno       0                       drop
:openingbalance                     mandatory      single
::tag                   ":60F:"              drop
::creditdebit        1
::date               6
::currency              "EUR"
::amount             0  "[0-9]+,[0-9][0-9]"
:statements
::statementline                     mandatory fold single
:::tag                  ":61:"               drop
:::valuedate         6
:::creditdebit       1
:::amount               "[0-9]+,[0-9][0-9]"
:::CS1                  "N"                  drop
:::transactiontype   3                          # 3 for Postbank, 4 for ING
:::paymentreference  0
::informationtoaccountowner                   fold single
:::tag                  ":86:"               drop
:::accountno            "[0-9]*( |)"
:::accountname       0
::description         fold
:::description       0  "|[^:].*"
:closingbalance                     mandatory      single
::tag                   ":62[FM]:"           drop
::creditdebit        1
::date               6
::currency              "EUR"
::amount             0  "[0-9]+,[0-9][0-9]"
:informationtoaccountowner                    fold single
::tag                   ":86:"               drop
::debit                 "D"                  drop
::debitentries       6
::credit                "C"                  drop
::creditentries      6
::debit                 "D"                  drop
::debitamount           "[0-9]+,[0-9][0-9]"
::credit                "C"                  drop
::creditamount          "[0-9]+,[0-9][0-9]"  drop
::accountname           "(\n[^-:][^\n]*)*"   drop
:messagetrailer                     mandatory      single
::start                 "-"
::end                   "XXX"
[Tabular description end]

Method create

Description

This function initialises the parser.

Parameter input

The input stream or string.

Parameter format

The format to be used (either precompiled or not). The format description language is documented under compile().

Parameter verbose

If >1, it specifies the number of characters to display of the beginning of each record as a progress indicator. Special values are:

`-4`	Turns on format debugging with visible mismatches.
`-3`	Turns on format debugging with named field contents.
`-2`	Turns on format debugging with field contents.
`-1`	Turns on basic format debugging.
`0`	Turns off verbosity. Default.
`1`	Is the same as setting it to `70`.

See also

compile(), setformat(), fetch()

Method feed: object feed(string content)
Parameter content: Is injected into the input stream.
Returns: This object.
See also: fetch()

Method fetch

mapping|zero fetch(void|array|mapping format)

Description

This function consumes as much input as needed to parse the full tabular structures at once.

Parameter format

Describes (precompiled only) formats to be parsed. If no format is specified, the format specified on create() is used, and empty lines are automatically skipped.

Returns

A nested mapping that contains the complete structure as described in the specified format.

If nothing matches the specified format, no input is consumed (except empty lines, if the default format is used), and zero is returned.

See also

compile(), create(), setformat(), skipemptylines()

Method setformat: array|mapping setformat(array|mapping format)
Parameter format: Replaces the default (precompiled only) format.
Returns: The previous default format.
See also: compile(), fetch()

Method skipemptylines: int skipemptylines()
Description: This function can be used to manually skip empty lines in the input. This is unnecessary if no argument is specified for fetch().
Returns: It returns true if EOF has been reached.
See also: fetch()

Module Parser.C

Method group: array(Token|array) group(array(string|Token) tokens, void|mapping(string:string) groupings)
Description: Fold sub blocks of an array of tokens into sub arrays, for grouping purposes.
Parameter tokens: The token array to fold.
Parameter groupings: Supplies the tokens marking the boundaries of blocks to fold. The indices of the mapping mark the start of a block, the corresponding values mark where the block ends. The sub arrays will start and end in these tokens. If no groupings mapping is provided, {}, () and [] are used as block boundaries.

Method hide_whitespaces: array hide_whitespaces(array tokens)
Description: Folds all whitespace tokens into the previous token's trailing_whitespaces.

Method reconstitute_with_line_numbers: string reconstitute_with_line_numbers(array(string|Token|array) tokens)
Description: Like simple_reconstitute, but adding additional #line n "file" preprocessor statements in the output whereever a new line or file starts.

Method simple_reconstitute: string simple_reconstitute(array(string|Token|array) tokens)
Description: Reconstitutes the token array into a plain string again; essentially reversing split() and whichever of the tokenize, group and hide_whitespaces methods may have been invoked.

Method split: array(string) split(string data, void|mapping(string:string) state)
Description: Splits the data string into an array of tokens. An additional element with a newline will be added to the resulting array of tokens. If the optional argument state is provided the split function is able to pause and resume splitting inside #"" and /**/ tokens. The state argument should be an initially empty mapping, in which split will store its state between successive calls.

Method strip_line_statements: array(Token|array) strip_line_statements(array(Token|array) tokens)
Description: Strips off all (preprocessor) line statements from a token array.

Method tokenize: array(Token) tokenize(array(string) s, void|string file)
Description: Returns an array of Token objects given an array of string tokens.

Class Parser.C.Token

Description: Represents a C token, along with a selection of associated data and operations.

Variable file: string Parser.C.Token.file
Description: The file in which the token was found.

Variable line: int Parser.C.Token.line
Description: The line where the token was found.

Variable text: string Parser.C.Token.text
Description: The actual token.

Variable trailing_whitespaces: string Parser.C.Token.trailing_whitespaces
Description: Trailing whitespaces.

Method _sprintf: string sprintf(string format, ... Parser.C.Token arg ... )
Description: If the object is printed as %s it will only output its text contents.

Method `+: string res = Parser.C.Token() + s
Description: A string can be added to the Token, which will be added to the text contents.

Method `==: int res = Parser.C.Token() == foo
Description: Tokens are considered equal if the text contents are equal. It is also possible to compare the Token object with a text string directly.

Method `[]: int|string res = Parser.C.Token()[ a ]
Description: Characters and ranges may be indexed from the text contents of the token.

Method ``+: string res = s + Parser.C.Token()
Description: A string can be added to the Token, which will be added to the text contents.

Method cast: (int)Parser.C.Token() (float)Parser.C.Token() (string)Parser.C.Token() (array)Parser.C.Token() (mapping)Parser.C.Token() (multiset)Parser.C.Token()
Description: It is possible to case a Token object to a string. The text content will be returned.

Method create: Parser.C.Token Parser.C.Token(string text, void|int line, void|string file, void|string trailing_whitespace)

Class Parser.C.UnterminatedCharacterError

Description: Error thrown when an unterminated character token is encountered.

Inherit Generic: inherit Error.Generic : Generic

Variable err_char: string Parser.C.UnterminatedCharacterError.err_char
Description: The character that failed to be tokenized

Class Parser.C.UnterminatedCommentError

Description: Error thrown when an unterminated comment token is encountered.

Inherit Generic: inherit Error.Generic : Generic

Variable err_comment: string Parser.C.UnterminatedCommentError.err_comment
Description: The comment that failed to be tokenized

Class Parser.C.UnterminatedStringError

Description: Error thrown when an unterminated string token is encountered.

Inherit Generic: inherit Error.Generic : Generic

Variable err_str: string Parser.C.UnterminatedStringError.err_str
Description: The string that failed to be tokenized

Module Parser.ECMAScript

Description: ECMAScript/JavaScript token parser based on ECMAScript 2017 (ECMA-262), chapter 11: Lexical Grammar.

Method split: array(string) split(string data)
Description: Splits the ECMAScript source data in tokens.

Module Parser.LR

Description: LALR(1) parser generator.

Enum Parser.LR.SeverityLevel

Description: Severity level

Constant NOTICE
Constant WARNING
Constant ERROR: constant Parser.LR.NOTICE
constant Parser.LR.WARNING
constant Parser.LR.ERROR

Class Parser.LR.ErrorHandler

Description: Class handling reporting of errors and warnings.

Variable verbose

optional int(-1..1) Parser.LR.ErrorHandler.verbose

Description

Verbosity level

`-1`	Just errors.
`0`	Errors and warnings.
`1`	Also notices.

Method create: Parser.LR.ErrorHandler Parser.LR.ErrorHandler(int(-1..1)|void verbosity)
Description: Create a new error handler.
Parameter verbosity: Level of verbosity.
See also: verbose

Class Parser.LR.Parser

Description

This object implements an LALR(1) parser and compiler.

Normal use of this object would be:

 set_error_handler
 {add_rule, set_priority, set_associativity}*
 set_symbol_to_string
 compile
 {parse}*

Variable error_handler: function(SeverityLevel, string, string, mixed ... :void) Parser.LR.Parser.error_handler
Description: Compile error and warning handler.

Variable grammar: mapping(int:array(Rule)) Parser.LR.Parser.grammar
Description: The grammar itself.

Variable known_states: mapping(string:Kernel) Parser.LR.Parser.known_states
Description: LR0 states that are already known to the compiler.

Variable lr_error: int Parser.LR.Parser.lr_error
Description: Error code

Variable s_q: StateQueue|zero Parser.LR.Parser.s_q
Description: Contains all states used. In the queue section are the states that remain to be compiled.

Variable start_state: Kernel|zero Parser.LR.Parser.start_state
Description: The initial LR0 state.

Method _sprintf: string sprintf(string format, ... Parser.LR.Parser arg ... )
Description: Pretty-prints the current grammar to a string.

Method add_rule: void add_rule(Rule r)
Description: Add a rule to the grammar.
Parameter r: Rule to add.

Method cast: (int)Parser.LR.Parser() (float)Parser.LR.Parser() (string)Parser.LR.Parser() (array)Parser.LR.Parser() (mapping)Parser.LR.Parser() (multiset)Parser.LR.Parser()
Description: Implements casting.
Parameter type: Type to cast to.

Method compile: int compile()
Description: Compiles the grammar into a parser, so that parse() can be called.

Method item_to_string: string item_to_string(Item i)
Description: Pretty-prints an item to a string.
Parameter i: Item to pretty-print.

Method parse: mixed parse(object|function(void:string|array(string|mixed)) scanner, void|object action_object)
Description: Parse the input according to the compiled grammar. The last value reduced is returned.
Note: The parser must have been compiled (with compile()) prior to calling this function.
Bugs: Errors should be throw()n.
Parameter scanner: The scanner function. It returns the next symbol from the input. It should either return a string (terminal) or an array with a string (terminal) and a mixed (value). EOF is indicated with the empty string.
Parameter action_object: Object used to resolve those actions that have been specified as strings.

Method rule_to_string: string rule_to_string(Rule r)
Description: Pretty-prints a rule to a string.
Parameter r: Rule to print.

Method set_associativity: void set_associativity(string terminal, int assoc)
Description: Sets the associativity of a terminal.
Parameter terminal: Terminal to set the associativity for.
Parameter assoc: Associativity; negative - left, positive - right, zero - no associativity.

Method set_error_handler: void set_error_handler(void|function(SeverityLevel, string, string, mixed ... :void) handler)
Description: Sets the error report function.
Parameter handler: Function to call to report errors and warnings. If zero or not specifier, use the built-in function.

Method set_priority: void set_priority(string terminal, int pri_val)
Description: Sets the priority of a terminal.
Parameter terminal: Terminal to set the priority for.
Parameter pri_val: Priority; higher = prefer this terminal.

Method set_symbol_to_string: void set_symbol_to_string(void|function(int|string:string) s_to_s)
Description: Sets the symbol to string conversion function. The conversion function is used by the various *_to_string functions to make comprehensible output.
Parameter s_to_s: Symbol to string conversion function. If zero or not specified, use the built-in function.

Method state_to_string: string state_to_string(Kernel state)
Description: Pretty-prints a state to a string.
Parameter state: State to pretty-print.

Class Parser.LR.Parser.Item

Description: An LR(0) item, a partially parsed rule.

Variable counter: int Parser.LR.Parser.Item.counter
Description: Depth counter (used when compiling).

Variable direct_lookahead: multiset(string) Parser.LR.Parser.Item.direct_lookahead
Description: Look-ahead set for this item.

Variable error_lookahead: multiset(string) Parser.LR.Parser.Item.error_lookahead
Description: Look-ahead set used for detecting conflicts

Variable item_id: int Parser.LR.Parser.Item.item_id
Description: Used to identify the item. Equal to r->number + offset.

Variable master_item: Item|zero Parser.LR.Parser.Item.master_item
Description: Item representing this one (used for shifts).

Variable next_state: Kernel|zero Parser.LR.Parser.Item.next_state
Description: The state we will get if we shift according to this rule

Variable number: int Parser.LR.Parser.Item.number
Description: Item identification number (used when compiling).

Variable offset: int Parser.LR.Parser.Item.offset
Description: How long into the rule the parsing has come.

Variable r: Rule|zero Parser.LR.Parser.Item.r
Description: The rule

Variable relation: multiset(Item) Parser.LR.Parser.Item.relation
Description: Relation to other items (used when compiling).

Class Parser.LR.Parser.Kernel

Description: Implements an LR(1) state

Variable action

mapping(int|string:Kernel|Rule) Parser.LR.Parser.Kernel.action

Description

The action table for this state

 object(kernel)    SHIFT to this state on this symbol.
 object(rule)      REDUCE according to this rule on this symbol.

Variable closure_set: multiset Parser.LR.Parser.Kernel.closure_set
Description: The symbols that closure has been called on.

Variable item_id_to_item: mapping(int:Item) Parser.LR.Parser.Kernel.item_id_to_item
Description: Used to lookup items given rule and offset

Variable items: array(Item) Parser.LR.Parser.Kernel.items
Description: Contains the items in this state.

Variable rules: multiset(Rule) Parser.LR.Parser.Kernel.rules
Description: Used to check if a rule already has been added when doing closures.

Variable symbol_items: mapping(int:multiset(Item)) Parser.LR.Parser.Kernel.symbol_items
Description: Contains the items whose next symbol is this non-terminal.

Method add_item: void add_item(Item i)
Description: Add an item to the state.

Method closure: void closure(int nonterminal)
Description: Make the closure of this state.
Parameter nonterminal: Nonterminal to make the closure on.

Method do_goto: Kernel do_goto(int|string symbol)
Description: Generates the state reached when doing goto on the specified symbol. i.e. it compiles the LR(0) state.
Parameter symbol: Symbol to make goto on.

Method goto_set: multiset(int|string) goto_set()
Description: Make the goto-set of this state.

Class Parser.LR.Parser.StateQueue

Description: This is a queue, which keeps the elements even after they are retrieved.

Variable arr: array(Kernel) Parser.LR.Parser.StateQueue.arr
Description: The queue itself.

Variable head: int(0..) Parser.LR.Parser.StateQueue.head
Description: Index of the head of the queue.

Variable tail: int(0..) Parser.LR.Parser.StateQueue.tail
Description: Index of the tail of the queue.

Method next: Kernel|zero next()
Description: Return the next state from the queue.

Method push: Kernel push(Kernel state)
Description: Pushes the state on the queue.
Parameter state: State to push.

Class Parser.LR.Priority

Description: Specifies the priority and associativity of a rule.

Variable assoc

int Parser.LR.Priority.assoc

Description

Associativity

`-1`	Left
`0`	None
`1`	Right

Variable value: int Parser.LR.Priority.value
Description: Priority value

Method create: Parser.LR.Priority Parser.LR.Priority(int p, int a)
Description: Create a new priority object.
Parameter p: Priority.
Parameter a: Associativity.

Class Parser.LR.Rule

Description: This object is used to represent a BNF-rule in the LR parser.

Variable action: function(:void)|string|zero Parser.LR.Rule.action
Description: Action to do when reducing this rule. function - call this function. string - call this function by name in the object given to the parser. The function is called with arguments corresponding to the values of the elements of the rule. The return value of the function will be the value of this non-terminal. The default rule is to return the first argument.

Variable has_tokens: int Parser.LR.Rule.has_tokens
Description: This rule contains tokens

Variable nonterminal: int Parser.LR.Rule.nonterminal
Description: Non-terminal this rule reduces to.

Variable num_nonnullables: int Parser.LR.Rule.num_nonnullables
Description: This rule has this many non-nullable symbols at the moment.

Variable number: int Parser.LR.Rule.number
Description: Sequence number of this rule (used for conflict resolving) Also used to identify the rule.

Variable pri: Priority|zero Parser.LR.Rule.pri
Description: Priority and associativity of this rule.

Variable symbols: array(string|int) Parser.LR.Rule.symbols
Description: The actual rule

Method create

Parser.LR.Rule Parser.LR.Rule(int nt, array(string|int) r, function(:void)|string|void a)

Description

Create a BNF rule.

Example

The rule
	   rule : nonterminal ":" symbols ";" { add_rule };
   might be created as
	   rule(4, ({ 9, ":", 5, ";" }), "add_rule");
   where 4 corresponds to the nonterminal "rule", 9 to "nonterminal"
   and 5 to "symbols", and the function "add_rule" is too be called
   when this rule is reduced.

Parameter nt

Non-terminal to reduce to.

Parameter r

Symbol sequence that reduces to nt.

Parameter a

Action to do when reducing according to this rule. function - Call this function. string - Call this function by name in the object given to the parser. The function is called with arguments corresponding to the values of the elements of the rule. The return value of the function will become the value of this non-terminal. The default rule is to return the first argument.

Module Parser.LR.GrammarParser

Description

This module generates an LR parser from a grammar specified according to the following grammar:

        directives : directive ;
	   directives : directives directive ;
	   directive : declaration ;
	   directive : rule ;
	   declaration : "%token" terminals ";" ;
	   rule : nonterminal ":" symbols ";" ;
	   rule : nonterminal ":" symbols action ";" ;
	   symbols : symbol ;
	   symbols : symbols symbol ;
	   terminals : terminal ;
	   terminals : terminals terminal ;
	   symbol : nonterminal ;
	   symbol : "string" ;
	   action : "{" "identifier" "}" ;
	   nonterminal : "identifier" ;
	   terminal : "string";

Variable lr_error: int Parser.LR.GrammarParser.lr_error
Description: Error code from the parsing.

Method make_parser: Parser make_parser(string str, object|void m)
Description: Compiles the parser-specification given in the first argument. Named actions are taken from the object if available, otherwise left as is.
Bugs: Returns error-code in both GrammarParser.error and return_value->lr_error.

Method make_parser_from_file: int|Parser make_parser_from_file(string fname, object|void m)
Description: Compiles the file specified in the first argument into an LR parser.
See also: make_parser

Module Parser.Markdown

Description

This is a port of the Javascript Markdown parser 'Marked' https://github.com/chjj/marked. The only method needed to be used is parse() which will transform Markdown text to HTML.

For a description on Markdown, go to the web page of the inventor of Markdown https://daringfireball.net/projects/markdown/.

Method encode_html: protected string encode_html(string html, void|bool enc)
Description: HTML encode <>"'. If enc is true & will also be encoded

Method parse

string parse(string md, void|mapping options)

Description

Convert markdown md to html

Parameter options

`"gfm" : bool`	Enable Github Flavoured Markdown. (true)
`"tables" : bool`	Enable GFM tables. Requires "gfm" (true)
`"breaks" : bool`	Enable GFM "breaks". Requires "gfm" (false)
`"pedantic" : bool`	Conform to obscure parts of markdown.pl as much as possible. Don't fix any of the original markdown bugs or poor behavior. (false)
`"sanitize" : bool`	Sanitize the output. Ignore any HTML that has been input. (false)
`"mangle" : bool`	Mangle (obfuscate) autolinked email addresses (true)
`"smart_lists" : bool`	Use smarter list behavior than the original markdown. (true)
`"smartypants" : bool`	Use "smart" typographic punctuation for things like quotes and dashes. (false)
`"header_prefix" : string`	Add prefix to ID attributes of header tags (empty)
`"xhtml" : bool`	Generate self closing XHTML tags (false)
`"newline" : bool`	Add a newline after tags. If false the output will be on one line (well, newlines in text will be kept). (false)
`"renderer" : Renderer`	Use this renderer to render output. (Renderer)
`"lexer" : Lexer`	Use this lexer to parse blocks of text. (Lexer)
`"inline_lexer" : InlineLexer`	Use this lexer to parse inline text. (InlineLexer)
`"parser" : Parser`	Use this parser instead of the default. (Parser)

Method replace1: protected string replace1(string subject, string from, string to)
Description: Replaces the first occurance of from in subject to to

Class Parser.Markdown.InlineLexer

Description: Lexer used for inline text (eg bold text inside a paragraph).

Method output: string output(string src)
Description: Parse some inline Markdown and return the corresponding HTML.

Class Parser.Markdown.Lexer

Description: Block-level lexer (parses paragraphs, lists, tables, etc).

Variable links: mapping Parser.Markdown.Lexer.links
Note: Read only

Variable tokens: array(mapping) Parser.Markdown.Lexer.tokens
Note: Read only

Method lex: this_program lex(string src)
Description: Main lexing entry point. Subclass Lexer and override this to add post-processing or other changes.

Class Parser.Markdown.Parser

Description: Top-level parsing handler. It's usually easier to replace the Renderer instead.

Method parse: string parse(Lexer src)

Method parse_text: protected string parse_text()

Method tok: protected string tok()
Description: Render a token (or group of tokens) to a string.

Class Parser.Markdown.Renderer

Method attrs: string attrs(mapping token, mapping|void dflt)
Description: Collect additional attributes from the token and render them as HTML attributes. Default attributes can be provided.

Method blockquote: string blockquote(string text, mapping token)

Method html
Method text
Method strong
Method em
Method del
Method codespan
Method br: string html(string text, mapping token)
string text(string t, mapping token)
string strong(string t, mapping token)
string em(string t, mapping token)
string del(string t, mapping token)
string codespan(string t, mapping token)
string br(mapping token)

Method code: string code(string code, string lang, bool escaped, mapping token)

Method heading: string heading(string text, int level, string raw, mapping token)

Method hr: string hr()

Method image: string image(string url, string title, string text, mapping token)

Method link: string link(string href, string|zero title, string text, mapping token)

Method list: string list(string body, void|bool ordered, mapping token)

Method listitem: string listitem(string text, mapping token)

Method paragraph: string paragraph(string text, mapping token)

Method table: string table(string header, string body, mapping token)

Method tablecell: string tablecell(string cell, mapping flags, mapping token)

Method tablerow: string tablerow(string row, mapping token)

Module Parser.Pike

Description: This module parses and tokenizes Pike source code.

Inherit "C.pmod": inherit "C.pmod" : "C.pmod"

Module Parser.Python

Method split: array(string) split(string data)
Description: Returns the provided string with Python code as an array with tokens.

Module Parser._parser

Description: Low-level helpers for parsers.
Note: You probably don't want to use the modules contained in this module directly, but instead use the other Parser modules. See instead the modules below.
See also: Parser, Parser.C, Parser.Pike, Parser.RCS, Parser.HTML, Parser.XML

Module Parser._parser._C

Description: Low-level helpers for Parser.C.
Note: You probably want to use Parser.C instead of this module.
See also: Parser.C, _Pike.

Method tokenize: array(array(string)|string) tokenize(string code)
Description: Tokenize a string of C tokens.
Note: Don't use this function directly. Use Parser.C.tokenize() instead.
Returns: Returns an array with an array with C-level tokens, and the remainder (a partial token), if any.

Module Parser._parser._Pike

Description: Low-level helpers for Parser.Pike.
Note: You probably want to use Parser.Pike instead of this module.
See also: Parser.Pike, _C.

Method tokenize: array(array(string)|string) tokenize(string code)
Description: Tokenize a string of Pike tokens.
Returns: Returns an array with Pike-level tokens and the remainder (a partial token), if any.

Module Parser._parser._RCS

Description: Low-level helpers for Parser.RCS.
Note: You probably want to use Parser.RCS instead of this module.
See also: Parser.RCS

Method tokenize: array(array(string)) tokenize(string code)
Description: Tokenize a string of RCS tokens.
Note: Don't use this function directly. Use Parser.RCS.tokenize() instead.
See also: Parser.RCS.tokenize()

11. Parsers

Module Parser.XML

Class Parser.XML.Simple

Class Parser.XML.Simple.Context

Class Parser.XML.Validating

Class Parser.XML.Validating.Element

Module Parser.XML.NSTree

Class Parser.XML.NSTree.NSNode

Module Parser.XML.SloppyDOM

Class Parser.XML.SloppyDOM.Document

Class Parser.XML.SloppyDOM.Node

Class Parser.XML.SloppyDOM.NodeWithChildElements

Module Parser.XML.Tree

Enum Parser.XML.Tree.ParseFlags

Class Parser.XML.Tree.AbstractNode

Class Parser.XML.Tree.AbstractSimpleNode

Class Parser.XML.Tree.AttributeNode

Class Parser.XML.Tree.CommentNode

Class Parser.XML.Tree.DTDAttlistNode

Class Parser.XML.Tree.DTDElementNode

Class Parser.XML.Tree.DTDEntityNode

Class Parser.XML.Tree.DTDNotationNode

Class Parser.XML.Tree.DoctypeNode

Class Parser.XML.Tree.ElementNode

Class Parser.XML.Tree.HeaderNode

Class Parser.XML.Tree.Node

Class Parser.XML.Tree.PINode

Class Parser.XML.Tree.RootNode

Class Parser.XML.Tree.SimpleCommentNode

Class Parser.XML.Tree.SimpleDTDAttlistNode

Class Parser.XML.Tree.SimpleDTDElementNode

Class Parser.XML.Tree.SimpleDTDEntityNode

Class Parser.XML.Tree.SimpleDTDNotationNode

Class Parser.XML.Tree.SimpleDoctypeNode

Class Parser.XML.Tree.SimpleElementNode

Class Parser.XML.Tree.SimpleHeaderNode

Class Parser.XML.Tree.SimpleNode

Class Parser.XML.Tree.SimplePINode

Class Parser.XML.Tree.SimpleRootNode

Class Parser.XML.Tree.SimpleTextNode

Class Parser.XML.Tree.TextNode

Class Parser.XML.Tree.VirtualNode

Class Parser.XML.Tree.XMLNSParser

Class Parser.XML.Tree.XMLParser

Class Parser.HTML

Module Parser

Class Parser.CSV

Class Parser.RCS

Class Parser.RCS.DeltatextIterator

Class Parser.RCS.Revision

Class Parser.SGML

Class Parser.SGML.SGMLatom

Class Parser.Tabular

Module Parser.C

Class Parser.C.Token

Class Parser.C.UnterminatedCharacterError

Class Parser.C.UnterminatedCommentError

Class Parser.C.UnterminatedStringError

Module Parser.ECMAScript

Module Parser.LR

Enum Parser.LR.SeverityLevel

Class Parser.LR.ErrorHandler

Class Parser.LR.Parser

Class Parser.LR.Parser.Item

Class Parser.LR.Parser.Kernel

Class Parser.LR.Parser.StateQueue

Class Parser.LR.Priority

Class Parser.LR.Rule

Module Parser.LR.GrammarParser

Module Parser.Markdown

Class Parser.Markdown.InlineLexer

Class Parser.Markdown.Lexer

Class Parser.Markdown.Parser

Class Parser.Markdown.Renderer

Module Parser.Pike

Module Parser.Python

Module Parser._parser

Module Parser._parser._C

Module Parser._parser._Pike

Module Parser._parser._RCS