stream

Usage

stream [ --input-file|:f filename | --input-pipe|:p filename | --input-string|:s expression] [ --output-file|:F filename | --output-pipe|:P filename | --output-string|:S $variable ] select xpath command-or-block [ select xpath command-or-block ... ]

Description

EXPERIMENTAL! This command provides a memory efficient (though slower) way to process selected parts of an XML document with XSH2. A streaming XML parser (SAX parser) is used to parse the input. The parser has two states which will be referred to as A and B below. The initial state of the parser is A.

In the state A, only a limited vertical portion of the DOM tree is built. All XML data coming from the input stream other than start-tags are immediately copied to the output stream. If a new start-tag of an element arrives, a new node is created in the tree. All siblings of the newly created node are removed. Thus, in the state A, there is exactly one node on every level of the tree. After a node is added to the tree, all the xpath expressions following the select keyword are checked. If none matches, the parser remains in state A and copies the start-tag to the output stream. Otherwise, the first expression that matches is remembered and the parser changes its state to B.

In state B the parser builds a complete DOM subtree of the element that was last added to the tree before the parser changed its state from A to B. No data are sent to the output at this stage. When the subtree is complete (i.e. the corresponding end-tag for its topmost element is encountered), the command-or-block of instructions following the xpath expression that matched is invoked with the root element of the subtree as the current context node. The commands in command-or-block are allowed to transform the whole element subtree or even to replace it with a different DOM subtree or subtrees. They must, however, leave intact all ancestor nodes of the subtree. Failing to do so can result in an error or unpredictable results.

After the subtree processing command-or-block returns, all subtrees that now appear in the DOM tree in the place of the original subtree are serialized to the output stream. After that, they are deleted and the parser returns to state A.

Note that this type of processing highly limits the amount of information the selecting XPath expressions can use. The first notable fact is, that elements can not be selected by their content. The only information present in the tree at the time of the XPath evaluation is the element's name and attributes plus the same information for all its ancestors (there is no information at all about its possible child nodes nor of the node's position within the list of its siblings).

The input parameters below are mutually exclusive. If non is given, standard input is processed.

--input-file or :f instructs the processor to stream from a given file.

--input-pipe or :p instructs the processor to stream the output of a given a command.

--input-string or :s instructs the processor to use the result of a given expression as the input to be processed.

The output parameters below are mutually exclusive. If none is given, standard output is used.

--output-file or :F instructs the processor to save the output to a given file.

--output-pipe or :P instructs the processor to pipe the output to a given command.

--output-string or :S followed by a variable name instructs the processor to store the result in the given variable.