open

Usage

$doc := open [--format|:F html|xml|docbook] [--file|:f | --pipe|:p | --string|:s] [--switch-to|:w | --no-switch-to|:W] [--validate|:v | --no-validate|:V] [--recover|:r | --no-recover|:R] [--expand-entities|:e | --no-expand-entities|:E] [--xinclude|:x | --no-xinclude|:X] [--keep-blanks|:b | --no-keep-blanks|:B] [--pedantic|:n | --no-pedantic|:N] [--load-ext-dtd|:d | --no-load-ext-dtd|:D] [--complete-attributes|:a | --no-complete-attributes|:A] expression

Description

Parse a XML, HTML or SGML DOCBOOK document from a file or URL, command output or string and return a node-set consisting of the root of the resulting DOM tree.

--format (:F) option may be used to specify file format. Possible values are xml (default), html, and docbook. Note, however, that the support for parsing DocBook SGML files has been deprecated in recent libxml2 versions.

--file (:f) instructs the parser to consider a given expression as a file name or URL.

--pipe (:p) instructs the parser to consider a given expression as a system command and parse its output.

--string (:s) instructs the parser to consider a given expression as a string of XML or HTML to parse.

--switch-to (:w) and --no-switch-to (:W) control whether the new document's root should become current node. These option override current global setting of switch-to-new-documents.

--validate (:v) and --no-validate (:V) turn on/off DTD-validation of the parsed document. These option override current global setting of validation.

--recover (:r) and --no-recover (:R) turn on/off parser's ability to recover from non-fatal errors. These option override current global setting of recovering.

--expand-entities (:e) and --no-expand-entities (:E) turn on/off entity expansion, overriding current global setting of parser-expands-entities.

--xinclude (:x) and --no-xinclude (:X) turn on/off XInclude processing, overriding current global settings of parser-expands-xinclude.

--keep-blanks (:b) and --no-keep-blanks (:B) control whether the parser should preserve so called ignorable whitespace. These option override current global setting of keep-blanks.

--pedantic (:n) and --no-pedantic (:N) turn on/off pedantic parser flag.

--load-ext-dtd (:d) and --no-load-ext-dtd (:D) control whether the external DTD subset should be loaded with the document. These option override current global setting of load-ext-dtd.

--complete-attributes (:a) and --no-complete-attributes (:A) turn on/off parse-time default attribute completion based on default values specified in the DTD. These option override current global setting of parser-completes-attributes.

$scratch/> $x := open mydoc.xml # open an XML document

# open a HTML document from the Internet
$h:=open --format html "http://www.google.com/?q=xsh"
# quote file name if it contains whitespace
$y := open "document with a long name with spaces.xml"

# use --format html or --format docbook to load these types
$z := open --format html index.htm

# use --pipe flag to read output of a command
$z := open --format html --pipe 'wget -O - xsh.sourceforge.net/index.html'

# use document variable to restrict XPath search to a
# given document
ls $z//chapter/title