Along with XPath, Perl is one of two XSH2 expression languages, and borrows XSH2 its great expressive power. Perl is a language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. It has built-in regular expressions and powerful yet easy to learn data structures (scalars, arrays, hash tables). It's also a good language for many system management tasks. XSH2 itself is written in Perl (except for the XML engine, which uses libxml2 library written in C by Daniel Veillard).
Perl expressions or blocks of code can either be used as arguments to any XSH2 command. One of them is perl command which simply evaluates the given Perl block. Other commands, such as map, even require Perl expression argument and allow quickly change DOM node content. Perl expressions may also provide lists of strings to iterate over with a foreach loop, or serve as conditions for if, unless, and while statements.
To prevent conflict between XSH2 internals and the evaluated
Perl code, XSH2 runs such code in the context of a special
namespace XML::XSH2::Map
. As described in
the section Variables, XSH2 string
variables may be accessed and possibly assigned from Perl
code in the most obvious way, since they actually
are Perl variables defined in the
XML::XSH2::Map
namespace.
The interaction between XSH2 and Perl actually works the
other way round as well, so that you may call back XSH2 from the
evaluated Perl code. For this, Perl function
xsh
is defined in the
XML::XSH2::Map
namespace. All parameters
passed to this function are interpreted as XSH2 commands.
Moreover, the following Perl helper functions are defined:
xsh(string,....)
- evaluates
given string(s) as XSH2 commands.
call(name)
- call a given
XSH2 subroutine.
count(string)
- evaluates
given string as an XPath expression and returns
either literal value of the result (in case of
boolean, string and float result type) or
number of nodes in a returned node-set.
literal(string|object)
-
if passed a string, evaluates it as a XSH2 expression
and returns the literal value of the result;
if passed an object, returns literal value of
the object.
For example,
literal('$doc/expression')
returns the same
value as count('string($doc/expression)')
.
serialize(string|object)
-
if passed a string, it first evaluates the string
as a XSH2 expression to obtain a node-list object.
Then it serializes the object into XML.
The resulting string is equal to the output of the XSH2 command ls applied on the same expression or object
expression only without indentation and folding.
type(string|object)
-
if passed a string, it first evaluates
the string as XSH2 expression to obtain a node-list object.
It returns a list of strings representing
the types of nodes in the node-list
(ordered in the canonical document order).
The returned type strings are:
element
,
attribute
,
text
,
cdata
,
pi
,
entity_reference
,
document
,
chunk
,
comment
,
namespace
,
unknown
.
nodelist(string|object,...)
-
converts its arguments to objects if necessary
and returns a node-list consisting of the objects.
xpath(string, node?)
-
evaluates a given string as an XPath expression
in the context of a given node and returns
the result.
echo(string,...)
- prints
given strings on XSH2 output.
Note, that in the interactive mode,
XSH2 redirects all output to a specific terminal file handle
stored in the variable $OUT
.
So, if you for example mean to pipe the result
to a shell command, you should avoid using STDOUT filehandle
directly. You may either use the usual print
without a filehandle,
use the echo
function,
or use $OUT
as a filehandle.
In the following examples we use Perl to populate the
Middle-Earth with Hobbits whose names are read from a text
file called hobbits.txt
, unless there are
some Hobbits in Middle-Earth already.
Example 7. Use Perl to read text files
unless (//creature[@race='hobbit']) { perl { open my $fh, "hobbits.txt" }; @hobbits=<$file>; close $fh; } foreach { @hobbits } { copy xsh:new-element("creature","name",.,"race","hobbit") into m:/middle-earth/creatures; } }
Example 8. The same code as a single Perl block
perl { unless (count(//creature[@race='hobbit'])) { open my $file, "hobbits.txt"; foreach (<$file>) { xsh(qq{insert element "<creature name='$_' race='hobbit'>" into m:/middle-earth/creatures}); } close $file; } };
XSH2 allows users to extend the set of XPath functions by
providing extension functions written in Perl. This can
be achieved using the register-function
command. The perl code implementing an extension function
works as a usual perl routine accepting its arguments in
@_
and returning the result. The
following conventions are used:
The arguments passed to the perl implementation by the XPath
engine are simple scalars for string, boolean and float
argument types and
XML::LibXML::NodeList
objects for node-set
argument types. The implementation is
responsible for checking the argument number and types. The
implementation may use general Perl functions as well as
XML::LibXML
methods to process the arguments and return the result.
Documentation for the XML::LibXML
Perl module
can be found for example at http://search.cpan.org/~pajas/XML-LibXML/.
Extension functions SHOULD NOT MODIFY the document DOM tree. Doing so could not only confuse the XPath engine but possibly even result in an critical error (such as segmentation fault). Calling XSH2 commands from extension function implementations is also dangerous and isn't generally recommended.
The extension function implementation must return
a single value, which can be of
one of the following types: simple scalar (a number or
string), XML::LibXML::Boolean
object
reference (result is a boolean value),
XML::LibXML::Literal
object reference
(result is a string), XML::LibXML::Number
object reference (result is a float),
XML::LibXML::Node
(or derived) object
reference (result is a node-set consisting of a single node),
or XML::LibXML::NodeList
(result is a
node-set). For convenience, simple (non-blessed) array
references consisting of
XML::LibXML::Node
objects can also be
used for a node-set result instead of a
XML::LibXML::NodeList
.
In the interactive mode, XSH2 interprets all lines starting
with the exclamation mark (!
) as shell
commands and invokes the system shell to interpret the line
(this is to mimic FTP and similar command-line interpreters).
xsh> !ls -l
-rw-rw-r-- 1 pajas pajas 6355 Mar 14 17:08 Artistic
drwxrwxr-x 2 pajas users 128 Sep 1 10:09 CVS
-rw-r--r-- 1 pajas pajas 14859 Aug 26 15:19 ChangeLog
-rw-r--r-- 1 pajas pajas 2220 Mar 14 17:03 INSTALL
-rw-r--r-- 1 pajas pajas 18009 Jul 15 17:35 LICENSE
-rw-rw-r-- 1 pajas pajas 417 May 9 15:16 MANIFEST
-rw-rw-r-- 1 pajas pajas 126 May 9 15:16 MANIFEST.SKIP
-rw-r--r-- 1 pajas pajas 20424 Sep 1 11:04 Makefile
-rw-r--r-- 1 pajas pajas 914 Aug 26 14:32 Makefile.PL
-rw-r--r-- 1 pajas pajas 1910 Mar 14 17:17 README
-rw-r--r-- 1 pajas pajas 438 Aug 27 13:51 TODO
drwxrwxr-x 5 pajas users 120 Jun 15 10:35 blib
drwxrwxr-x 3 pajas users 1160 Sep 1 10:09 examples
drwxrwxr-x 4 pajas users 96 Jun 15 10:35 lib
-rw-rw-r-- 1 pajas pajas 0 Sep 1 16:23 pm_to_blib
drwxrwxr-x 4 pajas users 584 Sep 1 21:18 src
drwxrwxr-x 3 pajas users 136 Sep 1 10:09 t
-rw-rw-r-- 1 pajas pajas 50 Jun 16 00:06 test
drwxrwxr-x 3 pajas users 496 Sep 1 20:18 tools
-rwxr-xr-x 1 pajas pajas 5104 Aug 30 17:08 xsh
To invoke a system shell command or program from the non-interactive mode or from a complex XSH2 construction, use the exec command.
Since UNIX shell commands are very powerful tool for
processing textual data, XSH2 supports direct redirection of
XSH2 commands output to system shell command. This is very
similarly to the redirection known from UNIX shells, except
that here, of course, the first command in the pipe-line
colone is an XSH2 command. Since semicolon (;
)
is used in XSH2 to separate commands, it has to be prefixed
with a backslash if it should be used for other purposes.
Example 9. Use grep and less to display context of `funny'
xsh> ls //chapter[5]/para | grep funny | less
change system working directory
execute a shell command
expression argument type
index selected nodes by some key value
transform node value/data using Perl or XPath expression
in-line code in Perl programming language
evaluate in-line Perl code
quickly rename nodes with in-line Perl code