This proposal defines a packaging system for various core XML technologies: XSLT, XQuery, and XProc. The goal is to define it in a way generic enough so to adapt it to other technologies in the future (such as XML Schema, XForms, etc.) using the same framework. Besides enabling the delivery of libraries written in standard XSLT, XQuery and XProc, it provides support for extensions specific to some processors, as well as enabling new processors to be supported by using the same framework.
Must be ignored, but is required by the schema...
revisiondesc
XSLT, XQuery and XProc are amazing programming languages. But they lack a large choice of libraries, and when such libraries do exist, this is a challenge to install. There is no automatic install process, the rules are different for each processor, library authors do not follow the same rules regarding the info they provide, the cataloging, the way they reference third-party libraries, etc.
All those problems (well, most of them) can be addressed by a packaging system that is broadly adopted by processor vendors and library authors. The cornerstone of such a system is the packaging format: a description of the information to be provided by the library authors and how to provide and structure them.
A
A package has a unique name (a URI) as well as a convenient short name, also known as
its
A package provides all those infos as well as its components as a single file. This is then very convenient to organize packages, publish them, give them to a processor to install them, etc.
A
All the components composing the package, alongside an additional expath-pkg.xml
at the root of the ZIP
file, and containing information about the package (like its name and its version
number) and about the components it provides and how to reference them.
The package descriptor is an XML file, named expath-pkg.xml
and located at
the root of the ZIP file. It describe the whole package, and all of its components.
Alongside this descriptor, the root of the ZIP file contains a directory named after the
module abbrev, which contains the components and any other file the package
needs. All the relative URIs used to identify components are relative to the package
directory. See
Because this package format is designed to be extensible and used as a building block by other specifications, the ZIP file can contain another entries at top-level. They are just ignored when the package is deployed as a simple package following this specification.
All the elements in the package descriptor are defined in the namespace
http://expath.org/ns/pkg
. All the elements defined in this
specification or used in samples and in text are in this namespace, even if no prefix
is used. The root element is package
, and contains, besides some naming
and versionning attributes, a title, an optional home URI, an optional list of
dependencies on other packages and on processors, and a list of components:
name
is the name of the package. A package is named using an absolute
URI, except any file:
scheme URIs (most frequent choices are
http:
and urn:
scheme URIs). abbrev
is the
package abbrev, and version
its version. The package directory must have
the exact same name as the abbrev. spec
is the version of the packaging
specification the package conforms to. The current specification requires the package
to use the spec number 1.0
(no forward compatibility rules are defined,
a processor conforming to this specification has to generate an error if the spec
number is different than the string 1.0
). The package then contains a
title, which is a plain string, intended to be a simple description of the package
for humans, and a home which is a URI to find more informations about the package. It
then contains the list of its dependencies and the list of its components.
Dependencies are set by using the name of other packages this package depends on. The
dependency can also define the version of the package or the set of processors it
depends on by using one of the few available strategies; see
This section describes the standard component kinds supported by this specification, and how they contribute to the package descriptor document type.
An XSLT file is associated a xsl:import
) to import the XSLT
file provided in the package. This file is configured with the element
xslt
.
The element file
contains the path to the file within the package
structure, relative to the package directory. Both elements
import-uri
and file
are of type
anyURI
.
An XQuery library module is referenced by its namespace URI. Thus the
xquery
element associates a namespace URI to an XQuery file. An
importing module just need to use an import statement of the form import
module namespace xx = "<namespace-uri>";
.
An XQuery main module is associated a public URI. Usually an XQuery package will provide functions through library modules, but in some cases one can want to provide main modules as well.
Note that there is no way to set any location hint (as the at
clause
in the import statement.) To use this packaging system, an XQuery library module
must be referenced by its target namespace.
An XProc pipeline, like an XSLT stylesheet, is associated a p:import
statement.
An XML schema can be imported using its target namespace. It is not possible to set several files as several sources for the schema. If the schema is spread over multiple files, there must be one top-level file that includes the other files.
The import-uri
can be used to define a schema location for this
schema component. This can be useful for schema without target namespace, or for
some specific usages, like when using xs:redefine
.
A RelaxNG schema, like an XSLT stylesheet, is associated a public import URI,
aimed to be used in an
A Schematron schema is associated a public URI.
An NVDL script is associated a public URI.
Documentation (like result of XSLStyle or xqDoc) is not taken into account in the packaging format, though that could be used by IDEs for instance to provide documentation for functions in an editor with a live completion feature. Some support for documentation can of course be added as a product-specific feature to the package descriptor.
A
The installed package list is implementation-defined. Each implementation (for a
specific processor) can define its own way to install and remove packages, as long as it
properly documents it. A processor should use, when appropriate, the
Whether or not such a repository exists (or several repositories), the implementation must define an installed packages list. How this is done is outside the scope of this spec. An XML IDE could provide a way to select packages to activate for a specific scenario, or a web server container could activate packages on a per-web application basis.
When a reference to a file of a specific kind is done via an absolute URI, a processor must look up for this URI in the corresponding URI space in the repository. How the repository is set to the processor is implementation-defined (a processor can also use a list of repositories, and enable or disable some libraries in any implementation-defined way.)
The URI space to use is defined by the nature of the reference. An XSLT
href
attribute on xsl:import
will use the xslt
URI space, while it will use the xsd
space for
xsl:import-schema
.
An XProc processor in particular has to pay great attention to the space it uses
regarding the step that is being evaluated. Any xsl:import
instruction
encountered on the stylesheet
port of the step p:xslt
has
to be looked for in the xslt
space (regardless if the stylesheet
document is inlined in the pipeline, computed, loaded from the file system or
retrieved from the Internet, or if the containing stylesheet has been imported
itself.)
The XProc elements p:document
and p:data
, as well as the
step p:load
are handled specially. They can be used to access any kind
of resource, including but not limited to components in a repository. The user has to
tell explicitly the processor what kind of component is looked for by using the
pkg:kind
extension attribute. For instance, a stylesheet can be
loaded from a repository as input to the step xslt
as following:
Every package has a name (a URI) and a version number. It can also have a set of dependencies on other packages, identified by their names. Each dependency can also define a specific version (or range of versions) of the package it depends on. In addition to depending on other packages, a package can also depend on a specific processor (or on one among a set of specifc processors). This section uses the following example package to illustrate those concepts:
At first glance, we can see that this package (which is named
http://example.org/app
and has the version number 1.0
),
depends on the package http://external.org/library
(no specific version),
on the package http://partner.com/lib-2
(version 2, whatever minor revision
numbers) and on the Saxon processor, Professional Edition. The rest of this section
explains in details the rules behind those dependencies, how to represent them, and
their semantics.
The remainder of this section will use the term
A dependency on another package (usually called a dependency
, with an attribute package
which
is the name of the library the package depends on. The dependency can be versioned;
that is, only some versions of the library are acceptable for this package.
The versionning attributes are versions
, semver
,
semver-min
and semver-max
. They are all mutually
exclusive, except semver-min
and semver-max
. If no
versionning attribute is set on the dependency, any version of the secondary package,
identified by its name (the URI in the package
attribute), is
acceptable. If the versions
attribute is used, it defines the exact set
of acceptable versions for the secondary package, separated by spaces.
The other versionning attributes use
If the semver
attribute is used, the secondary package version must be
compatible with this template. If semver-min
is used, the secondary
package version must be either compatible with that SemVer template, or greater than
it. If semver-max
is used, the secondary package version must be either
compatible with that SemVer template, or lower than it. For instance, with
semver-min="2.3"
and semver-max="3"
, any version of the
secondary package which is equal or above 2.3.0 and strictly below 4.0.0 is valid
(that is, 2.3.0, 3.0.0, 3.99.87, etc.)
In addition to depending on other packages, a dependency can tell which processor(s) is (are) needed for this package:
If there is several such dependencies, the package is supported by a specific processor as soon as one of the proc dependencies does match. The meaning of the versionning attributes on processor dependencies is defined by each processor.
How those rules are enforced is implementation-defined. For instance a command line tool to install a package in a repository can generate an error if the required dependencies are not installed, try to install them automagically from the web, or just emit a warning install the package hoping the dependencies to be installed later on, and provide an option to select the behaviour to use. Another implementation can enforce the dependencies at compile- or at run-time. An implementation SHOULD provide a way to enforce those rules, but can also provide an option to ignore them.
From the above text, a version number is not defined to have any particular
semantics, except when used with one of the several SemVer attributes. The only
constraint is that a version number cannot contain any whitespace character (as
defined by the
This section defines a standard structure for on-disk repositories. An implementation can choose not to support this kind of repository and to define its own one (or even to not define it publicly, just to provide the ability to install and remove packages, in a clearly documented way.) However, there are several advantages to support this structure, the most obvious one is to be able to benefit from existing tools to manage such repositories as well as existing libraries to access those repositories.
The repository is a directory on the file system, within which the packages are
installed. Each package results in a subdirectory in the repository, where the content
of the XAR file is unzipped. Next to those package directories, there is also a
subdirectory named .expath-pkg/
, which contains administrative information
about the packages installed in this repository. The package directories should be named
after the package abbreviation (the abbrev
attribute in the package
descriptor) and the version number, separated by a dash. A package directory name must
not contain the space character.
Because these directory names cannot be ensured to be unique among every packages installed in the repository, and because the processors will in some case have to translate some characters (depending on the rules of the filesystem they put the repository on), they cannot be used as a perfect mapping between package abbreviations and directories. It is only described here as a note, in the hope that the several implementations will use similar naming strategies, meaningful for humans.
The repository maintains a list of installed packages. For each of them, this list
contains its name (the name URI), its subdirectory name, and its version number. This
list is maintained within two files: .expath-pkg/packages.xml
and
.expath-pkg/packages.txt
. Both contain the same information, the former
in an XML format and the later as plain text. The XML format looks like the following
(see
The text file is a line-separated file (either line termination can be used:
\r
, \n
or \n\r
, though the standard Internet
line termination is preferred: \n\r
). Each line represent a package, and is
a space-separated record. The first field is the package directory (the name if the
subdirectory within the repository where the package has been unzipped), the second one
is the package name (the name URI), and the last one is the version number. The two
first fields cannot contain any space character. The version number is the only one that
can contain a space character (though this is not recommended). Every field is separated
by exactly one space character (the Unicode character with code point #x20). This file
then looks like:
The overall repository thus looks like:
Because the package descriptor is unzipped in the package directory, as well as the
rest of the files contained in the package, this is easy to find the mapping between
package directories and package names and versions: just look into the package
descriptor into every package directory, and you will find what you are looking for.
But in order to allow other programs to access easily the list of installed packages,
we provide the list in .expath-pkg/packages.xml
. And because shell
scripts must be able to access that list as well, a second file, text-oriented, is
provided; of course this is redundant information, but it is maintained automatically
by the repository manager, so the redundancy is not an issue.
In addition to this structure, a specific processor can store information in
subdirectories (both in the repository level or within each package directories),
provided those directories start with a dot, and are named unambiguously with the name of
the processor. For instance [repository]/.saxon/
, or
[repository]/[some-pkg]/.exist/
.
This section provides a non-normative example to illustrate the concepts defined here.
Instead of using a
The first thing to do is to create a ZIP file with both of those components, alongside a
expath-pkg.xml
at the root of the package, 2) the library content is
in a directory at the root of the package (aka the abbrev
attribute,
and must be a valid NCName. The structure (the content) of the package directory is
completely free. In our case, let's just put both component files directly in the
package directory, and define the module name as functx
:
The XQuery library module's target namespace is defined by the module itself. For the
XSLT stylesheet, we have to define its xsl:import
(or any other means, for instance within XProc or an
IDE scenarii system). Let's define it as http://www.functx.com/functx.xsl
.
The package descriptor thus looks like the following:
We just have to create a ZIP file with this structure and content. The convention is to
call this file functx-1.0.xar
(that is,
[
The content of .expath-pkg/packages.xml
is:
The content of .expath-pkg/packages.txt
is:
The package directory functx-1.0/
has been named after the abbrev and the
version number. Its content is simply the content of the XAR file, unzipped. A user can
then import the FunctX library from either an XQuery module or an XSLT stylesheet by
using respectively an import statement:
or an import instruction:
The package format defined in this specification is a complete system to package
libraries, for the right definition of a library. But a specfic kind of
Another extensibility point is the definition of additional component types. The component types defined here are the standard types, but an implementation may support additional implementation-defined types.
More generally, an implementation may define any extension element in the package descriptor to achieve its purposes, providing that the new elements it defines are neither in the EXPath Packaging namespace nor in the null namespace.