The presentation of this document has been augmented to identify changes from a previous version. Three kinds of changes are highlighted: new, added text, changed text, and deleted text.


W3C

File Module

EXPath Candidate Proposed Module 03 December 2013

This version:
http://expath.org/spec/file/20131203
Latest version:
http://expath.org/spec/file
Previous versions:
http://expath.org/spec/file/20120614
http://expath.org/spec/file/20100517
Editors:
Christian Grün, BaseX GmbH
Matthias Brantner, 28msec GmbH
Contributor:
Gabriel Petrovay, 28msec GmbH

This document is also available in these non-normative formats: XML and Revision markup.


Abstract

This proposal provides a file system API for XPath. It defines extension functions to perform file system related operations such as listing, reading, or writing files or directories. It has been designed to be compatible with XQuery 1.0 and XSLT 2.0, as well as any other XPath 2.0 usage.

Table of Contents

1 Status of this document
2 Introduction
    2.1 Namespace conventions
    2.2 File Paths vs. URIs
    2.3 Query Execution
    2.4 Error Management
3 File Properties
    3.1 file:exists
    3.2 file:is-dir
    3.3 file:is-file
    3.4 file:last-modified
    3.5 file:size
4 Input/Output
    4.1 file:append
    4.2 file:append-binary
    4.3 file:append-text
    4.4 file:append-text-lines
    4.5 file:copy
    4.6 file:create-dir
    4.7 file:create-temp-dir
    4.8 file:create-temp-file
    4.9 file:delete
    4.10 file:list
    4.11 file:move
    4.12 file:read-binary
    4.13 file:read-text
    4.14 file:read-text-lines
    4.15 file:write
    4.16 file:write-binary
    4.17 file:write-text
    4.18 file:write-text-lines
5 Paths
    5.1 file:name
    5.2 file:parent
    5.3 file:base-name
    5.4 file:dir-name
    5.5 file:path-to-native
    5.6 file:path-to-uri
    5.7 file:resolve-path
6 System Properties
    6.1 file:dir-separator
    6.2 file:line-separator
    6.3 file:path-separator
    6.4 file:temp-dir

Appendices

A References
B Summary of Error Conditions


1 Status of this document

This document is in a final draft stage. Comments are welcomed at public-expath@w3.org mailing list (archive).

2 Introduction

2.1 Namespace conventions

The module defined by this document defines the functions and variables errors in the namespace http://expath.org/ns/file. In this document, the file prefix, when used, is bound to this namespace URI.

The output prefix is bound to the namespace http://www.w3.org/2010/xslt-xquery-serialization. It is used to specify serialization parameters.

Error codes are defined in the namespace http://expath.org/ns/error. In this document, the err prefix, when used, is bound to this namespace URI.

2.2 File Paths vs. URIs

All file paths are specified as strings, and are resolved against the current working directory. An implementation must accept absolute and relative UNIX/Linux and Windows paths as well as absolute file URIs. Some examples:

  • C:\Test Dir\my file.xml: An absolute path on Windows platforms.

  • /Test Dir/my file.xml: An absolute path on UNIX-based platforms.

  • C:\\\Test Dir//\\my file.xml: An absolute path on Windows platforms that tolerates an arbitrary number of slashes and backslashes.

  • my file.xml: A relative path, pointing to a file in the current working directory.

  • file:///C:/Test%20Dir/my%20file.xml: An absolute file URI on Windows platforms.

  • file:///Test%20Dir/my%20file.xml: An absolute path on UNIX-based platforms.

Before further processing, all paths must first be are normalized to an implementation-defined representation (which usually is the representation of the underlying operating system).

If a function returns a string that refers to a directory, it will always be suffixed with the system-specific directory separator.

The standard function fn:static-base-uri can be used to resolve file operations against the base URI:

let $filename := "input.txt"
let $dir := file:parent(static-base-uri())
let $path := concat($dir, $filename)
return file:read-text($path)

2.3 Query Execution

Functions on File Properties (Section 2) and Input/Output (Section 3) are Some function are marked as ·nondeterministic·, which means they are not guaranteed to perform the same operations and produce identical results from repeated calls. As such, a A query processor must ensure that these functions are not relocated or pre-evaluated and that its results are not cached when compiling and evaluating the query and serializing its results.

2.4 Error Management

Error conditions are identified by a code (a QName). When such an error condition is reached during the execution of the function, a dynamic error is thrown, with the corresponding error code (as if the standard XPath function fn:error had been called).

Error codes are defined through the specification. The generic error [errFILE9999] [file:io-error] with an appropriate message is raised for I/O faults, or for specific errors caused by the underlying platform or programming language. If file operations raise additional, errors, which may be specific to the underlying platform or programming language, the generic error [errFILE9999] [file:io-error] with an appropriate message is raised.

For a list of specific errors see the "Summary of Error Conditions" section of this document.

3 File Properties

3.1 file:exists

file:exists($path as xs:string) as xs:boolean

Tests if a path/URI is already used in the file system. the file or directory pointed by $path exists.

The function returns true() if a file or a directory exists at the location pointed by $path.

This function is ·nondeterministic·.

3.2 file:is-dir

file:is-dir($path as xs:string) as xs:boolean

Tests if $path points to a directory. On UNIX-based systems the root and the volume roots are considered directories.

This function is ·nondeterministic·.

3.3 file:is-file

file:is-file($path as xs:string) as xs:boolean

Tests if $path points to a file.

This function is ·nondeterministic·.

3.4 file:last-modified

file:last-modified($path as xs:string) as xs:dateTime

Returns xs:dateTime representing the last modification time of a file or directory.

This function is ·nondeterministic·.

[errFILE0001][file:not-found] is raised if $path does not exist.
[errFILE9999][file:io-error] is raised if any other error occurs.

3.5 file:size

file:size($file as xs:string) as xs:integer

Returns the byte size of a file as integer, or the value 0 for directories.

This function is ·nondeterministic·.

[errFILE0001][file:not-found] is raised if $path does not exist.
[file:is-dir] is raised if $path points to a directory.
[errFILE9999][file:io-error] is raised if any other error occurs.

4 Input/Output

4.1 file:append

file:append($file as xs:string,
            $items as item()*) as empty-sequence()
file:append($file as xs:string,
            $items as item()*,
            $params as element(output:serialization-parameters)) as empty-sequence()

Appends a sequence of items to a file. If the file pointed by $file does not exist, a new file will be created.

$params controls the way the $items items are serialized. The semantics of $params is the same as for the fn:serialize function in [XQuery and XPath Functions and Operators 3.0]. This consists of an output:serialization-parameters element whose format is defined in [XSLT and XQuery Serialization 3.0]. In contrast to fn:serialize,, the encoding stage will not be skipped by this function.

The function returns the empty sequence if the operation is successful.

This function is ·nondeterministic·.

[errFILE0003][file:no-dir] is raised if the parent directory of $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.2 file:append-binary

file:append-binary($file as xs:string,
                   $value as xs:base64Binary) as empty-sequence()

Appends a Base64 item as binary to a file. If the file pointed by $file does not exist, a new file will be created.

The function returns the empty sequence if the operation is successful.

This function is ·nondeterministic·.

[errFILE0003][file:no-dir] is raised if the parent directory of $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.3 file:append-text

file:append-text($file as xs:string,
                 $value as xs:string) as empty-sequence()
file:append-text($file as xs:string,
                 $value as xs:string,
                 $encoding as xs:string) as empty-sequence()

Appends a string to a file. If the file pointed by $file does not exist, a new file will be created.

The optional parameter $encoding, if not provided, is considered to be UTF-8.

The function returns the empty sequence if the operation is successful.

This function is ·nondeterministic·.

[errFILE0003][file:no-dir] is raised if the parent directory of $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE0005][file:unknown-encoding] is raised if $encoding is invalid or not supported by the implementation.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.4 file:append-text-lines

file:append-text-lines($file as xs:string,
                       $values as xs:string*) as empty-sequence()
file:append-text-lines($file as xs:string,
                       $lines as xs:string*,
                       $encoding as xs:string) as empty-sequence()

Appends a sequence of strings to a file, each followed by the system-dependent newline character. If the file pointed by $file does not exist, a new file will be created.

The optional parameter $encoding, if not provided, is considered to be UTF-8.

The function returns the empty sequence if the operation is successful.

This function is ·nondeterministic·.

[errFILE0003][file:no-dir] is raised if the parent directory of $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE0005][file:unknown-encoding] is raised if $encoding is invalid or not supported by the implementation.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.5 file:copy

file:copy($source as xs:string,
          $target as xs:string) as empty-sequence()

Copies a file or a directory given a source and a target path/URI. The following cases may occur if $source points to a file:

  1. if $target does not exist, it will be created.
  2. if $target is a file, it will be overwritten.
  3. if $target is a directory, the file will be created in that directory with the name of the source file. If a file already exists, it will be overwritten.

The following cases may occur if $source points to a directory:

  1. if $target does not exist, it will be created as directory, and all files of the source directory are copied to this directory with their existing local names.
  2. if $target is a directory, all files are copied from the source the source directory with all its files will be copied into the target directory. If a file already exists, it will be overwritten. At each level, if a file already exists in the target with the same name as in the source, it is overwritten. If a directory already exists in the target with the same name as in the source, it is not removed, it is recursed in place (if it does not exist, it is created before recursing).

Other cases will raise one of the errors listed below.

The function returns the empty sequence if the operation is successful. If an error occurs during the operation, no rollback to the original state will be possible

This function is ·nondeterministic·.

[errFILE0001][file:not-found] is raised if the $source path does not exist.
[errFILE0002][file:exists] is raised if $source points to a directory and $target points to an existing file.
[errFILE0003][file:no-dir] is raised if the parent directory of $source does not exist.
[errFILE0004][file:is-dir] is raised if $source points to a file and $target points to a directory, in which a subdirectory exists with the name of the source file.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.6 file:create-dir

file:create-dir($dir as xs:string) as empty-sequence()

Creates a directory, or does nothing if the directory already exists. The operation will create all non-existing parent directories.

The function returns the empty sequence if the operation is successful.

This function is ·nondeterministic·.

[errFILE0002][file:exists] is raised if the specified path, or any of its parent directories, points to an existing file.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.7 file:create-temp-dir

file:create-temp-dir($prefix as xs:string,
                     $suffix as xs:string) as xs:string
file:create-temp-dir($prefix as xs:string,
                     $suffix as xs:string,
                     $dir as xs:string) as xs:string

Creates a temporary directory and all non-existing parent directories and returns the full path to the created directory.

The temporary directory will not be automatically deleted after query execution. It is guaranteed to not already exist when the function is called.

If $dir is not given, the directory will be created inside the system-dependent default temporary-file directory.

This function is ·nondeterministic·.

[errFILE0002][file:exists] is raised if the specified directory does not exist.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.8 file:create-temp-file

file:create-temp-file($prefix as xs:string,
                      $suffix as xs:string) as xs:string
file:create-temp-file($prefix as xs:string,
                      $suffix as xs:string,
                      $dir as xs:string) as xs:string

Creates a temporary file and all non-existing parent directories and returns the full path to the created file.

The temporary file will not be automatically deleted after query execution. It is guaranteed to not already exist when the function is called.

If $dir is not given, the directory will be created inside the system-dependent default temporary-file directory.

This function is ·nondeterministic·.

[errFILE0002][file:exists] is raised if the specified directory does not exist.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.9 file:delete

file:delete($path as xs:string) as empty-sequence()
file:delete($path as xs:string,
            $recursive as xs:boolean) as empty-sequence()

Deletes a file or a directory from the file system.

If the optional parameter $recursive is set to true(), sub-directories will be deleted as well.

The function returns the empty sequence if the operation is successful.

This function is ·nondeterministic·.

[errFILE0001][file:not-found] is raised if $path does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a non-empty directory.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.10 file:list

file:list($dir as xs:string) as xs:string*
file:list($dir as xs:string,
          $recursive as xs:boolean) as xs:string*
file:list($dir as xs:string,
          $recursive as xs:boolean,
          $pattern as xs:string) as xs:string*

Lists all files and directories in a given directory. The order of the items in the resulting sequence is not defined. The "." and ".." items are never returned. The returned paths are relative to the provided directory $dir.

If the optional parameter $recursive is set to true(), all directories and files will be returned that are found while recursively traversing the given directory.

The optional $pattern parameter defines a name pattern in the glob syntax. If this is provided, only the paths of the files and directories whose names are matching the pattern will be returned.

An implementation must support at least the following glob syntax for the pattern:

  • * for matching any number of unknown characters and
  • ? for matching one unknown character.

This function is ·nondeterministic·.

[errFILE0003][file:no-dir] is raised $dir does not point to an existing directory.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.11 file:move

file:move($source as xs:string,
          $target as xs:string) as empty-sequence()

Moves a file or a directory given a source and a target path/URI. The following cases may occur if $source points to a file:

  1. if $target does not exist, it will be created.
  2. if $target is a file, it will be overwritten.
  3. if $target is a directory, the file will be created in that directory with the name of the source file. If a file already exists, it will be overwritten.

The following cases may occur if $source points to a directory:

  1. if $target does not exist, it will be created as directory, and all files of the source directory are moved to this directory with their existing local names.
  2. if $target is a directory, all files are moved from the source the source directory with all its files will be moved into the target directory. If a file already exists, it will be overwritten. If the target directory contains a directory with the same name as the source, the error [file:is-dir] is raised.

Other cases will raise one of the errors listed below.

The function returns the empty sequence if the operation is successful. If an error occurs during the operation, no rollback to the original state will be possible

This function is ·nondeterministic·.

[errFILE0001][file:not-found] is raised if the $source path does not exist.
[errFILE0002][file:exists] is raised if $source points to a directory and $target points to an existing file.
[errFILE0003][file:no-dir] is raised if the parent directory of $source does not exist.
[errFILE0004][file:is-dir] is raised if $source points to a file and $target points to a directory, in which a subdirectory exists with the name of the source file.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.12 file:read-binary

file:read-binary($file as xs:string) as xs:base64Binary
file:read-binary($file as xs:string,
                 $offset as xs:integer) as xs:base64Binary
file:read-binary($file as xs:string,
                 $offset as xs:integer,
                 $length as xs:integer) as xs:base64Binary

Returns the content of a file in its Base64 representation.

The optional parameters $offset and $length can be used to read chunks of a file.

This function is ·nondeterministic·.

[errFILE0001][file:not-found] is raised if $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE0006][file:out-of-range] is raised if $offset or $length is negative, or if the chosen values would exceed the file bounds.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.13 file:read-text

file:read-text($file as xs:string) as xs:string
file:read-text($file as xs:string,
               $encoding as xs:string) as xs:string

Returns the content of a file in its string representation.

The optional parameter $encoding, if not provided, is considered to be UTF-8.

This function is ·nondeterministic·.

[errFILE0001][file:not-found] is raised if $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE0005][file:unknown-encoding] is raised if $encoding is invalid or not supported by the implementation.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.14 file:read-text-lines

file:read-text-lines($file as xs:string) as xs:string*
file:read-text-lines($file as xs:string,
                     $encoding as xs:string) as xs:string*

Returns the contents of a file as a sequence of strings, separated at newline boundaries.

The optional parameter $encoding, if not provided, is considered to be UTF-8.

The newline handling is the same as for the fn:unparsed-text-lines function in [XQuery and XPath Functions and Operators 3.0].

This function is ·nondeterministic·.

[errFILE0001][file:not-found] is raised if $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE0005][file:unknown-encoding] is raised if $encoding is invalid or not supported by the implementation.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.15 file:write

file:write($file as xs:string,
           $items as item()*) as empty-sequence()
file:write($file as xs:string,
           $items as item()*,
           $params as element(output:serialization-parameters)) as empty-sequence()

Writes a sequence of items to a file. If $file already exists, it will be overwritten; otherwise, it will be created.

$params controls the way the $items items are serialized. The semantics of $params is the same as for the fn:serialize function in [XQuery and XPath Functions and Operators 3.0]. This consists of an output:serialization-parameters element whose format is defined in [XSLT and XQuery Serialization 3.0]. In contrast to fn:serialize, the encoding stage will not be skipped by this function.

The function returns the empty sequence if the operation is successful.

This function is ·nondeterministic·.

[errFILE0003][file:no-dir] is raised if the parent directory of $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.16 file:write-binary

file:write-binary($file as xs:string,
                  $value as xs:base64Binary) as empty-sequence()
file:write-binary($file as xs:string,
                  $value as xs:base64Binary,
                  $offset as xs:integer) as empty-sequence()

Writes a Base64 item as binary to a file. If $file already exists, it will be overwritten; otherwise, it will be created.

If the optional parameter $offset is specified, data will be written to this file position. An existing file may be resized by that operation.

The function returns the empty sequence if the operation is successful.

This function is ·nondeterministic·.

[errFILE0003][file:no-dir] is raised if the parent directory of $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE0006][file:out-of-range] is raised if $offset is negative, or if it exceeds the current file size.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.17 file:write-text

file:write-text($file as xs:string,
                $value as xs:string) as empty-sequence()
file:write-text($file as xs:string,
                $value as xs:string,
                $encoding as xs:string) as empty-sequence()

Writes a strings to a file. If $file already exists, it will be overwritten.

The optional parameter $encoding, if not provided, is considered to be UTF-8.

The function returns the empty sequence if the operation is successful.

This function is ·nondeterministic·.

[errFILE0003][file:no-dir] is raised if the parent directory of $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE0005][file:unknown-encoding] is raised if $encoding is invalid or not supported by the implementation.
[errFILE9999][file:io-error] is raised if any other error occurs.

4.18 file:write-text-lines

file:write-text-lines($file as xs:string,
                      $values as xs:string*) as empty-sequence()
file:write-text-lines($file as xs:string,
                      $values as xs:string*,
                      $encoding as xs:string) as empty-sequence()

Writes a sequence of strings to a file, each followed by the system-dependent newline character. If $file already exists, it will be overwritten; otherwise, it will be created.

The optional parameter $encoding, if not provided, is considered to be UTF-8.

The function returns the empty sequence if the operation is successful.

This function is ·nondeterministic·.

[errFILE0003][file:no-dir] is raised if the parent directory of $file does not exist.
[errFILE0004][file:is-dir] is raised if $file points to a directory.
[errFILE0005][file:unknown-encoding] is raised if $encoding is invalid or not supported by the implementation.
[errFILE9999][file:io-error] is raised if any other error occurs.

5 Paths

None of the functions in this section performs any check regarding the existence of the received or returned paths.

5.1 file:name

file:name($path as xs:string) as xs:string

Returns the name of a file or directory.

An empty string is returned if the path points to the root directory, or if it contains no directory separators.

This function is ·deterministic· (no path existence check is made).

5.2 file:parent

file:parent($path as xs:string) as xs:string?

Transforms the given path into an absolute path, as specified by file:resolve-path, and returns the parent directory.

An empty sequence is returned if the path points to a root directory.

This function is ·nondeterministic·.

5.3 file:base-name

file:base-name($path as xs:string) as xs:string
file:base-name($path as xs:string,
               $suffix as xs:string) as xs:string

Returns the last component from $path, deleting any trailing directory separators. If $path consists entirely of directory separator, the empty string is returned. If $path is the empty string, the string "." is returned, signifying the current working directory.

If $suffix is present, it will be trimmed from the end of the result. This can be used to eliminate file extensions.

No path existence check is made.

5.4 file:dir-name

file:dir-name($path as xs:string) as xs:string

This function returns a string denoting the parent directory of $path. Any trailing directory separators are not counted as part of the directory name. If the specified string is empty or contains no directory separators, "." is returned, signifying the current directory. If the specified string is empty or contains no directory separators, it is replaced with a single dot (.), signifying the current directory. If the resulting string does not end with a directory separator, it will be suffixed with the system-dependent directory separator.

No path existence check is made.

5.5 file:path-to-native

file:path-to-native($path as xs:string) as xs:string

Transforms a URI, an absolute path, or relative path to a canonical, system-dependent path representation. A canonical path is both absolute and unique and thus contains no redirections such as references to parent directories or symbolic links.

No path existence check is made.

If the resulting path points to a directory, it will be suffixed with the system-specific directory separator.

This function is ·nondeterministic·.

[errFILE9999][file:io-error] is raised if an error occurs while trying to obtain the native path.

5.6 file:path-to-uri

file:path-to-uri($path as xs:string) as xs:anyURI

Transforms a file system path into a URI with the file:// scheme. If the path is relative, it is first resolved against the current working directory.

No path existence check is made.

This function is ·deterministic· (no path existence check is made).

5.7 file:resolve-path

file:resolve-path($path as xs:string) as xs:string

Transforms a relative path into an absolute operating system path by resolving it against the current working directory.

No path existence check is made.

If the resulting path points to a directory, it will be suffixed with the system-specific directory separator.

This function is ·nondeterministic·.

6 System Properties

6.1 file:dir-separator

file:dir-separator() as xs:string

Returns the value of the operating system-specific directory separator, which usually is / on UNIX-based systems and \ on Windows systems.

This function is ·nondeterministic·.

6.2 file:line-separator

file:line-separator() as xs:string

Returns the value of the operating system-specific line separator, which usually is 
 on UNIX-based systems, 
 on Windows systems and 
 on Mac systems.

This function is ·nondeterministic·.

6.3 file:path-separator

file:path-separator() as xs:string

Returns the value of the operating system-specific path separator, which usually is : on UNIX-based systems and ; on Windows systems.

This function is ·nondeterministic·.

6.4 file:temp-dir

file:temp-dir() as xs:string

Returns the path to the default temporary-file directory of an operating system.

This function is ·nondeterministic·.

A References

XSLT and XQuery Serialization 3.0
XSLT and XQuery Serialization 3.0. Henry Zongaro. W3C Working Draft 14 December 2010.
XQuery and XPath Functions and Operators 3.0
XPath and XQuery Functions and Operators 3.0. Michael Kay. W3C Working Draft 14 December 2010.

B Summary of Error Conditions

file:not-found
The specified path does not exist.
file:exists
The specified path already exists.
file:no-dir
The specified path does not point to a directory.
file:is-dir
The specified path points to a directory.
file:unknown-encoding
The specified encoding is not supported.
file:out-of-range
The specified offset or length is negative, or the chosen values would exceed the file bounds.
file:io-error
A generic file system error occurred.