XQuery Update

From BaseX Documentation
Jump to navigation Jump to search

This article is part of the XQuery Portal. It summarizes the update features of BaseX.

BaseX offers a complete implementation of the XQuery Update Facility (XQUF). This article aims to provide a quick and basic introduction to XQUF. First, some examples for update expressions are given. Next, the challenges are addressed that arise due to the functional semantics of the language.

Features

Updating Expressions

There are five new expressions to modify data. While insert, delete, rename and replace are basically self-explanatory, the transform expression is different, as modified nodes are copied in advance and the original databases remain untouched.

An expression consists of one or more target nodes (the nodes we want to alter) and (depending on the expression type) additional information like nodes to be inserted, a QName, etc., and optional modifiers. You can find a few examples and additional information below.

insert

insert node (attribute { 'a' } { 5 }, 'text', <e/>) into /n

Insert enables you to insert a sequence of nodes into a single target node. Several modifiers are available to specify the exact insert location: insert into as first/as last, insert before/after and insert into.

Note: in most cases, as last and after are evaluated faster than as first and before.

delete

delete node //n

The example query deletes all <n> elements in your database. In contrast to other updating expressions, multiple nodes can be supplied as a target.

replace

replace node /n with <a/>

The target element is replaced by the DOM node <a/>. You can also replace the value of a node and its descendants by using the modifier value of:

replace value of node /n with 'newValue'

All descendants of /n are deleted, and the supplied text is inserted as the only child. The result of the insert sequence is either a single text node or an empty sequence. If the insert sequence is empty, all descendants of the target are deleted. Consequently, replacing the value of a node leaves the target with either a single text node or no descendants at all.

rename

for $n in //originalNode
return rename node $n as 'renamedNode' 

All originalNode elements are renamed. A loop can be used to modify multiple nodes within a single statement. Nodes on the descendant or attribute axis of the target are not affected.

Main-Memory Updates

With the following expressions, copies of nodes are created, which can then be modified with the already presented updating expressions. As the original node will not be changed, the expressions are called non-updating.

copy/modify/return

copy $c := doc('example.xml')//originalNode
modify rename node $c as 'copyOfNode'
return $c

A copy of the originalNode element is created, renamed and returned; the original document will not be updated.

In the following example, multiple update operations are performed on the copied node:

Query
copy $c :=
  <entry>
    <title>Transform expression example</title>
    <author>BaseX Team</author>
  </entry>
modify (
  replace value of node $c/author with 'BaseX',
  replace value of node $c/title with concat('Copy of: ', $c/title),
  insert node <author>Joey</author> into $c
)
return $c
Result
<entry>
  <title>Copy of: Transform expression example</title>
  <author>BaseX</author>
  <author>Joey</author>
</entry>

Instead of the main-memory <entry> element, a database node can be supplied:

copy $c := (db:get('example')//entry)[1]
...

In this case, the database node remains untouched, as all updates are performed on the node copy.

Entire documents can be copied and modified:

copy $doc := doc("zaokeng.kml")
modify (
  for $point in $doc//*:Point
  return insert node (
    <extrude>1</extrude>,
    <altitudeMode>relativeToGround</altitudeMode>
  )  before $point/*:coordinates
)
return $doc

update

The update expression is a BaseX-specific convenience operator for the bulky copy/modify/return construct. Similar to the XQuery 3.0 map operator, the nodes resulting from the first expression are bound as context items, and the bracketed expressions performs updates on the item. The updated nodes is returned as result:

for $item in db:get('data')//item
return $item update {
  delete node ./text()
}

If multiple nodes are supplied as input, the updates will subsequently be performed on each node:

db:get('data')//item update {
  delete node text()
}

It is easy to chain subsequent update expressions:

<root/> update {
  insert node <child/> into .
} update {
  insert node "text" into child
}

transform with

The transform with expression was added to the current XQuery Update 3.0 working draft. It is a simplified version of the update expression (it is limited to single input nodes and cannot be chained):

<xml>text</xml> transform with {
  replace value of node . with 'new-text'
}

Functions

Built-in Functions

Numerous Database Functions exist in BaseX for performing document- and database-wide updates.

XQUF provides a single function fn:put() for serializing nodes to secondary storage:

  • The function will be executed after all other updates.
  • Serialized documents therefore reflect all changes made effective during a query.
  • No files will be created if the addressed nodes have been deleted.
  • Serialization parameters can be specified as third argument (more details are found in the XQUF 3.0 Specification).

If you want to write intermediate results to files, it is more flexible to use file:write.

User-Defined Functions

Functions that performs updates need to be marked with an %updating annotation:

declare %updating function local:add($target, $node) {
  insert node $node into $target
};

<node/> update {
  local:add(., <sub/>)
}

If update operations are defined in an anonymous function, it may be necessary to call the function with an additional updating keyword:

let $add := %updating function($target, $node) {
  insert node $node into $target
}
return <node/> update {
  updating $add(., <sub/>)
}

Concepts

In addition to the simple expression, XQUF introduced updating expressions:

  • All existing expressions are simple expressions. If such an expression is evaluated, the result is a sequence of items.
  • Updating expressions, which are presented in this article, result in a list of update primitives that are added to the Pending Update List.

Pending Update List

Updating statements are not executed immediately, but are first collected as update primitives within a set-like structure, the so-called Pending Update List (PUL). After the evaluation of the query, and after some consistency checks and optimizations, the update primitives will be applied in the following order:

If an inconsistency is found, an error message is returned and all accessed databases remain untouched (ensuring atomicity). For the user, this means that updates are only visible after the end of a snapshot.

It may be surprising to see db:create in the lower part of this list. This means that a newly created database cannot be accessed by the same query, which can be explained by the semantics of updating queries: all expressions can only be evaluated on databases that already exist while the query is evaluated. As a consequence, db:create is mainly useful in the context of Command Scripts, or Web Applications, in which a redirect to another page can be triggered after having created a database.

Example

The query…

insert node <b/> into /doc,
/doc/* ! (rename node . as 'renamed')

…applied on the document…

<doc> <a/> </doc>

…results in the following document:

<doc> <renamed/><b/> </doc>

Despite explicitly renaming all child nodes of <doc/>, the former <a/> element is the only one to be renamed. The <b/> element is inserted within the same snapshot and is therefore not yet visible to the user.

Returning Results

By default, it is not possible to mix different types of expressions in a query result. The root expression of a query must be a sequence of updating expressions. But there are two ways out:

  • The BaseX-specific update:output function bridges this gap: it caches the results of its arguments at runtime and returns them after all updates have been processed. The following example performs an update and returns a success message:
update:output("Update successful."), insert node <c/> into doc('factbook')/mondial
  • With MIXUPDATES, all updating constraints will be turned off. Returned nodes will be copied before they are modified by updating expressions. An error is raised if items are returned within a transform expression.

If you want to modify nodes in main memory, you can use the transform expression.

Effects

Original Files

In BaseX, all updates are performed on database nodes or in main memory. By default, update operations do not affect the original input file (the info string "Updates are not written back" appears in the query info to indicate this). The following solutions exist to write XML documents and binary resources to disk:

  • Updates on main-memory instances of files that have been retrieved via fn:doc or fn:collection will be propagated back to disk if WRITEBACK is turned on. This option can also be activated on command line via -u. Make sure you back up the original documents before running your queries.
  • Functions like fn:put or file:write can be used to write single XML documents to disk. With file:write-binary, you can write binary resources.
  • The EXPORT command can be used write all resources of a databases to disk.

Indexes

Index structures are discarded after update operations when UPDINDEX is turned off (which is the default). More details are found in the article on Indexing.

Error Messages

Along with the Update Facility, a number of new error codes and messages have been added to the specification and BaseX. All errors are listed in the XQuery Errors overview.

Please remember that the collected updates will be executed after the query evaluation. All logical errors will be raised before the updates are actually executed.

Changelog

Version 10.0
  • Updated: db:put-binary is executed before XQuery Update expressions.
  • Updated: update: Curly braces are now mandatory.
Version 9.0
Version 8.5
Version 8.0
  • Added: MIXUPDATES option for Returning Results in updating expressions
  • Added: information message if files are not written back
Version 7.8
  • Added: update convenience operator