XQuery/All Leaf Paths
Motivation
[edit | edit source]You want to generate a list of all leaf paths in a document or document collection.
This process is very useful to get to know a new data set. Specifically you will find that the leaf elements in an XML file carry much of the data in a data-style markup. These leaf elements frequently are used to carry the most semantics or meaning within the document. They for the basis for a semantic inventory of the document. That is each leaf element should be able to be associated with a data definition.
Leaf elements are also good targets for indexing within your index configuration file.
Example
[edit | edit source]Method
[edit | edit source]We will use the functx leaf-elements() function
functx:leaf-elements($nodes*) xs:string*
This function takes as input, one or more nodes and returns an array of strings.
Example Output
[edit | edit source]For the demo play Hamlet that is included in the eXist demo set the file /db/shakespeare/plays/hamlet.xml will generate the following output:
PLAY
TITLE
FM
P
PERSONAE
PERSONA
PGROUP
GRPDESCR
SCNDESCR
PLAYSUBT
ACT
SCENE
STAGEDIR
SPEECH
SPEAKER
LINE
Source Code to leaf-elements
[edit | edit source]declare namespace functx = "http://www.functx.com";
declare function functx:leaf-elements ($root as node()?) as element()* {
$root/descendant-or-self::*[not(*)]
};
This query uses the descendant-or-self::* function with the predicate [not(*)] to qualify only elements that do not have child nodes.
Example XQuery
[edit | edit source]xquery version "1.0";
declare namespace functx = "http://www.functx.com";
declare function functx:distinct-element-names($nodes as node()*) as xs:string* {
distinct-values($nodes/descendant-or-self::*/local-name(.))
};
let $doc := doc('/db/shakespeare/plays/hamlet.xml')
let $distinct-element-names := functx:distinct-element-names($doc)
let $distinct-element-names-count := count($distinct-element-names)
return
<ol>{
for $distinct-element-name in $distinct-element-names
order by $distinct-element-name
return
<li>{$distinct-element-name}</li>
}</ol>
Adding Attributes
[edit | edit source]You can also run a query that will get all the distinct attributes. Attributes are all considered leaf data types since they can never have child elements.
declare function functx:distinct-attribute-names($nodes as node()*) as xs:string* {
distinct-values($nodes//@*/name(.))
};
This query says in effect to "get all the all the distinct attribute names in the input nodes".
For the MODS demo file: doc('/db/mods/01c73f2b05650de2e6124d9d113f40be.xml')
You will get the following attributes:
- type
- encoding
- authority
</syntaxhighlight>
References
[edit | edit source]Documentation on xqueryfunctions.com web site.