Developing Marklogic Applications I - XQuery

These are my notes from the MarkLogic training course Developing Marklogic Applications I - XQuery

Day 1: Learning XQuery Language

Understanding MarkLogic and XQuery

MarkLogic is good at supporting lots of data types/formats (xml, json, binaries, geospatial…)

Data can be enriched in the database. EG. Reverse geocoding coordinates and storing in db. Can occur on loading the data or transitionally after inital data is loaded. Additional data enables faceted search, like amazon search constraints.

MarkLogic can use (semantic)triples to enchich stored data. Triples are avaliable on the freely open web. (EG. DBpedia)

Triple :: Subject - Predicate - Object(value). Eg. US:captial:Washington

MarkLogic's value proposition is it's ability to integrate data from many data sources and silos.

MarkLogic :: an ACID, transactional database.

Getting Started with XQuery

Course Materials were.are avaliable on the VM and ftp://ftp.marklogic.com/outgoing/training/ml8-d1x.zip

Describe XQuery, XPath, XSLT and their relationship
  • XQuery
    • Server side
    • Application server can eval the code
    • Application service is a webserver also
    • is a functional programming language

One of the avaliable languages in ML XQuery Core logic is FLWOR expresssion. Output xquery files of .xqy or .xq XQuery is a W3C specification langauges but ML addes addiitonal languages.

  • XPath
    • Abiliatry to navigate and extract data
    • Visualises XML sas a tree structure
  • XSLT
    • Designed for transfomations
Load XML documents into a MarkLogic Server database
  • Xquery dataloading methods
    xdmp
    xml datamanagement platform
    xdmp:document-load()  ;; low-level
    info:load()  ;; higher-level
    
    xdmp:document-insert()
    
    ;;   WebDAV is slow
    ;; Record Loader / MLCP is lightweight & faster
    
Install an XQuery editor

Lots of options avaliable. Some Tools are providedby MarkLogic.

  • Notepadd++,
  • Oxygen,
  • Eclipse + XQDT
Create an application using XQuery
  • Anatomy of an XQuery file

    Each file has a prolog & query body.

    1. Prolog
      1. Version declaration
      2. Namespace declaration
      3. Module import statements
      4. Variable & function declaration
    2. Query Body
  • XQuery Syntax
    • Semicolons in prolog
    • Markup and presentation can be mixed in the query body.
      • App server evals anything in {} as xquery code
    • $someVariable
    • declare variable $name := asdf
    • (: This is a comment. :)
    • (1,2,3) (Sequences can be empty)
    • Semicolons in the query bode are transaction separator
Unit 2: Getting started with XQuery
  • Install Marklogic
    1. run msi.
    2. Start MarkLogic service
    3. Setup server install at localhost:8001
      1. Skip the cluster setup
xdmp:set-responsecontent-type("text/html; charset=utf-8")
declare version "1.0-ml";
declare variable $count := 4;
xmdp:set-response-content-type("text/html; charset-utf-8"),
'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3c.org/TR/xhtml1/DTD/xhtml1-strict.dtd">'
<h3>{("There are ", $count, " tickets left.")}</h3>
(: Import the marklogic admin module from the web. :)
import module namespace admin = "http://marklogic.com/xdmp/admin" at "/MarkLogic/admin.xqy";

Writing XPath Expressions

xmdp:document-insert(
  "/library.xml", <bookstore>
  <book category="COOKING">
  <title en="lang">Everday Italian</title>
  <author>Giada De Laurentiis</author>
  <price>20.00</price>
  </book>
</bookstore>
)

MarkLogic will recognise the ~lang="en"~ attribute in XML. This allows multi-language content. MarkLogic can use this settings and change things like stemming rules while searching. If not defined it uses the default for the database.

fn:doc("/library.xml")
doc("/libary.xml")  (: `fn` namespace is optional because it is the default namespace. :)
fn:doc()  (: This returns all documents from the database. :)

XPath allows navigating xml. XPath are steps of expressions.

(: xpath can be dropped into expression :)
xquery version "1.0-ml";
/bookstore/book/title/string()

Not defining a namespace puts the document into the empty namespace

(: update through overriding :)
xmdp:document-insert(
  "/library.xml",
  <bookstore xmlns="http://www.marklogic/bookstore">
    <book category="COOKING">
    <title en="lang">Everday Italian</title>
    <author>Giada De Laurentiis</author>
    <price>20.00</price>
    </book>
  </bookstore>
)

Namespace is called a q name in xquery

(: xpath can be dropped into expression :)
declare namespace bks := "http://www.marklogic/bookstore"
/bks:bookstore/bks:book/bks:title/string()

declare default namespace bks := "http://www.marklogic/bookstore"
/bks:bookstore/book/title/string()

Results from xpath come bace in document order. ie. not sorted. Results from search does not.

/bks:bookstore/bks:book[bks:price < 30]/bks:title/string()
/bks:bookstore/bks:book[bks:price le 30]/bks:title/string()
# predicates are ANDed: first match, note the 1-based index
/bks:bookstore/bks:book[bks:price le 30][1]/bks:title/string()
# `//` :: is axis of any descendant node
# `following-sibling` :: is axis at the same level in the higherarchy
//bks:title[following-sibling::bks:price < 30]/text()
/bks:bookstore/bks:book/@category
/bks:bookstore/bks:book[@category='COOKING']
(: attributes must be serialize :)
fn:string(/bks:bookstore/bks:book[@category='COOKING'])

Creating FLOWR Expressions

FOR
create a sequence of tuples
LET
binds a sequencee to a variable
WHERE
hilters the tuples on a boolean expression
ORDER BY
sorts the tuple
RETURN
gets the evaliated once for every tuple
Requirements fo a FLOWR statement
  1. 1 FOR or 1 LET clause, and
  2. 1 return clause
  3. WHERE and ORDER BY are optional
FLOWR by example
for $i in 1 to 8
let $squared := $i * $i
where $squared >= 25
order by $i descending
return $i

In xquery, everything is a sequence of items.

let $i := (1,2,3)
return $i
for $i := (1,2,3)
return $i

Writing Conditional Expressions

If will always have all 3 IF, THEN & ELSE blocks. They can be nested.

for $t in (5, 22, 35)
  if ($t < 10)
  then "It's cold."
  else
    if ($t < 30)
    then "It's nice."
    else "It's warm."

Using XQuery Functions and Operators

Function
  • namespace reference
  • marklogic dialog of xquery
    • Out of the box:
      cts
      Core text search,
      xs
      xml schema
      xdmp
      databas & app servce
      fn
      default namespace

text() && string() are node tests. text() does not include nested children. string() does include nested children.

xquery version "1.0-ml";
let $var = <rootnode>ABC <childnode id="99">123</childnode>XYZ</rootnode>
return $var/text()   (: ABC XYZ :)
return $var/string() (:ABC 123XYZ :)
let $xml :=
  document {
<customer>
  <phone>1112223333</phone>
</customer>
}
let $phone := fn:string($xml/cust/phone/text())
fn:replace($phone,"(\d{3})(\d{3})(\d{4}", "($1) $2-$3"
Other 'helpful' string/generic functions
  • string
    • fn:substring
    • fn:string-join
    • fn:replace
    • fn:tokenize
  • generic
    • fn:contains
    • fn:string
    • fn:data
    • fn:distinct-values
  • date
    xs:date
    the date type ('yyyy-MM-dd')
    xs:current-date
    today in typed date format
    xs:days-from-duration
    difference between dates in days
  • http
    • xmdp:get-request-field("form-id")
  • math
    • fn:floor
    • fm:integer

The fn:div is the division operator in marklogic. / would be super fun when dealing with xml.