Developing Marklogic Applications I - Serach Concepts
This is continuation of my notes from the MarkLogic training course Developing Marklogic Applications I - XQuery
Day 3: Search Concepts
Search
Everything in the Information Studio can be scripted via the
admin
module
By Default,
cts:search
isfiltered
cts:search( fn:collection(), cts:near-query((cts:word-query("cats"), cts:word-query("cats")), 2), "filtered" )
By Default, Reindexing occurs as soon as a configuration value is changes. IE automatic reindexing.
xdmp:plan(cts:search(fn:collection("books"), cts:word-query("dog")))//qry:final-plan
GeoSpacial Search
Basis of geospacial is latitude and longditude.
The basic Marklogic type is cts:point
Queries
cts:circle($radiusInMiles, cts:point(-33.8830, 151.2216)
cts:polygon($pointsSequence)
cts:box($lowerCorner, $upperCorner)
cts:element-pair-geospatial-query()
cts:element-geospatial-query()
Index Types
- Element -
<element> lat, long </element>
- Element Child -
<element> <child>lat, long</child> </element>
- Element Pair -
<element> <lat>lat</lat><long>long</long></element>
- Attribute -
<element lat="lat" long="long" />
Information Studio Workspaces are xml.
Examples
cts:search( fn:collection(), cts:element-pair-geospacial-query( xs:QName("place"), xs:QName("lat"), xs:QName("lon"), cts:circle( 200, cts:point(25.0, -80) )))
xdmp:http-get("some-url", options);
Snippets, Highlighting, Sorting and Pagination
Snippets
Search returns a snippeted results set. Results provide a search snippet that gives the context of the search results.
Included snippet elements
- search:result
- snippet:sni
Can provide a option on the serch request to apply transform
<options xmls="http://marklogic.com/appservices/search"> <transform-results apply="snippet"> <preferred-elements> <element ns="http://marklogic.com/mlu/top-songs" name="descr" /> </preferred-elemetns> </transform-results> </options>
Highlighting
Results set includesd a highligh that can then have a class wrapping the string.
Sort options
Sort options require range indexes
option
node defined how the search results are sorted.
PAgniation
- Default to 10 results
- Modified in the results set.
- search:response inclused
total
,start
, andpage-length
search:search("beatles", (), 11)
Faceted Navigation
- Facets are grouped search results. by adding contraints to the query
- Facets can be Bucketed contraints (eg Decade 2010-2020). These have upper and lower bounds.
- Facets require ranged index
- String range indexes have a
collation
. A collation is a uri to the string rules (eg diacritices, case sensitive - Facets are returned in the search results.
Creation of facets
- Create a contraint (with collation if needed)
- This may include buckets for bucket contraints.
- Configure the Facet options.
Updating Content and Transactions
MVCC Multi Version Concurrency Controll
- Inserts and Updates
On Commit, Document is given a created transaction stamp on the document.
- Doc and indexes are inserted into Forest in In-Memory Stand, and
- An Entry has been made in Forest level Journal on disk
Once the stand reaches a point (doc count, mem count, /other configuratable ceiling) the docuemtns is pushed from the In-Memory Stand into the On-Disk Stand
Updates are the same except:
- The document uri exists.
- The old document received a
delete
timestamp equal to the new docs create. - Updates aquire a write lock on the document.
MarkLogic is an append database
xdmp:document-insert("uri", xml):
xdmp:node-replace(fn:doc("song1.xml")/top-song/title, <title>Trouble for Nothing</title)
xdmp:node-insert-child(fn:doc("title")/book, <chapter no="2">...</chapter>) xdmp:node-insert-after(fn:doc("title")/book/title, <author>Herman Melvill</author>)
xdmp:node-insert-child(fn:doc("moby_dick.xml")/book/author, attribute dob {"1819-08-01"})
xdmp:node-insert-child(fn:doc("moby_dick.xml")/book/author/@dob, attribute dob {"unknown"})
- Queries
Run on a timestamp, so the data from that time is returned. this means queries never need to lock documents, becuase its a snapshot.
- Merge / Update transactions
Merge, merges stands together. Merge purges old deleted documents from the system. Only includes the current data.
- Delete
Delete marks the document with a delete timestamp. Delete is a document update.
xdmp:node-delete(fn:doc("moby_dick.xml")//chapter[2])
xdmp:node-delete(fn:doc("moby_dick.xml")/book/author/@dob)