Query Syntax
The filter query parameter filters the results of a GET request on an EMu based resource. The parameter defines a JSON based query used to restrict the records returned. Some of the operators may not be available due to restrictions imposed by EMu’s indexing system. Where this occurs it is noted in the documentation.
A query is defined as a JSON object containing zero or more conditions, where conditions are tied together with boolean operators. The simplest query is the empty object:
{}
This query returns all records, as no conditions have been imposed. If one or more restrictions are to be applied to the matching records then a boolean operator must be defined at the top level. The query language supports three boolean operators, namely:
- AND
- OR
- NOT
Each of these operators is a JSON array taking zero or more conditions or sub-queries. For example the query:
{
"AND": []
}
is a valid query that will result in all records matching. Queries can be nested so you can have something like:
{
"AND": [
"OR": [
"AND": []
]
]
}
The nesting of queries allows the full range of boolean operations to be constructed. A condition is a JSON object restricting values for a given field. The general format of a condition is:
{
"{column}": {
"{operator}": {
"{operand}": "{value}"
}
}
}
The {column} property is a fully qualified column name including the envelope around a matching record. The envelope being:
{
"id": "{id}",
"version": {version},
"data": {match}
}
If you wanted to search on the NamLast field in the Parties module then the column name would be data.NamLast. If you wanted to search on the id field then the column name would just be id.
The {operator} defines the how the {operand} and {value} are to match against the {column}. The next section will discuss each operator along with the expected operands and values. An example condition to restrict matches in the Parties module to those having a NamLast value of Smith would be:
{
"data.NamLast": {
"exact": {
"value": "Smith"
}
}
}
The full query required to find all Parties records with a surname of Smith would be:
{
"AND": [
{
"data.NamLast": {
"exact": {
"value": "Smith"
}
}
}
]
}
Since filter is a query parameter, it is part of the request URL. To conform with the list of acceptable characters in a URL, the filter value must be URL Encoded. All reserved and common characters must be percent encoded.
Each of the operators is discussed below.
contains
The contains operator looks for a word inside a text field. The match is case insensitive. The format of a contains condition is:
{
"{column}": {
"contains": {
"value": "{value}"
}
}
}
where {column} is the fully qualified column name to search and {value} is the word to find in the column.
range
The range operator finds records matching a given range of values. The operator can be used on date, time, integer, float, latitude and longitude columns. The format of a range condition is:
{
"{column}": {
"range": {
"gt": "{value}",
"gte": "{value}",
"lt": "{value}",
"lte": "{value}",
"mode": "{mode}"
}
}
}
The optional comparison operands are:
Operand | Description |
---|---|
gt | Greater than |
gte | Greater than or equal |
lt | Less than |
lte | Less than or equal |
The optional {mode} property defines how the {value} should be interpreted. Allowable values for {mode} are:
- date
- time
- latitude
- longitude
The type of the {value} must be consistent with the {column} type. If the {column} is of type integer then {value} must be an integer. For date, time, latitude and longitude values the {value} must be a string containing a correctly formatted value.
exact
The exact operator matches the complete contents of a column case insensitively. The format of an exact condition is:
{
"{column}": {
"exact": {
"value": "{value}",
"mode": "{mode}"
}
}
}
The optional {mode} property defines how the {value} should be interpreted. Allowable values for {mode} are:
- date
- time
- latitude
- longitude
If {mode} is not defined then {value} is interpreted as a string and a character by character match is performed. If {mode} is defined then the {value} matches on its actual value rather than its representation. For example a {value} of 2019-01-01 could be interpreted as either a string or a date depending on the {mode} setting.
exists
The exists operator tests whether or not a column contains a value. The format of an exists condition is:
{
"{column}": {
"exists": {
"value": {value},
}
}
}
where {value} is either true or false. A true value indicates the {column} should contain a value, while false specifies that the {column} has no value.
fuzzy
The fuzzy operator applies to text based searches and locates words that fall within a given Levenshtein distance. The format of a fuzzy condition is:
{
"{column}": {
"fuzzy": {
"value": "{value}",
"fuzziness": {fuzziness},
"prefix_length": {prefix_length},
"max_expansion": {max_expansion}
}
}
}
The optional operands are:
Operand | Description |
---|---|
fuzziness | Maximum Levenshtein distance to qualify for a match. A value of AUTO varies the fuzziness based on the length of the {value}. |
prefix_length | Number of leading characters to ignore when performing the fuzzy match. The default value is zero. |
max_expansion | Maximum number of terms a fuzzy search will exand to. The default value is fifty. |
The fuzzy operator is not supported by the shim as EMu does not support the operator.
geo_bounding_box
The geo_bounding_box operator matches against latitude and longitude columns for values residing in a geographical bound box. The format of a geo_bounding_box condition is:
{
"{column}": {
"geo_bounding_box": {
"top_left": {
"lat": {value},
"lon": {value}
},
"bottom_right": {
"lat": {value},
"lon": {value}
}
}
}
}
The lat/long {value} is expressed in decimal format.
The geo_bounding_box operator is not supported by the shim as EMu does not support the operator.
geo_distance
The geo_distance operator matches all records within a certain distance of a lat/long value. The format of a geo_distance condition is:
{
"{column}": {
"geo_distance": {
"lat": {value},
"lon": {value},
"distance": "{distance}"
}
}
}
The {distance} operand has a format of [value][unit] where value is a number and unit is one of the following:
- mi or miles
- yd or yards
- ft or feet
- in or inches
- km or kilometres
- m or metres
- cm or centimetres
- mm or millimetres
- NM or nauticalmiles
A value of 5km would find all records within five kilometres of the supplied lat/long value.
The geo_distance operator is not supported by the shim as EMu does not support the operator.
phonetic
The phonetic operator is a text based operator that looks for words with similar soundings. For example a search for Smythe would match Smith. The format of a phonetic condition is:
{
"{column}": {
"phonetic": {
"value": "{value}"
}
}
}
EMu has full support for phonetic searching. The retrieval time may vary depending on whether the {column} has phonetic indexing enabled.
phrase
The phrase operator is a text based operator that matches a list of consecutive words. The format of a phrase condition is:
{
"{column}": {
"phrase": {
"value": "{value}"
}
}
}
The {value} should consist of a number of words.
proximity
The proximity operator is applied to a text based {column} to locate words within a certain distance of each other. The format of a proximity condition is:
{
"{column}": {
"proximity": {
"value": "{value}",
"distance": {distance}
}
}
}
The {value} string is two or more words and the {distance} is the number of words within which all the words must appear. For example a distance of three indicates all the words in {value} must be within three words of each other.
regex
The regex operator uses a regular expression to identify matching records. The format of a regex condition is:
{
"{column}": {
"regex": {
"value": "{value}",
"flags": "{flags}"
}
}
}
The following regular expression meta characters are supported:
Operator | Description |
---|---|
. | Matches any single character |
? | Repeat the preceding character zero or one times |
+ | Repeat the preceding character one or more times |
* | Repeat the preceding character zero or more times |
{min,max} | Repeat the preceding character min to max times |
| | OR operator, matches of the longest pattern on the left side or right side |
(…) | Group a series of characters together to treat as a single character |
[…] | Match one of the characters in the square brackets. A range of characters may be specified (e.g. [a-z]). A ^ before a character or range negates the character or range (e.g. [^a-z]) |
A complete list with examples can be found at the REGEX Cheat Sheet. The {flags} setting is not supported by EMu and any value supplied will be ignored.
stemmed
The stemmed operator is a text based operator matching words with the same root. For example, elect would match election, elected, electioneering, elects but would not match electricity. The format of a stemmed condition is:
{
"{column}": {
"stemmed": {
"value": "{value}",
}
}
}
The {value} does not need to be the root word. It can be any of its variants and still match.
synonym
The synonym operator searches for a word and all of its synonyms. A pre-loaded dictionary is used to calculate the synonyms for a given word. The format of a synonym condition is:
{
"{column}": {
"synonym": {
"value": "{value}",
}
}
}
The synonym operator is not supported by the shim as EMu does not support the operator.
reference
The reference operator implements join queries between two resources. The format of a reference condition is:
{
"{column}": {
"reference": {
"{reference}": {
{search}
}
}
}
}
The {column} must be a reference column containing links to another resource. In EMu the column name will end in Ref or Ref_tab. The {reference} value is a URI referring to the referenced table, namely emu:/{tenant}/{resource}. The {search} value is a full search to be performed on the other table through the {column} links.
The reference operator is not supported by the shim as EMu does not support the operator.
Examples
Find all records where the NamLast column is either Smith or Smythe and the NamFirst column is John.
{
"AND": [
{
"data.NamFirst": {
"exact" : {
"value": "John"
}
}
},
{
"OR": [
{
"data.NamLast": {
"exact": {
"value": "smith"
}
}
},
{
"data.NamLast": {
"exact": {
"value": "smythe"
}
}
}
]
}
]
}
Find all records where AdmDateModified is between 1st Jan 2021 and 31st Mar 2021 inclusive and AdmModifiedBy is John Smith.
{
"AND": [
{
"data.AdmDateModified": {
"range": {
"gte": "2021-01-01",
"lte": "2021-03-31",
"mode": "date"
}
}
},
{
"data.AdmModifiedBy": {
"exact": {
"value": "John Smith"
}
}
}
]
}
Find all records where one of the entries in NamOtherNames_tab is Barry Humphries.
{
"AND": [
{
"data.NamOtherNames_tab": {
"exact": {
"value": "Barry Humphries"
}
}
}
]
}
JSON Schema
The JSON Schema definition for the filter query parameter is:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "emu:/shared/resources/search#",
"title": "Common Search Language JSON definitions",
"description": "Definitions to validate Common Search Language",
"definitions": {
"non_empty_string": {
"type": "string",
"minLength": 1
},
"non_negative_integer": {
"type": "integer",
"minimum": 0
},
"positive_integer": {
"type": "integer",
"minimum": 1
},
"non_empty_string_or_number": {
"oneOf": [
{ "$ref": "#/definitions/non_empty_string" },
{ "type": "number" }
]
},
"latlon": {
"type": "object",
"properties": {
"lat": {
"type": "number",
"exclusiveMimimum": -90.0,
"exclusiveMaximum": 90.0
},
"lon": {
"type": "number",
"exclusiveMimimum": -180.0,
"exclusiveMaximum": 180.0
}
},
"required": [ "lat", "lon" ]
},
"mode": {
"enum": [ "date", "time", "latitude", "longitude" ]
},
"search": {
"oneOf": [
{ "$ref": "#/definitions/and" },
{ "$ref": "#/definitions/or" },
{ "$ref": "#/definitions/not" },
{ "$ref": "#/definitions/empty" }
]
},
"and": {
"type": "object",
"properties": {
"AND": {
"$ref": "#/definitions/clauses"
}
},
"required": [ "AND" ]
},
"or": {
"type": "object",
"properties": {
"OR": {
"$ref": "#/definitions/clauses"
}
},
"required": [ "OR" ]
},
"not": {
"type": "object",
"properties": {
"NOT": {
"$ref": "#/definitions/clauses"
}
},
"required": [ "NOT" ]
},
"empty": {
"type": "object",
"maxProperties": 0
},
"clauses": {
"type": "array",
"items": { "$ref": "#/definitions/clause" }
},
"clause": {
"oneOf": [
{ "$ref": "#/definitions/condition" },
{ "$ref": "#/definitions/search" }
]
},
"condition": {
"type": "object",
"patternProperties": {
"^[a-zA-Z0-9_.]+$": {
"oneOf": [
{ "$ref": "#/definitions/contains" },
{ "$ref": "#/definitions/range" },
{ "$ref": "#/definitions/exact" },
{ "$ref": "#/definitions/exists" },
{ "$ref": "#/definitions/fuzzy" },
{ "$ref": "#/definitions/geo_bounding_box" },
{ "$ref": "#/definitions/geo_distance" },
{ "$ref": "#/definitions/phonetic" },
{ "$ref": "#/definitions/phrase" },
{ "$ref": "#/definitions/proximity" },
{ "$ref": "#/definitions/regex" },
{ "$ref": "#/definitions/stemmed" },
{ "$ref": "#/definitions/synonym" },
{ "$ref": "#/definitions/text" },
{ "$ref": "#/definitions/reference" }
]
}
},
"minProperties": 1,
"maxProperties": 1
},
"contains": {
"type": "object",
"properties": {
"contains": {
"type": "object",
"properties": {
"value": { "$ref": "#/definitions/non_empty_string" }
},
"required": [ "value" ]
}
},
"required": [ "contains" ]
},
"range": {
"type": "object",
"properties": {
"range": {
"type": "object",
"properties": {
"gt": { "$ref": "#/definitions/non_empty_string_or_number" },
"gte": { "$ref": "#/definitions/non_empty_string_or_number" },
"lt": { "$ref": "#/definitions/non_empty_string_or_number" },
"lte": { "$ref": "#/definitions/non_empty_string_or_number" },
"mode": { "$ref": "#/definitions/mode" },
"format": { "type": "string" }
},
"minProperties": 1,
"additionalProperties": false
}
},
"required": [ "range" ]
},
"exact": {
"type": "object",
"properties": {
"exact": {
"type": "object",
"properties": {
"value": { "$ref": "#/definitions/non_empty_string" },
"mode": { "$ref": "#/definitions/mode" },
},
"required": [ "value" ]
}
},
"required": [ "exact" ]
},
"exists": {
"type": "object",
"properties": {
"exists": {
"type": "object",
"properties": {
"value": { "type": "boolean" }
},
"required": [ "value" ]
}
},
"required": [ "exists" ]
},
"fuzzy": {
"type": "object",
"properties": {
"fuzzy": {
"type": "object",
"properties": {
"value": { "$ref": "#/definitions/non_empty_string" },
"prefix_length": { "$ref": "#/definitions/non_negative_integer" },
"max_expansion": { "$ref": "#/definitions/positive_integer" },
"fuzziness": {
"oneOf": [
{ "enum": [ "AUTO" ] },
{ "type": "integer", "minimum": 0 }
]
}
},
"required": [ "value" ]
}
},
"required": [ "fuzzy" ]
},
"geo_bounding_box": {
"type": "object",
"properties": {
"geo_bounding_box": {
"type": "object",
"properties": {
"top_left": { "$ref": "#/definitions/latlon" },
"bottom_right": { "$ref": "#/definitions/latlon" }
},
"required": [ "top_left", "bottom_right" ]
}
},
"required": [ "geo_bounding_box" ]
},
"geo_distance": {
"type": "object",
"allOf": [
{ "$ref": "#/definitions/latlon" },
{
"type": "object",
"properties": {
"distance": { "$ref": "#/definitions/non_empty_string" }
},
"required": [ "distance" ]
}
]
},
"phonetic": {
"type": "object",
"properties": {
"phonetic": {
"type": "object",
"properties": {
"value": { "$ref": "#/definitions/non_empty_string" }
},
"required": [ "value" ]
}
},
"required": [ "phonetic" ]
},
"phrase": {
"type": "object",
"properties": {
"phonetic": {
"type": "object",
"properties": {
"value": {"$ref": "#/definitions/non_empty_string" }
},
"required": [ "value" ]
}
},
"required": [ "phrase" ]
},
"proximity": {
"type": "object",
"properties": {
"proximity": {
"type": "object",
"properties": {
"value": { "$ref": "#/definitions/non_empty_string" },
"distance": { "$ref": "#/definitions/non_negative_integer" }
},
"required": [ "value", "distance" ]
}
},
"required": [ "proximity" ]
},
"regex": {
"type": "object",
"properties": {
"regex": {
"type": "object",
"properties": {
"value": { "$ref": "#/definitions/non_empty_string" },
"flags": { "$ref": "#/definitions/non_empty_string" }
},
"required": [ "value" ]
}
},
"required": [ "regex" ]
},
"stemmed": {
"type": "object",
"properties": {
"stemmed": {
"type": "object",
"properties": {
"value": { "$ref": "#/definitions/non_empty_string" }
},
"required": [ "value" ]
}
},
"required": [ "stemmed" ]
},
"synonym": {
"type": "object",
"properties": {
"synonym": {
"type": "object",
"properties": {
"value": { "$ref": "#/definitions/non_empty_string" }
},
"required": [ "value" ]
}
},
"required": [ "synonym" ]
},
"text": {
"type": "object",
"properties": {
"synonym": {
"type": "object",
"properties": {
"value": { "$ref": "#/definitions/non_empty_string" }
},
"required": [ "value" ]
}
},
"required": [ "text" ]
},
"reference": {
"type": "object",
"properties": {
"reference": {
"type": "object",
"patternProperties": {
"^emu:/[^/]+/[^/]+": { "$ref": "#/definitions/search" }
},
"minProperties": 1,
"maxProperties": 1
}
},
"required": [ "reference" ]
}
},
"anyOf": [
{
"$comment": "Required for Common Search Language",
"$ref": "#/definitions/search"
}
]
}