Mappings
The Mapping object specifies:
By default, mapping Darwin Core fields to EMu Catalog columns involves adding a prefix of Dar
to the Darwin Core field name. For example:
Darwin Core field |
becomes |
EMu Column |
---|---|---|
CatalogNumber
|
=> | DarCatalogNumber
|
ScientificName
|
=> | DarScientificName
|
The only exception to this rule is the Darwin Core field DateLastModified
, which requires special handlers - see TexQL & Value handlers.
At time of writing (October 2008) the following Darwin Core versions are supported in the default mapping:
- 1.2
- 1.21
- 1.3
- 1.4
- 1.4 extensions
- OBIS extensions
- PALEO extensions (maybe!)
The default mapping can be modified (or a new mapping specified) by implementing the clientDigirProvider::getMappings()
function. If this function exists, it is called and passed the default mapping object when each request is processed. It is then possible to use the Mapping::setMapping()
function to add or modify mappings.
All of the default mappings may be cleared by using the Mapping::clearMappings()
function.
There is no requirement for the EMu columns specified in the mapping to actually exist in the client back-end. When each request is processed the existence of all of the requested columns (as returned by their field mappings) is checked in the client back-end and if they do not exist, they are removed from the query and a warning diagnostic is generated. If the column is part of a request filter or it is the inventory field of an inventory request, the request fails and an error diagnostic is generated.
This means that it is only necessary to define the columns that are important to the client. This will generally mean defining all of the columns from one or more of the official Darwin Core versions.
This concept of Mandatory fields is defined in the Darwin Core standard but poorly supported by the DiGIR protocol. The idea is that a Darwin Core record (the set of Darwin Core fields specified by the Darwin Core version) is only valid if it contains values for these fields (nillable=false
in the Darwin Core version schema). If it does not contain values for these fields, the record is not valid and should not be included in the record set returned to a DiGIR request.
Unfortunately, different versions of the Darwin Core protocol specify different Mandatory fields, and requests in the DiGIR protocol are not required to specify which version of the Darwin Core standard they use (actually the requests are not bound by the Darwin Core versions whatsoever, any combination of Darwin Core fields may be used). The default Mandatory fields used are the four that are common across all of the Darwin Core versions:
ScientificName
CatalogNumber
CollectionCode
InstitutionCode
Mandatory fields may be cleared or added to as desired.
- Use the
Mapping::clearMandatory()
function of the mapping object to clear all Mandatory fields. - Use the
Mapping::setMandatory()
function to add a Mandatory field.
Given the somewhat fluid nature of the Darwin Core standard it is possible that the name of a Darwin Core field in one version of the standard has changed since a prior version, but that the concept (i.e. the data) has remained the same.
If a customer supports multiple Darwin Core versions and the Darwin Core field name changes between versions, it is possible to indicate that we want the new name to reference the old field, so that we only need to create one column in the client back-end. This is essentially the same as updating the existing default mapping (or creating a new mapping) with all of the same values as the old mapping (except, of course, the field name).
Note: See DarwinCoreVersions for numerous examples of name changes between versions.
Mapping references can be specified by using the Mapping::setMappingRef()
function. No mapping references are specified by default. Mapping references should be set after all other mapping operations have been completed.
TexQL and Value handlers provide a mechanism for retrieving data from a database and constructing that data for presentation in the DiGIR response beyond the simple one to one mapping of Darwin Core fields to EMu columns.
TexQL and Value handlers are set on a per-field basis. Typically it is necessary to specify both a TexQL and Value handler for the same Darwin Core field, but in some limited situations it may only be necessary to specify one or the other (see below).
TexQL handlers designate:
- The columns to retrieve from the database (part of the TexQL
select
statement) when a particular Darwin Core field is requested (i.e. when the field is included in the DiGIR request structure). - The TexQL
where
statement to generate when that particular Darwin Core field is queried in the DiGIR request (i.e. when the field is included in the DiGIR request filter).
Value handlers designate:
- How to construct the value of the Darwin Core field in the DiGIR response.
The default mapping only specifies a TexQL and Value handler for the Darwin Core field DateLastModified
.
See ~/ web/webservices/digir/Mapping.php
for this example.
The purpose of the handlers on this field is to use the existing EMu AdmDateModified
and AdmTimeModified
columns rather than creating a new column. Using the existing EMu columns makes it possible to take advantage of the range buckets of these columns for range queries on the Darwin Core DateLastModified
field.
TexQL and Value handlers are specified by using Mapping::setTexqlHandler()
and Mapping::setValueHandler()
respectively.
The main reason for defining TexQL and Value handlers is to perform some kind of post-processing on a Darwin Core field.
One reason to apply post-processing to a field is that the field should contain data that does not already exist in the EMu database. An example of this might be the Darwin Core field RecordURL
. This specifies the full URL required to locate the record. As most of this information is not likely to be stored in the EMu Catalog and the URL may change over time, it makes a good candidate for post-processing.
This is a simple example of the handlers that might be used for the RecordURL
Darwin Core field (this code would be located in clientDigirProvider
):
function
getMappings($map)
{
$this->baseRecordURL = "http://servername/webdir/pages/nmnh/iz/Display.php?irn=";
### function will be called like this
### function($field, $operator, $value, $data)
### where/> ### $field = "RecordURL"
### $operator & $value depend on the DiGIR request
### $data = $this->baseRecordURL
###
$code =
'
if ($operator == \'=\')
{
if (preg_match("/^$data(\d+)$/", $value, $match))
return "irn $operator $match[1]";
}
else if ($operator == \'<>\')
{
if (preg_match("/^$data(\d+)$/", $value, $match))
return "irn $operator $match[1]";
else
return \'TRUE\';
}
else
{
return null;
}
return \'FALSE\';
';
$map->setTexqlHandler('RecordURL', 'irn_1', $code, $this->baseRecordURL);
### function will be called like this
### function($field, $record, $data)
### where
### $field = "RecordURL"
### $record is a hash of all required data retrieved from the database indexed by EMu column name
### $data = $this->baseRecordURL
###
$code =
'
if (empty($record[\'irn_1\']))
return \'\';
return $data . $record[\'irn_1\'];
';
$map->setValueHandler('RecordURL', $code, $this->baseRecordURL);
return $map;
}
TexQL handlers must return the correct TexQL for each operation that it supports (some operations are too difficult to handle entirely with code and should not be supported).
Take the example of the "=" query above; since the IRN is the only portion of the RecordURL
field that we actually store in the EMu Catalog it is necessary to determine whether the other portion (the base
URL) gathered from the DiGIR request matches the base
URL we have defined:
- If the
base
URL does match, it is necessary to generate the TexQL to determine if the IRN matches any values in the database. - If the
base
URL does not match, we return the TexQL valueFALSE -
because the value from the DiGIR request does not match ourbase
URL it should not retrieve any values from the database.
Value handlers should use the required values retrieved from the database along with any pre-defined data to generate the correct value to display in the DiGIR response.
See also:
The easiest way to indicate that the client does not support (i.e. provide data) for a specific Darwin Core field is to do nothing. Non-existent columns specified in the mapping are handled by the DiGIR Provider - see Relation of Darwin Core fields to EMu columns.
On the other hand, if a mapped column has been implemented in the back-end and it is necessary to disable it, use the Mapping::setNoMapping()
function.