Providing convenience APIs
Introduction
For spatial datasets it is of interest to also share them via convenience APIs, so the datasets can be downloaded in parts or easily be visualized in common tools such as QGIS, OpenLayers or Leaflet. The standards on data services of the Open Geospatial Consortium are designed with this purpose. These APIs give direct access to subsets or map visualizations of a dataset.
In this paragraph you will be introduced to various standardised APIs, after which we introduce an approach to publish datasets, which builds on the data management approach introduced in the previous paragraphs.
These days novel ways to share data over the web arrive, where the data formats itself allow requesting subsets of the data, enabling efficient consumption of the data straight from a repository or cloud storage service. Typical examples are Cloud Optimized GeoTiff (COG) and GeoZarr for grid data and for vector data there is GeoParquet.
Standardised data APIs
Standardised mapping APIs, such as Web Map Service (WMS), Web Feature service (WFS) and Web Coverage Service (WCS), originate from the beginning of this century. In recent years several challenges have been identified around these standards, which led to a series of Spatial Data on the Web Best Practices. Combined with the OGC Open Geospatial APIs - White Paper, OGC then initiated a new generation of standards based on these best practices.
An overview of both generations:
OWS | OGC-API | Description |
---|---|---|
Web Map Service (WMS) | Maps | Provides a visualization of a subset of the data |
Web Feature Service (WFS) | Features | API to request a subset of the vector features |
Web Coverage Service (WCS) | Coverages | API to interact with grid sources |
Sensor Observation Service (SOS) | SensorThings | Retrieve subsets of sensor observations |
Web Processing Service (WPS) | Processes | Run processes on data ] |
Catalogue Service for the Web (CSW) | Records | Retrieve and filter catalogue records |
Notice that most of the mapping software supports the standards of both generations. However, due to their recent introduction, expect incidental challenges in the implementations of OGC APIs.
Setting up an API
MapServer is server software which is able to expose datasets through various APIs. Examples of similar software are QGIS server, ArcGIS Server, GeoServer and pygeoapi.
We’ve selected MapServer for this training, because of its robustness, ease of configuration and low resource consumption. MapServer is configured using a configuration file: called the mapfile. The mapfile defines metadata for the dataset and how users interact with the dataset, mainly the colour scheme (legend) to draw a map of a dataset.
Various tools exist to write these configuration files, such as MapServer studio, QGIS Bridge, up to a Visual Studio plugin to edit mapfiles.
The pyGeoDataCrawler, introduced in a previous paragraph, also has an option to generate mapfiles. A big advantage of this approach is the integration with existing metadata. pyGeoDataCrawler will, during mapfile generation, use the existing metadata, but also update the metadata so it includes a link to the MapServer service endpoint. This toolset enables a typical workflow of:
- Users find a dataset in a catalogue
- Then open the dataset via the linked service
But also vice versa; from a mapping application, access the metadata describing a dataset.
Mapfile creation exercise
- Navigate with shell to a folder with data files.
- Verify if MCFs are available for the files, if not, create initial metadata with
crawl-metadata --mode=init --dir=.
- Add a index.yml file to the folder. This metadata is introduced in the mapfile to identify the service.
mcf:
version: 1.0
identification:
title: My new map service
abstract: A map service for data about ...
contact:
pointOfContact:
organization: example
email: info@example.com
url: https://www.example.com
- Set some environment variables in the
.env
file;pgdc_md_url
,pgdc_ms_url
. Notice in the commands below that we include the .env file in the container. - Generate the mapfile
cd /srv/data/foss4g
crawl-maps --dir=.
cd ./docker/
docker run -it --rm --env-file=.env -v $(pwd):/tmp \
--dir=/tmp/data/foss4g pvgenuchten/geodatacrawler crawl-maps
docker run -it --rm --env-file=.env -v "${PWD}:/tmp" `
pvgenuchten/geodatacrawler crawl-maps --dir=/tmp/data/foss4g
Test your MapServer configuration. The MapServer container includes a test tool for this purpose. With the Docker composition running, try:
map2img
docker exec mapserver map2img -m /srv/data/data/data.map \
-l cities -o /srv/data/data/test.png
docker exec mapserver map2img -m /srv/data/data/data.map `
-l cities -o /srv/data/data/test.png
Replace -l (layer) for a layer in your mapfile. Notice a file test.png
being written to the data folder.
The pyGeoDataCrawler tool internally uses the mappyfile library. Mappyfile is a library to work with mapfiles from Python or the comandline. It offers mapfile creation, formatting and validation options. As an example, use below code to ‘validate’ a map file.
pip3 install mappyfile
mappyfile validate ./data/text.map --no-expand
MapServer via Docker
For this workshop we’re using a MapServer image provided by Camp to Camp available from Docker Hub.
docker pull camptocamp/mapserver:8.4
First update the config file ./data/ms.conf
. On this config file list all the mapfiles wihich are published on the container. Open the file ./data/ms.conf
and populate the maps section. The maps section are key-value pairs of alias and path to the mapfile, the alias is used as http://localhost/ows/{alias}/ogcapi (for longtime MapServer users, the alias replaces the ?map=example.map
syntax).
Notice that our local ./docker/data
folder is mounted into the MapServer container as /srv/data
. You may have to move some content from previous paragraphs into ./docker/data folder. An alias for the foss4 map has already been added.
MAPS
"foss4g" "/srv/data/foss4g.map"
END
Run or restart Docker Compose.
Check http://localhost/ows/foss4g/ogcapi in your browser. If all has been set up fine it should show the OGCAPI homepage of the service. If not, check the container logs to evaluate any errors.
You can also try the url in QGIS. Add a WMS layer, of service http://localhost/ows/foss4g?request=GetCapabilities&service=WMS.
Notice that the MCF files now include a link to the mapservice via which the datasets are shared. These links have been added by the crawl-maps method (and use the pgdc_ms_url
environment variable). Publish the records to the catalogue again, so users will be able to find the service while browsing the catalogue.
pyGeoDataCrawler uses default (gray) styling for vector and an average classification for grids. You can finetune the styling of layers through the robot section in index.yml or by providing an Styled Layer Descriptor (SLD) file for a layer, as {name}.sld
. Sld files can be created using QGIS (export style as SLD).
Summary
In this paragraph the standards of Open Geospatial Consortium have been introduced and how you can publish your data according to these standards using MapServer. In the next section we’ll look at measuring service quality.