NAV Navbar
Logo rw
cURL

Introduction

Welcome to the Resource Watch API Developer Documentation

API Architecture

This chapter covers the basic architectural details of the API. If you are new to RW API development, you should start here, as key concepts explained here will be needed as you go more hands-on with the API code and/or infrastructure.

Microservice architecture

The RW API is built using a microservices architecture using Control Tower as the gateway application.

In this configuration, Control Tower (CT) offers gateway and core functionality, like routing or user management, while user-facing functionality is provided by a set of microservices that communicate with each other and the external world through Control Tower.

Internal communication between CT and the microservices is done through HTTP requests, and as such each microservice is built on top of a web server..These different web servers create a network within the API itself, to which will refer throughout the rest of the documentation when we mention “internal network” or “internal requests”. By opposition, an “external request” will reference a request from/to the world wide web, and “external network” basically means “the internet”.

Control Tower itself acts both as an API in itself - with endpoints for management and monitoring - as well as a catch-all side, that is used to proxy requests to the functional endpoints. gi

Lifecycle of a request

All incoming requests to the API are handled by CT that, among other things, does the following: - Checks if there’s a microservice capable of handling that request. - Checks if authentication data is required and/or is present. - Automatically filters out anonymous requests to authenticated endpoints.

We’ll go into more details about these processes in the next sections

Control Tower matches each incoming external request to a microservice, by comparing its URI and request method. It then generates a new HTTP request to that microservice and will wait for a response to it - which is used as a response to the original external request.

Microservices can make requests to each other via Control Tower. They also have unrestricted access to the public internet, so 3rd party services can be accessed as they normally would be. The API infrastructure also has other resources, like databases (MongoDB, ElasticSearch, Postgres) or publish-subscribe queues (RabbitMQ), which can be accessed. We’ll cover those in more details in a separate section.

Control Tower

This chapter covers Control Tower in depth - what it does and how it does it.

Overview

Control Tower is essentially a mix of 3 main concepts: - An API proxy/router - A lightweight management API - A plugin system to extend functionality and its own API.

We’ll cover those 3 topics individually in this section.

API proxy/router

Control Tower’s most basic functionality is accepting external requests to the API, “forwarding” them to the corresponding microservices, and returning whatever response the microservices produce.

Once an external request is received, Control Tower uses the HTTP request method and the URI of the request to match it to one of the known endpoints exposed internally by one of the microservices. Note that all external requests are handled exclusively by Control Tower, and microservices are no able to directly receive requests from outside the API’s internal network (they can, however, make external requests).

This matching process can have one of two results:

Handling a matching request

Should a match be found for the external request’s URI and method, a new, internal request is generated and dispatched to the corresponding microservice. The original external request will be blocked until the microservice replies to the internal request - and the internal request’s reply will be used as the reply to the external request as soon as the corresponding microservice returns it.

Handling a non-matching request

No URI and method match

At its most basic, the external requests is matched by URI and HTTP method. Should no combination of URI and method be found, Control Tower will reply to the external request with a 404 HTTP code.

Authenticated

Microservice endpoints can be marked as requiring authentication. If a matching request is received, but no user authentication data is provided in the request, Control Tower will reject the request with a 401 HTTP code.

Application key

Similar to what happens with authenticated endpoints, microservice endpoints can also request an application to be provided in the request. If that requirement is not fulfilled, Control Tower will reject the request with a 401 HTTP code.

Filter error

Registered endpoints also have the possibility to specify filters - a custom requirement that must be met by the request in order for a request match to be successful. For example, endpoints associated with dataset operations use a common URI and HTTP method, but will be handled by different microservices depending on the type of dataset being used - this type of functionality can be implemented using the filter functionality. On some scenarios, even if all the previous conditions are met, filters may rule out a given match, in which case a 401 HTTP code will be returned.

Microservice registration process

The matching process described above is carried out by Control Tower based on endpoints declared by each microservice. In this section, we’ll take a detailed look at the process through which microservices can declare their available endpoints to Control Tower

Overview

Here’s a graphical overview of the requests exchanged between CT and a microservice:

Control Tower Request Microservice
<=== POST /v1/microservice <===
{“name”:“microservice name”, “url”: “http://microservice-url.com”, “active”: true }
===> Reply ===>
HTTP OK
===> GET /api/info ===>
<=== Reply <===
{ /* JSON with microservice details */ }
(every 30 seconds)
===> GET /api/ping ===>
<=== Reply <===
pong

The registration process is started by the microservice. It announces its name, internal URL and active state to Control Tower. This tentatively registers the microservice in Control Tower’s database, and triggers the next step of the process.

Immediately after receiving the initial request from the microservice, Control Tower uses the provided URL to reach the microservice. This is used not only to load the endpoint data, but also to ensure that Control Tower is able to reach the microservice at the provided URL - if that does not happen, the registration process is aborted on Control Tower’s side, and the microservice data dropped. When it receives this request, the microservice should reply with a JSON array of supported endpoints. We’ll dive deeper into the structure of that reply in a moment.

The last step of the process is Control Tower processing that JSON entity, and storing its data in its database. From this point on, the microservice is registered and is able to receive user requests through Control Tower.

Control Tower will, every 30 seconds, emit a ping request to the microservice, which must reply to it to confirm to Control Tower that it’s still functional. Should the microservice fail to reply to a ping request, Control Tower will assume its failure, and de-register the microservice and associated endpoints. Should this happen, it’s up to the microservice to re-register itself on Control Tower, to be able to continue accepting requests.

Minimal configuration

During the registration process, each microservice is responsible for informing Control Tower of the endpoints it supports, if they require authentication, application info, etc. This is done through a JSON object, that has the following basic structure:

{
    "name": "dataset",
    "tags": ["dataset"],
    "endpoints": [
        {
            "path": "/v1/dataset",
            "method": "GET",
            "redirect": {
                "method": "GET",
                "path": "/api/v1/dataset"
            }
        }
    ],
    "swagger": {}
}

Breaking it down bit by bit:

Within the endpoints array, the expected object structure is the following:

Taking the example above, that reply would be provided by the dataset microservice while registering on Control Tower. It would tell CT that it has a single public endpoint, available at GET <api public URL>/v1/dataset. When receiving that external request, the redirect portion of the endpoint configuration tells CT to issue a GET request to <microservice URL>/api/v1/dataset. It then follows the process previously described to handle the reply from the microservice and return its content to the user.

Advanced configuration

The example from the previous section covers the bare minimum a microservice needs to provide to Control Tower in order to register an endpoint. However, as discussed before, Control Tower can also provide support for more advanced features, like authentication and application data filtering. The JSON endpoint snippet below shows how these optional parameters can be used to configure said functionality.

{
    "name": "query-document",
    "tags": ["query-document"],
    "endpoints": [
        {
            "path": "/v1/query/:dataset",
            "method": "GET",
            "binary": true,
            "authenticated": true,
            "applicationRequired": true,
            "redirect": {
                "method": "POST",
                "path": "/api/v1/document/query/:dataset"
            },
            "filters": [{
                "name": "dataset",
                "path": "/v1/dataset/:dataset",
                "method": "GET",
                "params": {
                    "dataset": "dataset"
                },
                "compare": {
                    "data": {
                        "attributes": {
                            "connectorType": "document"
                        }
                    }
                }
            }]
        }   
    ]
 }

Within the endpoints array, you’ll notice a few changes. The path property now includes an URI which contains :dataset in it - this notation tells Control Tower to expect a random value in that part of the URI (delimited by /) and to refer to that value as dataset in other parts of the process.

The redirect.path references the same :dataset value - meaning the incoming value from the external request will be passed on to the internal requests to the microservice generated by Control Tower.

You’ll also notice new fields that were not present before:

The filtering section has the most complex structure, so lets analyse it with a real-world example. The above example would match a request like GET <api external URL>/v1/query/<dataset id>. Once that request is received by Control Tower, it is tentatively matched to this endpoint, but pending the validation of the filter section.

To process the filter, Control Tower will do a request of type filters.method (GET, in this case) and URI filters.path. As the URI contains a :dataset variable, it will use the filters.params object to map that value to the dataset value from the external request. This internal request is then issued and the process briefly stopped until a response is returned.

Once the response is returned, the filters.compare object comes into play: Control Tower will see if the response body object matches the set value in the filters.compare object. In this particular example, the resulting comparison would be something like response.body.data.attributes.connectorType == "document". Should this evaluate to true, the filter is considered to have matched, and this endpoint is used to dispatch the internal request to. Otherwise, this endpoint is discarded from the list of possible matches for this external request.

The data loaded as part of the filtering process is then passed on to the microservice that will handle that call, either as query arguments (DELETE or GET requests) or as part of the body (PATCH, POST and PUT requests). The query param/body property is named after the filter.name value configured in the filter. In our example, the request issued to the microservice that matches the filter would be of type POST, so its body will contain a dataset property containing the response from querying the filter endpoint - /v1/dataset/:dataset.

Its worth noting at this stage that there’s no restriction regarding uniqueness of internal endpoints - two microservices may support the same endpoint, and use filters to differentiate between them based on the underlying data. The example above illustrates how a microservice can support the /v1/query/:dataset external endpoint, but only for datasets of type document. A different microservice may support the same endpoint, but with a different filter value (for example carto) and offer the same external functionality with a completely different underlying implementation.

Should multiple endpoints match an external request, one of them is chosen and used - there are no guarantees in terms of which is picked, so while this scenario does not produce an error, you probably want to avoid it.

Management API

In the previous section, we discussed how microservices can register their endpoint on Control Tower, exposing their functionality to the outside world. That registration process uses part of Control Tower’s own API, which we’ll discuss in finer detail in this section.

Microservice management endpoints

The microservice registration endpoint is one of 4 endpoints that exist around microservice management:

GET /microservice/

This endpoint shows a list of registered microservice and their corresponding endpoints.

curl -X GET \
  http://<CT URL>/api/v1/microservice \
  -H 'Authorization: Bearer <your user token>' \
  -H 'Content-Type: application/json'
[
    {
        "infoStatus": {
            "numRetries": 0,
            "error": null,
            "lastCheck": "2018-12-05T07:33:30.244Z"
        },
        "pathInfo": "/info",
        "pathLive": "/ping",
        "status": "active",
        "cache": [],
        "uncache": [],
        "tags": [
            "dataset"
        ],
        "_id": "5aa66766aee7ae846a419c0c",
        "name": "Dataset",
        "url": "http://dataset.default.svc.cluster.local:3000",
        "version": 1,
        "endpoints": [
            {
                "redirect": {
                    "method": "GET",
                    "path": "/api/v1/dataset"
                },
                "path": "/v1/dataset",
                "method": "GET"
            }
        ],
        "updatedAt": "2018-11-23T14:27:10.957Z",
        "swagger": "{}"
    }
]

GET /microservice/status

Lists information about operational status of each microservice - like errors detected by Control Tower trying to contact the microservice or number of retries attempted.

curl -X GET \
  http://<CT URL>/api/v1/microservice/status \
  -H 'Authorization: Bearer <your user token>' \
  -H 'Content-Type: application/json'
[
    {
        "infoStatus": {
            "numRetries": 0,
            "error": null,
            "lastCheck": "2018-12-05T07:36:30.199Z"
        },
        "status": "active",
        "name": "Dataset"
    }
]

POST /microservice/

This is the endpoint used by microservices to register on Control Tower. You can find a detailed analysis of its syntax in the previous section

DELETE /microservice/:id

This endpoint is used to unregister a microservice’s endpoints from Control Tower. Control Tower does not actually delete the microservice information, nor does it immediately remove the endpoints associated to it. This endpoint iterates over all endpoint associated with the microservice to be unregistered, and flags them for deletion - which is actually done by a cron task that in the background. Until that moment, the microservice and associated endpoints will continue to be available, and external requests to those endpoints will be handled as matched as they were before. However, you will notice that endpoints scheduled for deletion will have a toDelete value of true - more on this in the next section.

curl -X DELETE \
  http://<CT URL>/api/v1/microservice/<microservice id> \
  -H 'Authorization: Bearer <your user token>' \
  -H 'Content-Type: application/json'
{
    "infoStatus": {
        "numRetries": 0
    },
    "pathInfo": "/info",
    "pathLive": "/ping",
    "status": "active",
    "cache": [],
    "uncache": [],
    "tags": [
        "dataset",
        "dataset",
        "dataset"
    ],
    "_id": "5c0782831b0bf92a37a754e2",
    "name": "Dataset",
    "url": "http://127.0.0.1:3001",
    "version": 1,
    "updatedAt": "2018-12-05T07:49:36.754Z",
    "endpoints": [
        {
            "redirect": {
                "method": "GET",
                "path": "/api/v1/dataset"
            },
            "path": "/v1/dataset",
            "method": "GET"
        }
    ],
    "swagger": "{}"
}

Endpoint management endpoints

GET /endpoint

This endpoint lists all microservice endpoint known by Control Tower. Note that it does not contain endpoints offered by Control Tower itself or any of its plugins.

curl -X GET \
  http://<CT URL>/api/v1/endpoint \
  -H 'Authorization: Bearer <your user token>' \
  -H 'Content-Type: application/json'
[
    {
        "pathKeys": [],
        "authenticated": false,
        "applicationRequired": false,
        "binary": false,
        "cache": [],
        "uncache": [],
        "toDelete": true,
        "_id": "5c0784c88dcce0323abe705d",
        "path": "/v1/dataset",
        "method": "GET",
        "pathRegex": {},
        "redirect": [
            {
                "filters": null,
                "_id": "5c0784c88dcce0323abe705e",
                "method": "GET",
                "path": "/api/v1/dataset",
                "url": "http://127.0.0.1:3001"
            }
        ],
        "version": 1
    }
]

DELETE /endpoint/purge-all

This endpoint purges the complete HTTP cache for all microservices. It does not support any kind of parametrization, so it’s not possible to use this endpoint to clear only parts of the cache. As such, we recommend not using this endpoint unless you are certain of its consequences, as it will have noticeable impact in end-user perceived performance.

curl -X DELETE \
  http://<CT URL>/api/v1/endpoint/purge-all \
  -H 'Authorization: Bearer <your user token>' \
  -H 'Content-Type: application/json'

Documentation management endpoints

GET /doc/swagger

Generates a complete Swagger JSON file documenting all API endpoints. This swagger is compiled by Control Tower based on Swagger files provided by each microservice. As such, the Swagger details for a given endpoint will only be as good as the information provided by the microservice itself.

curl -X GET \
  http://<CT URL>/api/v1/endpoint \
  -H 'Authorization: Bearer <your user token>' \
  -H 'Content-Type: application/json'
{
    "swagger": "2.0",
    "info": {
        "title": "Control Tower",
        "description": "Control Tower - API",
        "version": "1.0.0"
    },
    "host": "tower.dev:9000",
    "schemes": [
        "http"
    ],
    "produces": [
        "application/vnd.api+json",
        "application/json"
    ],
    "paths": {
        "/api/v1/doc/swagger": {
            "get": {
                "description": "Return swagger files of registered microservices",
                "operationId": "getSwagger",
                "tags": [
                    "ControlTower"
                ],
                "produces": [
                    "application/json",
                    "application/vnd.api+json"
                ],
                "responses": {
                    "200": {
                        "description": "Swagger json"
                    },
                    "500": {
                        "description": "Unexpected error",
                        "schema": {
                            "$ref": "#/definitions/Errors"
                        }
                    }
                }
            }
        }
    },
    "definitions": {
        "Errors": {
            "type": "object",
            "description": "Errors",
            "properties": {
                "errors": {
                    "type": "array",
                    "items": {
                        "$ref": "#/definitions/Error"
                    }
                }
            }
        },
        "Error": {
            "type": "object",
            "properties": {
                "id": {
                    "type": "integer",
                    "format": "int32",
                    "description": "A unique identifier for this particular occurrence of the problem."
                },
                "links": {
                    "type": "object",
                    "description": "A links object",
                    "properties": {
                        "about": {
                            "type": "string",
                            "description": "A link that leads to further details about this particular occurrence of the problem."
                        }
                    }
                },
                "status": {
                    "type": "string",
                    "description": "The HTTP status code applicable to this problem, expressed as a string value"
                },
                "code": {
                    "type": "string",
                    "description": "An application-specific error code, expressed as a string value"
                },
                "title": {
                    "type": "string",
                    "description": "A short, human-readable summary of the problem that SHOULD NOT change from occurrence to occurrence of the problem, except for purposes of localization."
                },
                "detail": {
                    "type": "string",
                    "description": "A human-readable explanation specific to this occurrence of the problem. Like title, this field's value can be localized"
                },
                "source": {
                    "type": "object",
                    "description": "An object containing references to the source of the error, optionally including any of the following members",
                    "properties": {
                        "pointer": {
                            "type": "string",
                            "description": "A JSON Pointer [RFC6901] to the associated entity in the request document"
                        },
                        "parameter": {
                            "type": "string",
                            "description": "A string indicating which URI query parameter caused the error."
                        }
                    }
                },
                "meta": {
                    "type": "object",
                    "description": "A meta object containing non-standard meta-information about the error."
                }
            }
        }
    }
}

Plugin management endpoints

Control Tower as a plugin system of its own, which we’ll cover in detail in the next section. As part of that system it has a few API endpoints to support certain actions

GET /plugin

Lists all currently enabled plugins, along with their configuration.

curl -X GET \
  http://<CT URL>/api/v1/plugin \
  -H 'Authorization: Bearer <your user token>' \
  -H 'Content-Type: application/json'
[
    {
        "active": true,
        "_id": "5bfd440834d5076bb4609f9f",
        "name": "manageErrors",
        "description": "Manage Errors",
        "mainFile": "plugins/manageErrors",
        "config": {
            "jsonAPIErrors": true
        }
    }
]

PATCH /plugin/:id

Updates the settings of a given plugin.

curl -X PATCH \
  http://<CT URL>/api/v1/plugin/:pluginId \
  -H 'Authorization: Bearer <your user token>' \
  -H 'Content-Type: application/json' \
  -d '{
    "config": {
        "jsonAPIErrors": false
    }
}'
{
    "_id": "5bfd440834d5076bb4609f9f",
    "name": "manageErrors",
    "description": "Manage Errors",
    "mainFile": "plugins/manageErrors",
    "config": {
        "jsonAPIErrors": false
    }
}

Control Tower plugins

Control Tower provides basic API management functionality, but it also has a set of plugins that allow decoupling non-core functionality from the core code base. In this section we’ll briefly cover the functional aspect of each plugin, without focusing too much on the underlying implementation, which will be covered separately.

Time Request

This plugin times the time elapsed between the external request is received and the reply to it being dispatched back to the external API client. It adds the computed value as a response header X-Response-Time

Manage errors

This plugin intercepts replies that represent errors and formats them properly.

CORS

This plugins adds the necessary headers to support CORS

Invalidate cache endpoints

Varnish cache integration plugin that invalidates cache entries.

Response formatter

Handles response formats other than the default JSON, setting headers and formatting the response body according to the requested content type. Currently only supports XML.

Statistics

Collects and stores statistics about the API usage.

MongoDB sessions

Adds support for storing session data on MongoDB

Oauth plugin

User management and authentication plugin. Supports email+password based registration, as well as Facebook, Twitter and Google+ oauth-based authentication.

Redis cache

Redis-based caching for Control Tower.

Application key authorization

Application key handling.

Fastly cache

Integrates HTTP caching using Fastly.

Read-only mode

Rejected endpoints will have HTTP status 503 and the following message:

"API under maintenance, please try again later."

Configuration of white and black lists:

{
    // This blacklist prevents GETting widgets
    "blacklist": ["/api/v1/widget"],

    // This whitelist allows POST/PATCH/DELETE calls for datasets and layers
    "whitelist": ["/api/v1/dataset", "/api/v1/layer"]
}

This plugin activates a read-only mode that passes through all calls to GET endpoints and rejects calls to POST/PATCH/PUT/DELETE endpoints. The plugin provides the following configurations for increased flexibility:

Note: Please ensure you know what you’re doing when activating this plugin, since it will highly restrict the usage of the API.

Control Tower plugin development

This chapter covers Control Tower plugin development basics. It aims at providing basic understanding of how existing Control Tower plugins work, and give you the foundations to develop your own plugins.

Implementing functionality

Control Tower and its plugins are implemented using Nodejs and Koa. Be sure to familiarize yourself with the key concepts behind this framework before exploring this section, as it assumes you are comfortable with concepts like route declaration or koa middleware.

Existing plugin functionality for Control Tower falls within one of two categories: middleware that intercepts requests as they are processed by Control Tower, or new endpoints that are added to Control Tower’s API. Some plugins combine the two approaches to implement their functionality.

Middleware

Middleware-based functionality consists of intercepting the external request and/or response and either gathering data about them (for logging or statistics) or modifying them (for example, for caching). This type of functionality can be implemented using koa’s middleware API, as Control Tower itself does not modify or extend that in any way.

Endpoint creation

Plugins can also extend Control Tower’s endpoint list by registering new endpoints of their own. An example of this is the Control Tower Oauth plugin that exposes a number of endpoints to support actions like registering, logging in, recovering password, etc. Like with the middleware approach, this can be implemented by relying on koa principles, and Control Tower does not modify this behavior.

Data storage

Control Tower uses a Mongo database to store data like known microservices or endpoints. Plugins can also access this database and create their own collections, should they wish to. Control Tower and existing plugins rely on Mongoose to that effect, and you’ll find example of its usage throughout Control Tower and its plugins code.

Plugin bootstrap and config

During its first initialization, Control Tower will load plugin settings from this file. It includes a list of plugins to be initialized, whether or not they should be active, and their default configuration. On subsequent executions, this information will be loaded from the database instead, so additional changes you may want to do should be done on the database, and not on this file.

An important part of this file and of the corresponding database entries is plugin configuration. This data is stored within the plugins MongoDB collection managed by Control Tower but it’s made available to plugins as they are initialized and ran.

Microservice

This section of the docs covers the details of a microservice.

Microservice overview

As described in the API Architecture section, microservices are small web applications that expose a REST API through a web server. This means that microservices can be built using any programming language, just as long as it supports HTTP communication. In practical terms, most of this API’s core microservices are built using nodejs, with python and Rails being distant 2nd and 3rd respectively. This is due to personal preference of the team behind the API, as there really isn’t a technical reason or limitation preventing the creation of microservices in PHP, Go, Elixir, etc.

Control Tower integration

While they could technically work as standalone applications, microservices are built from the ground up to work through Control Tower. As such, not only do they lack built-in functionality provided by Control Tower itself (for example, user management), they also need to handle their own integration with Control Tower. Control Tower provides integration libraries for certain languages and frameworks, which you can use to ease development: - nodejs package for Koa - Python module for Flask - Rails engine

These libraries provide 2 basic features that we’ll cover in detail in this chapter. You can also use it for reference in case you want to implement a microservice in a different programming language or framework.

We’ll use the nodejs library as reference and example in the following sections, as it’s the most commonly used language in this API. Other libraries will provided the same underlying functionality, but may have different ways to operate. Refer to each library’s specific documentation for more details.

Registering on Control Tower

The first feature provided by these libraries, and that a microservice must perform, is registering on Control Tower. Most of the details of this process can be found on Control Tower’s documentation, which you should read at this point if you haven’t already.

// dataset microservice registration example

const ctRegisterMicroservice = require('ct-register-microservice-node');
const Koa = require('koa');
const logger = require('logger');

const app = new Koa();

const server = app.listen(process.env.PORT, () => {
    ctRegisterMicroservice.register({
        info: require('../microservice/register.json'),
        swagger: require('../microservice/public-swagger.json'),
        mode: (process.env.CT_REGISTER_MODE && process.env.CT_REGISTER_MODE === 'auto') ? ctRegisterMicroservice.MODE_AUTOREGISTER : ctRegisterMicroservice.MODE_NORMAL,
        framework: ctRegisterMicroservice.KOA2,
        app,
        logger,
        name: config.get('service.name'),
        ctUrl: process.env.CT_URL,
        url: process.env.LOCAL_URL,
        token: process.env.CT_TOKEN,
        active: true
    }).then(() => {
    }, (error) => {
        logger.error(error);
        process.exit(1);
    });
});

Covering the arguments in detail:

This registration call usually takes place right after the microservice’s start process has ended, and the corresponding web server is available. Keep in mind that the call above will trigger an HTTP request to Control Tower, which in turn will call the microservice’s web server - so make sure the microservice’s web server is up and running when you attempt to register it.

Requests to other microservices

Besides contacting Control Tower to register themselves, microservices also need to contact Control Tower to make requests to other microservices.

// Microservice call to another microservice's endpoint
const ctRegisterMicroservice = require('ct-register-microservice-node');
const tags = ['tag1', 'tag2'];

ctRegisterMicroservice.requestToMicroservice({
    uri: `/graph/dataset/${id}/associate`,
    method: 'POST',
    json: true,
    body: {
        tags
    }
});

In this example, the dataset microservice makes a call to the /graph/dataset/<id>/associate endpoint to tag a dataset with the given tag list. This endpoint is implemented by the Graph microservice, but the request is actually handled by Control Tower. Taking a deeper look at the code that implements the call above, we learn a few things:

// Implementation of call to another microservice

requestToMicroservice(config) {
    logger.info('Adding authentication header ');
    try {
        let version = '';
        if (process.env.API_VERSION) {
            version = `/${process.env.API_VERSION}`;
        }
        if (config.version === false) {
            version = '';
        }
        config.headers = Object.assign(config.headers || {}, { authentication: this.options.token });
        if (config.application) {
            config.headers.app_key = JSON.stringify({ application: config.application });
        }
        config.uri = this.options.ctUrl + version + config.uri;
        return requestPromise(config);
    } catch (err) {
        logger.error('Error to doing request', err);
        throw err;
    }

}

As explained above, although the call is ultimately to another microservice, the request is sent to Control Tower, who is then responsible for issuing another internal request to the destination microservice, getting the reply from that call and passing it on to the microservice that initiated the process.

Another thing you’ll notice is that this call depends on preexisting configuration stored in the this.options property. These configuration options are stored within the object during the Control Tower registration process, meaning you should not attempt to make a call to another microservice unless you have previously registered your microservice on Control Tower. Keep in mind that this is a restriction of this particular integration library, and not of Control Tower itself - a different implementation could do internal requests to microservices through Control Tower without being registered as a microservice.

Logging

An important part of microservice operation is logging events as it processes requests. Many errors are only triggered during staging and production server execution, and without proper logging, there isn’t a way to identify how it can be reproduced, so it can then be fixed.

Common development languages often come with either built-in or 3rd party logging libraries than make logging easier to handle. Current nodejs microservices use Bunyan to manage logs, which eases managing log destinations (stdout, file, etc) and log levels. Other libraries, for nodejs and other languages, offer similar functionality.

For microservice logs, the main output channel should be stdout, so it can seamlessly integrate with the infrastructure to which microservices will be deployed when going live - more on this later. If you prefer, you can also log to file or other output channels for development purposes - it’s in this sort of scenarios that logging libraries become useful, as they decouple the logging action from the destination channels.

const logger = require('logger');

logger.info('Validating Dataset Update');

The example above logs that the validation process for input data associated with a dataset updated has been started. You’ll notice that the info() function is called - this sets the logging level for this message. While different logging tools implement different strategies to differentiate logs, this approach is rather common and widespread. It’s up to you to define your log levels, how they translate into entries on the logging output, or how different logging levels are forwarded to different channels - just remember to keep the day-to-day logs on stdout.

If you want to access your logging output for a microservice that’s already deployed on either staging or production, you’ll need access to kubernetes logging UI or CLI. The details of this are discussed on a separate section.

Microservice internal architecture - nodejs

Nodejs microservices are based on the Koa framework for nodejs. To understand the following code snippets, we assume you are familiar with the basics of the framework, like how routes are declared and handled, and what middleware are and how they work. You should also be somewhat familiar with tools like npm, mongo and mongoose, Jenkins CI, docker and docker-compose

Anatomy of a (nodejs) microservice

In this section, we’ll use the dataset microservice as an example, but these concepts should apply to most if not all nodejs microservices:

Since we are interested in the microservice’s functional bits, we’ll analyse the app folder content in more detail. It’s worth mentioning that, depending on how you run the microservice, the respective docker compose files may contain relevant information and configuration, as do the files inside the config folder.

The app folder contains the following structure:

The grunt file includes several task definition that may be useful during day-to-day development. However, grunt is semi-deprecated (it’s still needed, don’t remove it) in the sense that it’s recommended to define useful tasks in the package.json file instead - those tasks will, in turn, call grunt tasks.

Inside the app/src folder you’ll find the following structure. The folders below will be commonly found on all microservice, unless stated otherwise:

Adding a new endpoint

In this section we’ll cover how you can add a new endpoint with new functionality to an existing microservice. The aim is not to be a comprehensive guide to cover all cases, but more of a quick entry point into day-to-day actions you may want to perform, which should be complemented by your own learning of how a microservice works - remember that all microservices, despite being structurally similar, have their own custom code and functionality.

To add a new endpoint, here’s the short tasklist you have to tackle:

Register your route in koa

This can be done in the app/src/routes/api/v1/dataset.router.js file, usually at the bottom of if:


// router object declaration, usually at the top of the file
const router = new Router({
    prefix: '/dataset',
});

// routes declaration, usually at the bottom of the file
router.get('/', DatasetRouter.getAll);
router.post('/find-by-ids', DatasetRouter.findByIds);
router.post('/', validationMiddleware, authorizationMiddleware, authorizationBigQuery, DatasetRouter.create);
// router.post('/', validationMiddleware, authorizationMiddleware, authorizationBigQuery, authorizationSubscribable, DatasetRouter.create);
router.post('/upload', validationMiddleware, authorizationMiddleware, DatasetRouter.upload);
router.post('/:dataset/flush', authorizationMiddleware, DatasetRouter.flushDataset);
router.post('/:dataset/recover', authorizationRecover, DatasetRouter.recover);

router.get('/:dataset', DatasetRouter.get);
router.get('/:dataset/verification', DatasetRouter.verification);
router.patch('/:dataset', validationMiddleware, authorizationMiddleware, DatasetRouter.update);
router.delete('/:dataset', authorizationMiddleware, DatasetRouter.delete);
router.post('/:dataset/clone', validationMiddleware, authorizationMiddleware, DatasetRouter.clone);

In here you’ll find the already existing routes. As you can see from the rather explicit syntax, you need to call the method that matches the desired HTTP verb on the router object, and pass it a variable number of arguments - more on this in the next section. One thing to keep in mind is that all the routes in a file are typically prefixed, as defined in the router object declaration.

Endpoints

This section of the documentation refers to endpoints that can only be used for the purposes of development. These endpoints can only be called by other micro services via Control Tower.

Finding users by ids

To retrieve the information of multiple users by ids, use the /auth/user/find-by-ids endpoint.

This endpoint requires authentication, and can only be called from another micro service.

# retrieve info for multiple users with the given ids
curl -X POST https://api.resourcewatch.org/auth/user/find-by-ids \
-H "Authorization: Bearer <your-token>" \
-H "Content-Type: application/json"  -d \
'{
    "ids": [
        "0706f055b929453eb1547392123ae99e",
        "0c630aeb81464fcca9bebe5adcb731c8",
    ]
}'

Example response:

    {
        "data": [
            {
                "provider": "local",
                "role": "USER",
                "_id": "0706f055b929453eb1547392123ae99e",
                "email": "example@user.com",
                "createdAt": "2016-08-22T11:48:51.163Z",
                "extraUserData": {
                    "apps": [
                        "rw",
                        "gfw"
                    ]
                },
                "updatedAt": "2019-12-18T15:59:57.333Z"
            },
            {
                "provider": "local",
                "role": "ADMIN",
                "_id": "0c630aeb81464fcca9bebe5adcb731c8",
                "email": "example2@user.com",
                "createdAt": "2016-08-22T11:48:51.163Z",
                "extraUserData": {
                    "apps": [
                        "rw",
                        "gfw",
                        "prep",
                        "aqueduct",
                        "forest-atlas",
                        "data4sdgs",
                        "gfw-climate",
                        "gfw-pro",
                        "ghg-gdp"
                    ]
                },
                "updatedAt": "2019-12-18T15:59:57.333Z"
            }
        ]
    }

Microservices

A list of information related to the microservices

List all registered microservices

To obtain a list of all the registered microservices:

curl -X GET https://api.resourcewatch.org/api/v1/microservice \
-H "Authorization: Bearer <your-token>"

Example response:

[
    {
        "infoStatus": {
            "numRetries": 0,
            "error": null,
            "lastCheck": "2019-02-04T14:05:30.748Z"
        },
        "pathInfo": "/info",
        "pathLive": "/ping",
        "status": "active",
        "cache": [],
        "uncache": [],
        "tags": [
            "dataset"

        ],
        "_id": "id",
        "name": "Dataset",
        "url": "http://dataset.default.svc.cluster.local:3000",
        "version": 1,
        "endpoints": [
            {
                "redirect": {
                    "method": "GET",
                    "path": "/api/v1/dataset"
                },
                "path": "/v1/dataset",
                "method": "GET"
            },
            {
                "redirect": {
                    "method": "POST",
                    "path": "/api/v1/dataset/find-by-ids"
                },
                "path": "/v1/dataset/find-by-ids",
                "method": "POST"
            }
        ],
        "updatedAt": "2019-01-24T13:04:46.728Z",
        "swagger": "{}"
    },
    {
        "infoStatus": {
            "numRetries": 0,
            "error": null,
            "lastCheck": "2019-02-04T14:05:30.778Z"
        },
        "pathInfo": "/info",
        "pathLive": "/ping",
        "status": "active",
        "cache": [
            "layer"
        ],
        "uncache": [
            "layer",
            "dataset"
        ],
        "tags": [
            "layer"
        ],
        "_id": "5aa667d1aee7ae16fb419c23",
        "name": "Layer",
        "url": "http://layer.default.svc.cluster.local:6000",
        "version": 1,
        "endpoints": [
            {
                "redirect": {
                    "method": "GET",
                    "path": "/api/v1/layer"
                },
                "path": "/v1/layer",
                "method": "GET"
            },
            {
                "redirect": {
                    "method": "POST",
                    "path": "/api/v1/dataset/:dataset/layer"
                },
                "path": "/v1/dataset/:dataset/layer",
                "method": "POST"
            },
            {
                "redirect": {
                    "method": "GET",
                    "path": "/api/v1/dataset/:dataset/layer"
                },
                "path": "/v1/dataset/:dataset/layer",
                "method": "GET"
            }
        ],
        "updatedAt": "2018-11-08T12:07:38.014Z",
        "swagger": "{}"
    }
]

Filters

The microservice list provided by the endpoint can be filtered with the following attributes:

Filter Description Accepted values
status Status of the microservice pending, active or error
url Internal URL of the microservice within the cluster String

Filtering by status

curl -X GET https://api.resourcewatch.org/api/v1/microservice?status=active \
-H "Authorization: Bearer <your-token>"

Get a microservice by id

To obtain the details of a single microservice, use:

curl -X GET https://api.resourcewatch.org/api/v1/microservice/5aa667d1aee7ae16fb419c23 \
-H "Authorization: Bearer <your-token>"

Example response:

{
  "data": {
    "id": "5aa667d1aee7ae16fb419c23",
    "infoStatus": {
        "numRetries": 0,
        "error": null,
        "lastCheck": "2019-02-04T14:05:30.778Z"
    },
    "pathInfo": "/info",
    "pathLive": "/ping",
    "status": "active",
    "cache": [
        "layer"
    ],
    "uncache": [
        "layer",
        "dataset"
    ],
    "tags": [
        "layer"
    ],
    "name": "Layer",
    "url": "http://layer.default.svc.cluster.local:6000",
    "version": 1,
    "endpoints": [
        {
            "redirect": {
                "method": "GET",
                "path": "/api/v1/layer"
            },
            "path": "/v1/layer",
            "method": "GET"
        },
        {
            "redirect": {
                "method": "POST",
                "path": "/api/v1/dataset/:dataset/layer"
            },
            "path": "/v1/dataset/:dataset/layer",
            "method": "POST"
        },
        {
            "redirect": {
                "method": "GET",
                "path": "/api/v1/dataset/:dataset/layer"
            },
            "path": "/v1/dataset/:dataset/layer",
            "method": "GET"
        }
    ],
    "updatedAt": "2018-11-08T12:07:38.014Z",
    "swagger": "{}"
   }
}

Delete microservice

To remove a microservice:

curl -X DELETE https://api.resourcewatch.org/api/v1/microservice/:id \
-H "Authorization: Bearer <your-token>"

This will delete the microservice and its associated endpoints from the gateway’s database. It does not remove the actual running microservice application instance, which may re-register and become available once again.

Subscriptions

When communicating with the Subscriptions microservice from other microservices, you have access to special actions that are not available when using the public API. This section concerns subscriptions endpoints that offer special functionality when handling requests from other microservices.

Creating a subscription for another user

Creating a subscription for user with ID 123 - only works when called by other MS!

curl -X POST https://api.resourcewatch.org/v1/subscriptions \
-H "Authorization: Bearer <your-token>" \
-H "Content-Type: application/json"  -d \
 '{
    "name": "<name>",
    "datasets": ["<dataset>"],
    "params": { "geostore": "35a6d982388ee5c4e141c2bceac3fb72" },
    "datasetsQuery": [
        {
            "id": ":subscription_dataset_id",
            "type": "test_subscription",
            "threshold": 1
        }
    ],
    "application": "rw",
    "language": "en",
    "env": <environment>,
    "resource": { "type": "EMAIL", "content": "email@address.com" },
    "userId": "123"
}'

You can create a subscription for another user by providing the user id in the body of the request.

This can only be done when performing requests from another microservice.

Field Description Type Required
userId Id of the owner of the subscription - if not provided, it’s set as the id of the user in the token. String No

Updating a subscription for another user

If the request comes from another microservice, then it is possible to modify subscriptions belonging to other users. Otherwise, you can only modify subscriptions if you are the owner of the subscription.

The following fields are available to be provided when modifying a subscription:

Field Description Type Required
userId Check here for more info String No

Finding subscriptions by ids

curl -X POST https://api.resourcewatch.org/v1/subscriptions/find-by-ids \
-H "Authorization: Bearer <your-token>"
-H "Content-Type: application/json"  -d \
 '{ "ids": ["5e4d273dce77c53768bc24f9"] }'

Example response:


{
    "data": [
        {
            "type": "subscription",
            "id": "5e4d273dce77c53768bc24f9",
            "attributes": {
                "createdAt": "2020-02-19T12:17:01.176Z",
                "userId": "5e2f0eaf9de40a6c87dd9b7d",
                "resource": {
                    "type": "EMAIL",
                    "content": "henrique.pacheco@vizzuality.com"
                },
                "datasets": [
                    "20cc5eca-8c63-4c41-8e8e-134dcf1e6d76"
                ],
                "params": {},
                "confirmed": false,
                "language": "en",
                "datasetsQuery": [
                    {
                        "threshold": 1,
                        "lastSentDate": "2020-02-19T12:17:01.175Z",
                        "historical": [],
                        "id": "20cc5eca-8c63-4c41-8e8e-134dcf1e6d76",
                        "type": "COUNT"
                    }
                ],
                "env": "production"
            }
        }
    ]
}

You can find a set of subscriptions given their ids using the following endpoint.

Finding subscriptions for a given user

curl -X POST https://api.resourcewatch.org/v1/subscriptions/user/5e2f0eaf9de40a6c87dd9b7d \
-H "Authorization: Bearer <your-token>"

Example response:


{
    "data": [
        {
            "type": "subscription",
            "id": "5e4d273dce77c53768bc24f9",
            "attributes": {
                "createdAt": "2020-02-19T12:17:01.176Z",
                "userId": "5e2f0eaf9de40a6c87dd9b7d",
                "resource": {
                    "type": "EMAIL",
                    "content": "henrique.pacheco@vizzuality.com"
                },
                "datasets": [
                    "20cc5eca-8c63-4c41-8e8e-134dcf1e6d76"
                ],
                "params": {},
                "confirmed": false,
                "language": "en",
                "datasetsQuery": [
                    {
                        "threshold": 1,
                        "lastSentDate": "2020-02-19T12:17:01.175Z",
                        "historical": [],
                        "id": "20cc5eca-8c63-4c41-8e8e-134dcf1e6d76",
                        "type": "COUNT"
                    }
                ],
                "env": "production"
            }
        }
    ]
}

You can find all the subscriptions associated with a given user id using the following endpoint.

This endpoint supports the following optional query parameters as filters:

Field Description Type
application Application to which the subscription is associated. String
env Environment to which the subscription is associated. String

Finding all subscriptions

curl -X GET https://api.resourcewatch.org/v1/subscriptions/find-all \
-H "Authorization: Bearer <your-token>"

Example response:

{
    "data": [
        {
            "type": "subscription",
            "id": "57bc7f9bb67c5da7720babc3",
            "attributes": {
                "name": null,
                "createdAt": "2019-10-09T06:17:54.098Z",
                "userId": "57bc2631f077ce98007988f9",
                "resource": {
                    "type": "EMAIL",
                    "content": "your.email@resourcewatch.org"
                },
                "datasets": [
                    "umd-loss-gain"
                ],
                "params": {
                    "geostore": "d3015d189631c8e2acddda9a547260c4"
                },
                "confirmed": true,
                "language": "en",
                "datasetsQuery": [],
                "env": "production"
            }
        }
    ],
    "links": {
        "self": "https://api.resourcewatch.org/v1/subscriptions/find-all?page[number]=1&page[size]=10",
        "first": "https://api.resourcewatch.org/v1/subscriptions/find-all?page[number]=1&page[size]=10",
        "last": "https://api.resourcewatch.org/v1/subscriptions/find-all?page[number]=1&page[size]=10",
        "prev": "https://api.resourcewatch.org/v1/subscriptions/find-all?page[number]=1&page[size]=10",
        "next": "https://api.resourcewatch.org/v1/subscriptions/find-all?page[number]=1&page[size]=10"
    },
    "meta": {
        "total-pages": 1,
        "total-items": 1,
        "size": 10
    }
}

You can find all the subscriptions using the following endpoint.

This endpoint supports the following optional query parameters as filters:

Field Description Type Example
application Application to which the subscription is associated. String ‘rw’
env Environment to which the subscription is associated. String 'production’
updatedAtSince Filter returned subscriptions by the updatedAt date being before the date provided. Should be a valid ISO date string. String '2020-03-25T09:16:22.068Z’
updatedAtUntil Filter returned subscriptions by the updatedAt date being after the date provided. Should be a valid ISO date string. String '2020-03-25T09:16:22.068Z’
page[size] The number elements per page. The maximum allowed value is 100 and the default value is 10. Number 10
page[number] The page to fetch. Defaults to 1. Number 1

Microservice reference

This document should give developers a bird’s eye view of existing microservices, their status and resources, organized by namespace.

Core

Name URL Travis Status Code Coverage
arcgis Github Build Status Test Coverage
bigquery Github Build Status Test Coverage
carto Github Build Status Test Coverage
control-tower Github Build Status Test Coverage
converter Github Build Status Test Coverage
dataset Github Build Status Test Coverage
doc-executor Github Build Status Test Coverage
doc-orchestrator Github Build Status Test Coverage
doc-writer Github Build Status Test Coverage
document Github Build Status Test Coverage
gee Github
gee-tiles Github
geostore Github Build Status Test Coverage
graph-client Github Build Status Test Coverage
layer Github Build Status Test Coverage
metadata Github Build Status Test Coverage
mail Github Build Status Test Coverage
query Github Build Status Test Coverage
rw-lp Github
task-async Github
vocabulary Github Build Status Test Coverage
webshot Github Build Status Test Coverage
widget Github Build Status Test Coverage

GFW

Name URL Travis Status Code Coverage
analysis-gee Github
arcgis-proxy Github Build Status Test Coverage
area Github Build Status Test Coverage
forest-change Github Build Status Test Coverage
gfw-forma Github Build Status Test Coverage
gfw-guira Github Build Status Test Coverage
gfw-ogr Github Build Status Test Coverage
gfw-prodes Github Build Status Test Coverage
gfw-umd Github Build Status Test Coverage
gfw-user Github Build Status Test Coverage
gs-pro-config Github Build Status Test Coverage
glad-analysis-athena Github Build Status Test Coverage
high-res Github Build Status Test Coverage
imazon Github Build Status Test Coverage
quicc Github Build Status Test Coverage
story Github Build Status Test Coverage
subscriptions Github Build Status Test Coverage
true-color-tiles Github Build Status Test Coverage
viirs-fires Github Build Status Test Coverage

FW

Name URL Travis Status Code Coverage
forest-watcher-api Github Build Status Test Coverage
forms Github Build Status Test Coverage
fw-alerts Github Build Status Test Coverage
fw-contextual-layers Github Build Status Test Coverage
fw-teams Github Build Status Test Coverage

Aqueduct

Name URL Travis Status Code Coverage
aqueduct-analysis Github Build Status Test Coverage

PREP

Name URL Travis Status Code Coverage
nexgddp Github Build Status Test Coverage
prep-api Github Build Status Test Coverage
prep-app Github
prep-manager Github
proxy Github Build Status Test Coverage

Climate Watch

Name URL Travis Status Code Coverage
Climate Watch Flagship Github
Climate Watch India Platform Github
Climate Watch Indonesia Platform Github
Climate Watch South Africa Platform Github
Climate Watch: Emissions Scenario Portal Github

RW

Name URL Travis Status Code Coverage
resource-watch-manager Github Build Status Test Coverage

API Smoke Tests

This chapter covers the existing API Smoke Tests, including instructions on how to manage existing tests and create new ones.

The API Smoke Tests are implemented using Canaries provided by AWS Synthetics (docs here).

Template for smoke tests

Template for an AWS Synthetics Canary

const synthetics = require('Synthetics');
const log = require('SyntheticsLogger');
const AWS = require('aws-sdk');
const https = require('https');
const http = require('http');

const apiCanaryBlueprint = async function () {

  const verifyRequest = async function (requestOption, body = null) {
    return new Promise((resolve, reject) => {
      // Prep request
      log.info("Making request with options: " + JSON.stringify(requestOption));
      let req = (requestOption.port === 443) ? https.request(requestOption) : http.request(requestOption);

      // POST body data
      if (body) { req.write(JSON.stringify(body)); }

      // Handle response
      req.on('response', (res) => {
        log.info(`Status Code: ${res.statusCode}`)

        // Assert the status code returned
        if (res.statusCode !== 200) {
          reject("Failed: " + requestOption.path + " with status code " + res.statusCode);
        }

        // Grab body chunks and piece returned body together
        let body = '';
        res.on('data', (chunk) => { body += chunk.toString(); });

        // Resolve providing the returned body
        res.on('end', () => resolve(JSON.parse(body)));
      });

      // Reject on error
      req.on('error', (error) => reject(error));
      req.end();
    });
  }

  // Build request options
  let requestOptions = {
    hostname: "api.resourcewatch.org",
    method: "GET",
    path: "/v1/dataset",
    port: 443,
    headers: {
      'User-Agent': synthetics.getCanaryUserAgentString(),
      'Content-Type': 'application/json',
    },
  };

  // Find and use secret for auth token
  const secretsManager = new AWS.SecretsManager();
  await secretsManager.getSecretValue({ SecretId: "gfw-api/token" }, function(err, data) {
    if (err) log.info(err, err.stack);
    log.info(data);
    requestOptions.headers['Authorization'] = "Bearer " + JSON.parse(data["SecretString"])["token"];
  }).promise();

  // Find and use secret for hostname
  await secretsManager.getSecretValue({ SecretId: "wri-api/smoke-tests-host" }, function(err, data) {
    if (err) log.info(err, err.stack);
    log.info(data);
    requestOptions.hostname = JSON.parse(data["SecretString"])["smoke-tests-host"];
  }).promise();

  const body = await verifyRequest(requestOptions);
  const id = body.data[0].id;

  // Change needed request options
  requestOptions.method = "GET";
  requestOptions.path = "/v1/dataset/"+id;

  // Make second request
  await verifyRequest(requestOptions);
};

exports.handler = async () => {
  return await apiCanaryBlueprint();
};

New tests should be based on the template displayed on the side, in order to take advantage of the configurations already in place.

Tests can execute multiple requests, but please minimize the number of interactions with databases to avoid creating junk data (for this reason, smoke testing POST, PATCH and DELETE endpoints is not recommended).

Another thing to notice is the usage of AWS secrets for storing a token to execute the request (gfw-api/token), as well as the hostname where the test will be executed (wri-api/smoke-tests-host).

The template on the side executes a GET request to /v1/dataset, grabs the first ID in the response data and executes a second GET request to the /v1/dataset/:id endpoint.

The test will pass if there are no exceptions thrown or promise rejections during the execution of the test. For the example on the side, the test will fail if any of the requests performed returns a status code that is not 200.

Things to pay attention

Use a user to run the tests

Please ensure that all tests are ran using a token for a user which was specifically created for running the tests. Also, it goes without saying, please don’t share either the token or the credentials for the user running the tests with anyone.

Always configure alarms for the smoke tests

Smoke tests by default are created without an associated alarm. When managing or creating smoke tests, please ensure that each test has a unique alarm associated to it.

Also, please ensure that the created alarm has an action defined to notify someone in case of failure of a test.