Skip to content

Architecture of OSH v2

Alex Robin edited this page Oct 7, 2019 · 7 revisions

General Architecture

This page discusses API and architecture changes introduced in OSH v2.

Decoupling

Several improvements have been introduced in OSH v2 to reduce coupling between components. Among other things, these changes will allow much more scalable deployments of OSH in private or public cloud infrastructure.

Module API vs Procedure API

We separated the concepts of Module and the Entities they implement (such as sensor drivers).

In OSH v1.x, there was a one-to-one relationship between a Module and a Sensor driver (or more generally Procedure). Thus a Sensor driver could only be implemented as a Module (and only one per module) and each instance was referred to using its module ID (for example in the config of another module). This made things less flexible than desired.

In OSH v2, a single Module can now implement many different Entities, including several Procedures but also a mix of Entities if desired (for example, a Module providing access to both real-time and historical data from a given source can now register both interfaces at once). Procedures are now always referred to using their unique ID (or URI, as provided in their SensorML description).

To this end, new interfaces for sensor drivers that don't inherit from IModule have been introduced. These new interfaces are rooted on the IProcedureWithState interface that represents any procedure that is registered on the hub. In particular, sensor drivers and processes extend the IDataProducer interface that itself extends IProcedureWithState.

Procedure Registry

The concept of Procedure Registry was introduced to provide access to information regarding all Sensors and Processes (more generally Procedures) registered on the hub.

The main goal is to reduce the need for direct connections between Modules. For example, a Service Module that needs to retrieve information from a Sensor would now retrieve it from the Procedure Registry rather that directly from the sensor module.

The IProcedureRegistry interface provides this functionality and only one instance can exist on a given sensor hub. The default implementation provided in osh-core works by creating Procedure Shadows that expose the same interface as the actual Procedures they represent, but maintain the last procedure state independently of the lifecycle of the source module. The latest information about a procedure is thus accessible to all other modules at any time, even if the sensor driver has been turned off. This also provides a generic mechanism to persist procedures state across hub restarts (optional, depending on the actual registry implementation used).

The new Procedure Registry, along with the new Event Bus architecture allows discovery and access to information and data generated by all Procedures registered on the hub.

Database Registry

Similarly to the Procedure Registry, a Database Registry has also been introduced to provide federated access to all historical procedure data, including observations, features of interest and procedure description history (i.e. history of SensorML descriptions) form a single place.

One important feature is that the Database Registry provides read-only access to all historical data in a federated manner whether data is stored in a single or multiple databases. However, writing is done by connecting directly to the database.

A new, more generic, DataStore API was also introduced to replace the old Persistence API and will be discussed later.

Procedure API / Sensor API / Process API

As described above, these APIs have been separated from the Module API to improve flexibility.

Specific APIs for "multi-source" producers have been simplified. Now, any data producer can generate data for multiple FOIs by simply inserting the FOI unique ID in the corresponding data event.

Procedure Registry

All Data Producers and other Procedures that are enabled on the hub must be registered with the Procedure Registry, so they can be discovered by other OSH components.

Note that the Procedure Registry only references live Procedures, i.e. the ones that are currently active on the hub. Procedures for which only historical data is available can only be discovered via the Database Registry that also acts as a federated database.

DataStore API

  • Storage instances can now be associated to several sensors at once, and even to an entire sensor group

The Federated Database Registry

OSH v2 provides a Database Registry that is a single read-only entry point to query data from all attached databases, in a federated manner.

Federated IDs

Since each database instance assigns its own internal IDs independently of each other, the federated database needs to transform these internal IDs into "federated IDs" that are unique across a given sensor hub.

This design was retained because it provides the most flexibility. In particular it helps with the following aspects:

  • Database connectors can be implemented w/o a-priori knowledge of a federated ID scheme
  • Database connectors can directly use IDs assigned by the underlying database engine w/o transformation
  • Database instances can be moved from one hub to another w/o rewriting the IDs
  • while still allowing database clients to use internal IDs for efficient references across several data stores (i.e. tables), both within the same database, or across databases!

Several other approaches have been investigated:

  1. Only use internal IDs for optimization within a database but don't expose them outside. So we would always use object keys publicly (e.g. for procedures, we would always write a key as URI + valid time). This leads web services/APIs to expose large IDs, usually of type String, and composed of multiple parts (e.g. urn:osh:group01:sensor006@156389412556).

  2. Let services handle the conversion to numerical IDs. The problem with this is that each service would have to maintain its own persistent map to convert numerical IDs to object keys. Indeed, the conversion from object key to numerical ID can usually be implemented with a hash function, but the reverse transformation can only be done with the help of a persistent map.

Dispatching queries

When queries are received by the federated database, they must be dispatched to the proper database instance(s). Dependending on the query, this can be more or less optimized and targeted to the right DB. In the worst case the query is flooded to all DB instances available.