CDP organizes digital information about the physical world. Assets are digital representations of physical objects or groups of objects, and assets are organized into an asset hierarchy. For example, an asset can represent a water pump which is part of a subsystem on an oil platform.
Assets are used to connect related data together, even if the data comes from different sources; Time series of data points, events and files are all connected to one or more assets. The pump asset can be connected to a time series measuring pressure within the pump, as well as events recording maintenance operations, and a file with a 3D diagram of the pump.
At the top of an asset hierarchy is a root asset (e.g., the oil platform). Each project can have multiple root assets. All assets have a name and a parent asset. No assets with the same parent can have the same name.
A time series consists of a sequence of data points connected to a single asset.
For example: A water pump asset can have a temperature time series that records a data point in units of °C every second.
A single asset can have several time series. The water pump could have additional time series measuring pressure within the pump, rpm, flow volume, power consumption, and more.
Time series store data points as either number or strings. This is controlled by the is_string flag on the time series object. Numerical data points can be aggregated before they are returned from a query (e.g., to find the average temperature for a day). String data points, on the other hand, cannot be aggregated by CDP, but can store arbitrary information like states (e.g. “open”/”closed”) or more complex information (JSON).
Cognite stores discrete data points, but the underlying process measured by the data points can vary continuously. When interpolating between data points, we can either assume that each value stays the same until the next measurement, or that it linearly changes between the two measurements. This is controlled by the is_step flag on the time series object. For example, if we estimate the average over a time containing two data points, the average will either be close to the first (is step) or close to the mean of the two (not is step).
Deprecation warning: In the future, CDP will phase out name as a unique identifier for time series, and instead use a primary key of externalId. Time series names must currently be unique across across all time series in the same project. In version 0.6 of CDP, time series names will no longer be unique.
A data point stores a single piece of information, a number or a string, associated with a specific time. Data points are identified by their timestamps, measured in milliseconds since the unix epoch -- 00:00, January 1st, 1970. Milliseconds is the finest time resolution supported by CDP i.e. fractional milliseconds are not supported. Leap seconds are not counted.
Numerical data points can be aggregated before they are retrieved from CDP. This allows for faster queries by reducing the amount of data transferred. You can aggregate data points by specifying one or more aggregates (e.g. average, minimum, maximum) as well as the time granularity over which the aggregates should be applied (e.g. “1h” for one hour).
Aggregates are aligned to the start time modulo the granularity unit. For example, if you ask for daily average temperatures since monday afternoon last week, the first aggregated data point will contain averages for monday, the second for tuesday, etc. Determining aggregate alignment without considering data point timestamps allows CDP to pre-calculate aggregates (e.g. to quickly return daily average temperatures for a year). As a consequence, aggregating over 60 minutes can return a different result that aggregating over 1 hour because the two queries will be aligned differently.
Event objects store complex information about multiple assets over a time period. For example, an event can describe two hours of maintenance on a water pump and some associated pipes, or a future time window where the pump is scheduled for inspection. This is in contrast with data points in time series that store single pieces of information about one asset at specific points in time (e.g., temperature measurements).
An event’s time period is defined by a start time and end time, both millisecond timestamps since the UNIX epoch. The timestamps can be in the future. Events can also be categorized by a type (e.g, “fault”) and a subtype (e.g., “electrical”), both arbitrary strings defined when creating the event. In addition, events can have a text description as well as arbitrary metadata and properties.
A file stores a sequence of bytes connected to one or more assets. For example, a file can contain a piping and instrumentation diagram (P&IDs) showing how multiple assets are connected.
Each file is identified by a unique ID that is generated when it is a created, as well as a name and a directory path. File names are limited to 256 bytes, and directory paths to 512. The combination of file name and directory path must be unique within a project.
Directories in CDP differ from ones in normal file systems; They exist only as string attributes on individual file objects. This means that directories themselves cannot be created, deleted, or moved. There is no particular path separator, and no notion of a directory hierarchy.
Files are created in two steps; First the metadata is stored in a file object, and then the file contents are uploaded. This means that files can exist in a non-uploaded state.
Cursors and pagination
When fetching data from the Cognite API, the results will be wrapped in one of two data types:
The difference between
DataWithCursor is merely that the latter also has cursors that
you can use to navigate through pages of results. The cursor is a random string that can be copied
and sent with subsequent requests to navigate through pages of results.
To access the API of CDP the requests must be authenticated. Users and services authenticate differently.
Users authenticate by presenting a token obtained from the identity provider configured for the project. This enables users to authenticate using their existing identity that are managed by the user´s organization.
Services authenticate through presenting an API key. The API key is a secret string that grants access to a project when making requests to the API. Each API key connects exactly one service to one project. A single service can have multiple API keys for the same project.
3D models and revisions
The cognite platform uses 3D models of physical assets to give data a visual and geometrical context. We can connect e.g. a pump asset with a 3D model of the plant floor where it's placed. Seeing asset data rendered in 3D enables you to quickly find the sensor data you are interested in.
3D data is organized into models and revisions. A model is just a placeholder for a set of revisions, or versions. Revisions contain the actual 3D data. For example, you can have a model named
Compressor and you can upload a revision under that model. When you create a revision you need to attach a 3D file. For each new version of the 3D model, you upload a new revision under the same model. A revision can have status
unpublished which is used by applications to decide whether or not to list the revision. Multiple revisions can be published at the same time, since they do not necessarily represent time evolution of the 3D model, but rather different versions (high detail vs low detail).
When you upload a new revision, Cognite needs to process the 3D data to optimize it for our renderer. Depending on the complexity of the 3D file, this can take some time. A revision can have status
Failed, which can be tracked during processing.
3D data is typically built up by a hierarchical structure. This is very similar to how we organize our internal asset hierarchy. Each 3D node is assigned a random ID, nodeId (
uint64). If a user clicks on an object on the screen, the application can get a callback containing the nodeId of the clicked object. We support endpoints to extract the full 3D node hierarchy, and endpoints to create mapping between 3D nodes and nodes in Cognite's asset hierarchy. You can then use the nodeId to connect the 3D data to asset information such as metadata and timeseries.
We also deliver a web based 3D viewer to embed the 3D model in your own web page.
Projects are used to separate customers from one another, and all objects in CDP are linked to a project. A customer usually has only one project. The project object contains configuration for how to authenticate users.
Automatically assigned object ids are unique only within each project.