A big part of what SolarNetwork does is collect and store data from all sorts of devices and services. The type of data collected varies quite a bit, so SolarNetwork has a flexible design for storing it, which we call datum.
A datum is a collection of properties tied to a specific timestamp and object and source. An object is either a SolarNode device (a node) or a physical location (such as a building address or GPS coordinates). A source is a user-assigned name.
For example, a node datum collected from a power meter might have properties like this:
Property | Value |
---|---|
ts |
2021-01-01 12:00:00+13 |
node_id |
123 |
source_id |
/meter/1 |
current |
9.733 |
voltage |
239.5 |
watts |
2331 |
wattHours |
340983098 |
status |
OK |
Until now, SolarNetwork stored this data as a set of JSON objects, based on a simple classification of instantaneous vs accumulating vs status property types. Along with columns for the timestamp, object ID, and source ID of the datum, the database table had JSON columns for each set of properties, sort of like this:
Column | Datatype | Value |
---|---|---|
ts |
timestamp | 2021-01-01 12:00:00+13 |
node_id |
bigint | 123 |
source_id |
text | /meter/1 |
jdata_i |
JSON | {"current":9.733,"voltage":239.5,"watts":2331} |
jdata_a |
JSON | {"wattHours":340983098} |
jdata_s |
JSON | {"status":OK} |
Additionally, users are allowed to store arbitrary metadata for object + source ID pairs, in the form of a JSON object. In the old database design a separate table holds this information, using the object ID and source ID pair as its primary key. A row in the database table for node source metadata might look like this:
Column | Datatype | Value |
---|---|---|
node_id |
bigint | 123 |
source_id |
text | /meter/1 |
jdata |
JSON | {"description":"Power meter for house supply."} |
Although quite flexible (and having served SolarNetwork well for many years), the database design had some notable drawbacks:
To overcome these design problems, we’ve developed a new database design that preserves the general flexibility of how datum are stored but solves the design issues. We introduced the concept of a datum stream, with these changes:
A diagram makes this easier to take in:
The new da_datm
table has a primary key defined by a stream ID and timestamp. The datum
property values are stored as numeric[]
and text[]
array columns.
The new da_datm_meta
and da_loc_datm_meta
tables associate stream IDs to node/location ID and
source ID pairs. These tables hold the datum property names for each property classification in
text[]
array columns (the names_*
columns). They also hold user-defined JSON metadata for each
stream in the jdata
column, formerly stored in the da_meta
and da_loc_meta
tables.
Using the same example datum as shown earlier, in the new database design there would be one row
in the da_datm_meta
table like this:
Column | Datatype | Value |
---|---|---|
stream_id |
UUID | a4840264-2b5b-4673-83f7-66a404d6faf0 |
node_id |
bigint | 123 |
source_id |
text | /meter/1 |
names_i |
text[] | [current,voltage,watts] |
names_a |
text[] | [wattHours] |
names_s |
text[] | [status] |
Then there would be one row in the da_datm
table like this:
Column | Datatype | Value |
---|---|---|
stream_id |
UUID | a4840264-2b5b-4673-83f7-66a404d6faf0 |
ts |
timestamp | 2021-01-01 12:00:00+13 |
data_i |
numeric[] | [9.733,239.5,2331] |
data_a |
numeric[] | [340983098] |
data_s |
text[] | [OK] |
The new design solves the first problem of renaming source IDs quite nicely. Now a source ID is
just a bit of metadata for a datum stream: updating a single row in the da_datm_meta
table is all
it takes to update the source ID for an entire stream. In fact, object IDs are now essentially
another bit of metadata for a datum stream, and could be easily updated if needed.
The new design solves the second problem of duplicate property names by turning the property names
into metadata on a datum stream as well. As more and more datum are collected for a given stream,
there will still be only one row in the da_datm_meta
table. SolarNetwork is also flexible
enough that property names can be added over time to a given stream, and NULL
(empty) values
are allowed (and stored efficiently) if a datum doesn’t always contain values for all properties.
Our charitable aim is to protect the environment through supporting energy education and using technology to help people understand energy conservation and renewable energy.
Take control of your energy usage and costs with our powerful platform.