Device Data and Limitations of Relational Schema

Introduction

Not unlike Network Management Systems (NMS), Data Center Infrastructure Management (DCIM) software also monitors diverse set of equipment. The equipment ranges from server, network switches, Power Distribution Units (PDUs), panels, sensors, Diesel Generator (DG) sets. These devices have different protocols – MODBUS, SNMP, and BACNET. In addition, the parameters, that are monitored, are also different. For example the monitored parameters from a DG set may be output voltage, output power, and output current for all phases. Now in the case of a sensor it may be temperature and relative humidity. The software needs to capture the data from the various devices, keep in persistent store and report/alert on the data. This poses a problem if we want to store in traditional row/column format of relational data base. We will explore the implementation options and the method adopted.

Implementation Options in RDBMS

If we choose to store the monitored data in traditional relational form we have couple of options:

Build a super set of column list from all the monitored devices
If we choose this option then let’s say that we have 3 devices A, B & C and for A the monitored parameters are x, y, for B the monitored parameters are y, z, and for C the monitored parameters are x, z. So if we have a table with columns x, y and z it should suffice. Well in the real world the number of devices can run into hundreds of types with each device having multiple unique parameters. In that case you will see that the number of columns will easily run into few hundreds making the table design unwieldy. Furthermore, when it is populated with data it will be sparse. Of course, every time a new device is added with unique parameter, one will have to add columns to the table making the design untenable.
Have a table per device
This approach is somewhat better than the previous one – in the design add a table, which is unique to the type of device. For example there will be a table for DG set with columns for parameters that are monitored for a DG set, a table for a sensor with temperature and relative humidity as columns, so on and so forth. It sounds logical. However, this design also suffers from similar deficiencies as stated above. Let us say you have 2 DG sets from two different manufacturers and their monitored parameters, although having overlaps, are not exactly same. So what do we have to do – add two different tables for 2 DG sets? There goes the design principle for a toss!

How to retrofit in a RDBMS based solution?

Having described the issues that we encountered, how do we design the persistence of monitored data? The natural choice would have been NOSQL databases such as Cassandra or similar persistent store. The NOSQL data model is a dynamic schema, column-oriented data model. This means that, unlike a relational database, you do not need to model all of the columns required by your application up front, as each row is not required to have the same set of columns. Columns and their metadata can be added by your application as they are needed without incurring downtime to your application.Since we had to retrofit the design into an already existing relational schema, we chose have a single column of text (varchar field in RDBMS terminology) sufficiently large to hold the monitored data. However, we devised a scheme such that when we acquire data we say what field it is, what is the unit and what is the value. For example if from a sensor we acquire temperature and relative humidity, the data that is written into the table will be “field = temp, unit = Celsius, value = 22/field = RH, unit = %, value = 50”. Similarly for a generator a data row may be “field = voltage, unit = volt, value = 240/field = power, unit = KW, value = 100”. Both these data points will go into the same column and another column for their unique device id. Having done this we simplified the design, its maintenance and reporting. A separate reporting module which normalizes the data after suitably extracting from the monitored table suffices to do all kind of reporting from each unique device. It is flexible enough to add new devices with its own unique parameters without changing the core tables. This is how we married structure and unstructured data.