Storing and processing large volumes of data is not a trivial task, especially when this information comes from multiple sources and is received in different formats. Different types of data, when combined, have the potential to influence strategic analyses and decision making, but they must be gathered and processed with agility and reliability.
Big data technology has helped companies manage large databases, but this flow of information tends to increase and solution providers will have to adapt to ever growing requirements. According to research by IDC (International Data Corporation), the digital universe doubles every two years and may reach 44 trillion gigabytes by 2020. How can we store, process, and especially extract value from so much data in its different forms, sources, and volumes?
It is in this context that the concept of a data lake stands out. Data lakes are data management platforms designed to hold and analyze large quantities of information from different data sources. With this technology, different types of data are stored in their original form, and from there data is extracted, combined, correlated, and utilized in different ways, according to the needs of each business.
The concept of a data lake fits quite well with the energy and utilities management scenario. Energy and utilities management systems may be separated into two distinct classes. First, systems geared towers gathering and logging measurement data in time series. Many manufacturers of management devices supply their own systems, some with more, some with fewer features. This class of system is closely related to the so-called PIMS (Plant Information Management Systems), traditionally used in process industries to store time series sampled in different frequencies from multiple meters. In a second class, there are generic management systems of transactional nature, geared towards handling isolated problems such as contract management, costs, budgeting, and others. Many software companies, particularly those providing ERP suites (Enterprise Resource Planning), supply such solutions. These systems manage different types of data such as accounting transactions, financial projections, energy & utilities contracts, among others.
In the past, these classes of systems were supplied by different software providers, had very different internal architectures, and were utilized by distinct groups of users within companies. However, we now live in a moment of convergence. The advance of new technologies allows these systems to be unified in integrated systems with architectures based on data lakes, enabling new information and analyses based on multiple varieties of data. Consumption time series may be compared to equipment production data, detecting inefficiencies as soon as they happen. Contract terms may directly influence energy supply strategies, with real-time updates.
The number of variables and the amount of data is ever growing, and the search for new combinations has become a great tool to find new solutions and become more competitive. Innovation possibilities based on technologies such as data lakes are boundless, and so are the gains for better energy management and efficiency.
The Viridis platform explores the concept of a data lake to combine time series sampled in real time with transactional data from management systems. To deal with large volumes of data, the system has a high-performance data historian with efficient storage and response times, capable of holding thousands of time series for many years. All this measurement data is combined with data from legacy systems such as ledger transactions, production orders, maintenance orders, laboratory data, among others. The system automatically correlates multiple types of data, temporal and transactional, allowing disruptive views on energy performance and operational efficiency.