This blog was originally posted on the ChartiO Blog on 3/14/17.
It’s never been more important to know what’s happening now
Pretty much every technological development we are seeing – be it IoT, augmented reality or any one of the thousand other changes happening right now – involves doing things over shorter and shorter timescales.
This means that to run your business, you need to know what’s happening now. Lags between an event happening and knowing about the event impede our ability to act – you can’t do something if you don’t have the information. Fast-moving systems also allow failures to escalate and propagate before you’ve seen them. Since you can’t stop what you don’t know about, high-performance, real-time command and control systems are a critical element of 21st century infrastructure.
Mass-scale, accurate counting is hard
Accurate counting of events is key to understanding your business, but counting very large numbers of events accurately is hard.
The traditional approach is to store all the data until a defined time period has elapsed and then count them. This means if your time period is five minutes, you can never, ever be less than five minutes behind reality. While you might be able to get closer by using smaller time windows you can never get to now, and as your time windows get smaller and smaller the computational overhead will increase geometrically.
Another approach is to increment whatever counters are needed as each event happens, ideally in an ACID manner. This works well in small-scale demos or during simplistic ‘Kung Fu Villain’ benchmarks that involve all the data showing up in a neat and predictable order. But traditional databases don’t handle simultaneous updates to counts of things well, and are prone to locking issues. They especially hate it when being queried for a dashboard at the same time, as the large number of incomplete transactions makes it very hard to provide a read-consistent view of state at the start of the query.
NoSQL solutions generally aren’t much better, as they often rely on eventual consistency or other architectures that don’t guarantee accurate, up-to-date numbers.
While ChartIO essentially solves the dashboard creation problem, you still need a high-performance system to classify and count streams of data events.
Getting to ‘now’ with VoltDB and ChartIO
VoltDB is a NewSQL, cloud-friendly RDBMS that excels at accurately counting and classifying events. Major US telephone companies use it for keeping track of mobile phone credit usage in real-time, even when multiple people, possibly in multiple locations, share accounts.
A key strength of VoltDB is its proven ability to handle hundreds of thousands of transactions per second. In a ChartIO dashboard scenario, each of those transactions could be an asynchronous report of an event we want to measure, such as a bus arriving at a stop, the amount of money at stake in an organization with multiple trading desks, or the latency profile of a time-sensitive application.
Each transaction we receive can update one or more rows in one or more tables keeping track of state. It is perfectly normal for VoltDB to keep track of millions of ‘things’ at once.
We can then use VoltDB’s materialized views to turn the ‘thing’ level data into totals that ChartIO’s JDBC engine can work with. A key feature of VoltDB’s views is that they are re-calculated every time the underlying table changes, which means their values are always correct – and they can still be read without the contention and read-consistency issues seen with legacy RDBMS products.
- In order to get this to work you’ll need to make VoltDB’s port (21212 by default) visible from ChartIO.
- You’ll then need to use the VoltDB JDBC driver, which is available within ChartIO’s user interface.
- VoltDB will run happily on any Linux distro, but obviously very heavy loads will require some form of sizing/proof of concept exercise.
- VoltDB is now available by the hour in AWS, so it’s easy to spin up a PoC to try it yourself.
- Create a ChartIO account
- Download VoltDB from here
ChartIO and VoltDB are complementary. VoltDB has the ability to ingest, classify and count massive real-time streams of data as fast as it receives them. ChartIO can then be used to create an enterprise dashboard that’s as up to date as your data is.