Comparing Calpont InfiniDB and a Leading Row-based Database
Learn about independent benchmark test results between Calpont InfiniDB 2.0 column-based databse and a leading row-based database. The results were published in January 2010 in the white paper, "Data Warehouse Benchmark: Comparing Calpont InfiniDB® and a Row-based Database." The benchmark test's author, Bert Scalzo, a database expert and Oracle ACE who has published several books, articles and speaks on subject of data warehouse design and performance, provides a narrative and highlights of the performance results.
A Benchmark Comparison between the InfiniDB column database and a Leading Row-Based Database
As an example of how a column database can outperform a legacy RDBMS, Calpont Corporation recently commissioned a well-known data warehouse industry expert to benchmark the leading row-based database (of which the expert has many years of experience in tuning for fast performance) against Calpont’s InfiniDB Server, which has as one of its core features, a column-oriented design. The Star Schema styled benchmark was conducted on two different machines to gauge performance on both mid and large-sized servers. The mid-sized server was an 8 CPU, 8GB RAM, 14 SATA 7200 RAID-0 no cache configuration, and the large server was a 16 CPU, 16GB RAM, 14 SAS 15K RPM RAID-0 with 512MB cache machine. Both were running 64-bit CentOS 5.4. The raw database size was 2TB.
As can be seen on the graphs below, various configurations were used for the leading row-based database, however no matter the configuration, the InfiniDB column based database consistently and dramatically beat the legacy database in storage footprint, load time, and query speed:
In addition to producing overall faster query speeds, InfiniDB column database also supplied much better query predictability in terms of query time. Whereas the leading row-based database produced wildly varying minimum and maximum query times over the various runs, InfiniDB had a far more tightly group of runs when it came to predictable response times. This translates into much better dependability from a business standpoint in ensuring BI reports and queries meet whatever service-level agreements are imposed from business users.
In addition to better performance, the column-orientation aspect of column database supplies a number of useful benefits to those wishing to deploy fast business intelligence databases.
First, there is no need for indexing as with traditional row-based databases. The elimination of indexing means: (1) less overall storage is consumed in column databases because indexes in legacy RDBMS’s often balloon the storage cost of a database to double or more the initial data size; (2) data load speed is increased because no indexes need to be maintained; (3) ad-hoc DML work speed is increased because no index updates are performed; (4) no indexing design or tuning work is imposed on the database IT staff.
Second, there is far less design work forced on database architects when a column based database is used. The need for complicated partitioning schemes, materialized view or summary table designs, and other such work is completely removed because column databases need none of these components to achieve superior query performance.
In summary, the InfiniDB server saves on storage costs, supplies faster access to new/incoming data, and runs query much faster than its row-based competitor.