Columnar and Relational Database Structures


Jamie Clark
Dimensional Insight

Dimensional Insight

business analytics and
integrated business intelligence solutions

Diver Platform

Supply Chain, Manufacturing,
Healthcare, Beverage Alcohol, ...

> spectre dive basic.cbase \
    -q -d Direction -c Value 
Direction       Value
East       2,937.7687
North      2,520.4438
South      1,641.8041
West       2,924.6749

Jamie Clark

15+ years building business intelligence tools
author of DivePort
Spectre team lead
first poultry and egg conference


☑ storage geometry:
row-oriented vs. column-oriented

☑ database types:
MOLAP vs. in-memory ROLAP


(usually) two-dimensional

but they must be stored
on one-dimensional media
so a decision must be made...


Title Year Gross
Bolt 2008 114M
Juno 2007 143M
Thor 2011 181M

Bolt 2008 114M Juno 2007 143M Thor 2011 181M


Title Year Gross
Bolt 2008 114M
Juno 2007 143M
Thor 2011 181M

Bolt Juno Thor 2008 2007 2011 114M 143M 181M

so what's the difference?

locality of reference

disk is very far away

data: Jonas Bonér


Bolt 2008 114M Juno 2007 143M Thor 2011 181M


Bolt Juno Thor 2008 2007 2011 114M 143M 181M


  • add new rows quickly
  • fast access to many columns on few rows


  • add/remove columns quickly
  • fast access to many rows on few columns

analytical tasks prefer column-oriented transactional tasks prefer row-oriented

Business Analytics

(OLAP: online analytical processing)

it's possible to do analysis
in a transactional database
using SQL

a business intelligence solution
makes it easier and faster

let's look at two structures
for busines analytics...



multidimensional OLAP

8 x 7 = ?

8 x 7 = 56


pre-computed query results


to answer a query: identify the right table, load it

this is very fast

(when the answer was pre-computed)

this works on weak hardware

this takes time to prepare

and cannot be in-memory

cannot change summary functions at run time

must limit number of dimensions

MOLAP is fast at run-time
but inflexible


relational OLAP

102 x 11 = ?

102 x 11 = 1,122


no pre-computed results
just the detail table


queries are run directly against detail table

this is very flexible

and takes much less space

but it requires strong hardware

even then, it can be slow

there's a balance between speed and flexibility

but it's complicated...
if ROLAP fits in memory
and we have multiple processors
and the questions are complex
and the data changes frequently
and the engine can optimize the query
and nobody is leaning against the server
and the moon is waxing
and your favorite show is on

ROLAP is very flexible,
and was considered slow at run-time
but with modern hardware
and enough memory
in-memory ROLAP is looking good


  • fast query time
  • especially on poor hardware

in-memory ROLAP

  • fast access time
  • fast build time
  • scales better
  • more flexible


let's bring it back to storage geometry

MOLAP "data cubes" and ROLAP data tables
could be row- or column-oriented

for MOLAP it doesn't matter much

for ROLAP it matters (among many other factors)

transactional databases are usually row-oriented

"optimize" the data by moving it
to a column-oriented database


Dimensional Insight started in 1989
with row-major MOLAP

in 2014 we introduced Spectre,
column-oriented in-memory ROLAP


ROLAP has a lot of advantages

and two disadvantages
(performance and hardware requirements)

the disadvantages were impossible to overcome

but things change

memory is very cheap


CPU cores are becoming plentiful

we've ridden Moore's law to a tipping point
where cheap processors and abundant memory
permit fast, dynamic, in-memory analysis


things change
buy lots of memory
know how your data is stored
modern hardware opens up possibilities

columns under Gaudí's Parc Güell, Barcelona