What is Columnar?
Columnar intro
Columnar storage is a key part of the data warehouse, but why is that? In this article we will review what columnar storage is and why itβs such a key part of the data warehouse.
The heap table
By default, data in Postgres is stored in a heap table. Heap tables are arranged row-by-row, similar to how you would arrange data when creating a large spreadsheet. Data can be added forever by appending data to the end of the table.
In Postgres, heap tables are organized into pages. Each page holds 8kb of data. Each page holds pointers to the start of each row in the data.
Image credit: The Internals of PostgreSQL
Advantages
Heap tables are optimized for transactional queries. Heap tables can use indexes to quickly find the row of data you are looking for β an index holds the page and row number for particular values of data. Generally, transactional workloads will read, insert, update, or delete small amounts of data at a time. Performance can scale so long as you have indexes to find the data youβre looking for.
Shortcomings
Heap tables perform poorly when data cannot be found by an index, known as a table scan. In order to find the data, all data in the table must be read. Because the data is organized by row, you must read each row to find it. When you active dataset size grows beyond the available memory on the system, you will find these queries slow down tremendously.
Additionally, scans assisted by an index can only go so far if you are requesting a large amount of data. For example, if you would like to know the average over a given month and have an index on the timestamp, the index will help Postgres find the relevant data, but it will still need to read every target row individually in the table to compute the average.
Enter Columnar
Columnar tables are organized transversely from row tables. Instead of rows being added one after another, rows are inserted in bulk into a stripe. Within each stripe, data from each column is stored next to each other. Imagine rows of a table containing:
This data would be stored as follows in columnar:
In columnar, you can think of each stripe as a row of metadata that also holds up to 150,000 rows of data. Within each stripe, data is divided into chunks of 1000 rows, and then within each chunk, there is a βrowβ for each column of data you stored in it. Additionally, columnar stores the minimum, maximum, and count for each column in each chunk.
Advantages
Columnar is optimized for table scans β in fact, it doesnβt use indexes at all. Using columnar, itβs much quicker to obtain all data for a particular column. The database does not need to read data that you are not interested in. It also uses the metadata about the values in the chunks to eliminate reading data. This is a form of βauto-indexingβ the data.
Because similar data is stored next to each other, very high data compression is possible. Data compression is an important benefit because columnar is often used to store huge amounts of data. By compressing the data, you can effectively read data more quickly from disk, which both reduces I/O and increasing effective fetch speed. It also has the effect of making better use of disk caching as data is cached in its compressed form. Lastly, you greatly reduce your storage costs.
Shortcomings
Columnar storage is not designed to do common transactional queries like βfind by IDβ - the database will need to scan a much larger amount of data to fetch this information than a row table.
Additionally, columnar storage on Hydra is append-only, though we working on implementing updates and deletes. Even so, updates and deletes will principally be for convenience, and will not be as performant as row tables.
Lastly, columnar tables need to be inserted in bulk in order to create efficient stripes. This makes them ideal for long-term storage of data you already have, but not ideal when data is still streaming into the database. For these reasons, itβs best to store data in row-based tables until it is ready to be archived into columnar.
Using a columnar table
Create a Columnar table by specifying USING columnar
when creating the table.
Insert data into the table and read from it like normal (subject to the limitations listed below). Note that columnar supports only btree and hash indexes and their associated constraints. Review columnar limitations in Hydra documentation.
Converting From Row to Columnar
Hydra has a convenience function that will copy your row table to columnar.
CREATE TABLE my_table (i INT8);
Data can also be converted manually by copying. For instance:
Partitioning
Columnar tables can be used as partitions; and a partitioned table may be made up of any combination of row and columnar partitions. You can use this feature to have archived data from previous months or years stored in columnar tables while active data is added to a heap table.
When performing operations on a partitioned table with a mix of row and columnar partitions, take note of the following behaviors for operations that are supported on row tables but not columnar (e.g. UPDATE
, DELETE
, tuple locks, etc.):
If the operation is targeted at a specific row partition (e.g. UPDATE p2 SET i = i + 1
), it will succeed; if targeted at a specified columnar partition (e.g. UPDATE p1 SET i = i + 1
), it will fail.
If the operation is targeted at the partitioned table and has a WHERE
clause that excludes all columnar partitions (e.g. UPDATE parent SET i = i + 1 WHERE ts = '2020-03-15'
), it will succeed.
If the operation is targeted at the partitioned table, but does not exclude all columnar partitions, it will fail; even if the actual data to be updated only affects row tables (e.g. UPDATE parent SET i = i + 1 WHERE n = 300
).
Note that the columnar engine supports btree
and hash
indexes (and the constraints requiring them) but does not support gist
, gin
, spgist
and brin
indexes. For this reason, if some partitions are columnar and if the index is not supported by columnar, then it's impossible to create indexes on the partitioned (parent) table directly. In that case, you need to create the index on the individual row partitions. Similarly for the constraints that require indexes, e.g.:
Options
Set options using:
For example:
The following options are available:
compression: [none|pglz|zstd|lz4|lz4hc]
- set the compression type for newly-inserted data. Existing data will not be recompressed/decompressed. The default value is zstd
.
compression_level: <integer>
- Sets compression level. Valid settings are from 1 through 19. If the compression method does not support the level chosen, the closest level will be selected instead.
stripe_row_limit: <integer>
- the maximum number of rows per stripe for newly-inserted data. Existing stripes of data will not be changed and may have more rows than this maximum value. The default value is 150000
.
chunk_group_row_limit: <integer>
- the maximum number of rows per chunk for newly-inserted data. Existing chunks of data will not be changed and may have more rows than this maximum value. The default value is 10000
.
View options for all tables with:
SELECT * FROM columnar.options;
You can also adjust options with a SET
command of one of the following configuration variables:
columnar.compression
columnar.compression_level
columnar.stripe_row_limit
columnar.chunk_group_row_limit
These settings only affect newly-created tables, not any newly-created stripes on an existing table.
Last updated