Table Of Contents

1.1. Primer

1.1.1. What is quasardb?

quasardb is a column-oriented, distributed timeseries database (see api/time_series).

It has been designed to reliably store large amounts of data, while delivering a performance level that enables analysts to work interactively.

Where would you want to use quasardb? Here are a couple of use cases:

  • Market data store for back testing and analysis
  • Trades store for compliance
  • Time deviation database for compliance
  • Sensors data repository for predictive maintenance
  • Monitoring database for large deployments

1.1.2. Shall we dance?

To start a quasardb server, just run it!

To get through this primer you will only need Docker installed on your machine.

We provide packages for many platforms in binary format needing manual installation, you will find instructions in the documentation if you wish to go down this road.

You will first need to install the quasardb docker images, first the server

docker pull bureau14/qdb

Then the shell

docker pull bureau14/qdbsh

We can know start the server in a terminal type:

docker run -it -p 2836:2836   -e QDB_DISABLE_SECURITY=true --name qdb-server bureau14/qdb

You will have after a couple of seconds the daemon log on the console that it is ready to accept incoming requests.

quasardb is typically used via its multi-language API. For the purpose of this introduction, we will restrict ourselves to the Python API the shell (see reference/qdb_shell).

The shell also comes with a query language similar to SQL (see queries/index).

You can now start a quasardb shell in another terminal:

docker run -ti --link qdb-server:qdb-server bureau14/qdbsh --cluster qdb://qdb-server:2836

You should see a prompt type:

qdbsh >

1.1.3. Inserting timeseries data

We’re going to create a single-column timeseries and insert data. The name of the timeseries will be stocks and the name of its unique column will be close. The type of the data we will insert in the column is 64-bit floating point numbers.

In the shell window type:

CREATE TABLE stocks (close DOUBLE)

You create the timeseries only once, like you create a table in a SQL database only once.

You can check this was successful by typing

SHOW TABLES
$timestamp                       timeseries   name
---------------------------------------------------
                        (void)      $tables stocks

Returned 1 row in 1,265 us
Scanned 0 rows in 1,265 us (0 rows/sec)

If you wish to delete (drop) the whole timeseries, you do the following.

DROP TABLE stocks

Note

Operations performed on timeseries are transactional. quasardb uses MVCC and 2PC to provide you with a high level of reliability.

Let’s add 3 values into our timeseries using a simple query in the shell:

INSERT INTO stocks ($timestamp, close) VALUES (2017-01-01, 1.0), (2017-01-02, 2.0), (2017-01-03, 3.0)

In this example we added 3 points, 1 point per day, for simplicity sake. In the examples directory of the Python API, you will find several examples showcasing the different ways you can insert data into a quasardb cluster.

1.1.4. Working with the data

The data we inserted has been normalized and is now accessible from all other APIs. We will now use our shell to visualize it:

Note

Everything we do with the shell can be done via the API of your choice.

Let’s first check that our timeseries exists:

SHOW stocks

This will output:

Shard size: 86400000 ms
2 columns
 0. timestamp index
 1. close - 64-bit double

We can also dump the content of our timeseries:

select * from stocks in range(2017-01-01, 2017-01-10)

Output:

$timestamp                       timeseries            close
-------------------------------------------------------------
2017-01-01T00:00:00.000000000Z       stocks                1
2017-01-02T00:00:00.000000000Z       stocks                2
2017-01-03T00:00:00.000000000Z       stocks                3

Returned 3 rows in 3,688 us
Scanned 3 rows in 3,688 us (813 rows/sec)

As you can see the timestamp allows for nanosecond precision. Time definition syntax in quasardb is very flexible:

select * from stocks in range(2017, +10d)

Output:

    $timestamp                       timeseries            close
-------------------------------------------------------------
2017-01-01T00:00:00.000000000Z       stocks                1
2017-01-02T00:00:00.000000000Z       stocks                2
2017-01-03T00:00:00.000000000Z       stocks                3

Returned 3 rows in 1,509 us
Scanned 3 rows in 1,509 us (1,988 rows/sec)

You can narrow down your searches with the WHERE keyword, exactly as you would in a regular SQL query:

select * from stocks in range(2017, +10d) where close < 2

Output:

$timestamp                       timeseries            close
-------------------------------------------------------------
2017-01-01T00:00:00.000000000Z       stocks                1

Returned 1 row in 15,163 us
Scanned 3 rows in 15,163 us (197 rows/sec)

The database is also able to perform server-side aggregations for maximum performance. For example, you can ask for the average value of a timeseries, without having to retrieve all the data. Aggregations are able to leverage the enhanced instruction set of your CPU, when available.

For example, we can request the arithmetic mean of our stocks for the same interval:

select arithmetic_mean(close) from stocks in range(2017, +10d)

Output:

$timestamp                       timeseries   arithmetic_mean(close)
---------------------------------------------------------------------
                        (void)       stocks                        2

Here are some queries you can try for yourself:

  • Show the minimum and maximum open value for the last 20 years:

    select min(open), max(open) from stocks_A in range(now(), -20y)
    
  • Display the open and close of two different time series:

    select open, close from stocks_A, stocks_B in range(2017, +10d)
    
  • Daily averages over a year:

    select arithmetic_mean(close) from stocks in range(2017, +1y) group by day
    
  • Display the last known value of a timeseries with respect to the timestamps of another:

    select value from bids in range(2017, +1d) asof(trades)
    

Note

For a list of supported functions, see api/time_series.

1.1.5. Organizing your data

When you start to have a lot of timeseries, you probably want to find them based on criteria different than the name.

Out of the box, all timeseries are searchable by prefix and suffix:


prefix_get sto 100

::output

  1. stocks
suffix_get cks 100

::output

  1. stocks

This will return the 100 first matches for entries starting with “sto”, and the 100 first matches for entries ending with “cks”.

However, sometimes you want to organize your timeseries according to arbitraty criteria. For example, for our stocks, we may want to say that it comes from the NYSE and that the currency is USD.

quasardb has a powerful feature named “tags” that enables you to tag timeseries and do reverse lookups based on those tags

based on an exchange for example

attach_tag stocks nyq

Or a currency: .. code-block:: sql

attach_tag stocks usd

Then you can look up based on those tags

get_tagged nyq

output:

1. stocks - timeseries
 get_tagged usd

output::

   1. stocks - timeseries

It’s also possible to ask more complex questions, such as “give me everything that is tagged with ‘usd’ but not ‘nyq’”:

find(tag='usd' and not tag='nyq')

Output

An entry matching the provided alias cannot be found.

Which is, in our case, the correct answer!

Can you inject the result of a find in a select? Of course you can:

select arithmetic_mean(close) from find(tag='nyq') in range(2017, +10d)

Output

timestamp                      arithmetic_mean(close)
------------------------------------------------------
2017-01-01T00:00:00.000000000Z 2.00

1.1.6. Going further

We hope this quick tour left you wanting for more! quasardb is feature-rich yet simple to use and operate. If you want to go further, the best course of action is to start with looking through Python samples Python notebooks and then the full documentation of the API for the language of your choice (api/index).

If you’d like to learn more about building a quasardb cluster, head over to the administrative section (administration/index).

Curious about the underlying concepts, we have a section dedicated to it (concepts/index).

1.1.7. Wrap up

Things to remember about quasardb:

  • Fast and scalable
  • High-performance binary protocol
  • Multi-platform: FreeBSD, Linux 2.6+, OS X and Windows (32-bit and 64-bit)
  • Peer-to-peer network distribution
  • Transparent persistence
  • Native timeseries support
  • Distributed transactions
  • Rich typing
  • Tag-based search
  • Fire and forget: deploy, run and return to your core business.
arrow_backPrevious
1. Primer
Next arrow_forward
1.2. python notebooks