Table Of Contents

7.13. Python

7.13.1. Introduction

The quasardb module contains multiple classes to make working with a quasardb cluster simple. It is written in C++ on top of numpy <https://www.numpy.org/>, and provides high-performance access to the QuasarDB cluster using a simple API.

7.13.2. Requirements

The QuasarDB Python API is built and tested against the following versions:

  • Python 3.5
  • Python 3.6
  • Python 3.7

In addition to this, we support the following environments:

  • MacOS 10.9+
  • Microsoft Windows
  • Linux
  • FreeBSD

7.13.3. Installation

The QuasarDB Python module is using PyPi / pip.

Windows and MacOS

On Windows and MacOS, the QuasarDB Python module is distributed in binary format, and can be installed without any additional dependencies as follows:

pip install quasardb

This will download the API and install all its dependencies.

Linux

For Linux users, installation via pip through PyPi will trigger a compilation of this module. This will require additional packages to be installed:

  • A modern C++ compiler (llvm, g++)
  • CMake 3.5 or higher
  • QuasarDB C API

Ubuntu / Debian

On Ubuntu or Debian, the installation can be achieved as follows:

$ apt install apt-transport-https ca-certificates -y
$ echo "deb [trusted=yes] https://repo.quasardb.net/apt/ /" > /etc/apt/sources.list.d/quasardb.list
$ apt update
$ apt install qdb-api cmake g++
$ pip install wheel
$ pip install quasardb

RHEL / CentOS

On RHEL or CentOS, the process is a bit more involved because we need a modern GCC compiler and cmake. It can be achieved as follows:

# Enable SCL for recent gcc
$ yum install centos-release-scl -y

# Enable EPEL for recent cmake
$ yum install epel-release -y

# Enable QuasarDB Repository
$ echo $'[quasardb]\nname=QuasarDB repo\nbaseurl=https://repo.quasardb.net/yum/\nenabled=1\ngpgcheck=0' > /etc/yum.repos.d/quasardb.repo

$ yum install devtoolset-7-gcc-c++ cmake3 make qdb-api

# Make cmake3 the default
$ alternatives --install /usr/bin/cmake cmake /usr/bin/cmake3 10

# Start using gcc 7
$ scl enable devtoolset-7 bash

# Install the Python module
$ pip install wheel
$ pip install quasardb

7.13.4. Verifying installation

You can verify the QuasarDB Python module is installed correctly by trying to print the installed version:

$ python
Python 3.7.2 (default, Feb 13 2019, 15:08:44)
[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import quasardb
>>> print(quasardb.version())
3.1.0

This tells you the currently installed version of the Python module, and the QuasarDB C API it is linked against is 3.1.0. Ensure that this version also matched the version of the QuasarDB daemon you’re connecting to.

7.13.5. Getting started

Establishing a connection with the QuasarDB cluster is easy:

import quasardb

c = quasardb.Cluster("qdb://127.0.0.1:2836")
b = c.blob("entry")
b.put(b"content")
print(b.get())

The execution of the above code snippet will output:

content

If your cluster is set up using security features, you can provide authentication details as follows:

c = quasardb.Cluster(uri='qdb://127.0.0.1:2836',
                     user_name='qdbuser',
                     user_private_key='/var/lib/qdb/user_private.key')

7.13.6. Timeout

The default timeout is one minute. To specify a different timeout, you must pass it as a parameter when constructing your quasardb Cluster object:

c = quasardb.Cluster("qdb://127.0.0.1:2836", datetime.timedelta(minutes=2))

7.13.7. Expiry

Expiry is either set at creation or through the expires_at and expires_from_now methods for the given data type.

Danger

The behavior of expires_from_now is undefined if the time zone or the clock of the client computer is improperly configured.

To set the expiry time of an entry to 1 minute, relative to the call time:

b.expires_from_now(datetime.timedelta(minutes=1))

To set the expiry time of an entry to January, 1st 2020:

b.expires_at(datetime.datetime(year=2020, month=1, day=1))

Or alternatively:

b.update("content", datetime.datetime(year=2020, month=1, day=1))

To prevent an entry from ever expiring:

b.expires_at(quasardb.never_expires);

By default, entries never expire. To obtain the expiry time of an existing entry as a datetime.datetime object:

print(b.get_expiry_time())

This will print:

2020-01-01

7.13.8. Tags

To get the list of tags for an entry, use the get_tags method.

tags = b.get_tags()

To find the list of items matching a tag, you create a tag object. For example, if you want to find all entries having the tag “my_tag”. The get_entries method will then list the entries matching the tag.

c = quasardb.Cluster("qdb://127.0.0.1:2836")
tag = c.tag("my_tag")
entries = tag.get_entries()

7.13.9. Timeseries

You first create an instance of a timeseries object, then you specify the columns. In this example we will create a timeseries with three columns, “close”, “volume”, and “value_date”. The respective types of the columns are double precision floating point values, 64-bit signed integer, and high resolution nanosecond-precise timestamps.

ts = c.ts("my_table")

columns = [quasardb.ColumnInfo(quasardb.ColumnType.Double, "close"),
           quasardb.ColumnInfo(quasardb.ColumnType.Int64, "volume"),
           quasardb.ColumnInfo(quasardb.ColumnType.Timestamp, "value_date")]

ts.create(columns)

Insertion of the data into timeseries can be done through the batch API, which is a row oriented, bulk insert API.

The first step is to create a bulk inserter that matches all the timeseries and columns you want to insert to:

batch_columns = [quasardb.BatchColumnInfo("my_table", "close", 100),
                 quasardb.BatchColumnInfo("my_table", "volume", 100),
                 quasardb.BatchColumnInfo("my_table", "value_date", 100)]

batch_inserter = q.ts_batch(batch_columns)

Note that there is no need to specify the type of the column as the batch API will query the server for that.

The number you specify in the column is the number of rows you expect to send per rows, enabling the API to pre-allocate the right amount of memory, significantly increasing performance. If you specify 100, it means you expect to push every 100 rows.

You can insert data into multiple timeseries at the same time by simply specifying multiple columns from multiple timeseries, such as this:

batch_columns = [quasardb.BatchColumnInfo("my_table", "close", 100),
                 quasardb.BatchColumnInfo("my_table", "volume", 100),
                 quasardb.BatchColumnInfo("my_table", "value_date", 100),
                 quasardb.BatchColumnInfo("other_table", "close", 100),
                 quasardb.BatchColumnInfo("other_table", "volume", 100),
                 quasardb.BatchColumnInfo("other_table", "value_date", 100)]

Once the batch_inserter object is created you insert the data row per row:

import numpy as np

# All timestamps are numpy datetime64 with nanosecond precision
batch_inserter.start_row(np.datetime64('2018-01-01', 'ns'))
batch_inserter.set_double(0, 1.0) # set close
batch_inserter.set_int64(1, 231) # set volume
batch_inserter.set_timestamp(2, np.datetime64('2018-02-02', 'ns'))

# send to the server
batch_inserter.push()

Note that the data isn’t sent to the server as long as you don’t call push. Pushing every row isn’t advised as it’s extremely inefficient.

You can run a query directly from the Python API and have the results as a dictionnary of numpy arrays.

q = c.query("select * from my_table in range(2018, +10d)")
res = q.run()

for col in res.tables["my_table"]:
    # col.name is a string for the name of the column
    # col.data is a numpy array of the proper type
    print(col.name, ": ", col.data)

7.13.10. Example Client

This module creates a simple save() and load() wrapper around the API:

import quasardb

# Assuming we have a quasardb server running on dataserver.mydomain:3001
# Note this will throw an exception if the quasardb cluster is not available.
c = quasardb.Cluster("qdb://dataserver.mydomain:3001")

# We want to silently create or update the object
# depending on the existence of the key in the cluster.
def save(key, obj):
    b = c.blob(key)
    b.update(obj)

# We want to simply return None if the key is not found in the cluster.
def load(key):
    try:
        b = c.blob(key)
        return b.get(key)
    except quasardb.Error:
        return None

7.13.11. Reference

class quasardb.Deque(handle, alias, *args, **kwargs)

Bases: quasardb.RemoveableEntry

An unlimited, distributed, concurrent deque.

alias()
Returns:The alias of the entry
attach_tag(tag)

Attach a tag to the entry

Parameters:tag (str) – The tag to attach
Returns:True if the tag was successfully attached, False if it was already attached
Raises:Error
back()

Returns the last Entry of the deque. The deque must exist and must not be empty.

Raises:Error
Returns:The last Entry of the deque
detach_tag(tag)

Detach a tag from the entry

Parameters:tag (str) – The tag to detach
Returns:True if the tag was successfully detached, False if it was not attached
Raises:Error
front()

Returns the first Entry of the deque. The deque must exist and must not be empty.

Raises:Error
Returns:The first Entry of the deque
get_tags()

Returns the list of tags attached to the entry

Returns:A list of alias (stings) of tags
Raises:Error
has_tag(tag)

Test if a tag is attached to the entry

Parameters:tag (str) – The tag to test
Returns:True if the entry has the specified tag, False otherwise
Raises:Error
pop_back()

Atomically returns and remove the last Entry of the deque. The deque must exist and must not be empty.

Raises:Error
Returns:The last Entry of the deque
pop_front()

Atomically returns and remove the first Entry of the deque. The deque must exist and must not be empty.

Raises:Error
Returns:The first Entry of the deque
push_back(data)

Appends a new Entry at the end of the deque. The deque will be created if it does not exist.

Parameters:data (str) – The content for the Entry
Raises:Error
push_front(data)

Appends a new Entry at the beginning of the deque. The deque will be created if it does not exist.

Parameters:data (str) – The content for the Entry
Raises:Error
remove()

Removes the given Entry from the repository. It is an error to remove a non-existing Entry.

Raises:Error
size()

Returns the current size of the deque.

Raises:Error
Returns:The current size of the deque.
exception quasardb.Error(error_code=0)

Bases: Exception

The quasardb database exception, based on the API error codes.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class quasardb.HSet(handle, alias, *args, **kwargs)

Bases: quasardb.RemoveableEntry

An unlimited, distributed, concurrent hash set.

alias()
Returns:The alias of the entry
attach_tag(tag)

Attach a tag to the entry

Parameters:tag (str) – The tag to attach
Returns:True if the tag was successfully attached, False if it was already attached
Raises:Error
contains(data)

Tests if the Entry exists in the hash set. The hash set must exist.

Raises:Error
Returns:True if the Entry exists, false otherwise
detach_tag(tag)

Detach a tag from the entry

Parameters:tag (str) – The tag to detach
Returns:True if the tag was successfully detached, False if it was not attached
Raises:Error
erase(data)

Erases an existing an Entry from an existing hash set.

Raises:Error
get_tags()

Returns the list of tags attached to the entry

Returns:A list of alias (stings) of tags
Raises:Error
has_tag(tag)

Test if a tag is attached to the entry

Parameters:tag (str) – The tag to test
Returns:True if the entry has the specified tag, False otherwise
Raises:Error
insert(data)

Inserts a new Entry into the hash set. If the hash set does not exist, it will be created.

Raises:Error
remove()

Removes the given Entry from the repository. It is an error to remove a non-existing Entry.

Raises:Error
size()

Returns the current size of the hash set.

Raises:Error
Returns:The current size of the hash set.
class quasardb.Integer(handle, alias, *args, **kwargs)

Bases: quasardb.ExpirableEntry

A 64-bit signed integer. Depending on your Python implementation and platform, the number represented in Python may or may not be a 64-bit signed integer.

add(addend)

Adds the supplied addend to an existing integer. The operation is atomic and thread safe. The entry must exist. If addend is negative, the value will be substracted to the existing entry.

Parameters:number (long) – The value to add to the existing entry.
Returns:The value of the entry post add
Raises:Error
alias()
Returns:The alias of the entry
attach_tag(tag)

Attach a tag to the entry

Parameters:tag (str) – The tag to attach
Returns:True if the tag was successfully attached, False if it was already attached
Raises:Error
detach_tag(tag)

Detach a tag from the entry

Parameters:tag (str) – The tag to detach
Returns:True if the tag was successfully detached, False if it was not attached
Raises:Error
expires_at(expiry_time)

Sets the expiry time of an existing Entry. If the value is None, the Entry never expires.

Parameters:expiry_time (datetime.datetime) – The expiry time, must be offset aware
Raises:Error
expires_from_now(expiry_delta)

Sets the expiry time of an existing Entry relative to the current time, in milliseconds.

Parameters:expiry_delta (long) – The expiry delta in milliseconds
Raises:Error
get()

Returns the current value of the entry, as an integer.

Returns:The value of the entry as an integer
Raises:Error
get_expiry_time()

Returns the expiry time of the Entry.

Returns:datetime.datetime – The expiry time, offset aware
Raises:Error
get_tags()

Returns the list of tags attached to the entry

Returns:A list of alias (stings) of tags
Raises:Error
has_tag(tag)

Test if a tag is attached to the entry

Parameters:tag (str) – The tag to test
Returns:True if the entry has the specified tag, False otherwise
Raises:Error
put(number, expiry_time=None)

Creates an integer of the specified value. The entry must not exist.

Parameters:
  • number (long) – The value of the entry to created
  • expiry_time (datetime.datetime) – The expiry time for the alias
Raises:

Error

remove()

Removes the given Entry from the repository. It is an error to remove a non-existing Entry.

Raises:Error
update(number, expiry_time=None)

Updates an integer to the specified value. The entry may or may not exist.

Parameters:
  • number (long) – The value of the entry to created
  • expiry_time (datetime.datetime) – The expiry time for the alias
Raises:

Error

quasardb.QuasardbException

alias of Error

class quasardb.Tag(handle, alias, *args, **kwargs)

Bases: quasardb.Entry

A tag to perform tag-based queries, such as listing all entries having the tag.

alias()
Returns:The alias of the entry
attach_tag(tag)

Attach a tag to the entry

Parameters:tag (str) – The tag to attach
Returns:True if the tag was successfully attached, False if it was already attached
Raises:Error
count()

Returns an approximate count of the entries matching the tag, up to the configured maximum cardinality.

Returns:The approximative count of entries tagged
Raises:Error
detach_tag(tag)

Detach a tag from the entry

Parameters:tag (str) – The tag to detach
Returns:True if the tag was successfully detached, False if it was not attached
Raises:Error
get_entries()

Returns all entries with the tag

Returns:The list of entries aliases tagged
Raises:Error
get_tags()

Returns the list of tags attached to the entry

Returns:A list of alias (stings) of tags
Raises:Error
has_tag(tag)

Test if a tag is attached to the entry

Parameters:tag (str) – The tag to test
Returns:True if the entry has the specified tag, False otherwise
Raises:Error
class quasardb.TimeSeries(handle, alias, *args, **kwargs)

Bases: quasardb.RemoveableEntry

An unlimited, distributed, time series with nanosecond granularity and server-side aggregation capabilities.

class BlobAggregationResult(t, r, ts, count, content, content_length)

Bases: object

An aggregation result holding the range on which the aggregation was perfomed, the result value as well as the timestamp if it applies.

class BlobColumn(ts, col_name)

Bases: quasardb.Column

A column whose values are blobs

aggregate(aggregations)

Aggregates values over the given intervals

Parameters:aggregations (A list of (quasardb.TimeSeries.Aggregation, (datetime.datetime, datetime.datetime)) couples) – The aggregations to perform
Raises:Error
Returns:The list of aggregation results
erase_ranges(intervals)

Erase points within the specified intervals, left inclusive.

Parameters:intervals (A list of (datetime.datetime, datetime.datetime) couples) – The intervals for which the ranges should be erased
Raises:Error
Returns:The number of erased points
fast_insert(vector)

Inserts value into the time series.

Parameters:vector (quasardb.BlobPointsVector) – A vector of blob points
Raises:Error
get_ranges(intervals)

Returns the ranges matching the provided intervals.

Parameters:intervals (A list of (datetime.datetime, datetime.datetime) couples) – The intervals for which the ranges should be returned
Raises:Error
Returns:A flattened list of (datetime.datetime, string) couples
insert(tuples)

Inserts values into the time series.

Parameters:tuples (A list of (datetime.datetime, string) couples) – The list of couples to insert into the time series
Raises:Error
name()

Returns the name of the column

Returns:The name of the column
class Column(ts, col_name)

Bases: object

A column object within a time series on which one can get ranges and run aggregations.

aggregate(ts_func, aggregations)

Aggregates values over the given intervals

Parameters:
  • ts_func – Function to call
  • aggregations – The aggregations to perform
Raises:

Error

Returns:

A list of aggregation results

erase_ranges(intervals)

Erase points within the specified intervals, left inclusive.

Parameters:intervals (A list of (datetime.datetime, datetime.datetime) couples) – The intervals for which the ranges should be erased
Raises:Error
Returns:The number of erased points
name()

Returns the name of the column

Returns:The name of the column
class ColumnInfo(col_name, col_type)

Bases: object

An object holding column information such as the name and the type.

class DoubleAggregationResult(t, r, ts, count, value)

Bases: object

An aggregation result holding the range on which the aggregation was perfomed, the result value as well as the timestamp if it applies.

class DoubleColumn(ts, col_name)

Bases: quasardb.Column

A column whose value are double precision floats

aggregate(aggregations)

Aggregates values over the given intervals

Parameters:aggregations (The list of (quasardb.TimeSeries.Aggregation, (datetime.datetime, datetime.datetime)) couples) – The aggregations to perform
Raises:Error
Returns:The list of aggregation results
erase_ranges(intervals)

Erase points within the specified intervals, left inclusive.

Parameters:intervals (A list of (datetime.datetime, datetime.datetime) couples) – The intervals for which the ranges should be erased
Raises:Error
Returns:The number of erased points
fast_insert(vector)

Inserts value into the time series.

Parameters:vector (quasardb.DoublePointsVector) – A vector of double points
Raises:Error
get_ranges(intervals)

Returns the ranges matching the provided intervals, left inclusive.

Parameters:intervals (A list of (datetime.datetime, datetime.datetime) couples) – The intervals for which the ranges should be returned
Raises:Error
Returns:A flattened list of (datetime.datetime, float) couples
insert(tuples)

Inserts values into the time series.

Parameters:tuples (A list of (datetime.datetime, float) couples) – The list of couples to insert into the time series
Raises:Error
name()

Returns the name of the column

Returns:The name of the column
class Int64Column(ts, col_name)

Bases: quasardb.Column

A column whose value are signed 64-bit integers

aggregate(ts_func, aggregations)

Aggregates values over the given intervals

Parameters:
  • ts_func – Function to call
  • aggregations – The aggregations to perform
Raises:

Error

Returns:

A list of aggregation results

erase_ranges(intervals)

Erase points within the specified intervals, left inclusive.

Parameters:intervals (A list of (datetime.datetime, datetime.datetime) couples) – The intervals for which the ranges should be erased
Raises:Error
Returns:The number of erased points
fast_insert(vector)

Inserts value into the time series.

Parameters:vector (quasardb.Int64PointsVector) – A vector of int64 points
Raises:Error
get_ranges(intervals)

Returns the ranges matching the provided intervals, left inclusive.

Parameters:intervals (A list of (datetime.datetime, datetime.datetime) couples) – The intervals for which the ranges should be returned
Raises:Error
Returns:A flattened list of (datetime.datetime, float) couples
insert(tuples)

Inserts values into the time series.

Parameters:tuples (A list of (datetime.datetime, float) couples) – The list of couples to insert into the time series
Raises:Error
name()

Returns the name of the column

Returns:The name of the column
class LocalTable(ts, columns=None)

Bases: object

A local table object that enables row by row bulk inserts.

append_row(timestamp, *args)

Appens a row to the local table. The content will not be sent to the cluster until push is called

Parameters:
  • timestamp (datetime.datetime) – The timestamp at which to add the row
  • args (floating points or blobs or None) – The columns of the row, to skip a column, use None
Raises:

Error

Returns:

The current row index

fast_append_row(timestamp, *args)

Appends a row to the local table. The content will not be sent to the cluster until push is called

Parameters:
  • timestamp (qdb_timespec_t) – The timestamp at which to add the row
  • args (floating points or blobs or None) – The columns of the row, to skip a column, use None
Raises:

Error

Returns:

The current row index

push()

Pushes the content of the local table to the remote cluster.

Raises:Error
class TimestampColumn(ts, col_name)

Bases: quasardb.Column

A column whose value are nanosecond-precise timestamps

aggregate(ts_func, aggregations)

Aggregates values over the given intervals

Parameters:
  • ts_func – Function to call
  • aggregations – The aggregations to perform
Raises:

Error

Returns:

A list of aggregation results

erase_ranges(intervals)

Erase points within the specified intervals, left inclusive.

Parameters:intervals (A list of (datetime.datetime, datetime.datetime) couples) – The intervals for which the ranges should be erased
Raises:Error
Returns:The number of erased points
fast_insert(vector)

Inserts value into the time series.

Parameters:vector (quasardb.TimestampPointsVector) – A vector of nanosecond-precise timestamps
Raises:Error
get_ranges(intervals)

Returns the ranges matching the provided intervals, left inclusive.

Parameters:intervals (A list of (datetime.datetime, datetime.datetime) couples) – The intervals for which the ranges should be returned
Raises:Error
Returns:A flattened list of (datetime.datetime, datetime.datetime) couples
insert(tuples)

Inserts values into the time series.

Parameters:tuples (A list of (datetime.datetime, datetime.datetime) couples) – The list of couples to insert into the time series
Raises:Error
name()

Returns the name of the column

Returns:The name of the column
alias()
Returns:The alias of the entry
attach_tag(tag)

Attach a tag to the entry

Parameters:tag (str) – The tag to attach
Returns:True if the tag was successfully attached, False if it was already attached
Raises:Error
column(col_info)

Accesses an existing column.

Parameters:col_info (TimeSeries.ColumnInfo) – A description of the column to access
Raises:Error
Returns:A TimeSeries.Column matching the provided information
columns()

Returns all existing columns.

Raises:Error
Returns:A list of all existing columns as TimeSeries.Column objects
columns_info()

Returns all existing columns information.

Raises:Error
Returns:A list of all existing columns as TimeSeries.ColumnInfo objects
create(columns, shard_size=datetime.timedelta(1))

Creates a time series with the provided columns information

Parameters:
  • columns (a list of TimeSeries.ColumnInfo) – A list describing the columns to create
  • shard_size (datetime.timedelta) – The length of a single timeseries shard (bucket).
Raises:

Error

Returns:

A list of columns matching the created columns

detach_tag(tag)

Detach a tag from the entry

Parameters:tag (str) – The tag to detach
Returns:True if the tag was successfully detached, False if it was not attached
Raises:Error
get_tags()

Returns the list of tags attached to the entry

Returns:A list of alias (stings) of tags
Raises:Error
has_tag(tag)

Test if a tag is attached to the entry

Parameters:tag (str) – The tag to test
Returns:True if the entry has the specified tag, False otherwise
Raises:Error
local_table(columns=None)

Returns a LocalTable matching the provided columns or the full time series if None.

Parameters:columns (A list of quasardb.ColumnInfo) – A list of columns to build the LocalTable
Raises:Error
Returns:A list quasardb.LocalTable initialized
remove()

Removes the given Entry from the repository. It is an error to remove a non-existing Entry.

Raises:Error
quasardb.build()

Returns the build tag and build date as a string

Returns:str – The API build tag
quasardb.make_error_string(error_code)

Returns a meaningful error message corresponding to the quasardb error code.

Parameters:error_code – The error code to translate
Returns:str – An error string
quasardb.version()

Returns the API’s version number as a string

Returns:str – The API version number