# 5.7. quasardb insertion tool¶

## 5.7.1. Introduction¶

The quasardb import tool (qdb_import, a.k.a. railgun) enables you to insert CSV files into the database in the fastest way possible. To know how to read the values and where to store them, qdb_import needs a configuration file. Otherwise, it will try to guess the configuration and ask you questions (it’s the interactve mode).

## 5.7.2. Quick Reference¶

Option

Usage

Default

display help

--help-config

display information about config file format

--help-parsers

--version

display the build version and date

set the cluster URI

qdb://127.0.0.1:2836

specify the file to insert

--config

specify the configuration file

--i-make-config

interactive mode: create a config file then exit

--i-auto

interactive mode: guess fields without asking

--i-verbose

interactive mode: becomes more explicit

--check-syntax

check the file’s syntax and display errors

--async

insert the data asynchronously

(disabled by default)

--max-threads

1

--max-error-count

maximum number of errors ignored

0

--user-credentials-file

path to the user’s private file

--cluster-public-key-file

path to the cluster’s public file

## 5.7.3. Configuration file¶

### Basic configuration¶

The configuration file is in this JSON format:

{
"delimiter": <One of ',', ';' or '\t', character separating the values>,
"has_header": <'true' or 'false', if it has a first line containing the column names>,
"default_timeseries": <The timeserie's name in which the values are inserted>,
"columns": [
// In case of a timestamp key value
{
"parser": <One of 'time, date, datetime, iso_8601, epoch, epoch_ms, epoch_ns'>,
"into": "$timestamp" }, // Otherwise, to insert the value { "parser": <One of 'time, date, datetime, iso_8601, epoch, epoch_ms, epoch_ns, double, int64, blob, quoted_blob'>, "into": <default timeseries's column where to put the value> } ] }  Example of configuration: • The first column creates the timestamp key. • The second column contains text inserted in the table ‘names’ of the time series ‘VIP’. • The third column contains integer inserted in the table ‘ages’ of the time series ‘VIP’. { "delimiter": ",", "has_header": false, "default_timeseries": "VIP", "columns": [ { "parser": "datetime", "into": "$timestamp"
},
{
"parser": "blob",
"into": "names"
},
{
"parser": "skip_column",
"into": "ages"
}
]
}


This configuration file would allow to insert values like this:

2018.15.05 11:03:02,Steve,21
2018.15.05 11:03:04,Duke Nukem,54
2018.15.05 11:03:05,Germaine,78
2018.15.05 11:03:07,,24
2018.15.05 11:03:11,Molly,
...


Notice that empty values are allowed if they are not timestamp values.

If several column values are marked as timestamps, the key used will the sum of these timestamps. If can be useful for example if one value is the current date, and the other value the current time.

The main JSON structure can give an optional “shard_size”, a mandatory parameter if there are mentionned timeseries that do not exist yet (otherwise, qdb_import will ask to enter it in command line). Units available are ‘d, h, min, s, ms’.

If a column’s parser is one of ‘skip_column, skip_quoted_column’, the value is ignored.

“into” can also be equal to a structure to have more flexibility:

"into": {
"timeseries": <The timeseries's name. Overrides "default_timeseries">
"column":     <The column's name>
}


Finally, “into” can be an array of several ‘into’ values. In this case, it will be inserted in multiple tables, perhaps in different timeseries.

Here is a more complex configuration example:

The first and the third columns create the timestamp key. The second column contains text inserted in: - The table ‘names’ of the time series ‘VIP’. - The table ‘persons’ of the time series ‘traces’. - The fourth columm is ignored.

{
"delimiter": ",",
"default_timeseries": "VIP",
"shard_size": "20min",
"columns": [
{
"parser": "date",
"into": "$timestamp" }, { "parser": "blob", "into": [ "names", { "timeseries": "traces", "column": "persons" } ] }, { "parser": "time", "into": "$timestamp"
},
{
"parser": "skip_column"
}
]
}


This second configuration file would allow to insert values like this:

2018.15.05,Steve,11:03:02,aaa
2018.15.05,Duke Nukem,11:03:04,1234
2018.15.05,Germaine,11:03:05,
2018.15.05,,11:03:07,xxx
2018.15.05,Molly,11:03:11,boop
...


Moreover, it allows to create the timeseries automatically (without asking the shard_size).

## 5.7.4. Parsers¶

Timestamp parsers :

• time: Parse a time of the day which may contains seconds and nanoseconds, eg. 11:03 or 02:49:31.123456789

• date: Parse a date at the format YYYY$MM$DD where $is ‘/’, ‘.’ or ‘-‘, eg. 2018.05.15 or 2019/02/26 • datetime: Parse a date and a time separated by spaces, eg. 2018-05-15 11:03:58.007 • iso_8601: Parse a date and a time in ISO format, eg. 2018-05-15T11:03:58.007 • epoch: Parse a number as seconds, eg. 2 or 2.45 • epoch_ms: Parse a number as milliseconds, eg. 2000 • epoch_ns: Parse a number as nanoseconds, eg. 2000000000 Other parsers : • double: Parse a floating value, eg. -1234.567, 123, -.2 or 3 • int64: Parse a signed integer, eg. 123 or -54 • blob: Parse text, eg. 123, bip bop or @#$%^&*{[<

• quoted_blob: Parse optionally quoted text, e.g. “commas,,” or bop bip. Double quotes escape quotes : “it”“s raining”

• skip_column: Skip a value that could be parsed by ‘blob’

• skip_quoted_column: Skip a value that could be parsed by ‘quoted_blob’

## 5.7.5. Program Options¶

-h, --help

Displays help about the program’s options.

--help-config

Displays information about config file format, for basic and advanced use cases.

--help-parsers

--version

Displays the build version and date.

-c, --cluster=<cluster_uri>

Specifies the address and port of the quasardb cluster to which the insertion tool must connect.

Argument

The URI (list of comma-separated endpoints, i.e. addresses and ports, preceded by qdb://) of a cluster on which the tests will be run.

Default value

qdb://127.0.0.1:2836, the IPv4 localhost address and the port 2836.

-f, --file=<path_to_file>

Specifies the file used for insertion.

Argument

Path to the file.

--config=<path_to_file>

Specifies the configuration file used to read the CSV file and insert it’s values.

Argument

Path to the file.

--i-make-config=<path_to_new_file>

The interactive mode outputs a configuration file at the given path then quits.

Argument

Path to the new file.

--i-auto

The interactive mode does not ask questions and quits on missing or ambiguous guess.

--i-verbose

The interactive mode is verbose, and asks more questions if ‘–i-auto’ is not specified.

--check-syntax

The program will locally check the file’s syntax then display it’s errors and quit. The program can run without any cluster.

--async

The program insert the data asynchronously.

--max-threads=<max_threads>

Maximum number of threads used. If the option is mentionned without value, it will be set to the number of logical cores availables on your machine.

Default value

1

--max-error-count=<max_error_count>

Maximum number of errors ignored (one per wrong line).

Default value

0

--user-credentials-file=<path_to_file>

Specifies the user’s private key file with both username and secret_key.

Argument

Path to the file.

--cluster-public-key-file=<path_to_file>

Specifies the path to the cluster’s public key file.

Argument

Path to the file.

5.6. quasardb benchmarking tool
5.8. quasardb export tool