Databases

These databases are supported by data_check. The instructions assume you use pipx. If you want to install the database drivers inside a virtual environment, replace pipx with pip.

Note: Do not store the credentials, especially the password directly, in the configuration! Use environment variables instead. For example postgresql://${DB_USER}:${DB_PASSWORD}@${DB_CONNECTION} instead of postgresql://username:password@db_host:5432/db_name.

PostgreSQL

Installation

Use data-check[postgres] to install data_check with PostgreSQL support:

pipx install data-check[postgres]

This will install psycopg2-binary as the database driver. psycopg2-binary should work on most systems without any additional dependencies.

Connection string

postgresql://username:password@db_host:5432/db_name

MySQL/MariaDB

Installation

Use data-check[mysql] to install data_check with MySQL/MariaDB support:

pipx install data-check[mysql]

This will install PyMySQL[rsa] as described in https://pypi.org/project/PyMySQL/ with additional cryptography dependencies.

Connection string

mysql+pymysql://username:password@db_host:3306/db_name

Microsoft SQL Server

Installation

Use data-check[mssql] to install data_check with Microsoft SQL Server support:

pipx install data-check[mssql]

This will install pyodbc which needs unixodbc and the development package (unixodbc-dev) on Linux.

Additionally you must install the Microsoft ODBC driver for SQL Server on your system.

Connection string

mssql+pyodbc://username:password@db_host:1433/db_name?driver=ODBC+Driver+17+for+SQL+Server

Oracle

You can choose between oracledb and cx_Oracle for Oracle. oracledb is preferred.

oracledb

Installation

Use data-check[oracledb] to install data_check with Oracle support:

pipx install data-check[oracledb]

This will install python-oracledb which does not requires any extra libraries.

Connection string

oracle+oracledb://username:password@db_host:1521/?service_name=XEPDB1

cx_Oracle

Installation

Use data-check[oracle] to install data_check with Oracle support:

pipx install data-check[oracle]

cx_Oracle needs Oracle client libraries. See https://cx-oracle.readthedocs.io/en/latest/user_guide/installation.html how to install them.

Connection string

oracle+cx_oracle://username:password@db_host:1521/?service_name=XEPDB1

DuckDB

Installation

Use data-check[duckdb] to install data_check with DuckDB support:

pipx install data-check[duckdb]

This will install duckdb-engine.

Connection string

duckdb:///path/to/duck.db

Note: This will use a relative path to the database file. Use duckdb:////full/path/to/duck.db to specify a full path.

Limitations

The load modes replace and upsert will not work with DuckDB since duckdb-engine does not support reflection on indices yet.

Databricks

Installation

Use data-check[databricks] to install data_check with Databricks support:

pipx install data-check[databricks]

This will install databricks-sql-python with the sqlalchemy option.

Connection string

databricks://token:${access_token}@${host}?http_path=${http_path}&catalog=${catalog}&schema=${schema}

Limitations

Lookups do not work yet.

The upsert load mode is not tested for Databricks.