Databases
These databases are supported by data_check. The instructions assume you use pipx
. If you want to install the database drivers inside a virtual environment, replace pipx
with pip
.
Note: Do not store the credentials, especially the password directly, in the configuration! Use environment variables instead.
For example postgresql://${DB_USER}:${DB_PASSWORD}@${DB_CONNECTION}
instead of postgresql://username:password@db_host:5432/db_name
.
PostgreSQL
Installation
Use data-check[postgres]
to install data_check with PostgreSQL support:
pipx install data-check[postgres]
This will install psycopg2-binary
as the database driver. psycopg2-binary
should work on most systems without any additional dependencies.
Connection string
postgresql://username:password@db_host:5432/db_name
MySQL/MariaDB
Installation
Use data-check[mysql]
to install data_check with MySQL/MariaDB support:
pipx install data-check[mysql]
This will install PyMySQL[rsa]
as described in https://pypi.org/project/PyMySQL/ with additional cryptography dependencies.
Connection string
mysql+pymysql://username:password@db_host:3306/db_name
Microsoft SQL Server
Installation
Use data-check[mssql]
to install data_check with Microsoft SQL Server support:
pipx install data-check[mssql]
This will install pyodbc
which needs unixodbc
and the development package (unixodbc-dev
) on Linux.
Additionally you must install the Microsoft ODBC driver for SQL Server on your system.
Connection string
mssql+pyodbc://username:password@db_host:1433/db_name?driver=ODBC+Driver+17+for+SQL+Server
Oracle
You can choose between oracledb
and cx_Oracle
for Oracle. oracledb
is preferred.
oracledb
Installation
Use data-check[oracledb]
to install data_check with Oracle support:
pipx install data-check[oracledb]
This will install python-oracledb which does not requires any extra libraries.
Connection string
oracle+oracledb://username:password@db_host:1521/?service_name=XEPDB1
cx_Oracle
Installation
Use data-check[oracle]
to install data_check with Oracle support:
pipx install data-check[oracle]
cx_Oracle
needs Oracle client libraries. See https://cx-oracle.readthedocs.io/en/latest/user_guide/installation.html how to install them.
Connection string
oracle+cx_oracle://username:password@db_host:1521/?service_name=XEPDB1
DuckDB
Installation
Use data-check[duckdb]
to install data_check with DuckDB support:
pipx install data-check[duckdb]
This will install duckdb-engine.
Connection string
duckdb:///path/to/duck.db
Note: This will use a relative path to the database file. Use duckdb:////full/path/to/duck.db
to specify a full path.
Limitations
The load modes replace
and upsert
will not work with DuckDB since duckdb-engine
does not support reflection on indices yet.
Databricks
Installation
Use data-check[databricks]
to install data_check with Databricks support:
pipx install data-check[databricks]
This will install databricks-sql-python with the sqlalchemy
option.
Connection string
databricks://token:${access_token}@${host}?http_path=${http_path}&catalog=${catalog}&schema=${schema}
Limitations
Lookups do not work yet.
The upsert load mode is not tested for Databricks.