mirror of
https://github.com/HannahVernon/pg-data-comparer.git
synced 2026-05-27 13:54:19 -05:00
A cross-platform command-line utility that exports PostgreSQL table data to tab-separated values (TSV) files for side-by-side comparison with diff tools such as [Beyond Compare](https://www.scootersoftware.com/).
- C# 100%
|
|
||
|---|---|---|
| .github | ||
| .gitignore | ||
| CODE_OF_CONDUCT.md | ||
| CONTRIBUTING.md | ||
| Directory.Build.props | ||
| global.json | ||
| LICENSE.md | ||
| pg-data-comparer.csproj | ||
| Program.cs | ||
| README.md | ||
| SECURITY.md | ||
pg-data-comparer
A cross-platform command-line utility that exports PostgreSQL table data to tab-separated values (TSV) files for side-by-side comparison with diff tools such as Beyond Compare.
Features
- Cross-platform — runs on Windows, macOS, and Linux (.NET 9)
- Parallel export — source and target tables are queried concurrently
- Auto-detected sort order — automatically discovers primary key columns for deterministic row ordering
- Per-table WHERE clauses — shared or independent filters for source and target
- Exclude generated columns — optionally omit identity, serial, and sequence-default columns
- Secure password input — prompts with masked input when passwords are not provided on the command line
- TSV output — clean tab-separated format with escaped special characters, ready for diff tools
- Progress reporting — logs progress every 100,000 rows for large tables
Prerequisites
- .NET 9 SDK or later
- Network access to the PostgreSQL servers being compared
Building
dotnet restore
dotnet build
To create a single-file executable:
dotnet publish -c Release -r win-x64 --self-contained
dotnet publish -c Release -r linux-x64 --self-contained
dotnet publish -c Release -r osx-x64 --self-contained
Usage
pg-data-comparer [options]
Connection Options
Each side (source and target) requires its own connection parameters:
| Option | Required | Default | Description |
|---|---|---|---|
--source-host |
Yes | — | Source PostgreSQL host |
--source-port |
No | 5432 |
Source PostgreSQL port |
--source-database |
Yes | — | Source database name |
--source-schema |
No | public |
Source schema name |
--source-table |
Yes | — | Source table name |
--source-user |
Yes | — | Source database username |
--source-password |
No | (prompts) | Source database password |
--target-host |
Yes | — | Target PostgreSQL host |
--target-port |
No | 5432 |
Target PostgreSQL port |
--target-database |
Yes | — | Target database name |
--target-schema |
No | public |
Target schema name |
--target-table |
Yes | — | Target table name |
--target-user |
Yes | — | Target database username |
--target-password |
No | (prompts) | Target database password |
Query Options
| Option | Default | Description |
|---|---|---|
--where |
— | WHERE clause applied to both queries (without the WHERE keyword) |
--source-where |
— | WHERE clause for the source query only (overrides --where for source) |
--target-where |
— | WHERE clause for the target query only (overrides --where for target) |
--sort-by |
(auto-detect PK) | Comma-separated column list for ORDER BY |
--exclude-generated |
false |
Exclude identity, serial, and sequence-default columns |
Output Options
| Option | Default | Description |
|---|---|---|
--output-dir |
. (current directory) |
Directory for output TSV files |
--null-text |
<NULL> |
Text representation for NULL values |
Examples
Basic comparison — same table on two servers
pg-data-comparer \
--source-host prod-db --source-database myapp --source-table users \
--source-user appuser \
--target-host staging-db --target-database myapp --target-table users \
--target-user appuser
Passwords will be prompted interactively with masked input.
Filtered comparison with a shared WHERE clause
pg-data-comparer \
--source-host prod-db --source-database billing --source-schema billing_own \
--source-table accountingentry --source-user billing_app --source-password secret1 \
--target-host staging-db --target-database billing --target-schema billing_own \
--target-table accountingentry --target-user billing_app --target-password secret2 \
--where "created_date >= '2026-01-01'" \
--output-dir C:\temp\compare
Independent WHERE clauses per side
pg-data-comparer \
--source-host db1 --source-database app --source-table orders \
--source-user reader --source-password pass1 \
--source-where "region = 'US' AND status = 'COMPLETE'" \
--target-host db2 --target-database app --target-table orders \
--target-user reader --target-password pass2 \
--target-where "region = 'CA' AND status = 'COMPLETE'"
Excluding auto-generated columns
pg-data-comparer \
--source-host prod --source-database mydb --source-table events \
--source-user admin --source-password secret \
--target-host staging --target-database mydb --target-table events \
--target-user admin --target-password secret \
--exclude-generated
This omits columns that are:
- Identity columns (
GENERATED ALWAYS AS IDENTITY/GENERATED BY DEFAULT AS IDENTITY) - Serial/bigserial columns (default of
nextval(...)) - Generated stored columns (
GENERATED ALWAYS AS (...))
Custom sort order
pg-data-comparer \
--source-host db1 --source-database app --source-table audit_log \
--source-user reader --source-password pass \
--target-host db2 --target-database app --target-table audit_log \
--target-user reader --target-password pass \
--sort-by "event_date, user_id"
Output
The tool produces two TSV files named after the connection details:
Source TSV: C:\temp\compare\prod-db_billing_billing_own_accountingentry.tsv
Target TSV: C:\temp\compare\staging-db_billing_billing_own_accountingentry.tsv
When both sides would produce the same filename (e.g., same host/database/schema/table), _source and _target suffixes are appended automatically.
TSV Format Details
- Header row — column names, tab-separated
- Data rows — values tab-separated, one row per line
- NULL values — rendered as
<NULL>(configurable via--null-text) - Special characters — tabs, carriage returns, and newlines within field values are escaped as
\t,\r,\n - Dates — formatted as ISO 8601 round-trip (
Oformat) - Binary data — hex-encoded
- Encoding — UTF-8 without BOM
Comparing the Output
Open both TSV files in your preferred diff tool:
- Beyond Compare — File > Open, select both TSV files; use "Table Compare" session type for best results
- WinMerge — File > Open, select both files
- VS Code — Use a diff extension or run
code --diff source.tsv target.tsv - CLI —
diff source.tsv target.tsv(Linux/macOS)
How It Works
- Connects to both PostgreSQL servers in parallel
- Auto-detects primary key columns for deterministic ordering (unless
--sort-byis specified) - Optionally queries
information_schema.columnsto identify and exclude generated columns - Streams rows via
SequentialAccessfor memory efficiency - Writes TSV output with escaped special characters
- Reports progress every 100,000 rows
Dependencies
| Package | Version | License |
|---|---|---|
| Npgsql | 9.0.3 | PostgreSQL License |
| System.CommandLine | 2.0.0-beta4 | MIT |
License
MIT