A cross-platform command-line utility that exports PostgreSQL table data to tab-separated values (TSV) files for side-by-side comparison with diff tools such as [Beyond Compare](https://www.scootersoftware.com/).
Find a file
Hannah Vernon b8709714f0
Merge pull request #5 from HannahVernon/dev
Merge dev to main: community health files
2026-04-20 14:17:21 -05:00
.github Add community health files and templates 2026-04-20 12:46:59 -05:00
.gitignore Initial commit: pg-data-comparer CLI utility 2026-03-31 14:56:57 -05:00
CODE_OF_CONDUCT.md Add community health files and templates 2026-04-20 12:46:59 -05:00
CONTRIBUTING.md Add community health files and templates 2026-04-20 12:46:59 -05:00
Directory.Build.props Add automated version bump on merge to main 2026-04-08 14:20:09 -05:00
global.json Add automated version bump on merge to main 2026-04-08 14:20:09 -05:00
LICENSE.md Initial commit: pg-data-comparer CLI utility 2026-03-31 14:56:57 -05:00
pg-data-comparer.csproj Initial commit: pg-data-comparer CLI utility 2026-03-31 14:56:57 -05:00
Program.cs Add --include-postgres-system-objects CLI flag 2026-04-08 14:14:03 -05:00
README.md Initial commit: pg-data-comparer CLI utility 2026-03-31 14:56:57 -05:00
SECURITY.md Add community health files and templates 2026-04-20 12:46:59 -05:00

pg-data-comparer

A cross-platform command-line utility that exports PostgreSQL table data to tab-separated values (TSV) files for side-by-side comparison with diff tools such as Beyond Compare.

Features

  • Cross-platform — runs on Windows, macOS, and Linux (.NET 9)
  • Parallel export — source and target tables are queried concurrently
  • Auto-detected sort order — automatically discovers primary key columns for deterministic row ordering
  • Per-table WHERE clauses — shared or independent filters for source and target
  • Exclude generated columns — optionally omit identity, serial, and sequence-default columns
  • Secure password input — prompts with masked input when passwords are not provided on the command line
  • TSV output — clean tab-separated format with escaped special characters, ready for diff tools
  • Progress reporting — logs progress every 100,000 rows for large tables

Prerequisites

  • .NET 9 SDK or later
  • Network access to the PostgreSQL servers being compared

Building

dotnet restore
dotnet build

To create a single-file executable:

dotnet publish -c Release -r win-x64 --self-contained
dotnet publish -c Release -r linux-x64 --self-contained
dotnet publish -c Release -r osx-x64 --self-contained

Usage

pg-data-comparer [options]

Connection Options

Each side (source and target) requires its own connection parameters:

Option Required Default Description
--source-host Yes Source PostgreSQL host
--source-port No 5432 Source PostgreSQL port
--source-database Yes Source database name
--source-schema No public Source schema name
--source-table Yes Source table name
--source-user Yes Source database username
--source-password No (prompts) Source database password
--target-host Yes Target PostgreSQL host
--target-port No 5432 Target PostgreSQL port
--target-database Yes Target database name
--target-schema No public Target schema name
--target-table Yes Target table name
--target-user Yes Target database username
--target-password No (prompts) Target database password

Query Options

Option Default Description
--where WHERE clause applied to both queries (without the WHERE keyword)
--source-where WHERE clause for the source query only (overrides --where for source)
--target-where WHERE clause for the target query only (overrides --where for target)
--sort-by (auto-detect PK) Comma-separated column list for ORDER BY
--exclude-generated false Exclude identity, serial, and sequence-default columns

Output Options

Option Default Description
--output-dir . (current directory) Directory for output TSV files
--null-text <NULL> Text representation for NULL values

Examples

Basic comparison — same table on two servers

pg-data-comparer \
  --source-host prod-db --source-database myapp --source-table users \
  --source-user appuser \
  --target-host staging-db --target-database myapp --target-table users \
  --target-user appuser

Passwords will be prompted interactively with masked input.

Filtered comparison with a shared WHERE clause

pg-data-comparer \
  --source-host prod-db --source-database billing --source-schema billing_own \
  --source-table accountingentry --source-user billing_app --source-password secret1 \
  --target-host staging-db --target-database billing --target-schema billing_own \
  --target-table accountingentry --target-user billing_app --target-password secret2 \
  --where "created_date >= '2026-01-01'" \
  --output-dir C:\temp\compare

Independent WHERE clauses per side

pg-data-comparer \
  --source-host db1 --source-database app --source-table orders \
  --source-user reader --source-password pass1 \
  --source-where "region = 'US' AND status = 'COMPLETE'" \
  --target-host db2 --target-database app --target-table orders \
  --target-user reader --target-password pass2 \
  --target-where "region = 'CA' AND status = 'COMPLETE'"

Excluding auto-generated columns

pg-data-comparer \
  --source-host prod --source-database mydb --source-table events \
  --source-user admin --source-password secret \
  --target-host staging --target-database mydb --target-table events \
  --target-user admin --target-password secret \
  --exclude-generated

This omits columns that are:

  • Identity columns (GENERATED ALWAYS AS IDENTITY / GENERATED BY DEFAULT AS IDENTITY)
  • Serial/bigserial columns (default of nextval(...))
  • Generated stored columns (GENERATED ALWAYS AS (...))

Custom sort order

pg-data-comparer \
  --source-host db1 --source-database app --source-table audit_log \
  --source-user reader --source-password pass \
  --target-host db2 --target-database app --target-table audit_log \
  --target-user reader --target-password pass \
  --sort-by "event_date, user_id"

Output

The tool produces two TSV files named after the connection details:

Source TSV: C:\temp\compare\prod-db_billing_billing_own_accountingentry.tsv
Target TSV: C:\temp\compare\staging-db_billing_billing_own_accountingentry.tsv

When both sides would produce the same filename (e.g., same host/database/schema/table), _source and _target suffixes are appended automatically.

TSV Format Details

  • Header row — column names, tab-separated
  • Data rows — values tab-separated, one row per line
  • NULL values — rendered as <NULL> (configurable via --null-text)
  • Special characters — tabs, carriage returns, and newlines within field values are escaped as \t, \r, \n
  • Dates — formatted as ISO 8601 round-trip (O format)
  • Binary data — hex-encoded
  • Encoding — UTF-8 without BOM

Comparing the Output

Open both TSV files in your preferred diff tool:

  • Beyond Compare — File > Open, select both TSV files; use "Table Compare" session type for best results
  • WinMerge — File > Open, select both files
  • VS Code — Use a diff extension or run code --diff source.tsv target.tsv
  • CLIdiff source.tsv target.tsv (Linux/macOS)

How It Works

  1. Connects to both PostgreSQL servers in parallel
  2. Auto-detects primary key columns for deterministic ordering (unless --sort-by is specified)
  3. Optionally queries information_schema.columns to identify and exclude generated columns
  4. Streams rows via SequentialAccess for memory efficiency
  5. Writes TSV output with escaped special characters
  6. Reports progress every 100,000 rows

Dependencies

Package Version License
Npgsql 9.0.3 PostgreSQL License
System.CommandLine 2.0.0-beta4 MIT

License

MIT