跨Oracle与PostgreSQL数据库同结构表持续数据对比工具咨询
Great question—comparing ongoing data consistency between Oracle and PostgreSQL (with identical schemas) is a super common use case, and there are solid options both from AWS managed services and open source tools. Let’s break them down:
These are great if you want a hands-off, scalable solution without managing infrastructure:
AWS Glue
AWS’s managed ETL service is perfect for cross-database data validation. You can use its JDBC connectors to connect both Oracle and PostgreSQL, then write Python/Scala scripts to implement your comparison logic—think checking row counts per primary key group, computing row-level hashes for quick comparisons, or validating individual field values. For ongoing checks, set up scheduled Glue jobs (via CloudWatch Events) and route results to S3 for auditing or CloudWatch Alarms to notify you of inconsistencies. If you prefer low-code, Glue DataBrew offers visual validation workflows too.AWS Database Migration Service (DMS)
While DMS is primarily for migration/sync, it includes a built-in Data Validation feature. If you’re already using DMS to replicate data between Oracle and PostgreSQL, enable this feature to automatically compare source and target data. It supports both full-table validation and ongoing CDC (Change Data Capture) validation, so you can monitor consistency in real time as data changes. Inconsistencies are logged to CloudWatch or S3 for review.AWS Lambda + CloudWatch Events
For full customization, write a Lambda function using database drivers likecx_Oracle(for Oracle) andpsycopg2(for PostgreSQL) to directly query and compare data. Schedule the function to run at your desired interval using CloudWatch Events, and use Amazon SNS to send alerts when mismatches are found. This is ideal if you have very specific validation rules that don’t fit out-of-the-box tools.
These are ideal for teams that want flexibility, cost savings, or full control over the validation logic:
Apache NiFi
A visual data flow tool that lets you build automated comparison pipelines without heavy coding. Use JDBC Processors to pull data from both databases, then use theCompareContentProcessor to match records (via primary keys) and flag inconsistencies. NiFi supports both scheduled full-table scans and CDC-based real-time checks, and you can route mismatched records to storage (like S3) or alerting tools (Slack, email).Debezium + Kafka Streams
For real-time ongoing validation, Debezium captures CDC events from both Oracle and PostgreSQL, streaming change logs to Apache Kafka. Use Kafka Streams to write lightweight processing logic that compares corresponding change events from both databases—if a record is updated in Oracle but not reflected correctly in PostgreSQL, you can trigger an immediate alert. This is perfect for high-throughput, low-latency scenarios.Custom SQL Scripts + Cron
For smaller-scale or simpler checks, write SQL scripts to generate row-level hashes or aggregate metrics for each table. For example:- Oracle:
SELECT id, MD5(COL1 || '|' || COL2 || '|' || COL3) AS row_hash FROM my_table; - PostgreSQL:
SELECT id, MD5(COL1 || '|' || COL2 || '|' || COL3) AS row_hash FROM my_table;
Export the results to CSV, then use a tool likediffor a Python script to compare them. Schedule this workflow with cron (Linux) or Task Scheduler (Windows) for ongoing checks.
- Oracle:
DataDiff (Open Source)
Purpose-built open source tools like DataDiff support direct comparisons between Oracle and PostgreSQL. Configure connection details, specify tables to compare, and the tool will generate detailed reports of missing records, mismatched field values, and row count differences. You can wrap this in a cron job to run it automatically on a schedule.
Quick Tips
- For large tables, prioritize hash-based comparisons over row-by-row checks to save performance—compute hashes for partitions or primary key ranges instead of full tables.
- If you’re running ongoing sync between the two databases, use CDC-based tools to only compare changed data, reducing resource usage.
- AWS services handle scaling and maintenance, while open source tools offer full customization for niche validation rules.
内容的提问来源于stack exchange,提问作者Punter Vicky




