Forward replication issues
Performance troubleshooting
If MOLT Replicator appears hung or performs poorly:
Enable trace logging with
-vvto get more visibility into the replicator's state and behavior.If MOLT Replicator is in an unknown, hung, or erroneous state, collect performance profiles to include with support tickets. Replace
{host}and{metrics-port}with your Replicator host and the port specified by--metricsAddr:curl '{host}:{metrics-port}/debug/pprof/trace?seconds=15' > trace.out curl '{host}:{metrics-port}/debug/pprof/profile?seconds=15' > profile.out curl '{host}:{metrics-port}/debug/pprof/goroutine?seconds=15' > gr.out curl '{host}:{metrics-port}/debug/pprof/heap?seconds=15' > heap.outMonitor lag metrics and adjust performance parameters as needed.
Unable to create publication or slot
This error occurs when logical replication is not supported.
Resolution: If you are connected to a replica, connect to the primary instance instead. Replicas cannot create or manage logical replication slots or publications.
Verify that the source database supports logical replication by checking the wal_level parameter on PostgreSQL:
SHOW wal_level;
If wal_level is not set to logical, update it and restart PostgreSQL:
ALTER SYSTEM SET wal_level = 'logical';
Replication slot already exists
ERROR: replication slot "molt_slot" already exists
Resolution: Either create a new slot with a different name, or drop the existing slot to start fresh:
SELECT pg_drop_replication_slot('molt_slot');
Dropping a replication slot can be destructive and delete data that is not yet replicated. Only use this if you want to restart replication from the current position.
Publication does not exist
run CREATE PUBLICATION molt_fetch FOR ALL TABLES;
Resolution: Create the publication on the source database. Ensure you also create the replication slot:
CREATE PUBLICATION molt_publication FOR ALL TABLES;
SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');
Could not connect to PostgreSQL
could not connect to source database: failed to connect to `user=migration_user database=migration_db`
Resolution: Verify the connection details including user, host, port, and database name. Ensure the database name in your --sourceConn connection string matches exactly where you created the publication and slot. Verify you're connecting to the same host and port where you ran the CREATE PUBLICATION and SELECT pg_create_logical_replication_slot() commands. Check if TLS certificates need to be included in the connection URI.
Wrong replication slot name
run SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput'); in source database
Resolution: Create the replication slot or verify the correct slot name:
SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');
Repeated binlog syncing restarts
If Replicator repeatedly restarts binlog syncing or starts replication from an unexpectedly old location, this indicates an invalid or purged GTID. When an invalid GTID is provided, the binlog syncer will fall back to the first valid GTID.
Resolution: Verify the GTID set is valid and not purged:
-- Check if GTID is in executed set
SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_executed) AS in_executed;
-- Check if GTID is purged
SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_purged) AS in_purged;
Interpret the results as follows:
- If
in_executedreturns1andin_purgedreturns0, the GTID is valid for replication. - If
in_purgedreturns1, the GTID has been purged and you must find a newer consistent point. - If both return
0, the GTID doesn't exist in the records and is invalid.
If the GTID is purged or invalid, follow these steps:
Increase binlog retention by configuring
binlog_expire_logs_secondsin MySQL:-- Increase binlog retention (example: 7 days = 604800 seconds) SET GLOBAL binlog_expire_logs_seconds = 604800;Note:For managed MySQL services (such as Amazon RDS, Google Cloud SQL, or Azure Database for MySQL), binlog retention is typically configured through the provider's console or CLI. Consult your provider's documentation for how to adjust binlog retention settings.
Get a current GTID set to restart replication:
-- For MySQL < 8.0: SHOW MASTER STATUS; -- For MySQL 8.0+: SHOW BINARY LOG STATUS;+---------------+----------+--------------+------------------+-------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +---------------+----------+--------------+------------------+-------------------------------------------+ | binlog.000005 | 197 | | | 77263736-7899-11f0-81a5-0242ac120002:1-38 | +---------------+----------+--------------+------------------+-------------------------------------------+Use the
Executed_Gtid_Setvalue for the--defaultGTIDSetflag.
Invalid GTID format
Invalid GTIDs can occur when GTIDs are purged due to insufficient binlog retention, when connecting to a replica instead of the primary host, or when passing a GTID that has valid format but doesn't exist in the binlog history.
Resolution: Use a valid GTID from SHOW MASTER STATUS (MySQL < 8.0) or SHOW BINARY LOG STATUS (MySQL 8.0+) and ensure you're connecting to the primary host. If GTIDs are being purged, increase binlog retention.
Table/column names exceed 30 characters
Oracle LogMiner excludes tables and columns with names longer than 30 characters from redo logs.
Resolution: Rename tables and columns to 30 characters or fewer before migration.
Unsupported data types
LogMiner and replication do not support:
- Long
BLOB/CLOBs (4000+ characters) - User-defined types (UDTs)
- Nested tables
- Varrays
GEOGRAPHYandGEOMETRY
Resolution: Convert unsupported data types or exclude affected tables from replication.
LOB column UPDATE statements
UPDATE statements that only modify LOB columns are not supported by Oracle LogMiner.
Resolution: Avoid LOB-only updates during replication, or use Binary Reader for Oracle 12c.
JSONB null handling
SQL NULL and JSON null values are not distinguishable in JSON payloads during replication.
Resolution: Avoid using nullable JSONB columns where the distinction between SQL NULL and JSON null is important.
Missing redo logs or unavailable SCN
If the Oracle redo log files are too small or do not retain enough history, you may get errors indicating that required log files are missing for a given SCN range, or that a specific SCN is unavailable.
Resolution: Increase the number and size of online redo log files, and verify that archived log files are being generated and retained correctly in your Oracle environment.
Replicator lag
If the replicator process is lagging significantly behind the current Oracle SCN, you may see log messages like: replicator is catching up to the current SCN at 5000 from 1000…. This indicates that replication is progressing but is still behind the most recent changes on the source database.
Schema drift errors
Indicates source and target schemas are mismatched:
WARNING: schema drift detected in "database"."table" at payload object offset 0: unexpected columns: column_name
Resolution: Align schemas or use userscripts to transform data.
Apply flow failures
Apply flow failures occur when the target database encounters error conditions such as unique constraint violations, target database being unavailable, or incorrect data (missing or extraneous columns) during apply operations:
WARNING: warning during tryCommit: ERROR: duplicate key value violates unique constraint
ERROR: maximum number of retries (10) exceeded
Resolution: Check target database constraints and connection stability. MOLT Replicator will log warnings for each retry attempt. If you see warnings but no final error, the apply succeeded after retrying. If all retry attempts are exhausted, Replicator will surface a final error and restart the apply loop to continue processing.
Failback issues
If the changefeed shows connection errors in SHOW CHANGEFEED JOB:
Connection refused
transient error: Post "https://replicator-host:30004/migration_db/migration_schema": dial tcp [::1]:30004: connect: connection refused
This indicates that Replicator is down, the webhook URL is incorrect, or the port is misconfigured.
Resolution: Verify that MOLT Replicator is running on the port specified in the changefeed INTO configuration. Confirm the host and port are correct.
Incorrect schema path errors
This error occurs when the CockroachDB changefeed webhook URL path does not match the target database schema naming convention:
transient error: 400 Bad Request: unknown schema:
The webhook URL path is specified in the INTO clause when you create the changefeed. For example: webhook-https://replicator-host:30004/database/schema.
Resolution: Verify the webhook path format matches your target database type:
- PostgreSQL targets should use
/database/schemaformat. For example,webhook-https://replicator-host:30004/migration_db/migration_schema.
- MySQL targets should use
/databaseformat. For example,webhook-https://replicator-host:30004/migration_db.
- Oracle targets should use
/SCHEMAformat in uppercase. For example,webhook-https://replicator-host:30004/MIGRATION_SCHEMA.
For details on configuring the webhook sink URI, refer to Webhook sink.
GC threshold error
batch timestamp * must be after replica GC threshold
This indicates starting from an invalid cursor that has been garbage collected.
Resolution: Double-check the cursor to ensure it represents a valid range that has not been garbage collected, or extend the GC TTL on the source CockroachDB cluster:
ALTER DATABASE defaultdb CONFIGURE ZONE USING gc.ttlseconds = {gc_ttl_in_seconds};
Duplicated data re-application
This occurs when resuming a changefeed from a cursor causes excessive data duplication.
Resolution: Clear the staging database to prevent duplication. This deletes all checkpoints and buffered data, so use with caution:
DROP DATABASE _replicator;
For more targeted cleanup, delete mutations from specific staging tables:
DELETE FROM _replicator.employees WHERE true;