Understand snapshot and offset behaviour in mssql source connector using CDC #40726
Unanswered
ritikanaidu-trakstar
asked this question in
Connector Questions
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I am using the Airbyte OSS version to ingest data from MSSQL server to Snowflake using CDC. The retention period is set to 10 days in SQL Server. Every 11th day, the job fails with this error -
To recover this, I have to reset the data and trigger a full refresh so that airbyte can get an initial snapshot again.
Looking at the events building up to this error, the lsn which the log reports missing from server -
0017bf25:0002d9d7:0004
is the lsn from the initial snapshot taken 10 days ago.Logs from 10 days ago, snapshot offset shows that commit_lsn =
0017bf25:0002d9d7:0004
Latest logs where airbyte tries to locate the initial offset on the server, and fails because the lsn has been cleaned up by sql server after the retention period.
My understanding of airbyte and debezium internals is limited. I would like to know if this is normal behaviour that airbyte tries to look for an lsn that will definitely run out its time in due course (retention period). Shouldn't it be looking for the incremental min_lsn on the server? Or should it be taking incremental (or blocking) snapshots periodically to build upon the initial snapshot taken during full refresh..? Appreciate any help in this regard, since this issue has been plaguing us for several weeks now. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions