About Change Data Capture

Tram Ho

Currently, there are many ways to save change data when performing Insert, Update, Delete such as: triggers , compare source and destination tables, cdc, change tracking, Row version, time stamps, etc. Each solution has pros and cons, and developers need to choose to suit each system This document will introduce cdc

  1. Mechanism of action

When enabling cdc, we will have 2 jobs. 1 capture job does change logging and 1 clean job clears changes log, to avoid system overload

The first is Job capture: This job will execute exec sp sys.sp_MScdc_capture_job to define the following parameter:

Continuous (default 1): 1 if the capture job is configured to run continuously, 0 if it is only run once

Maxtrans (default 500): maximum number of transactions in each log scan

Maxscans (default 10): number of log scans in each scan phase before pausing (if continuous = 1) or stopped completely (if continuous = 0)

Pollinginterval (default 5): pausing time (seconds) between each scan phase

After defining the params, exec sp sys.sp_cdc_scan. This sp does two things: exec sp sys.sp_replcmds to read the log, and insert the log into the cdc tables. Each time the above two tasks are done, it is a maxtrans scan, also called a log scan session

Since the Capture Process (reads the log and writes the log to the cdc table) reads data from the transaction log, there will be a delay between when the transaction occurs and when the Capture Process records the change as well as when the log is recorded on the cdc .table_CT. We will talk more about delay optimization in the cdc job configuration section below

With Cleanup Job: perform a clean change table and default is to run periodically at 2am so that change data can only be saved up to 3 days. This job will be configured by 2 parameters

Retention: the time (minutes) the log was saved on the cdc . table

Threshold: the maximum number of log records on the cdc table that are deleted at each cleanup run. Each time the clean job runs, there will be multiple cleanups to ensure the Retention value

  1. Configuration cdc

To enable cdc, we execute enable cdc in the database and on the table that we want to capture change data USE MyDBGO
EXEC sys.sp_cdc_enable_db
GO
USE MyDB
GO

EXEC sys.sp_cdc_enable_table
@source_schema = N’dbo’,
@source_name = N’MyTable’,
@role_name = N’MyRole’,
@filegroup_name = N’MyDB_CT’,
@supports_net_changes = 1
GO
To check the job configuration parameters, select the table msdb.dbo.cdc_jobs

To change the above parameters, execute the following command:

EXEC sp_cdc_change_job @job_type=’capture’, @maxtrans = 500, @maxscans = 10, @continuous = 1, @pollinginterval = 5

EXEC sp_cdc_change_job @job_type=’cleanup’, @retention = 4320, @threshold = 5000

Note: after resetting the above parameters, need to stop and start the capture job again

To disable cdc on board or above database USE MyDB
GO
EXEC sys.sp_cdc_disable_db
GO
USE MyDB
GO
EXEC sys.sp_cdc_disable_table
@source_schema = N’dbo’,
@source_name = N’MyTable’,
@capture_instance = N’dbo_MyTable’
GO
3. Conclude

Thus, we have understood the mechanism of operation and how to configure cdc. In the next article, we will learn how to optimize the parameters in cdc

Share the news now