Garbage Collection of Checkpoint Files
What is Garbage Collection and how it works?
From:BOL
A Data row which has been deleted by a transaction that is no longer active.a stale row is considered or is eligible for garbage collection.The deleted transactions are written in DELTA files.
- Non-blocking. Garbage collection is distributed over time with minimal impact on the workload.
- Cooperative. User transactions participate in garbage collection with main garbage-collection thread.
- Efficient. User transactions delink stale rows in the access path (the index) being used. This reduces the work required when the row is finally removed.
- Responsive. Memory pressure leads to aggressive garbage collection.
- Scalable. After commit, user transactions do part of the work of garbage collection. The more transaction activity, the more the transactions delink stale rows.
Garbage collection is controlled by the main garbage collection thread.The main garbage collection thread runs every 1 min, or when the number of committed transactions exceeds an internal threshold.The task of the garbage collector is to:
- Identify transactions that have deleted or updated a set of rows and have committed before the oldest active transaction.
- Identity row versions created by these old transactions.
- Group old rows into one or more units of 16 rows each. This is done to distribute the work of the garbage collector into smaller units.
- Move these work units into the garbage collection queue, one for each scheduler. Refer to the garbage collector DMVs for the details:
sys.dm_xtp_gc_stats (Transact-SQL),
sys.dm_db_xtp_gc_cycle_stats (Transact-SQL)
sys.dm_xtp_gc_queue_stats (Transact-SQL).
After a user transaction commits, it identifies all queued items associated with the scheduler it ran on and then releases the memory. If the garbage collection queue on the scheduler is empty, it searches for any non-empty queue in the current NUMA node. If there is low transactional activity and there is memory pressure, the main garbage-collection thread can access garbage collect rows from any queue. If there is no transactional activity after (for example) deleting a large number of rows and there is no memory pressure, the deleted rows will not be garbage collected until the transactional activity resumes or there is memory pressure.
Below are full steps involved to show basically garbage collection works
Add In-Memory OLTP File group with 2 containers (file stream files), the files will distributed in two containers, the first container will be used for DATA and second container will be used for DELTA
Check the no of files created.I have 16 GB Memory with one Physical CPU , dual core with 4 logical processors.
select state,file_type_desc, state_desc, internal_storage_slot,file_size_in_bytes, file_size_used_in_bytes, inserted_row_count, deleted_row_count, lower_bound_tsn, upper_bound_tsn, last_backup_page_count, drop_table_deleted_row_count from sys.dm_db_xtp_checkpoint_files order by file_type_desc, upper_bound_tsn
select file_type_desc, state_desc,relative_file_path, internal_storage_slot from sys.dm_db_xtp_checkpoint_files order by container_id, file_type_desc, upper_bound_tsn
Insert 90000 records and 2 new CPF’s in under construction were created.
Let’s insert some more records
The column file_size_used_in_bytes will be updated after the next checkpoint.
Automatic checkpoint completion: It is done when the transaction log since the last automatic checkpoint exceeds 512MB. In other words, an automatic checkpoint is kicked off for every 512MB of log records accumulated containing changes both from memory-optimized and diskbased tables. Unlike checkpoint for disk-based tables, the persistence of the data for memoryoptimized tables is done continuously by a background thread, the completion of a checkpoint updates internal metadata information
The Under Construction files will move to Active
I have disabled the Automatic Merge to issue a manually merge for demo
exec sys.sp_xtp_merge_checkpoint_files adventureworks2014, 2, 44 select * from sys.dm_db_xtp_merge_requests
Perform a log backup for log truncation checkpoint, to get the merge request installed.
Check the files which were merged.
select state,file_type_desc, state_desc, internal_storage_slot,file_size_in_bytes, file_size_used_in_bytes, inserted_row_count, deleted_row_count, lower_bound_tsn, upper_bound_tsn, last_backup_page_count, drop_table_deleted_row_count from sys.dm_db_xtp_checkpoint_files where state = 4 order by container_id, file_type_desc, upper_bound_tsn
Once the Merge is complete, file should be in required for backup state
After the backup and log truncation checkpoint [REQUIRED FOR BACKUP/HA] files will be moved to TOMBSTONE
Issue a manual garbage collection
EXEC sp_filestream_force_garbage_collcection
Files marked as Tombstone were removed from the disk
After several checkpoints