What Is Dirty Reads In SQL Server


The data which is modified in sql server table in data file but not committed is called as dirty data and when sql server read this uncommitted data, It is called dirty read.

A dirty read takes no notice of any lock taken by another process. The read is officially “dirty” when it reads data that is uncommitted. This can become problematic if the uncommitted transaction fails or for some other reason is rolled back.

Is that enough? No....

To understand what is dirty data i am going to run some test with below example. 

Example: To be more specific if i am running a very huge transaction in sql server, May be update on a 1 TB table. So let say this transaction updates billion rows of around 500GB size and run for 2 hours and my server have 200 GB ram. That means all data cant be modified and left in Ram so sql server have to update tables on data file and the data which get modified called as dirty data and the pages which get modified with dirty data called as dirty page.

In this scenario below things will happen.

1. In Sql server Check point / Lazy writer move data from cache to Data file even though data is uncommitted.
2. In Transaction log file until transaction is committed, all data can be rolled back any time in case of disaster.

To read uncommitted date in SQL server we have to use nolock hist 
Example:  Select * from table with (nolock)

And when you read uncommitted data, doing this there is no guarantee that data read will ever be committed meaning the data could well be bad.

In Below demo I am trying to read dirty data from table 

/*
I have create 3 tables and trying to update 1st table from 2nd and 2nd from third
*/

Use Tempdb
GO
/******create 4 tables**********/
CREATE TABLE t1 (i1 int);
CREATE TABLE t2 (i2 int);
CREATE TABLE t3 (i3 int);


/******Insert Record in 4 Tables***********/
INSERT t1 SELECT 1;
INSERT t2 SELECT 2;
INSERT t3 SELECT 3;

/*************************/


Now we will Run 2 transactions in 2 different query analyzer

/************** Query Window 1 **************/
BEGIN TRAN
UPDATE t2 SET  t2.i2 = t3.i3
FROM   t2 , t3
Waitfor Delay '00:00:20' -- 20 Seconds

Rollback TRAN
/************* Query Window 2 *************/
Select * from t2 with (Nolock)
Go
Waitfor Delay '00:00:20'
Go
Select * from t2
GO


/********************************************/
















 






So now you can see we have 2 different  data as my transaction is rolled back. First data is dirty data from table (using read uncommitted isolation level) due to update statement which is going to rollback after 20 seconds. In real time scenario this could happen due to power outage in huge transaction also.

No comments:

Post a Comment