6.12.10

Data Compression in SQL Server 2008

Data compression is a new feature introduced in SQL Server 2008. It enables the DBA’s to effectively manage the MDF files and Backup files. There are two types of compressions,

1. Row Level Compression: This type of compression will work on the row level of the data page.

a. Operations like changing the fixed length datatype to Variable length data type. For instance, Char(10) is a fixed length datatype and If we store “Venkat” as data. The space occupied by this name is 6 and remaining 4 spaces will be wasted in the legacy system. Whereas, In SQL Server 2008, it is utilised effectively. Only 6 spaces will be given to this variable.
b. Removal of Null value and zeros. These values will not be stored in the disk. Instead, they will have a reference in the CI Structure.
c. Reduce the amount of metadata used to store the row.

2. Page Level Compression: This compression will be effective in the page level.

a. This compression follows Row Level Compression. On top of that, Page level compression will work.
b. Prefix Compression - This compression will work on the column level. Repeated data will be removed and a reference will be stored in the Compression information (CI) structure which is located next to the page header.
c. Dictionary Compression – This compression will be implemented as a whole on the page. It will remove all the repeated data and a reference will be placed on the page.

How it works:

Considering, If you a user is requesting for a data. In that case, Relational Engine will take care of getting the request compile, parse and it will request the data from the Storage engine.

Now, our data is in the compressed format. Storage engine will send the compressed data to the Buffer cache which in turn will take care of sending the data to relational engine in uncompressed format. Relational engine will do the modifications on the uncompressed data and it will send the same to buffer cache. Buffer cache will take care of compressing the data and have it for future use. In turn, it will send a copy to the Storage Engine.

Advantages:

1. More data will be stored in the Buffer cache. So, no need to go and search in the disk which inturn reduce the I/O.
2. Disk space is highly reduced.

Disadvantages:

1. More CPU cycles will be used to decompress the data.
2. It will be a negative impact, if the data doesn’t have more null values, zeros and compact/exact data (Equivalent to the declared data type).

Cheers,
Venkatesan Prabu .J

Indexing Table variables in SQL Server

Table variables is a very nice handy tool to store the result set. The major advantage of table variable is that, it will log a very minimal information in the temp database. Initially, I had a thoughts that table variable will reside only in the memory and it won’t interact with any of the database. But, it’s not the case. The data will reside in the temp database but logged very minimally. Index created on the table variables will reside in the tempdb.

Let’s see, how to create indexes on the table variable.

-- Temp table created
DECLARE @VENKAT_VAR TABLE (ID INT)

-- Trying to create an index on the column. It's throwing error.
CREATE CLUSTERED INDEX IDX_VENKAT_VAR ON @VENKAT_VAR(ID)


-- Trying to alter the table variable. It's throwing error.
ALTER TABLE @VENKAT_VAR ADD CONSTRAINT CON_VENKAT_VAR PRIMARY KEY (ID)


-- Here is an option to create the clustered index or primary key on the temp variable.Index on the table variable can be created during table variable creation itself.

DECLARE @VENKAT_VAR TABLE
(ID INT PRIMARY KEY CLUSTERED )

-- We can query on the sys.indexes table to get the index details created on the table variable.

SELECT * FROM TEMPDB.SYS.indexes A INNER JOIN TEMPDB.SYS.tables B
ON A.object_id=B.object_id ORDER BY create_date DESC


Cheers,
Venkatesan Prabu .J