Two ways of improving your ETL with SQL 2012 #3 Change data capture

In an earlier post we saw how the new project deployment functionality will make it much easier to deploy and keep track of versions of a SSIS package.

The other cool feature I will talk about is change data capture. How do you transfer data from your datasources (e.g. an ERP-system) to a staging database? Well, I see multiple ways, but the purpose is to transfer as few records as possible, and some of the methods are:

1) If a table has a last changed, then keep track of the “Last changed date” in the staging database, and then transfer all data since the last “Last Changed”.

2) If table as a rowID that are incremental, then transfer all new rows since the last transfer.

3) Transfer e.g. the last two days data and then work with them.

All methods has their ups and downs, but now we can utilize the Change Data Capture functionality, and I will show you how it works.

Basically, we must

1) add change data capture to a table

2) Then we must create the SSIS package

3) Run it initially the first time.

4) Setup the package to run continiously.

 

It is very easy, and I will try to give you a short demo on that now.

[youtube=http://www.youtube.com/watch?v=vlOZvI9xmuY&w=448&h=252&hd=1]

 

<– Project Deployment

two ways of improving your ETL with SQL 2012 #2 – Project deployment

How did you deploy SSIS packages earlier? Well, I went into the project properties, and create a deploymentutility, and then I ran it. I had to do a lot of work.

Now I

  1. setup a SSIS catalog in SQL Server Management Studio.
  2. Right click in Data Tools (Visual studio) and deploy.

Let’s see how that is done in SQL Server 2012. Please don’t mind the quality of the video. I have a son that are sick today, so I had to do a lot of breaks.

[youtube=http://www.youtube.com/watch?v=SsjEftMLswU&w=448&h=252&hd=1]

 

<—Introduction   Change data capture –>

Two ways of improving your ETL with SQL 2012 #1

SQL 2012 has improved a lot when it comes to the possibility to create a better and more efficient ETL, and in this short intro, I would like to show you some of the things I find useful.

In this series of blogs I will show you:

  1. Projectdeployment instead of package deployment.
  2. Using Change Data Capture to improve performance.

Projectdeployment

One of the things that has always irritated me was the need to deploy a package instead of a complete project. To be practical, then I will show you how to deploy a project to a SQL Server 2012 instead of. It is very easy and gives you a lot of benefits. And it makes it easier to configure values that must change depending on the installation. On a scale from 1 to 10, where 1 is no new value and 10 is great new value, then I give this functionality 9.

Change data capture

This is also not new, and we have been able to setup change data capture since version 2008, but now even a newbie to ETL can setup changedatacapture and use it in Integration services. We have some new components that helps us. I would give this 10 out of 10.

This is simply brilliant, since it addresses one of the biggest problems in building a datawarehouse and getting a good performance: How do we only update and insert changed and new rows.

So let’s get started with projectdeployment.

Project deployment –>