SQL Internals er forskellige blog indlæg om hvordan en SQL Server fungerer

Get started with SQL Server on Linux

The first time I heard about this, I thought it was a joke, but now I have been working with Microsoft SQL Server for Linux, for some time, and I think the idea is great. Why not have Microsoft SQL server on as many platforms as possible.

Below I have described how I got my first solution op running. It is an Easy to use Guide with some links to usefull ressources.

IMPORTANT!!! I have absolutely no experience with LINUX, and therefore don’t be afraid of getting started. If I can do it, then you can also do it.

I have installed everything on a computer running windows 10.

To get running you just need windows 10 (with a version that can run Hyper-V). The steps are:

  1. Install Hyper-V on you work computer
  2. Install Ubuntu 16.04
  3. Install Microsoft SQL server on Ubuntu
  4. Install Management tools on Ubuntu
  5. Test sqlcmd on Ubuntu
  6. Connect from Windows and SQL Server Management Studio

Install Hyper-V on you work computer

If you have not already done, that then go to control panel and program / features. Click on Hyper-V and ok. Hyper-V is installed:

install-hyper-v

 

Install Ubuntu 16.04

After installing hyper-V, restart the computer, and when it’s up running again, press windows-S (for search) and choose Hyper-V.

search-for-hyper-v

The following screen will appear, and you are ready to work.

hyper-v

 

First of all, you need to download Ubuntu 16.04. go to the link below and click Download

https://www.ubuntu.com/download/desktop

ubuntu-download

The following screen will appear, and you can choose to download for free (1) or you can “Pay with Paypal”. For now, I suggest that you just download without paying, and if you start to using Linux in production, then play fair and pay a small amount.

So now you just need to wait for download to complete. when it is completed, go back to Hyper-V, and rightclick on you instance, and then choose new virtual Machine (that starts a wizard):

hypev_001

Skip the welcome screen, and choose a name for you virtual Machine (1), A location (2) and click Next (3)

hyperv_002

Then choose the version of Hyper-V Virtual Machine (I have choosen generation 2).

hyperv_003

Next you need to add memory, and I suggest that you choose at least 4 GB (4096), and if it is for test and not production, then choose Dynamically assign memory). Click Next.

hyperv_004

Then you need to configure you network. That requires that you have configured that before you started this wizard (if not, then look at the buttom of this guide, to see how you can install a network). Choose your network and then click Next.

hyperv_005

Now we need to add a harddrive, and I suggest that you just choose the default settings if you are testing (1). Size is 127 GB, but that is thin provisioned, so it is first used on disk when you use it. click Next.

hyperv_006

Now point your installation to the downloaded Ubuntu installation kit (2), and click next.

hyperv_007

You are at the end of the wizard. Click Finish, and you have created your virtual machine (no installation done yet).

hyperv_008

You should have a screen that looks like this:

hyperv_009

Right click your machine and choose settings. We need to setup a few things before we start the machine.

hyperv_010

First thing is to change security. Remove check from Enable Secure boot. Next thing is to Add more CPU. I have choosen two, because I am just testing. You might need to add more under different circumstances.

hyperv_011

Click OK, and return to the mainscreen of Hyper-V. Right click your machine and click Start:

hyperv_009

when that is done, then simple Right click again, and then click Connect (or simply just double click). You should see the following screen.

ubuntu_001

Choose Install Ubuntu (note that you can maximize the screen by using the menu at top of the window (View).

Follow the steps (I have just commented the most important, but basically there are the following steps:

  1. Welcome (choose your language. I prefer English).
  2. Preparing to install Ubuntu (I suggest “Download updates while¬†Installing Ubuntu)
  3. Installation type (Choose what you want, I just Choose Erase Disk and Install ubuntu). You a popup is shown, just click Continue again.
  4. Where are you… Choose what you want.
  5. Keyboard layout… Choose what you want.
  6. Who are you? Fill in the required fields.
  7. INSTALLATION IS RUNNING AND YOU ARE READY TO GO.

THATS IT!! I am very positive surprised on how easy it is to setup Ubuntu, and if you are finished correctly, then you should be ready to go and have a picture like this:

ready-to-go

I had to restart the virtual machine to get started, and I had some problems that I could not remove installation media. I simply just turned of the virtual machine and started it again. Then it was up running.

NOTE!!! Please note that you should of course try to set a fixed IP-address, test performance, security and so on. this is, how ever, just a simple guide in getting started.

 

Install Microsoft SQL Server on Ubuntu

So, now lets get started, and remember that we have three steps to do.

  1. Install SQL server
  2. Install the tools
  3. test if we are running.

This part just describes how to install the SQL server, and all steps are described in the following link from Microsoft:

https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-setup-ubuntu

First you need to open a terminal. Right click the desktop (1) and choose terminal:

sql_001

When terminal is open, then you need to do one thing, that are not in the description from Microsoft. You need to write Sudo su. You are required to enter your password.

sql_003

Now it is Very simple. Write the following commands (one-by-one):

  1. curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add –
  2. curl https://packages.microsoft.com/config/ubuntu/16.04/mssql-server.list | sudo tee /etc/apt/sources.list.d/mssql-server.list
  3. sudo apt-get update
  4. sudo apt-get install -y mssql-server
  5. sudo /opt/mssql/bin/sqlservr-setup (this will setup the last things. Remember to save the SA password). Start the instance.

And now you just need to confirm that the instance is running:

  1. systemctl status mssql-server

If you see an output like this, then everything is perfect, and you have setup your Microsoft SQL Server on LINUX!!!!!

sql_004

 

Install Management tools on Ubuntu

So, now we need to setup connectivity. At current times the installation of Management tools will install sqlcmd and bcp, there are no Management studio for now. You need to install that on a windows machine to get running.

  1. Open a new Terminal window
  2. Write the Following commands:
  3. sudo su (remember to enter your password)
  4. curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add –
  5. curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list | sudo tee /etc/apt/sources.list.d/msprod.list
  6. sudo apt-get update
  7. sudo apt-get install mssql-tools unixodbc-dev

You are reguired to accept licens terms after step 7 (2 times).

If you want to make life easy for your self, then add some references to the two application (Sqlcmd and bcp):

  1. open a terminal windows
  2. Write the following commands:
  3. ln -sfn /opt/mssql-tools/bin/sqlcmd-13.0.1.0 /usr/bin/sqlcmd
  4. ln -sfn /opt/mssql-tools/bin/bcp-13.0.1.0 /usr/bin/bcp

SO If no errors in the terminal window, then you are up running.

(I had to restart my ubuntu before I could test).

Let’s test it

Test sqlcmd on Ubuntu

  1. Open a new terminal window
  2. Write the command in below picture (1) to connect. Remember to add your SA password
  3. Write the three lines (select * from sys.objects go) (2). When you hit enter after go, you should see all the rubbish below go.

If you see the text below GO, then everything is fine and you are up running!!!


sql_005

Connect from windows and SQL server Management Studio

So, now the final and most important test. How do we test from a windows machine, running Management studio. Before we start, I will do three things (Please note!! I am in a test environment. Be careful in production)

  1. Turn of firewall on the linux machine (Only do that in test. NEVER in production!!!!)
  2. Download the correct version of Management studio (it must support the next version of SQL server – vNext)
  3. Find my IP address.

To turn of the firewall on linux, open a terminal and write

  1. sudo su (remember password). I am not sure this is required, but I do it.
  2. sudo ufw disable

To download the correct version of Management studio, og to this link and download it.

https://msdn.microsoft.com/library/mt238290.aspx

Choose the newest link (When I am writing, It is 17.0 RC1).

Finally you need to find the name of your machine or the IP. I just use IP. In Linux open a terminal and write

  1. ip addr show

Find the IP address. Mine is 192.168.8.102.

So now open Management studio and connect:

ssms_001

and if you see the window below, then you are connected and up running.

 

ssms_002

Please note!! A lot of things needs to be done, but this is the basics!!!

 

(off topic) How to setup a networkcard in Hyper-V

I promissed to show how you can do this in Hyper-V and on my machine it is easy, and if you are on a big network, then it requires a lot of skills.

In Hyper-V choose click on this menu item:

network_001

Choose Create new virtual Switch

network_002

Enter a name (1) and choose type (2). I choose external network and then my network card.

Click OK, and you can connect your virtual Machine to the internet ūüôā

network_003

 

Primary Keys and clustered indexes

One of the common misunderstandings in relation to the Microsoft SQL server is that a primary key always will be enforced with a clustered index.

That is not true, and in the following video I demonstrate that.

Microsoft SQL Server 2012 Internals

Microsoft SQL server 2012 Internals

Microsoft SQL server 2012 Internals

This book is a “must have book”, if you want to dig into the SQL server. Not only because it is written by some of the most Genius SQL persons (Kalen Delaney, Jonathan Kehayias, Benjamin Nevarez and Paul S. Randal), but because with this book, you get deep dive knowledge about how the SQL Server 2012 works.

The book is split into 14 chapters, starting with a very good introduction about the SQL server architecture, the SQLOS and databases.

Some of the new features are described in chapter 4, where an introduction to Contained databases are given.

As a MCT, then I really like when a book is focused and understandable, and one good example of this is on page 23, where it is stated that “SQL Server Books Only lists only 17 trace flags that are fully supported”. That means: If you want to pass an exam for MCSE on SQL server, then you must know 17 trace flags!!! (of course now the MCSE is changed so that it also covers Microsoft SQL Server 2014).

Chapter 6-9 is about indexes and storage, and that is of course important knowledge to have if you e.g. want to optimize performance.

Chapter 10-13 is a deep dive into how the SQL Server executes, how the Query optimizer works, and this is among my favorites. You can’t get enough information about how the Execution is done, how SQL server works with concurrency and how the SQL server is working with execution plans.

 

Knowing that the last chapter about DBCC is written by Paul Randal, then it is a “must read chapter”. Paul Randal has been writing a lot of the DBCC functionality, so here we get information about the DBCC command directly from one of the architects. And he knows what he is talking about ūüôā

On the downside, then the book is not for beginners. I think that the reader should have at least 3-5 years of experience with Microsoft SQL Server, and he / she must have a good understanding about tables, indexes, queries, maintenance, Backup and so on.

Another downside is that in some places of the books, then we have output and examples that span multiple pages. It can be a little bit difficult to read.

I think this book deserves 4 out of 5 stars. I was a little bit in doubt about if it should be 4 or 5 stars, but on the other hand, then I must also say that this book is an update of the same book for SQL server 2008, and even though it describes new things in Microsoft SQL Server 2012 very well, then I can’t give 5 stars, when a lot of the material has been published before (to be fair. Everything has been rewritten and changed to SQL 2012).

Executionplan with crazy percentages

I got the following executionplan from a customer (IT IS AN ESTIMATED EXECUTIONPLAN)

2013-12-09 00_40_56-10.5.1.51 - Remote Desktop Connection

and as you can see, then the sum of the percentages is more than 100%. That should be impossible, but when I started to look into it, the reason was quite obviously:

The shown executionplan shows the complete SQL statement, but only a part of it is executed, and the sum of the executed part is 100%. So lets look into the SQL statement (you must have the Adventureworks2008R2 installed)

if (select count(*)
from Person.Person)=1234
begin
	select *
	from person.person
	cross join
	sales.SalesOrderDetail
end

There are 19972 rows in the Person.Person table, so we will never execute the cross join between the begin and end. Therefore the 100% in the executionplan will only be calculated on the Select count(*) from Person.Person.

Now we know how much CPU+IO the executed parts takes (hover over the Cond With Query Cost operator). On my server with my databsae it will take 0.0787.

If I hover over the Nested Loops with a cost of 12866355% that wil NEVER be executed, then the operator cost is approx. 10127,94. Now the formula is 100*10127,94 /0.0787 which approximately equals 12866355%.

So that is the reason for the strange percentages: 100% equals the executed part and then we divide the operator cost for each ‚ÄúNon Executed‚ÄĚ operators with the total cost for the executed parts.

What happens if the IF was true so that the cross join will be executed?

Well, then we get the same result.

If I execute the query then I get the following:

 

2013-12-09 00_52_50-10.5.1.51 - Remote Desktop Connection

Strange. But basically I think I understand why the estimated executionplan acts as it does.

Imagine a SQL Statement like this:

if (select count(*) from Person.Person)=19972
begin
	select *
	from person.person
	cross join
		sales.SalesOrderDetail
end
else
begin
	select *
	from person.person
	cross join
		sales.SalesOrderDetail
end

Now the Query got more complicated, but if the sum should get 100% no matter what, then it would also be wrong because we will never execute all code, so I like to conclude:

  1. The code that the Optimizer is 100% sure to execute will calculate 100%.
  2. All other path‚Äôs of the code will related to the ‚Äú100% sure exetued 100%‚ÄĚ so that you have an idea about how long time it takes.

Please challenge me on this.

Practical Troubleshooting The database Engine

This book is an oldie, but a goodie, and I finally managed to get through all the pages. I hate my self for not having done that for many years ago.

 

This book is a technical book, and it covers a lot of serious aspects related to troubleshooting; from DBCC to Memory preassure; from IO problems to CPU problems.

 

The best thing is that it mentions a lot of Dynamic management views that are still very relevant.

 

So. If you a skilled SQL person, and you want to learn more about Troubleshooting, then this is a must have book.

Buy it on Amazone.com.

IAM Page Error – Fix it with a page restore

Today I have been playing around with situations, where your IAM page get corrupt, and I have tried to figure out what to do, and let me say it like this: If you have a valid backup then use that, and depending on the situation consider using page restore.

NOTE!! THIS BLOG IS TECHNICAL AND THERE ARE THINGS THAT YOU SHOULD NOT DO UNLESS YOU ARE 100% SURE ABOUT WHAT YOU DO. WE TAKE NO RESPONSIBILITY IF YOU TRY WHAT WE WRITE IN THE BLOG.

So, lets get started. I want to create a simple database, where I create one table, and I will corrupt that tables IAM Page. Basically that means: You can’t access your data.

So, lets start the blog by setting up a new database, a table and fill in 1.000 rows.

use master

go

 

if¬†DATABASEPROPERTYEX(‘PageRestore’,‘Version’)>0

drop database pagerestore

 

create database Pagerestore

go

 

use PageRestore

go

 

create table LargeTable(id intidentity, name char(4000))

 

go

 

insert into¬†Large Table¬†Values (‘Soak AS’)

go 1000

So, now we have a database called PAGERESTORE and a table called LARGETABLE with 1.0000 rows.

What I want to see now, is what pages the table LargeTable occupies:

dbcc ind(PageRestore , largetable,1)

go

On my computer the result is something like.

Skærmbillede 2013 04 07 kl 17 54 51

I can see that page 119 in file 1 has Null in IAMFID and IAMPID (and that PageType = 10), and then I know that it is a IAM Page.

Before we start, lets do a backup (create the physical folder C:\SQL

BACKUP¬†DATABASE¬†[Pagerestore]¬†TO¬† DISK=N’c:\sql\pagerestore.bak’¬†WITH¬†FORMAT,¬†INIT,¬† NAME=N’Pagerestore-Full Database Backup’, ¬† STATS= 20

go 

After you have taken the backup, then we corrupt the IAM page. NOTE!!! DON’T DO THIS ON A PRODUCTSERVER!!!!!! NEVER!!!!

use master

go

alter database page restore set single_user with no_wait

 

dbcc writepage(PageRestore, 1, 119, 200, 1, 0x50,1)

alter database pagerestore set multi_user with no_wait

DBCC WritePage is an undocumented command, that writes values to the physical page. Imagine what that does to a database. Terrible thins, so don’t use it, unless you really want to get into troubles. I assume that your IAM page is at page 119. If not, then change the number 119. 200 is byte 200 on the page. It was just a random byte, to destroy the page.

Now….. try to run DBCC CHECKDB. It willnot look nice.

Skærmbillede 2013 04 07 kl 18 02 34

and at the bottom of the output, I can see the following:

Skærmbillede 2013 04 07 kl 18 03 10

I think the most important thing is the second line from the bottom. It says that the minimum repair level is Reparir_allow_data_loss, which means that if I run checkDB to fix the error, then I will loose the data in my table (basically).

If I try to run query against the table LargeTable, I will get an error:

Skærmbillede 2013 04 07 kl 18 05 13

So basically… I am in deep trouble.

Lucky for me‚Ķ I have a backup, so what I can do is to restore that backup, and it is very simple with SQL 2012. Right click the database –> chose tasks –> choose restore –> Choose page, and the following dialog will occur:

Skærmbillede 2013 04 07 kl 18 08 21

Choose Check database Pages, and if no backup is selected, then select it. PLEASE NOTE!!! There might be some difficulties with selecting backup’s but when you have the right backups, then just click OK.

What happens, and what does everybody with SQL 2012 need to do and understand?

Well, I could restore a single page by writing:

RESTORE DATABASE¬†[Pagerestore]¬†PAGE=‘1:119’¬†FROM¬† DISK=N’c:\sql\pagerestore.bak’¬†WITH¬† FILE= 1,¬† NORECOVERY

And what happens, then is that the page is marked with Recovery pending. That is because we must restore the necessary log backups, and before we do that we must start taking a backup of the tail of the database. So, to restore a page, we now know that we need these three steps:

1) Restore the page.

2) Backup the tail of the log.

3) Restore necessary log backups, including the tail of the log.

So the complete steps, if you don’t have SQL 2012 and a nice UI will be:

USE[master]

go

RESTORE DATABASE¬†[Pagerestore]¬†PAGE=‘1:119’¬†FROM¬† DISK=N’c:\sql\pagerestore.bak’¬†WITH¬† FILE= 1,¬† NORECOVERY

backup¬†log[Pagerestore]¬†to¬†disk=N’c:\sql\pagerestore.trn’¬†with¬†format

restore¬†log¬†[Pagerestore]¬†from¬†disk=N’c:\sql\pagerestore.trn’

 

If you have done everything correctly, then you should be able to run DBCC CHECKDB and to query the table ūüôā

that’s it. Very easy, but of course there are difficulties. But now you have the general perspective and overview.

 

 

DBCC checkDB – A Nice trick to see progress

I hate the situation where I have started a DBCC CHECKDB and I don’t know for how long time it has been running. Well, I have created this very small T-SQL statement that you can run.

select session_id, command, percent_complete

from sys.dm_exec_requests

where session_id=55

 

Session_ID is the session that has started the dbcc command. The result looks like the picture below.
Skærmbillede 2013 04 07 kl 09 20 13
Execute it continuously and see that percent_complete will grow.
It is important to know that DBCC CHECKDB executes three statements:
  • DBCC CHECKALLOC
  • DBCC CHECKTABLE
  • DBCC CHECKATALOG
So actually the percentage will have to go to 100% three times. When the line disappears, then it is finished.
One other thing to note is that DBCC is using space, and to see how much space is required to run a dbcc CHECKDB, then simply write

dbcc¬†checkdb(‘adventureworks2012’)¬†with¬†Estimateonly

 

Change adventure works to your database.  On my server, the result is:

 

Skærmbillede 2013 04 07 kl 11 23 15

In my case DBCC CHECKDB requires 336 KB in TEMPDB.

Extended events – Your first setup

Extended events are used to track down what happens in the SQL Server, and you can very easily setup an  extended event to e.g. montioring what SQL Statements a user is executing, what waits there are on the servers and e.g.

I recommend that you try this demo first, and then read the webpages from MSDN that introduces Extended events (http://msdn.microsoft.com/en-us/library/bb630354%28v=sql.105%29.aspx).

Learning Extened events are quite easy when you have an SQL server 2008 (and in 2012 it is getting easier again). It is a three step process:

  1. Setting up your first extened event session.
  2. Understanding the output
  3. Learning what events and actions you can use.

But let me start with a warning. Extened events are lightweight and very efficent, but if you set it up wrongly and try to catch to many events and outputs, then you can basically drain the performance on the SQL Server.

So lets try to get working (you need to have the adventureworks database installed, and you can download it from here. You..  also need an SQL Server 2008 or never).
OK. Lets start to setup an extended event that catches all sql statements in the adventureworks database.

  1. Open a command promt (go to start and type CMD).
  2. Write MD c:\XEtest hit enter

Extened events requires a folder to store the XML files that are the output (at least we choose to output to a file in this example). The service account that has started the SQL Server must have access to write to that folder.

What are table hints and how are they used

A question that I often get is what a table hint is and how it is used?

Well, basically when you write some code in TSQL, then the SQL Server needs to translate it to something it can execute.

That is a multistep process, and the last steep: is to optimize the execution. Basically the SQL server says: ‚ÄúI can now run the statement, but how can I run it in the best possible way?‚ÄĚ. This is a costbased process, where the Query optimizer (that is the name of the last step) calculates a cost for different execution plans, and then it chooses the one with the lowest cost.

So if you e.g. have a statement like:

select *
from Person.address
where City=’Bothell’

Then the SQL Server finds what it think is an optimal plan of execution. It is not necessarily the best plan, but it is a good plan.

For the above query, the executionplan looks like:

image

The executionplan is read from right to left, and for this moment it does not matter what it means. the important think is that with tablehints you can override the way the query is executed.

Try the following:

select *
from Person.address with (index(1))
where City=’Bothell’

The executionplan looks like this:

image

So by adding the table hint WITH (index(1)), then you can change the way the sql server executes your statement.

What else does table hints do?

Well, it does a lot of things, and you can read about it here, but one thing that I would like to show is using tablehints in a practical world. If you have a system with a lot of users and you want to extract some data from a table that are heavily used, then you could use the table hint with (NOLOCK).

It means that the query does not take any locks and does not care about any locks taken on an object. In other terms: This query will execute no matter if other users are updating the table or not AND it will not wait on them to complete.

The risk is that you get inconsistent data, but the benefit is that you get data!! This is very useful for reporting, where e.g. you need data that are out of the range of data that are updated.

Extended events‚Äď the objects

Before reading this blog, it might be a good idea to read the blog Your First extended event if you are not familiar with Extended events.

A friend of mine asked me (for some month ago if there was an easy way to figure out what events you can use, and that give me the idea of writing this blogpost (and today I have the time for it).

What events can we track? Well, when we setup an event session we write something like:

Create event session testSQL on server
add event sqlserver.sql_statement_completed

The text written in red is the event and the package the event belongs to. If we look at the package (sqlserver), then you can see the current packages by writing:

select name, description, guid
from sys.dm_xe_packages

It gives you the name, description and the guid for the package.

The main source of information is the view sys.dm_xe_objects, and you can utilize that by writing

select *
from sys.dm_xe_objects t1

As you can see, there is an object_type column, and it has multiple values (7 in 2008R2), but the once we are interested in now are:

  1. event
  2. target
  3. action

Combining that with the view sys.dm_xe_packages, then we get an SQL statement like the following to get the events we can use:

select pck.name as package, obj.name as eventName, obj.description
from sys.dm_xe_objects AS obj
inner join sys.dm_xe_packages as pck on obj.package_guid=pck.guid
where object_type = ‘event’
order by pck.name, obj.name

So now you know what to write when you want to create an event session for a given event. If you are in doubt about what events you can use, then just google the event name and Extended event.

If we look at the action part of an event session, then it could look like:

Create event session testSQL on server
add event sqlserver.sql_statement_completed
(    action (sqlserver.sql_text)
where sqlserver.database_ID=30)

As you can see, then we have an action, where we will capture the sql_text for a database with the id = 30.

You can find the actions by replacing object_type=’event’ with object_type=’action’ in the above query:

select pck.name as package, obj.name as actionName, obj.description
from sys.dm_xe_objects AS obj
inner join sys.dm_xe_packages as pck on obj.package_guid=pck.guid
where object_type = ‘action’
order by pck.name, obj.name

That’s very easy Smile.

The second last step we need to investigate when setting up an eventsession is the target:

Create event session testSQL on server
add event sqlserver.sql_statement_completed
(    action (sqlserver.sql_text)
where sqlserver.database_ID=30)
add target package0.asynchronous_file_target
(set filename = N’C:\xetest\SQLTEST.xml’
, metadatafile= N’C:\xetest\SQLTESTmeta.xml’
)
with (max_dispatch_latency = 1 seconds)

As you can see, I have a target called package0.asynchronous_file_target, and I can find these targets by writing:

select pck.name as package, obj.name as actionName, obj.description
from sys.dm_xe_objects AS obj
inner join sys.dm_xe_packages as pck on obj.package_guid=pck.guid
where object_type = ‘target’ and pck.name=’package0′
order by pck.name, obj.name

So the final part is now to find out how we can setup the WITH clause for an event session, and that is very easy. Read the document from Microsoft.