Thursday, February 23, 2017

SQL Server Date, Datetime and Datetime2 - What should I use when I get the value as a string?

When we have to store a datatime value (or date), in most cases, application accepts the value as a datetime and send the value to SQL Server as a datetime value. However, if the value is sent as a string (Example, CSV upload), then what should be the best way of formatting the value and how we can convert it to datatime without making any mistake?

It is always recommended to use ISO 8601 standard when exchanging datatime values. The standard describes the way of passing a datetime value, generally it is YYYY-MM-DDTHH:MM:SS.sss. You can read more on this my post: Best way to pass datetime values to SQL Server – SS SLUG Dec 2013 – Brain Bashers - Demo III.

With SQL Server 2016, the default string format for dates is YYYY-MM-DD. If you pass the value with this format, regardless of the Current Language set, SQL Server will accurately read the value. However, this does not work with all datetime data types as expected. Have a look on the following code;

SET LANGUAGE English;

DECLARE @Date date
DECLARE @Datetime datetime
DECLARE @Datetime2 datetime2

SET @Date = '2017-05-06';
SET @Datetime = '2017-05-06';
SET @Datetime2 = '2017-05-06';

SELECT DATENAME(mm, @Date) As WithEnglish;
SELECT DATENAME(mm, @Datetime) As WithEnglish;
SELECT DATENAME(mm, @Datetime2)As WithEnglish;
GO


SET LANGUAGE German;

DECLARE @Date date
DECLARE @Datetime datetime
DECLARE @Datetime2 datetime2

SET @Date = '2017-05-06';
SET @Datetime = '2017-05-06';
SET @Datetime2 = '2017-05-06';


SELECT DATENAME(mm, @Date) As WithGerman;
SELECT DATENAME(mm, @Datetime) As WithGerman;
SELECT DATENAME(mm, @Datetime2) As WithGerman;


As you see, Datetime data type convert happens based on the language set but Date and Datetime2 data types are accurately interpreted regardless of the language set. This is something you have to remember. If you expect datetime values as string and settings related to the session can be changed, then it is always better to use either Date or Datetime2.

If you need to make sure that date is properly getting interpreted regardless of the settings (language), then stick into ISO 8601. If you change the values of variable as below, you will get the same month: May for both English and German.

SET @Date = '2017-05-06T00:00:00';
SET @Datetime = '2017-05-06T00:00:00';
SET @Datetime2 = '2017-05-06T00:00:00';

Note that ISO 8601 accepts a value like 24:00:00 for time for midnight but SQL Server does not support it.

Tuesday, February 21, 2017

SQL Server Sequence suddenly starts with -2147483648

Have you faced this? Assume that you have created a Sequence object for generating values sequentially for tables which data type is set as int, and suddenly it shows the next number as -2147483648.

See below code;

-- creating sequence
CREATE SEQUENCE [dbo].[OrderSequence]
AS int
START WITH 1 
INCREMENT BY 1 
MAXVALUE 2 
CYCLE
GO

-- requesting numbers three times
SELECT (NEXT VALUE FOR   dbo.OrderSequence) AS OrderSequence
GO 3



What could be the reason? If you analyze the code written above, you can easily find the issue. I have not used MINVALUE property when creating the Sequence, hence it takes the lowest value of the data type set for the Sequence, which is -2147483648 for int data type. You may experience the same, if so, check and see whether the MINVALUE has been set or not.

Monday, February 20, 2017

Can we access the SQL Server temporary table created in different database?

Temporary tables are nothing new and we have been using this for long time. There are two types of temporary tables; Local that starts with single pound sign (#) and Global that starts with double pound signs (##). Local Temporary Tables are limited to the connection created and will be discarded automatically when the connection is disconnected. Global Temporary Tables are global to the instance and it can be accessed by the anyone connected the instance. It will be dropped automatically when the last referenced connection is dropped.

Now the question is, when a Local Temporary Table is created, can I access it in another database?

Answer is yes and no. See the code below.

USE Sales;
GO

CREATE TABLE #TempTable
(
 Id int
);
GO

-- This works without any issue
SELECT * FROM #TempTable;

The created table can be access without any issue because access is done in the same database using the same connection. If we try the SELECT in another database with different window (different connection);

USE Sales;
GO

-- This will throw "Invalid object name '#TempTable'." error.
SELECT * FROM #TempTable;

You will see an error as above. However if I try to access the table from the same connection but different database;

--USE Sales;
--GO

--CREATE TABLE #TempTable
--(
-- Id int
--);
--GO

--SELECT * FROM #TempTable;

-- Same first connection but different database
USE AdventureWorks2014;
GO

-- This will work
SELECT * FROM #TempTable;

As you see, it is possible. Remember, Local Temporary Tables are limited to the connection, not to the database created, hence the created table can be accessed within any database as long as the connection is same.

Sunday, February 19, 2017

Changing Schema of SQL Server objects

I had a requirement today to change the schema of set of tables to new schema but I did not find a direct method to change the schema of all objects using a single statement. The ALTER SCHEMA supports transferring one object from one schema to another but it cannot be executed against multiple tables.

USE AdventureWorks2014;
GO

-- transferring Person table from Person Schema to Sales
-- Once executed, tables becomes Sales.Person
ALTER SCHEMA Sales TRANSFER Person.Person;

Therefore I wrote a simple code for transferring multiple tables (of course code can be changed for addressing any object type) and thought to share it because you may look for something similar if you have the same need.

USE AdventureWorks2014;
GO


DECLARE @TableNames TABLE (Id int identity(1,1) PRIMARY KEY, Name varchar(500));
DECLARE @Messages TABLE (Id int identity(1,1) PRIMARY KEY, Message varchar(1000));

DECLARE @TableName varchar(500);
DECLARE @TableId int = 1
DECLARE @NewSchema varchar(20) = 'Sales';
DECLARE @OldSchema varchar(20) = 'Production';
DECLARE @Statement varchar(500)

-- table all table names needed
INSERT INTO @TableNames
 (Name)
SELECT s.name + '.' + t.name TableName
FROM sys.tables t
 INNER JOIN sys.schemas s
  ON t.schema_id = s.schema_id
WHERE s.name = @OldSchema
 AND t.type = 'U';

-- making the ALTER SCHEMA statement for all tables
-- and execute them using EXEC
WHILE EXISTS (SELECT * FROM @TableNames WHERE Id = @TableId)
BEGIN
 SELECT @TableName = Name FROM @TableNames WHERE Id = @TableId;
 SET @Statement = 'ALTER SCHEMA ' + @NewSchema + ' TRANSFER ' + @TableName;
 BEGIN TRY

  EXEC (@Statement);
  -- record success message
  INSERT INTO @Messages (Message) VALUES ('Successfully transfered ' + @TableName + ' to ' + @NewSchema);
 END TRY
 BEGIN CATCH
  
  -- record the error
  INSERT INTO @Messages (Message) VALUES ('Transfer unsuccessful: ' + @TableName + ' [' + ERROR_MESSAGE());
 END CATCH

 SET @TableId += 1;
END

-- checking the output
SELECT * FROM @Messages;


Thursday, February 16, 2017

Creating multiple administrative accounts in Azure SQL Database Server

You know that we can create multiple administrator accounts in SQL Server if it is a on-premises instance or an instance configured in a VM (hosted in cloud or on-premises). What we generally do is, create a login and add the login to sysadmin fixed server role. Can we do the same in Azure SQL Database (Server)?

There is only one unrestricted administrative account that can be be created in Azure SQL Database Server. It is created when creating the server. In addition to that, you can add either one Azure Active Directory Account or Security Group Account (that has multiple accounts as members). If you open the Properties of your server, you can see your administrative accounts;


However, if you need to add multiple administrative accounts (not unrestricted administrative accounts), then there is a way of adding them. There are two server-level administrative roles that are available in the master database which user accounts can be added for granting permissions for creating and managing databases and logins.

The first role is dbmanager. This role has permission to create new databases, hence members of this role can create databases. This role exist in the master database. Therefore, only users in the master database can become members of this role.

-- In master database
-- creating a user
CREATE USER Dinesh2 WITH PASSWORD = 'Pa$$w0rd';

-- adding the user to dbmanager role
ALTER ROLE dbmanager ADD MEMBER Dinesh2;

The other role is loginmanager. This role has permissions for creating logins in the master database. Again, this role is master database, hence only users in master database can become members of it.

-- In master database
-- creating a user
CREATE USER Dinesh3 WITH PASSWORD = 'Pa$$w0rd';

-- adding the user to loginmanager role
ALTER ROLE loginmanager ADD MEMBER Dinesh2;

Wednesday, February 15, 2017

Should I create Azure SQL Logins or Azure SQL Users?

Unless it is a contained database, we must create a login for connecting with SQL Server in order to access a database. Databases are accessed using Users and Users are created using Logins. Is it same with Azure SQL Databases? Should I create a login before creating a user?


With Azure SQL Database, it is not a must as Azure SQL database is a contained database (portable). We can create a user account with a password for accessing the database without creating a login. However, it allows us to create a login in the master and then create a user in our database using the login created in the master.

What if I created a user in my database using a login?
The main advantage is maintenance. If the login needs to be dropped or disabled, it can be simply done in the master database. Not only that, since multiple user accounts can be created with multiple databases using the same login, one login can be used for connecting the multiple databases.
The biggest disadvantage with this is, database becomes non-portable. If databases needs to be copied, moved or replicated, login must be created with new servers.

What if I created a user in my database without creating a login?
The main benefit we get is a portable database. Since all users with passwords are maintained inside the database, it can be copied, moved, replicated without any issue.
However, if the environment has multiple databases and users need to access more than one database, then multiple user accounts have to be created.

Monday, February 13, 2017

The Integration between Azure Data Lake Store and Azure SQL Data Warehouse

As a part of my presentation, I had to show the integration between Azure Data Lake Store (ADL) and Azure SQL Data Warehouse (Azure DW), hence to created the image below;


Generally ADL is used for storing unlimited data in any format (structured and unstructured) and processing them using various tools given with Massively Parallel Processing (MPP). We use Azure DW for storing a large volume of structured data, again using MPP. In most cases, Azure SQL DW is the source for the Business Intelligence. Earlier, Azure DW could not ingest data from ADL but now it is possible with PolyBase, hence, best practice is, load data into ADL, process them and then transfer to Azure SQL DW for reporting.

Sunday, February 12, 2017

Converting Attributes to Columns - SQL Server Brain Basher of the Week #065

Let's talk about something common for all database management systems without talking something specific to SQL Server. This is the interview question of the week and let me start it with a conversation I had with a interviewee.


Me: Let's assume that you have been given a document that contains identified attributes for a particular entity, for an example, Customer Entity. Business Analyst has mentioned that it needs an attribute called Customer Name. How do you take this entity and design your database table?

Interviewee: All I have to do is, understand given attributes and create a table call Customer with relevant columns. We need to make sure that the right data type is selected for each every attribute or the column.

Me: How do you decide the data type? Let's talk about this specific attribute: Customer Name.

Interviewee: Yes, I will be setting varchar for this attribute, probably with the size as 200. Another thing, if the solution is a multilingual application, have to use nvarchar instead of varchar.

Me: Good, anything else to be considered on that?

Interviewee: Regarding Customer Name, I think that is all I have to consider. We might have to change the size of it but 200 is reasonable.

Okay, what do you think? He is not a senior person, that is what I started with basic but I expected something additional, something extra.

First thing you need to understand is, BA does not know or does not think about database design. It is all about the business. It is our duty to covert the business identified to logical and physical database design. So, even though BA has identified Customer Name as an attribute, or as a single attribute, it does not mean that we need to stick into it. This is where we apply normalization rules. Not only that, we need to think and see;
  • How this attribute is going to be filled
  • How this attributed is going to be read
  • How often the entity is searched based on this attribute.

For an example, Customer Entity (or the table) might be searched by Customer Last Name. Have we addressed this requirement. Do not say that BA has not mentioned it, if not mentioned, it is always good to get it clarified but it is something we need to consider.

If we consider other common things whether BA has mentioned it or not. Customer Name is not going to be a single column. You will surely have at least two columns called First Name and Last Name. This makes sure that your table is designed properly and it is ready for standard or most common analysis as well as holding data efficiently. Remember, breaking an attribute to multiple columns is not only for known attributes such as Employee Name, Location. We might break attributes like Product Code, Invoice Number as well. For example, Product Code might be forming using two elements such as Product Type Code and Unique Number. In that case, it is better to have two columns for Product Type Code and Unique Number, and another Computed Column for Product Code (if required).

Make sense? Although this is a simple thing, many miss it, hence DO NOT FORGET SIMPLE THINGS AND FUNDAMENTAL RELATED TO DATABASE DESIGN :).



Saturday, February 11, 2017

Azure Feature Pack for Integration Services (SSIS) - New Version Available

Microsoft has made the latest version of Azure Feature Pack for Integration Services (SSIS) and it is available for downloading. This was released on 28-Dec-2016 and there are few addition components added when comparing with the previous version.

Why we need this? Simply to work with Big Data as part of our ETL solution with the support of Azure Big Data platform which is HDInsight and supportive projects.

I wrote few posts on previous version of this, you can read more on them at;

The above posts explain how to download and install the feature pack, configure tasks for accessing HDInsight and process some unstructured data using tasks given with SSIS.

You can get the latest version of Azure Feature Pack fromhttps://msdn.microsoft.com/en-us/library/mt146770.aspx

I noticed one Control Flow new Task; Azure SQL DW Upload Task and two Data Flow items; Azure Data Lake Store Source and Azure Data Lake Store Destination that were not available with the previous version.

Previous Version:


Latest Version:


Need to test these new items, must be really interesting, will surely share them via blog posts.

Friday, February 10, 2017

SQL Server Data Tools (Visual Studio) - The length of the full path for the solution, project or item you are creating exceeds the maximum path length allowed

While trying to create a new SSIS project with default values for Project Name, Location and Solution Name, I experienced the following limitation;

The length of the full path for the solution, project or item you are creating exceeds the maximum path length allowed by the system. You must reduce the length of the name or the location.


This is not something new but I did not experienced this for a long time. What is the reason for this?

This is because of the limitation with the OS. The limitation is set for 260 characters for the path, hence when exceeds, Visual Studio shows this message without letting us to create the project. There are two ways I tried to overcome this issue;
  1. Of course, the first way is, select a path that can be referred with less number of characters.
  2. Second one is, creating the project without creating the solution. For that, we need to un-check the Create directory for solution Checkbox and then create the project. Once created, we can open it and add it a solution.
This checking is based on following calculation;

Path length + 1 (separator) + 
Solution name length + 1 (separator) + 
Project name length + 1 (separator) + 
Project name length + 
80 (Reserved space)



Thursday, February 9, 2017

Passing .Net data table as a User Defined Table to a Stored Procedure

When there is an updated data set that needs to be sent to a Stored Procedure for updating all records in the data set, what most do is, iterate the data set, access records one by one and call the stored procedure for each an every record. This is not a best practice and it affects to the performance of the application. Time it needs may not be noticed if the data set contains few records but it is a noticable when there are many records to be updated. What if we can send the entire data set with a single call and stored procedure can update all records with a single execution?

This is not something new but still do not use. SQL Server 2008 introduced User Defined Data Table that can be used for creating a new type similar to a Structure in programming languages. It can be used for accepting a data set to a stored procedure via a parameter and, since it is a table, data in the table can be accessed just like a normal table for manipulation.

Let's see how we can do it. Assume that I have a data set that contains updated recordset and need to pass it for a stored procedure. For that, first of all, we need to create a table type;

CREATE TYPE [dbo].[NoteSectionTagTable] AS TABLE
(
 NoteSectionId int not null
 , TagId int not null
)

Next is the stored procedure. See how it can be set. Note the Read Only keyword which is a must.

CREATE PROCEDURE [dbo].[UpdateNoteSectionTags]
 @NoteSectionTagTable NoteSectionTagTable READONLY
AS
BEGIN

 DELETE dbo.NoteSectionTags
 WHERE NoteSectionId IN (SELECT NoteSectionId FROM @NoteSectionTagTable);

 INSERT INTO dbo.NoteSectionTags
  (NoteSectionId, TagId)
 SELECT NoteSectionId, TagId
 FROM @NoteSectionTagTable;

 RETURN 0
END

Everything is ready in database level. Here is the way of creating a table table and passing it to the stored procedure using C#.Net.

// creating the table
System.Data.DataTable noteSectionTags = new System.Data.DataTable();

// adding columns
System.Data.DataColumn noteSectionIdColumn = new System.Data.DataColumn("NoteSectionId", typeof(int));
noteSectionTags.Columns.Add(noteSectionIdColumn);

System.Data.DataColumn tagIdColumn = new System.Data.DataColumn("TagId", typeof(int));
noteSectionTags.Columns.Add(tagIdColumn);

System.Data.DataRow noteSectionTag;

// adding rows
noteSectionTag = noteSectionTags.NewRow();
noteSectionTag["TagId"] = Convert.ToInt32(item.Value);
noteSectionTag["NoteSectionId"] = Convert.ToInt32(ViewState["NoteSectionId"]);
noteSectionTags.Rows.Add(noteSectionTag);

// we can add multiple rows like above

//creating sql command for calling the stored procedure
System.Data.SqlClient.SqlCommand command = new System.Data.SqlClient.SqlCommand("dbo.UpdateNoteSectionTags");
command.CommandType = CommandType.StoredProcedure;

//note the parameter data type
System.Data.SqlClient.SqlParameter parameterNoteSectionTagTable = new System.Data.SqlClient.SqlParameter("NoteSectionTagTable", SqlDbType.Structured);
parameterNoteSectionTagTable.Value = noteSectionTags;
command.Parameters.Add(parameterNoteSectionTagTable);

// getting the connection for the config
string connectionString = ConfigurationManager.ConnectionStrings["DefaultConnection"].ConnectionString;

using (System.Data.SqlClient.SqlConnection connection = new System.Data.SqlClient.SqlConnection(connectionString))
{

    command.Connection = connection;
    connection.Open();
    command.ExecuteNonQuery();
    connection.Close();
}

Wednesday, February 8, 2017

How do I set the size of SqlParameter of SqlCommand for varchar(max)?

If you need to execute a Stored Procedure that implemented with varchar(max) parameter, using .Net Application, how do you set the size of the parameter?

Since the constructor of SqlParameter does not accept string value for size property, we have no way of passing max as the size;

// This will not compile
SqlParameter parameterNote = new SqlParameter("Note", SqlDbType.NVarChar, max);

But we can pass -1 as the size that indicates it is max;

SqlParameter parameterNote = new SqlParameter("Note", SqlDbType.NVarChar, -1);


Tuesday, February 7, 2017

Splitting values in a string variable and inserting values as rows - II - STRING_SPLIT function

Yesterday I wrong a post on splitting string values using an Extended Function. The biggest issue with it was, unavailability in Azure SQL Database. However, we have a new function to achieve it and it works in both SQL Server 2016 and Azure SQL Database.

This is how it works;

DECLARE @String nvarchar(4000) = 'SQL,Business Intelligence,Azure';

SELECT *
FROM STRING_SPLIT(@String, ',');


No more complex queries for splitting values :).

Monday, February 6, 2017

Splitting values in a string variable and inserting values as rows - I

Challenges are interesting and finding various ways to solve is a thrilling adventure.

One of the codes I had to write today was, splitting a string value received from an ASP.Net application and inserting them into a table. There are many ways of splitting a string values (or sometime, converting columns into rows) but this was bit different. The values I receive from the application are something like;

"SQL Business_Intelligence Azure"
"Personal Fun_time Crazy_Stuff"

Now how can I convert them into individual values (as rows) and insert them into a table?

There is a useful extended stored procedure that allows us to split values in a string considering space as the separator. It is xp_sscanf. It has some limitations but it can be useful in some scenario.


Here is the function I wrote for splitting values;

USE tempdb;
GO

CREATE OR ALTER FUNCTION dbo.SplitString (@String nvarchar (4000))
RETURNS @Tags TABLE
(
 Tag nvarchar(200)
)
AS
BEGIN

 DECLARE @Tag1 nvarchar(200)
   , @Tag2 nvarchar(200)
   , @Tag3 nvarchar(200)
   , @Tag4 nvarchar(200)
   , @Tag5 nvarchar(200)
   , @Tag6 nvarchar(200)
   , @Tag7 nvarchar(200)
   , @Tag8 nvarchar(200)
   , @Tag9 nvarchar(200)
   , @Tag10 nvarchar(200)

 EXEC xp_sscanf @String, '%s %s %s %s %s %s %s %s %s %s',   
  @Tag1 OUTPUT, @Tag2 OUTPUT, @Tag3 OUTPUT, @Tag4 OUTPUT, @Tag5 OUTPUT
  , @Tag6 OUTPUT, @Tag7 OUTPUT, @Tag8 OUTPUT, @Tag9 OUTPUT, @Tag10 OUTPUT;  

 INSERT INTO @Tags
  (Tag)
 SELECT T.*
 FROM (
  SELECT REPLACE(@Tag1, '_', ' ') AS NewTag
  UNION ALL
  SELECT REPLACE(@Tag2, '_', ' ')
  UNION ALL
  SELECT REPLACE(@Tag3, '_', ' ')
  UNION ALL
  SELECT REPLACE(@Tag4, '_', ' ')
  UNION ALL
  SELECT REPLACE(@Tag5, '_', ' ')
  UNION ALL
  SELECT REPLACE(@Tag6, '_', ' ')
  UNION ALL
  SELECT REPLACE(@Tag7, '_', ' ')
  UNION ALL
  SELECT REPLACE(@Tag8, '_', ' ')
  UNION ALL
  SELECT REPLACE(@Tag9, '_', ' ')
  UNION ALL
  SELECT REPLACE(@Tag10, '_', ' ')) AS T
 WHERE T.NewTag IS NOT NULL;

 RETURN;
END
GO

And this is how I can use it;

USE tempdb;
GO

IF OBJECT_ID('dbo.Tags') IS NOT NULL
 DROP TABLE dbo.Tags;

CREATE TABLE dbo.Tags
(
 TagId int identity(1,1) primary key
 , Tag nvarchar(200)
);
GO

DECLARE @String nvarchar(4000) = 'SQL Business_Intelligence Azure';

INSERT INTO dbo.Tags
 (Tag)
SELECT Tag
FROM dbo.SplitString (@String);

SELECT * FROM dbo.Tags;


As you see with the second script, I can simply pass the string received, get them split and insert to the required table.

Is this working in Azure SQL Database?
Unfortunately, it does not work in Azure SQL Database as Azure SQL does not support Extended Stored Procedure. However, good news is, SQL Server 2016 has a new function that can be used with both SQL Server 2016 and Azure SQL Database. Here is a sample code for it.

Sunday, February 5, 2017

On-Premises or Cloud - SQL Server Brain Basher of the Week #064

Here is the Brain Basher of this week.

During interviews, questions on deciding the hosting place for a planned database is very common. Sometime we ask the best place for a database by giving the certain criteria but sometime we ask that how you can decide considering general factors.

If you were asked the below question, what would be your answer?

"You have to create a database for one of the business processes related to the company. Hosting environment for the database is still undecided. We can either create it as an Azure SQL Database or can use the existing On-Premises SQL Server for hosting this database. What do you think?"

This is the Interview Question of the week.

Since there is no specific requirement, particularly on business need, most of the time, interviewee gets stuck with certain areas. However, as per my experience, it would be great if interviewee can explains the factors related for making such decisions along with pros and cons. Let's list out possible factors that influence the decision;
  • Stable Connection - How this database will be used? Frequently, with heavy transactions, or infrequently with less transactions? This is very important because if the database is an OLTP database with heavy transactions and business is depend on it, you need a stable, high speed internet connection. If the company has no such facility that cannot satisfy the speed and the bandwidth required, it is better not to move into cloud. However, getting high speed internet connection is a plan that will be executed soon, hosting in cloud should be considered.
  • Sensitive data - Some companies, specially some financial companies have policies on sensitive data. If the policy says that data should be maintained only in local servers, then hosting the database in cloud is not going to work. Not only the policies established by the company, there are some regulations enforced by the government, hence it has to be considered too. We may go for a hybrid version, that hosts part of the database in cloud and the rest in on-premises server for making sure that sensitive data is not moved out. This is possible with Microsoft SQL Server (SQL Server On-Premise Database files maintained in Azure).
  • Accessibility - How our database is going to be accessed? What we need to make sure is, it can be accessed by the audience who need it. Users may access it via a web application, using reports, or using tools like Microsoft Excel or Power BI by connecting directly. If we hosted, can all access data as they want? Does it require special software to be installed? Does it require special permissions? Do we have to purchase more licenses? These are things we have to consider on this. If we have to spend more which is not a part of the budget or it makes the usage complex, we might have to consider on-premise. However, certain scenario make on-premises hosting complex and costly as well.
  • Size - Is it going to be in megabytes, gigabytes or terabytes? If you expect that database will grow in terabytes within shorter time period, then it should be hosted in a local machine as 1 TB is one of the limitations with Azure SQL Database.
  • Security - Most common factor that pushes the implementation away from cloud. Many think that having the database hosted with a local server is more secured than placing the database in cloud. However, it is not true. If you go for a good provider like Microsoft, Amazon, there have been no security breaches recorded and they guarantee the security. Securing the local environment is very difficult, have to consider security implementations electronically as well as physically and someone has to continuously watch it, patch it, upgrade it when required and monitor it. Therefore, experts recommend hosting in cloud than in-house when security is considered.
  • High Availability and Disaster Recovery - If this is required and you have already implemented this using methods like Mirroring, Clustering or even with a backup/restore strategy, and most importantly it addresses required RPT and RPO (read more at http://dinesql.blogspot.com/2016/10/determine-backup-strategy-sql-server-interview-question.html), then you can consider local implementation. However, if you have not implemented anything, then it is always better to move into cloud as almost all providers support HA and DR.
  • Cost - No doubt, initial cost is always very high when it comes to hosting in local server. You may spend on purchasing servers, software licenses and some extra hardware for speeding up the processes. But the initial cost is very low when hosting in cloud. You may calculate for multiple years and compare, but in most cases, hosting in cloud is always cheaper than hosting in on-premises server.
  • Maintenance - If it is local, you need someone to do this. Installing service packs required, checking space required, performance tuning, monitoring,... many things. If it is cloud, most of these are addressed by the provider and only few need our attention. Example, Azure SQL Database suggests enhancements on indexing and statistics and we can use these recommendations for performance tuning that saves time and resources. So, if you do not have a dedicated person for performing these, go for cloud.
  • Features and Functionalities - You may need to implement some processes that needs certain set of functionalities. For example, if you need to send an email using SQL Server Database, you need to use Database Mail feature. If you need to execute a business login on a scheduled time, you need to use SQL Server Agent. You know that these features are not available with Azure SQL Database. Not only that, it does not support all TSQL commands, standard partitioning among multiple volumes (with Azure SQL Database, partitioning is possible but volumes cannot be decided. If need, federation can be considered) is not available. Therefore, we need to think about features and functionalities that will be used with the database and then decide whether we can host in a local server or cloud.
There are more but interviewee should talk about at least mentioned factors. That makes sure that the interviewee is smart and knowledgeable on the subject, not just a person who implement anything when asked.


Saturday, February 4, 2017

What if I create an Azure SQL Database using Management Studio (SSMS)?

Will there be differences if I create my Azure SQL Database using Management Studio instead of Azure Portal?

Yes, there are. One major difference I noticed is, Snapshot Isolation. By default it is on in an Azure SQL Database along with Is Read Committed Snapshot On;

[Database created using the portal]

However, if you create a database using SSMS, it does not set as expected;

[Database created using SSMS]


As you see, even though Snapshot Isolation is on, Is Read Committed Snapshot On is set as false. Not sure how it is happening, have already asked from experts, will update this with reasons once I received a reply.

Friday, February 3, 2017

Is it possible to read data while they are being modified in Azure SQL Database with default settings?

Somewhere in 2011, I wrote an article titled Isolation Level in SQL Server that explains different Isolation Levels in SQL Server. It is based on an older version of SQL Server, but it is still valid for new versions as well.

Since now we have SQL Server as a SaaS, we slowly move to Azure SQL Database. As you know, Azure SQL Database is not same as On-Premises SQL Database, it has some differences; one is related to Isolation Level.

Before talking about the difference, let's write a sample code and see how it works in both Azure SQL Server and On-Premises SQL Server databases. For this, I have created a database called Sales in both environments.

Once databases are availble, the blow code that adds a table and inserts records, needs to be run in both local and Azure SQL Databases;

CREATE TABLE dbo.Products
(
 ProductId int identity(1,1) Primary Key
 , Name varchar(100) not null
 , Color varchar(20) not null
);
GO

INSERT INTO dbo.Products
 (Name, Color)
VALUES
 ('Product 01', 'Black')
 , ('Product 02', 'Red')
 , ('Product 03', 'Blue');

Now we have the same table in both databases. Let's first test with local database.

Open a new connection and execute the following code. It creates a transaction and updates one of the records. Note that, we do not either rollback or commit.

-- First connection

BEGIN TRAN
 UPDATE dbo.Products
  SET Color = 'Green'
 WHERE ProductId = 2;

Open another connection and execute the code given below. This connection tries to retrieve the record that is being modified by first connection.

-- Second connection

SELECT * FROM dbo.Products
WHERE ProductId = 2;

The result of the second connection is shown in the below image. As you see, we cannot immediately see the record as the record is being modified.


Let's go back to the first connection and rollback the transaction.

-- First connection

ROLLBACK TRAN

Once it is done, second connection will be able to see records.


This is the standard/default behavior of SQL Server. Generally, we cannot access records that are being modified because default Isolation Level set is Read Committed. Now let's do the same in Azure SQL Database.

Open a new connection for Azure Sales Database and execute the same code that starts the transaction.

Open another connection for Azure Sales Database and execute the same code for retrieving records. Here is the result of it.


As you see, there is no issue with Azure Sales database. Second connection can retrieve records even though the record is being modified by another connection.

This is one of the differences between Azure SQL Database and Local SQL Database. The reason for this is, by default Snapshot Isolation Level is enabled in Azure SQL Database. Because of that, a version of the old record is created before the modification and it will be used for other users if request.


Can we turn this behavior off in Azure SQL Server? No, it is not possible, if you try to change it via SSMA, you will experience the below error;


However, you can change Is Read Committed Snapshot On that changes the default behavior.


If you set this to false, then second connection will not be able to see records while first connection modifying records but second connection can set the Isolation Level manually and see records.




Thursday, February 2, 2017

Is Azure SQL Database a PaaS, DaaS or SaaS?

I was confused with this too, but added the most common one to my slides when presenting my session; Getting Ready for Azure SQL Database. Is Azure SQL Database a Platform-as-a-Service, Database-as-a-Service or Software-as-a-Service?

First of all, let's try to understand these models. In addition to the ones mentioned, there is another model called IaaS - Infrastructure as a Service. Here is simple and shortened explanation on these models;
  • IaaS - Infrastructure as a Service
    When it comes to Cloud Computing, getting required Servers, Storage, Network as an on-demand service and maintenance of them is somone's else (like Azure, Amazon) responsibility, is called as IaaS. It is something like we have outsourced the entire IS service including physical components. Microsoft Virtual Machines in Azure is one of the examples for this.
  • PaaS - Platform as a Service
    This model sits on IaaS, offering a complete development and deployment environment. In addition to the services you get with IaaS, this offers services on Programming Languages, Database Services, etc. for making solutions. Microsoft Azure is a good example for this.
  • SaaS - Software as a Service
    This is the top layer of the Cloud Computing and it sits on PaaS, means on top of both IaaS and PaaS. This offers software required as a service on an on-demand way or as a subscription. Example, Gmail from Google that allows us to use email service just using the browser.
  • DaaS - Database as a Service
    This is not something much popular or not something always getting referred because it is an extension or an optimized SaaS model. This provides all RDMS functionalities with cloud computing and sits on PaaS.
This is how it looks like;


And this is the distribution of management;

Note sure the original author of second image, taken from one of Mircosoft pages

Now SQL Server is a PaaS, DaaS or SaaS? It is considers as PaaS but it is actually a SaaS. Can we consider it as a DaaS. Yes it is possible but I think that we should consider it as SaaS because it offers more that what traditional RDMS offers.





Wednesday, February 1, 2017

Azure SQL Database - Auditing & Threat Detection

Has your database been attacked by someone?

Most of our applications are data-driven applications and attackers can use various techniques for getting into the database. One of the famous and most common techniques is SQL Injection which is a Code-Injection that modifies the statement written in the application and get it executed as attacker wants.

Here is a link for reading more on SQL-Injection: http://www.w3schools.com/sql/sql_injection.asp

Whether the database is a On-Premises database or Cloud database, if the code has been written poorly, then it is exposed for attackers. And the biggest issue is, in most cases, we do not know whether the database is attacked or not. What if someone can detects possible attacks and notifies you immediately? Yes Microsoft Azure SQL Server can do it.

If you have worked with Azure SQL Servers, you have already noticed a Setting called Auditing & Threat Detection in the SQL Server Blade. It allows you to set Auditing and in addition to that, we can enable a feature called Threat Detection which is still in Preview mode (means it will not always work as you expected until the final version is released.).

Read the post written on Auditing for more info on it: How to check success and failure logins attempts in Azure SQL Database

Threat Detection detects anomalous database activities that are considered as security threats and notifies us with details. This means, it can detects possible SQL-Injections as well. However, as per my experienced, it does not detect immediately at the initial tries but later, notification is sent immediately. As per some experts, it needs some times to learn the patterns for notifying but it is too early to make a such statement as this is still in Preview mode.

Let's see how we can use Threat Detection with an example. I created a Azure SQL database with following schema and inserted a set of records. Assume that this Sales database contains Sales made by Sales Representatives.

-- Create User table
CREATE TABLE dbo.Users
(
 UserId int identity(1,1) PRIMARY KEY
 , UserName varchar(10) NOT NULL
 , Password varbinary(1000) NOT NULL
);
GO

INSERT INTO dbo.Users
 (UserName, Password)
VALUES
 ('Dinesh', ENCRYPTBYPASSPHRASE('DK?DF%23:D', 'Dinesh123'))
 , ('Jane', ENCRYPTBYPASSPHRASE('DK?DF%23:D', 'Jane123'))
 , ('Jack', ENCRYPTBYPASSPHRASE('DK?DF%23:D', 'Jack123'));

-- Create Sales table
CREATE TABLE dbo.Sales
(
 SalesId int identity(1,1) PRIMARY KEY
 , UserId int NOT NULL
 , Constraint FK_Sales_Users FOREIGN KEY (UserId) REFERENCES dbo.Users (UserId)
 , SalesDate date NOT NULL
 , SalesAmount decimal(16,2)
);
GO

INSERT INTO dbo.Sales
 (UserId, SalesDate, SalesAmount)
VALUES
 (1, '2016-01-01', 10000.00)
 , (1, '2016-01-02', 12000.00)
 , (1, '2016-01-03', 14000.00)
 , (2, '2016-01-01', 9000.00)
 , (2, '2016-01-02', 110000.00)
 , (3, '2016-01-01', 17000.00)
 , (3, '2016-01-02', 126000.00)
 , (3, '2016-01-03', 19000.00)

Next step is creating a simple web application. I created an ASP.Net Web Application that contents with the Azure SQL Database created. I added a Web Form that contains two input boxes for accepting User Id and Password, a button for Loading User's Sales and a GridView for showing retrieved data.


This is Button-Click method written.

string sql = "SELECT u.UserName, s.SalesDate, s.SalesAmount FROM dbo.Sales s INNER JOIN dbo.Users u ON s.UserId = u.UserId WHERE ";
sql += " u.UserName = '" + TextBoxUserId.Text + "' AND DECRYPTBYPASSPHRASE('DK?DF%23:D', u.Password) =  '" + TextBoxPassword.Text + "'";

string connectionString = ConfigurationManager.ConnectionStrings["DefaultConnection"].ConnectionString;

SqlConnection connection = new SqlConnection(connectionString);
SqlCommand command = new SqlCommand(sql, connection);

connection.Open();
SqlDataReader reader = command.ExecuteReader();

DataTable table = new DataTable();
table.Columns.Add("UserName", System.Type.GetType("System.String"));
table.Columns.Add("SalesDate", System.Type.GetType("System.DateTime"));
table.Columns.Add("SalesAmount", System.Type.GetType("System.Decimal"));

DataRow row;
while (reader.Read())
{
    row = table.NewRow();
    row["UserName"] = reader["UserName"].ToString();
    row["SalesDate"] = Convert.ToDateTime( reader["SalesDate"]);
    row["SalesAmount"] = Convert.ToDecimal( reader["SalesAmount"]);
    table.Rows.Add(row);
}
GridView1.DataSource = table;
GridView1.DataBind();

connection.Close();

As you see, SqlCommand is not a parameterized command, hence attacker can use a simple SQL-Injection to modify the statement and get different set of data.

All done. If I check for User:Dinesh, I will be seeing Sales related to him.


However, if you use a SQL-Injection like below, I can get all Sales Records regardless of the user. 


We are not going to talk about the SQL-Injection and the poor code written in the Web Application. Let's see how we can get a notification from Azure when something like this is tried out by someone.

As you know, all we have to do is, enable Threat Detection. This how it has to be set up in Azure SQL Server Blade via Auditing & Threat Detection Setting.


Everything required is done. As you see, Threat Detection is ON and all Threat Detection types have been selected. I have added my email address for receiving alerts. Note that, you need to enable Auditing for enabling Threat Detection and it should be configured to all Audited Events.

Now if you try again with a SQL-Injection, you will immediately get a mail with details;


I can click on the View Related SQL Statements and see more details. If I click, it opens the Audit Records Blade that shows all records activities.



As you see, it is very easy to set up and you can be alerted on all suspicious activities. Try and see.

Tuesday, January 24, 2017

Creating a Testing Hadoop Cluster

Once I wrote a post on Hortonworks HDP Sandbox (Looking for a Hadoop Cluster for testing? Let's configure Hortonworks HDP Sanbox) which can be used for learning and testing many Hadoop related sub projects. Now the new version is available and bit different, since I did a session on Hive yesterday (https://www.meetup.com/LKBigData/events/236812354/) and used it for demonstration, thought to make a new post it; How to get the Sandbox configured in your machine.

Hortonworks Sanbox is available as a Virtual Machine. It is indeed a portable Apache Hadoop 2.5 with many sub projects and can be downloaded as a Sandbox for VMWARE, VIRTUALBOX or DOCKER. In order to download it, you need to visit http://hortonworks.com/products/sandbox.

I will make the post over Oracle VirtualBox. If you download the Sandbox for VirtualBox, follow below steps for adding the downloaded Sandbox for VirtualBox.

1. Open VirtualBox and click on File menu -> Import Appliance menu.


2. Browse the downloaded HDP_2.5_virtualbox.ova file and click on Next.


3. Change the configuration as you need. You may increase the CPU and Memory. If you need to change, double click on the default value and change.


4. That's all. Click on Import. It will add Sandbox to your virtual environment. This is what you should see finally.


Now you can start the virtual machine. Select it and click on Start or just double click on it. 


Once it is started, you can log in by pressing Alt+F5 and using root as user and hadoop as the password.


However, not like the older version, we can actually work with this without log in to this. As the image says, we can use the Browser for start a session on this using given URL, which is http://127.0.0.1:8888/.


You can open the dashboard that can be used to find out various information related components running inside but before that you need to get the password for each components. Here is the way of accessing some of the components.

1. Click on Quick Links for opening ADVANCE HDP QUICK LINKS.
2. Move the mouse over Ambari and note the user id and password.


3. Go back to main page and click on Dashboard.
4. Use the noted user id and password and click on Sign In.


5. This shows all running services and metric related to Hadoop components. If you need to see HDFS or need to open a window for executing Hive queries, click on the button next to Admin and select the menu item need.


6. This is how you see the Files View.


7. This is how you see the Hive View.


As you see, it is very easy to set it up and start using Hadoop. Of course, you have no way of setting many nodes but this is good for learning and testing.


Wednesday, January 18, 2017

SQL Server Identity value suddenly became 1000+

I noticed that values of a column related to newly inserted records which is an identity property enabled column in one of the tables, suddenly started from 1000+. The last values before this sudden jump were 27, 28, 29 but suddenly it started from 1000+ values. How did it happen without resetting the seed or increment of the identity property?

This is not a bug or something related to SQL Server 2016. It a result of a new implementation related to SQL Server 2012. As per my readings, since SQL Server 2012, it caches 1000 values for an identity property if the data type is int and it caches 10,000 values for big int and numeric. If an unexpected restart happens (in my case, I believe that an update caused the restart), there is a possibility to lose the cache, hence it starts from next available value. This is normal and you do not need to worry if you experience it.

Enabling Trace flag 272, you can get the old behavior or you can use Sequence instead. Read this thread for understanding these two options:  http://stackoverflow.com/questions/17587094/identity-column-value-suddenly-jumps-to-1001-in-sql-server

Friday, January 13, 2017

SQL Server - Adding an Authenticator when encrypting data

Among multiple methods given for securing data stored in the SQL Server database, even though the latest one which is Always Encrypted is available, we still use Keys. Passphrases and Certificates for encrypting data. When keys such as Symmetric Keys or Asymmetric Keys, or Passphrases are used for encrypting data, an additional parameter can be supplied which is called Authenticator. Since I recently used this for one of my database solutions, thought to make a note on it.

What is the usage of Authenticator? Why we should use it. Let's take an example and understand with it.

The following code creates a database and a table that holds Customers. The Security Code of the customer will be encrypted.

-- Create a database
CREATE DATABASE Sales;
GO

-- Connect with newly cerated database
USE Sales;
GO

-- Create a master key
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Pa$$w0rd';
GO 

-- Create a certificate for protecting the our key 
CREATE CERTIFICATE SalesCertificate WITH SUBJECT = 'Sales Certificate';
GO

-- Create the key for encrypting data
-- Note that the created certificate protects it.
CREATE SYMMETRIC KEY SalesKey
WITH ALGORITHM = AES_128  
ENCRYPTION BY CERTIFICATE SalesCertificate;
GO

-- Creating a table that holds customers
-- Note the Securiry Code Column, it is varbinary 
-- because code will be encrypted and stored
CREATE TABLE dbo.Customer
(
 CustomerId int identity(1,1) PRIMARY KEY
 , Name varchar(100) NOT NULL
 , SecurityCode varbinary(256) NOT NULL
);
GO

Let's insert some records.

OPEN SYMMETRIC KEY SalesKey DECRYPTION BY CERTIFICATE SalesCertificate ;
GO

INSERT INTO dbo.Customer
 (Name, SecurityCode)
VALUES
 ('Dinesh', ENCRYPTBYKEY(KEY_GUID('SalesKey'), 'CUS005XZ'))

INSERT INTO dbo.Customer
 (Name, SecurityCode)
VALUES
 ('Yeshan', ENCRYPTBYKEY(KEY_GUID('SalesKey'), 'CUS02ANX3'))


Once inserted, data will be looked like below;


And if I try to retrieve records, I need to decrypt encrypted values;


So far so good. Now let's try understand the usage of Authenticator. Assume that Yeshan needs to access some Securables that can be accessed only by Dinesh through an application. For that, all Yeshan needs is, log in to the application using Dinesh's Security Code. Since he does not know Dinesh's Security Code, one way of accessing the application using Dinesh's account is, replacing the Dinesh's code with his code. Let's see whether it is possible.

The following code updates Dinesh's account with Yeshan's code. And as you see, it gets updated and now Yeshan can use Dinesh account as he knows the code.


This should not be allowed and even if it is possible, what if we make sure that encrypted code cannot be replaced like that. That is what we can do with the Authenticator.

Look at the following code. It passes two additional values for encrypting. The third one which is 1 says that this has an Authenticator. The forth parameter is the data from which to derive an Authenticator.

OPEN SYMMETRIC KEY SalesKey DECRYPTION BY CERTIFICATE SalesCertificate ;
GO

-- Update Security codes with CustomerId as the Authenticator
UPDATE dbo.Customer
 SET SecurityCode = ENCRYPTBYKEY(KEY_GUID('SalesKey'), 'CUS005XZ', 1, Convert(varbinary(256), CustomerId))
WHERE CustomerId = 1;

UPDATE dbo.Customer
 SET SecurityCode = ENCRYPTBYKEY(KEY_GUID('SalesKey'), 'CUS02ANX3', 1, Convert(varbinary(256), CustomerId))
WHERE CustomerId = 2;


Just like the previous code, values are encrypted now. However, if Yeshan tried to do the same, see the result;


As you see, even though Dinesh's code has been replaced with Yeshan's code, when try to decrypt value of Dinesh, it results null because Authenticator is different. This is the usage of the Authenticator.

Note that we used CustomerId as the Authenticator but you can use something else, something uncommon as the Authenticator to make it more secured and avoid malicious activities like this.


Thursday, January 12, 2017

Incorrect syntax near 'TRIPLE_DES'. - SQL Server throws an error when try to use algorithms

Assume that you use SQL Server 2016 and trying to create a Symmetric Key or Asymmetric Key for encrypting data. If you try use an algorithm like TRIPLE_DES, you will get the mentioned error;

Msg 102, Level 15, State 1, Line 20
Incorrect syntax near 'TRIPLE_DES'.

Here is a sample code for seeing this error;

-- Create a database
CREATE DATABASE Sales;
GO

-- Connect with newly cerated database
USE Sales;
GO

-- Create a master key
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Pa$$w0rd';
GO 

-- Create a certificate for protecting the our key 
CREATE CERTIFICATE SalesCertificate WITH SUBJECT = 'Sales Certificate';
GO

-- Create the key for encrypting data
-- Note that the created certificate protects it.
-- However, this will throw an error
CREATE SYMMETRIC KEY SalesKey
WITH ALGORITHM = TRIPLE_DES  
ENCRYPTION BY CERTIFICATE SalesCertificate;
GO

Now, what is the reason for this. The reason for this is, this algorithm is deprecated in SQL Server 2016. Not only that, All Algorithms except AES_128, AES_192, and AES_256 are deprecated in SQL Server 2016

What if you still need one of these deprecated algorithms? Yes, it is possible, but you need to downgrade the database by changing the Compatibility Level 120 or below.

This code shows the way of doing it.


USE master;
GO

-- Change the compatibility level to 120
ALTER DATABASE Sales
SET COMPATIBILITY_LEVEL = 120;

-- And check again
USE Sales;
GO

-- This will work now without any error
CREATE SYMMETRIC KEY SalesKey
WITH ALGORITHM = TRIPLE_DES  
ENCRYPTION BY CERTIFICATE SalesCertificate;
GO

Even though you can use the algorithm after changing the Compatibility Level, remember lower Compatibility Level might not let you to use all functionalities available with SQL Server 2016. Therefore, if possible, use allowed algorithms only.