DQS – Data Quality Services in SQL Server 2012

TRINUG – Data SIG, July 2013

SQL Server Data Quality Services (DQS) is a knowledge-driven data quality product. DQS enables you to build a knowledge base and use it to perform a variety of critical data quality tasks, including correction, enrichment, standardization, and de-duplication of your data.

Using DbUpdater with MySql

DbUpdater can be used with mysql.

Here are the files you need to jumpstart the integration – DbUpdater-MySql.zip

1. You might need to change the path to mysql.exe in mysql-exec.bat.

2. Modify values in mysql-sample-command-line.bat.

Make sure mysql.data.dll is placed in GAC or in the same directory as DbUpdater.exe. Connector binaries can be downloaded from here –
http://dev.mysql.com/downloads/connector/net/6.1.html

SQL Azure Notes

My invitation code for SQL Azure CTP finally arrived. I had been waiting for this.

sqlazure-manage.png

You can create a database in from the SQL Azure Management page, shown above.

After you have created the database, of course, you can use DbUpdater with your SQL Azure databases. Here is the command-line :
DbUpdater.exe ..\SqlScripts /server:”tcp:SERVER.ctp.database.windows.net” /db:DATABASE /user:”USER@SERVER” /password:PASSWORD /confirm:true

And you will need this DbUpdater.exe.config file :

<?xml version=”1.0″ encoding=”utf-8″ ?>
<configuration>
<appSettings>
<add key=”file-prefix” value=”db” />
<add key=”file-extension” value=”sql” />
<add key=”schema-versions-name” value=”schema-versions-table.sql” />
<add key=”baseline-name” value=”baseline.sql” />
<add key=”post-script-prefix” value=”post” />
<add key=”post-script-always-apply” value=”true” />
<add key=”exe-file” value=”C:\Program Files\Microsoft SQL Server\90\Tools\Binn\SQLCMD.EXE” />
<add key=”exe-args” value=’-U %UserName% -P %Password% -S “%ServerName%” -d %DbName% -i “%SqlFile%”‘ />
<add key=”dialect” value=”NHibernate.Dialect.MsSql2000Dialect” />
<add key=”driver_class” value=”NHibernate.Driver.SqlClientDriver” />
<add key=”connection_string” value=”Server=%ServerName%;Database=%DbName%;User ID=%UserName%;Password=%Password%;Trusted_Connection=False;” />
</appSettings>
</configuration>

You can run sqlcmd.exe directly to execute t-sql scripts on SQL Azure. Here is a sample command :
sqlcmd.exe -S SERVER.ctp.database.windows.net -U USER@SERVER -P PASSWORD -d DB -i “baseline.sql”

I received these errors when I executed my SQL Server 2005 scripts on a SQL Azure database –

  • ‘ANSI_NULLS’ is not a recognized SET option.
  • Deprecated feature ‘SET ANSI_PADDING OFF’ is not supported in this version of SQL Server.
  • Deprecated feature ‘More than two-part column name’ is not supported in this version of SQL Server.
  • Deprecated feature ‘Data types: text ntext or image’ is not supported in this version of SQL Server.
  • Deprecated feature ‘Table hint without WITH’ is not supported in this version of SQL Server.

These are not the limitations of the SQL Azure platform. SQL Azure is based on SQL Server 2008 Database Engine. The complete list of deprecated features in SQL Server 2008 are listed here along with the replacements (if any) : Deprecated Database Engine Features in SQL Server 2008 – MSDN.
After modifying the scripts to replace the deprecated features, the scripts could be executed without any further issues. All stored procedures, triggers and views were created without any errors.

The following error is a SQL Azure limitation.

  • ‘Filegroup reference and partitioning scheme’ is not supported in this version of SQL Server.

Here is more detailed information on unsupported T-SQL – Unsupported Transact-SQL Statements (SQL Azure Database) – MSDN

Using NHibernate With SQLite in DbUpdater

DbUpdater (version 1.3 onwards) can be used with sqlite3.

Review the files included in DbUpdater-sqlite3.zip

1. DbUpdater.exe.config must be modified. Note changes to the following keys – exe-file, exe-args, dialect, driver_class, connetion_string.
2. SchemaVersion.hbm.xml must be modified. The ‘class’ attributes value for the ‘generator’ of ‘id’ must be changed to ‘identity’. No other changes are needed here.
3. \SqlScripts\schema-version-table.sql is modified. Note that the datatypes of all columns are changed to TEXT, except for SchemaChangeId, which is changed to integer primary key. integer primary key is used for auto-incremented columns in sqlite.
4. \SqlScripts\baseline.sql is modified. Note the use of current_timestamp function. The actual sql syntax in this file must conform to sqlite dialect.
5. All other .sql scripts in \SqlScripts directory are also modified to conform to sqlite dialect.

The following files are expected to be in DbUpdater.exe directory (as configured in the DbUpdater.exe.config file) –

1. sqlite3.exe : download.
2. System.Data.SQLite.dll : download.
3. createdb.bat file : This file is included. This batch script will call sqlite3.exe to execute sql queries.

Another very useful tool for working with sqlite database files is SQLite Database Browzer.

WPF – LINQ to SQL Sample

This exercise in binding a WPF ListView control with LINQ to SQL was a lot of fun ! The following image shows data from AdventureWorks sample database displayed in the ListView control. The end result is very admirable, even in this very basic demo. Feeling a complete sense of freedom is unavoidable when working with WPF.

Linq2SqlSample1.Png

This sample demonstrates :

  1. LINQ to SQL mapping.
  2. WPF ListView Control DataBinding.
  3. Displaying images stored in database as varbinary in WPF control.

The sample code can be downloaded here : WPF-LINQ to SQL Sample
License : Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.

Creative Commons License

Database Schema Version Control Tool

It is essential to treat database schema as source code and apply all the norms and best practices applicable to it. From what I hear and read, database schema version control is non-existent in too many projects.

Various techniques to accomplish version control of database schemas have been discussed and published. Martin Fowler and Pramod Sadagale has written a comprehensive article on Evolutionary Database Design. It is a must-read for everyone involved with database development.

K. Scott Allen writes in detail about an excellent system here – Versioning Databases – Change Scripts. This approach is sufficiently light-weight (for my taste) and provides complete control over the whole process.

I am providing a free tool here that you can use to implement this system. It will even create the SchemaVersionsLog table for your database, if it doesn’t exist. If you are using this tool with MSSQL Server, you can get started with very little effort. If you are using a different DBMS, you can still use this tool. It uses NHibernate to access the SchemaVersionsLog table. So, any DBMS supported by NHibernate is acceptable, as long as it comes with a command line tool to execute the sql scripts.

Download DbUpdater here.