Pages

Advertisement

Wednesday, July 11, 2007

SQL Server 2005: Creating Scalar User-Defined Functions from Assemblies

Scalar functions are not much different from stored procedures. Indeed, for the most part, they have the very same differences that the T-SQL versions. Much as with stored procedures, we utilize the same core CREATE syntax that we used in the T-SQL user-defined functions (UDFs) we created in Chapter 11, "Getting Procedural: Stored Procedures and User-Defined Functions," of the book Professional SQL Server 2005 Programming (Wrox, 2006, ISBN: 0764584340).

CREATE FUNCTION [<schema name>.]<function name>
( [ <@parameter name> [AS]
[<schema name>.]<scalar data type> [ = <default value>]
[ ,...n ] ] )
RETURNS {<scalar type>|TABLE [(<Table Definition>)]}
[ WITH [ENCRYPTION]|[SCHEMABINDING]|
[ RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT ]
| [EXECUTE AS {
CALLER|SELF|OWNER|<'user name'>} ]
]
[AS] { EXTERNAL NAME <<i>external method</i>> |
BEGIN
[<function statements>]
{RETURN <type as defined in RETURNS clause>|RETURN
(<SELECT statement>)}
END }[;]
 

There are one or two new things once you get inside of the .NET code. Of particular note is that there are some properties that you can set for your function. Among those, probably the most significant is that you must indicate if the function is deterministic (the default is nondeterministic). We'll see an example of this in use shortly.

For the example this time, start a new SQL Server project in Visual Studio, but instead of adding a stored procedure as we did in our original assembly example, add a user-defined function.

SQL Server starts you out with a simple template:

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlString ExampleUDF()
{
// Put your code here
return new SqlString("Hello");
}
};

This is actually a workable template "as is." You could compile it and add it to SQL Server as an assembly, and it would work right out of the box (though just getting back the string "Hello" is probably not all that useful).

We'll replace that, but this time we're going to write something that is still amazingly simple. In the end, we'll see that, while simple, it is much more powerful than our stored procedure example.

In previous books, I have often lamented the issues with trying to validate e-mail fields in tables. E-mail, when you think about it, is a strongly typed notion, but one that SQL Server has only been able to perform minimal validation of. What we need are regular expressions.

We could approach this issue by writing a validation function and implementing it as a user-defined data type. This approach would have some validity, but has a problem — the rules for validating e-mails change on occasion (such as when new country codes or added, or when the .biz and .info top domains were added several years ago). Instead, we're going to implement simple regex functionality and then utilize a call to that function in a constraint.

We can do this with relatively minimal changes to the function template that SQL Server gave us. First, we can get rid of a few library declarations, since we won't be really working with SQL Server data to speak of, and add back two of our own. We wind up with just three using declarations:

using System;
using System.Text.RegularExpressions;
using Microsoft.SqlServer.Server;

We're then ready to implement out function with very few changes:

[SqlFunction(IsDeterministic = true, IsPrecise = true)]
public static bool RegExIsMatch(string pattern,
string matchString)
{
Regex reg = new Regex(pattern.TrimEnd(null));
return reg.Match(matchString.TrimEnd(null)).Success;
}

Oh sure, we completely replaced the old function, but not by much. Indeed, we only have two more lines of code — and that's including the determinism declaration!

I'm not going to review it much here, but take a look at Chapter 11, "Getting Procedural: Stored Procedures and User-Defined Functions" of the book Professional SQL Server 2005 Programming (Wrox, 2006, ISBN: 0764584340). If you need to be reminded of how determinism works The key thing is that, given the same inputs, the function must always yield the same outputs.

Go ahead and compile this, and we're ready to upload the assembly:

USE AdventureWorks
CREATE ASSEMBLY ExampleUDF
FROM '<solution path>\ExampleUDF\bin\Debug\ExampleUDF.dll'

And then create the function reference:

CREATE FUNCTION fCLRExample
(
@Pattern nvarchar(max),
@MatchString nvarchar(max)
)
RETURNS BIT
AS EXTERNAL NAME ExampleUDF.UserDefinedFunctions.RegExIsMatch

Notice the use of the nvarchar type instead of varchar. The string data type is a Unicode data type, and our function data type declaration needs to match.

This done, we're ready to test things out a bit:

SELECT ContactID, FirstName, LastName, EmailAddress, Phone
FROM Person.Contact
WHERE dbo.fCLRExample(
'[a-zA-Z0-9_\-]+@([a-zA-Z0-9_\-]+\.)+(com|org|edu|mil|net)',
EmailAddress) = 1

If you have the default data, then this will actually return every row in the table since they all are adventure-works.com addresses. So, let's try a simple test to show what works versus what doesn't:

DECLARE @GoodTestMail varchar(100),
@BadTestMail varchar(100)
SET @GoodTestMail = 'robv@professionalsql.com'
SET @BadTestMail = 'misc. text'
SELECT dbo.fCLRExample(
'[a-zA-Z0-9_\-]+@([a-zA-Z0-9_\-]+\.)+(com|org|edu|nz|au)',
@GoodTestMail) AS ShouldBe1
SELECT dbo.fCLRExample(
'[a-zA-Z0-9_\-]+@([a-zA-Z0-9_\-]+\.)+(com|org|edu|nz|au)',
@BadTestMail) AS ShouldBe0

For the sake of brevity, I have not built the full e-mail regex string here. It would need to include all of the valid country code top domans such as au, ca, uk, and us. There are a couple hundred of these, so it wouldn't fit all that well. That said, the basic construct is just fine, you can tweak it to meet your particular needs.


This gets us back what we would expect:

ShouldBe1
---------
1
(1 row(s) affected)
ShouldBe0
---------
0
(1 row(s) affected)

But let's not stop there. We have this nice function, let's apply it a little further by actually applying it as a constraint to the table.

ALTER TABLE Person.Contact
ADD CONSTRAINT ExampleFunction
CHECK (dbo.fCLRExample(
'[a-zA-Z0-9_\-]+@([a-zA-Z0-9_\-]+\.)+(com|org|edu|nz|au)',
EmailAddress) = 1)

Now we try to update a row to insert some bad data into our column, and it will be rejected:

UPDATE Person.Contact
SET EmailAddress = 'blah blah'
WHERE ContactID = 1

And SQL Server tells you the equivalent of "no way!":

Msg 547, Level 16, State 0, Line 1
The UPDATE statement conflicted with the
CHECK constraint "ExampleFunction".
The conflict occurred in database "AdventureWorks",
table "Person.Contact", column 'EmailAddress'
The statement has been terminated.

This article is adapted from Professional SQL Server 2005 Programming 2.0 by Joe Robert Vieira (Wrox, 2006, ISBN: 0-7645-8434-0), from Chapter 14, "Nothing But NET!"

 

No comments:

Post a Comment