Friday, June 3, 2016

Forcing a view to return a nullable column for a static value

I came across a weird scenario wherein I had two tables, one essentially a staging table and the other a slightly transformed version of that. However, they are more or less the same and both have a large amount of applications and ORMs tied to them directly. As a first step to remove the ETL process that moves the data from one table to the other unnecessarily, I planned to turn one into a view of the other. 

Due to restrictions, I essentially have to do this in a way that won't cause the ORM database mappings to break - so the view has to perfectly mimic the table - same name, same data types, same null-able status for each column. Naming and data types are easy to adjust, casting or converting where needed. Changing whether a column is null-able or not is a little more complicated, though.

I first started by tackling base columns that are null-able but had been made non-null during the ETL process by wrapping them in an ISNULL:

--Create a test table
CREATE TABLE [dbo].[Sales] (
    [ID] INT NOT NULL
   ,[Date] DATE NULL
   ,[Amount] INT NOT NULL
   ,[Code] INT NOT NULL
   ,[SalesPersonName] NVARCHAR(50) NULL
);
GO

--Use a view to switch from NULL to NOT NULL
CREATE VIEW [dbo].[vw_SalesView] AS
   SELECT ISNULL([SalesPersonName], N'') AS [SalesPersonName_NotNull]
         ,[Date]
   FROM [dbo].[Sales];
GO

--Check the view's output column meta
sp_help '[dbo].[vw_SalesView]';

When we check the output of the view, we see that the [SalesPersonName_NotNull] column is what we expect, NOT NULL:



This is a pretty quick and obvious solution and has been written about a lot, and the inverse works pretty similarly. When we want to turn a non-null column into null-able, a CASE statement is an easy way to achieve this:

--Drop old view
DROP VIEW [dbo].[vw_SalesView];
GO
--Add a column to make [Amount] nullable
CREATE VIEW [dbo].[vw_SalesView] AS
   SELECT ISNULL([SalesPersonName], N'') AS [SalesPersonName_NotNull]
         ,[Date]
         ,CASE 
             WHEN [Amount] IS NOT NULL
                THEN [Amount]
             ELSE NULL
          END AS [Amount]
   FROM [dbo].[Sales];
GO

--Check the view's output columns
sp_help '[dbo].[vw_SalesView]';

Which gives us a null-able [Amount] column, even though the underlying [Sales] table has the column as non-null:



At this point things are going great and most use cases for adjusting the null-ability of a column are covered. However, I came across a strange situation: the view needed to return a static integer value that wasn't from a table, yet needed be null-able to fit with the original table's makeup. The ETL process had just been setting the column equal to 1000 for every single row, so my view should do the same. My first thought was to use a CASE statement with a hard coded logical comparison to fool SQL Server into making it null-able, the opposite of what I had just done above:

DROP VIEW [dbo].[vw_SalesView];
GO
--Use 1 = 1 try to force a NULLABLE column via CASE
CREATE VIEW [dbo].[vw_SalesView] AS
   SELECT ISNULL([SalesPersonName], N'') AS [SalesPersonName_NotNull]
         ,[Date]
         ,CASE 
             WHEN [Amount] IS NOT NULL
                THEN [Amount]
            ELSE NULL
          END AS [Amount]
         ,CASE 
             WHEN 1 = 1 
                THEN 1000
             ELSE NULL
          END AS [SalesGoal]
   FROM [dbo].[Sales];
GO

--Check the view's output columns
sp_help '[dbo].[vw_SalesView]';

But that didn't work, SQL Server knows this will always evaluate to true and therefor leaves the column as non-null:



So next I began to think of a non-deterministic logic that would for all intensive purposes always return true, but still be potentially false, thereby tricking SQL Server while always giving me the desired result:

DROP VIEW [dbo].[vw_SalesView];
GO

--Use a potentially false condition that will always return true with GETDATE()
CREATE VIEW [dbo].[vw_SalesView] AS
   SELECT ISNULL([SalesPersonName], N'') AS [SalesPersonName_NotNull]
         ,[Date]
         ,CASE 
            WHEN [Amount] IS NOT NULL
              THEN [Amount]
            ELSE NULL
          END AS [Amount]
         ,CASE 
             WHEN DATEPART(YEAR, GETDATE()) > 2015
              THEN 1000
             ELSE NULL
          END AS [SalesGoal]
   FROM [dbo].[Sales];
GO

--Check the view's output columns
sp_help '[dbo].[vw_SalesView]';

Lo and behold, this worked!



I now had my static value returning as a null-able column in my view. And while it's pretty safe to assume neither me nor the database will be time traveling back to pre-2016, something about this approach left me a little unsettled. Who knows what obscure bug or sequence of events could end up giving me a bad date value. So what was something more reliable? I thought about using @@SERVERNAME, @@VERSION (how often do you downgrade to SQL 2000 ?!), or some other @@ value, but having to make assumptions or do some string manipulation seemed like an over complication and a route that would lead to reduced readability, making it even harder to discern what I was trying to achieve with this crazy CASE statement.

I finally settled on something that fulfilled my criteria and didn't leave a pit in my stomach: an approach using a built-in function for low cost, no extraneous manipulation to compare values, and astronomically low odds that I would ever accidentally get a NULL value when I didn't want one: GUIDs.

DROP VIEW [dbo].[vw_SalesView];
GO

--Compare GUIDs to ensure a true outcome, but still register as potentially false
CREATE VIEW [dbo].[vw_SalesView] AS
   SELECT ISNULL([SalesPersonName], N'') AS [SalesPersonName_NotNull]
         ,[Date]
         ,CASE 
            WHEN [Amount] IS NOT NULL
              THEN [Amount]
            ELSE NULL
          END AS [Amount]
         ,CASE 
            WHEN NEWID() <> NEWID()
              THEN 1000
            ELSE NULL
         END AS [SalesGoal]
   FROM [dbo].[Sales];
GO
--Check the view's output columns
sp_help '[dbo].[vw_SalesView]';



Since the odds of two generated GUIDs on the same server are extremely low and right off the bat it's obvious what the statement is doing, I think this is the best approach to take for this very specific, strange problem.