# Calculating year to day targets by month

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38``` ```DECLARE @TargetDate DATETIME = '01 mar 2018' IF OBJECT_ID(N'tempdb..#temp') IS NOT NULL DROP TABLE #temp ;WITH cte AS ( SELECT CAST('01 jan 2018' AS DATE) AS dateid UNION ALL SELECT dateadd(month,1,dateid) FROM cte WHERE MONTH(dateid) <> 12 ) SELECT dateid,'averageLoan' AS [TargetName], month(dateid) AS [Target] INTO #temp FROM cte UNION ALL SELECT dateid,'target_1', 2 AS [Target] FROM cte UNION ALL SELECT dateid,'target_2', 3 AS [Target] FROM cte UNION ALL SELECT dateid,'target_3', 4 AS [Target] FROM cte UNION ALL SELECT dateid,'target_4', 5 AS [Target] FROM cte SELECT t1.targetName AS TargetName, SUM(t1.[target] * t2.[target]) AS TargetValue FROM #temp t1 INNER JOIN #temp t2 ON t1.dateid = t2.dateid WHERE MONTH(t1.dateid) <= MONTH(@TargetDate) AND t1.targetName <> 'averageLoan' AND t2.targetName = 'averageLoan' GROUP BY t1.targetName ```

# Average start time of an overnight job

### Problem:

You have an overnight job which is triggered by the arrival of some external data.  You keep a record of the date and time that the overnight job is run.   Someone wants to know what the average start time of the job is.  You can’t use the average function on dates and times in T-SQL so you need to write some custom code to calculate the average time the job starts.

### Solution:

First, you need to identify the point in time that you want the calculation to consider the start of a new day.  For a job which runs during the day could be 0 – indicating midnight.  For an overnight job you would probably want to consider the possibility that the job could be started the next morning.  In that situation you might choose 18 – indicating 6pm as the start of a new day.

Then, in the calculation itself:

1. identify the number of seconds since the day division
2. get the average of number of seconds
3. add to a date time and extract the time portion

```-- Defines the hour of the day when a new day starts
DECLARE @DayDivision INT = 12

IF OBJECT_ID(N'tempdb..#StartTimes') IS NOT NULL DROP TABLE #StartTimes

CREATE TABLE #StartTimes(
start DATETIME NOT NULL
)

INSERT INTO #StartTimes
VALUES
('2018-01-09 00:01:38.000')
,('2018-01-09 23:43:22.000')

SELECT
-- 3. Add the number of seconds to a day starting at the
--    day division hour, then extract the time portion
-- 2. Average number of seconds
AVG(
-- 1. Get the number of seconds from the day division point (@DayDivision)
DATEDIFF(SECOND,
CASE WHEN DATEPART(HOUR,start) < @DayDivision THEN
ELSE
SMALLDATETIMEFROMPARTS(YEAR(start),MONTH(start),DAY(start),@DayDivision,0)
END
,start)
)
,'01 jan 1900 ' + CAST(@DayDivision AS VARCHAR(2)) + ':00') AS TIME) AS AverageStartTime
FROM #StartTimes
```

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32``` ```/* Date : 17/01/2018 Notes: Catches a bad xml character error and displays messages to help with debugging */ DECLARE @x VARCHAR(MAX) SET @x = 'dave' + CHAR(160) + 'smith' BEGIN TRY SELECT CAST(@x AS XML) AS xmlOutput END TRY BEGIN CATCH DECLARE @errorMessage VARCHAR(MAX) ,@errPos INT SELECT @errorMessage = ERROR_MESSAGE() IF PATINDEX('%illegal xml character%', @errorMessage) > 0 BEGIN SELECT @errPos = SUBSTRING(@errorMessage, PATINDEX('%character %', @errorMessage) + 10, PATINDEX('%, illegal%', @errorMessage) - PATINDEX('%character %', @errorMessage) - 10 ) PRINT 'Error caused by illegal character in XML:' PRINT 'Character:' + SUBSTRING(@x,@errPos,1) PRINT 'Ascii Code:' + CAST(ASCII(SUBSTRING(@x,@errPos,1)) AS VARCHAR(MAX)) PRINT 'Surrounding text:' + SUBSTRING(@x,@errPos-100,200) PRINT '' END END CATCH ```

# T-SQL defensive programming overview

The four principles of defensive programming are

2. Test rigorously against user cases
3. Layout code in modules
4. Reuse code

# The other space character… ascii 160

I recently discovered that the space character in ascii has an evil twin.  Like all evil twins it looks exactly the same but it does evil…

Here they are in SSMS:

### They look the same – why is there another space character in ascii?

The second space is a non-breaking space character – so any program displaying the text should not separate a line at that point.

### Why is ascii 160 evil?

The other day I was trying to extract some data into xml but I kept getting an illegal xml character error.

The source data had a single ascii 160 character in it, but all the other spaces were ascii 32.  Even when you find the row of data which has the non breaking space it still requires a lot of work to actually find where the character is!

### How can you solve this error?

The best way is to make sure that the data going into your system is cleaned of the non breaking space – if you intend to extract it in XML.

Another possibility is to replace non breaking spaces with breaking spaces in the output.

Or, if the system that will receive the XML is happy with it, you could change the encoding to UTF-16.

# Adding many-to-many relationships to a data warehouse

Many to many relationships frequently occur in OLTP databases.  For instance, consider a retail system which stores the details of products in one table. Each product  could belong to  one or more product types; equally, each product type could apply to more than one product.  This is often represented in an OLTP database as:

However, in a data warehouse this structure is likely to cause performance problems.

Assuming Products has it’s own dimension in the DW.  Here are some alternatives for storing the associated Product Type data:

Proposed structure Issues to condsider
Add a bitmap column to the Product dimension
• Not easily readable (for humans)
• Good for binary indicators
• Good for values which are quite static
• Very good for performance
• Needs to be documented well so that all using it clearly understand what each bit indicates
Add an xml column to the Product dimension
• Need to pay attention to the performance implications
• May need to add an XML index to resolve performance issues
• Extends easily – it is XML after all!
Add a bridging table to a dimension
• Does have performance implications due to the joins that will need to be made to connect the tables together
• How many connections is the dimension likely to have to the bridging table? If most don’t have connections then the performance hit may be acceptable
• Consider the dimension type – the fewer rows of a type I dimension will have much less of a performance hit than a type II dimension
• Requires more ETL work than previous options
Add a bridging table to a fact table
• Consider which type of fact table it is – has the potential to add a lot of redundant information to the DW if not relevant to reporting requirements
• Good for capturing frequently changing values that need to be reported on
• Requires more ETL work than previous options