June 2023

SQL: Quick and dirty comparing result sets

When doing regression testing – I need to prove that the changes that I have done to two queries returned the same result sets.

If result sets are on the same database then you can do a MINUS operator. 

If not – it can be a pain to shift result sets from database to database; especially if they are large.

For quick and dirty testing; I like using the tsql function CHECKSUM and pl sql function ORA_HASH.  It can give you some quick results to test whether a column of data has a high probability of been the same as another column.

To test a result set; I have the following excel formulas saved in my “Handy formulas” notebook:

TSQL
=",SUM(CAST(CHECKSUM("&A1&") AS BIGINT)) AS "&A1
PL/SQL
=",SUM(ORA_HASH("&A1&")) AS "&A1

With a list of column name you can then make an checksum sql statement from Excel

Running this query on both result sets; differences can be found by seeing if there is a difference in the retuned values

Excel & VBA – Concatenating two fields together with formatting

A work colleague asked me to do a “Mate’s rate” job for his team yesterday.

They had a MS Word document with a huge table in it that they needed to convert to Excel to load into an application.

Paraphrased  it looked like this:

My first thought was to copy this table out and paste it into excel to see how it looks:

Two problems:

  1. The cell “Some data” is merged across three rows
  2. Line 1, Line 2, Line 3 are now split across three rows instead of being in the one cell

I wrote a macro to split out the merge cells; but when it came to combining cells for Line 1, Line 2 and Line 3 I ran into a problem.

None of the usual commands and from my quick google searching could combine the cells and retained the formatting.

So commands like:

=CONCAT(B2:B4)

=B2 & B3 & B4

Would not retain the formatting; let alone add in line feed characters to make it a multiline cell

I’m sure there is a better way – but strapped for time I programmed the following subroutine in VBA

Public Sub combineCells(cell1 As Range, cell2 As Range, output As Range)
    Dim formatIndex As Long
    Dim formatOffset As Long
    Dim hyperlink1 As String
    Dim hyperlink2 As String

    hyperlink1 = ""
    hyperlink2 = ""

    ' Check if the cell has a hyperlin; but don't have a text version of the hyperlink in the Value 2
    If cell1.Hyperlinks.Count <> 0 And InStr(cell1.Value2, "https:") < 1 Then
        hyperlink1 = " [" & cell1.Hyperlinks(1).Address & "]"
    End If

    If cell2.Hyperlinks.Count <> 0 And InStr(cell2.Value2, "https:") < 1 Then
        hyperlink2 = " [" & cell2.Hyperlinks(1).Address & "]"
    End If

 
    ' Handling if the first cell is blank.  If so we don't want a LF at the top of the cell
    If Trim(cell1.Value2) <> "" Then
        output.Value2 = cell1.Value2 & hyperlink1 & vbLf & cell2.Value2 & hyperlink2
        formatOffset = Len(cell1.Value2) + Len(hyperlink1) + 1
    Else
        output.Value2 = cell2.Value2 & hyperlink2
        formatOffset = Len(cell1.Value2)
    End If

  
    ' Copies the formatting from cell1 to the final cell
    ' You can add more options to transfer over different formatting
    For formatIndex = 1 To Len(cell1.Value2)
        output.Characters(formatIndex, 1).Font.Bold = cell1.Characters(formatIndex, 1).Font.Bold
        output.Characters(formatIndex, 1).Font.Italic = cell1.Characters(formatIndex, 1).Font.Italic
        'output.Characters(formatIndex, 1).Font.Underline = cell1.Characters(formatIndex, 1).Font.Underline
    Next

  
    ' Copies the formatting from cell2 to the final cell
    For formatIndex = 1 To Len(cell2.Value2)
        output.Characters(formatIndex + formatOffset, 1).Font.Bold = cell2.Characters(formatIndex, 1).Font.Bold
        output.Characters(formatIndex + formatOffset, 1).Font.Italic = cell2.Characters(formatIndex, 1).Font.Italic
        'output.Characters(formatIndex + formatOffset, 1).Font.Underline = cell2.Characters(formatIndex, 1).Font.Underline
    Next

End Sub

Oh boy it runs slow – to combine a couple of thousands cells took half an hour and I had the typical worries that Excel crashed because everything was locked up.

But it worked for the quick task that I was doing

Qlik Replicate – Simple trick for Source file connectors

Quick post on a trick I use in Qlik Replicate

Qlik Replicate has a Source connector to consume flat files from a drive. 

More details can be found on the Qlik website

For change data capture of these files; Qlik will only consume files that have a newer “Last Change Date” than when the task is started.

So if for some reason you have to stop and reload files that are already present; you have to change the “Last Modified Date”

A simple Powershell command that I got from itechtics.com can be used to reset all the dates in a directory of your choice.

Get-ChildItem -force PathToItem * | ForEach-Object{$_.LastWriteTime = ("yyyy-mm-dd hh24:mi:ss")}

So to change all the dates of the files in the directory C:tempload to the current time (as of writing):

Get-ChildItem -force c:\temp\load * | ForEach-Object{$_.LastWriteTime = ("2023-06-08 11:05:00")}

If a Qlik Replicate task is running and watching this directory; it will reload the file my_load.file.csv.