Code

Power BI and the lost tnsnames.ora file

A distraught user contacted me today with a problem that her Power BI dashboard was not refreshing and returning the following error:

OLE DB:ORA-12154: TNS:could not resolve the connect identifier specified

The problem was not just isolated to her; but common across all her team.

I’m not an expert on Oracle and Power BI; and I kindly suggest for her to contact the Helpdesk.  She came back a few minutes later saying that the Helpdesk just passed her off unceremoniously to another team.

Deciding to give her a hand – I had a think about the error message.

TNS files seem to end up in all sorts of weird and wonderful locations on our organisation’s systems.

I also remembered that there was a patching of the Oracle client just done last week.

I asked her to do a search for tnsnames.ora on her C drive.

Meanwhile I ran the following command through the command line on my own computer; which should be a similar configuration:

echo %TNS_ADMIN%

She got back:

C:\app\product\12.1.0\client_1\network\admin

And I got back:

C:\app\product\19.3.0\client_1\network\admin

My hypothesis that IT installed a new version of the Oracle client and this process updated the %TNS_ADMIN% field.

What the patching didn’t do was to shift the tnsnames.ora file to the new directory of C:\app\product\19.3.0\client_1\network\admin.

When Power BI tried to run an Oracle query; it was using the new client installation and hence not finding the tnsnames.ora file.

After copying the TNS file into the correct directory; right away she was able to rerun her Power BI dashboard.

Qlik Replicate – “Json doesn’t start with ‘{‘ [xxxxxxx] (at_cjson.c:1773)” error

Had a shocker of a week.

You know those weeks; where everything went wrong.

Busy fixing other systems; after Operating system patching done by our IT team I didn’t look closely to our Qlik Replicate nodes that have been running smoothly over the past year 

After all there were no alerts; and a quick glance all our tasks were in a running status and none were in suspended or error status.

Next day One of our junior admin the pointed out some Qlik tasks using a high amount of memory.

I looked in and my stomach dropped.

Although the task was “green” and running; no changes were getting through to the destinations (AWS S3 and GCS).  The log file was filled with errors like:

00002396: YYYY-MM-DDT15:21:14 [AT_GLOBAL ]E: Json doesn't start with '{' [xxxxxxx] (at_cjson.c:1773)
00002396: YYYY-MM-DDT15:21:14 [AT_GLOBAL ]E: Cannot parse json: [xxxxxxx(at_protobuf.c:1420)

And hundreds and thousands of transactions were waiting to be written out.

The problem only existed on one QR cluster and only jobs that were writing to AWS S3 and GCS; the Kafka one was fine.  The other QR clusters were running fine

The usual “Turn it off and on again” didn’t work in either stopping or resuming the task; or restarting the server.

In the end I contacted Qlik Supported.

They hypothesised that the blanked patching caused the Qlik Replicate cluster to fail over and corrupt the captured changes stored up waiting to be written out in the next batch process.  When QR tried to read the captured changes – the json was corrupted.

Their fix strategy was:

  1. Stop the task
  2. Using the log file; find out the last successful time or stream position that the task.  This is usually found at the end of the log files.
  3. Using the Run -> Advance Run option; restart the task from the time last written out.

If this didn’t work; the recommended rebuilding the whole task and following the above steps

Luckily their first steps worked.  After finding the correct timestamps we could restart the QR tasks from the correct position.

Now looking into some alerting to prevent this problem again.

Python – Unpacking a COMP-3 number

New business requirement today.  We have some old mainframe files we have to run through Qlik Replicate.

Three problems we have to overcome:

  1. The files are in fixed width format
  2. The files are in EBCDIC format; specifically Code 1047
  3. Inside the records there are comp-3 packed fields 

The mainframe team did kindly provide us with a schema file that showed us the how many bytes make up each field so we could divide up the fixed width file by reading in a certain number of bytes per a field.

Python provided a decode function to decode the fields read to a readable format:

focus_data_ascii = focus_data.decode("cp1047").rstrip()

The hard part now was the comp-3 packed fields. They are made up with some bit magic and working with bits and shifts is not my strongest suite  

I have been a bit spoilt so far working with python and most problems can be solved by “find the module to do the magic for you.”

But after ages of scouring for a module to handle the conversion for me; I had a lot of false leads – testing questionable code with no luck.

Eventually I stumbles upon:

zorchenhimer/cobol-packed-numbers.py

Thank goodness.

It still works with bits ‘n’ shift magic – but it works on the data that I have and now have readable text 

I extended the code to fulfil the business requirements:

# Source https://gist.github.com/zorchenhimer/fd4d4208312d4175d106
def unpack_number(field, no_decimals):
    """ Unpack a COMP-3 number. """
    a = array('B', field)
    value = float(0)

    # For all but last digit (half byte)
    for focus_half_byte in a[:-1]:
        value = (value * 100) + ( ( (focus_half_byte & 0xf0) >> 4) * 10) + (focus_half_byte & 0xf)

    # Last digit
    focus_half_byte = a[-1]
    value = (value * 10) + ((focus_half_byte & 0xf0) >> 4)

    # Negative/Positve check.  If 0xd; it is a negative value
    if (focus_half_byte & 0xf) == 0xd:
        value = value * -1

    # If no_decimals = 0; it is just an int
    if no_decimals == 0:
        return_int = int(value)
        return (return_int)
    else:
        return_float = value / pow(10, no_decimals)
        return (return_float)

Qlik Replicate – “SYS-E-HTTPFAIL, SYS-E-UNRECREQ, Unrecognized request pattern” problem

After a weekend of patching in our dev environment; we came in to discover that half of our Qlik Replicate Nodes were offline in Enterprise manager with the following error message

4 2022-08-08 13:19:24 [ServerDto      ] [ERROR] Test connection failed for server:MY_QRNODE. Message:'SYS-E-HTTPFAIL, SYS-E-UNRECREQ, Unrecognized request pattern 'GET: /servers/local/adusers/devdomain/qlik_user?directory_lookup=false'..'.

We could still access the node through their individual node’s web gui and the tasks were still happily running in the background.  The other half of our QR nodes were still online in Enterprise manager

The usual troubleshooting of restarting the services and restarting the server didn’t fix the problem; and the patching team washed their hands of the problem.  (I can’t really list the patches they applied due to security issues)

A database administrator mentioned it might be a TLS problem as the server team has been changing TLS settings in various places.  This lead me to comparing the connection settings between a QR node that was working to not working (we checked other things like patch levels etc).

Some servers had the username in domainusername format; while others didn’t.  Coincidentally the ones who had the  domainusername format were the ones not working,

Removing the domain section of the username resolved the problem.

So not sure what in the patching caused this issue; but something we have to check on our prod servers before the patches are applied to them.

The Fake LEFT OUTER JOIN

In my role I review a wide variety of SQL code.

Some is highly professional that is above my skill level so I absorb as much learning as I can from it.

On the other side is self taught SQLers from the business world who are trying to build weird and wonderful reports with SQL held together with sticky tape and don’t truly comprehend the code that is written.

An example – I see the following a lot with people who don’t understand the data and trying to program stability in their code like this:

SELECT 
    A.BDATE,
    A.ID,
    A.CUST_ID,
    C.CUSTOMER_NAME,
    C.PHONE
FROM dbo.ACCOUNTS A
LEFT OUTER JOIN dbo.CUSTOMER C
ON   C.BDATE = @DATE
     C.CUST_ID = A.CUST_ID 
WHERE
    A.BDATE = @DATE AND
    A.ACCOUNT_TYPE = 'XX';

This means if for some reason if there is an integrity problem between ACCOUNTS and CUSTOMER; they will still get the account data on the report.

The problem arises when the business requirement of the report evolve and the developer needs to filters on the left joined table. 

For instance – “Report on Accounts on type XX but remove deceased customers so we don’t ring them” 

I quite often see this:

SELECT 
    A.BDATE,
    A.ID,
    A.CUST_ID,
    C.CUSTOMER_NAME,
    C.PHONE
FROM dbo.ACCOUNTS A
LEFT OUTER JOIN dbo.CUSTOMER C
ON   C.BDATE = @DATE
     C.CUST_ID = A.CUST_ID 
WHERE
    A.BDATE = @DATE AND
    A.ACCOUNT_TYPE = 'XX' 

    AND

    C.DECEASED = 'N';

Without knowing it; the developer has broken their “Get all account XX report” business rule by turning their LEFT OUTER JOIN into effectively a INNER JOIN.

The correct way to write the query is:

SELECT 
    A.BDATE,
    A.ID,
    A.CUST_ID,
    C.CUSTOMER_NAME,
    C.PHONE
FROM dbo.ACCOUNTS A
LEFT OUTER JOIN dbo.CUSTOMER C
ON  C.BDATE = @DATE
    C.CUST_ID = A.CUST_ID 

    -------------------------------- 
    AND

    C.DECEASED = 'N'
    -------------------------------- 
WHERE
    A.BDATE = @DATE AND
    A.ACCOUNT_TYPE = 'XX' ;

You can get a good gauge of the skill level of the developer by checking their LEFT OUTER JOINS.  If they using them on every join where you know there is integrity between the two data objects; you can have an educated guess that you’re reviewing a novice developer.

This is where you have to pay extra attention to their joins and where predicates and make them aware of the consequences of the code.

Qlik Replicate – The Wide Text file problem

A wide problem

We had a new business requirement for our team to create a new source for a CSV through Qlik Replicate.

OK – simple enough; create the Source Connector and define the file’s schema in the table section of the connector.

But the file was 250 columns wide.

I had nightmares before of defining file schemas before in programs like SSIS where one little mistake can lead to hours of debugging.

A stroke of Luck

Luckily we knew the source of the CSV file was an output from a view of a database; so the first part of the problem was solved – we had the schema.  Now to import the schema into a Qlik Replicate task.

  1. Create a new File Source connector and prefill as much known information as you can
  2. In the Tables definition of the File Source; add a dummy field
  3. Create a NULL Target connector
  4. Create a dummy task with your new file source and the Null target and save it
  5. Export task with End Points to save out the json definition of the task

Creating the table definition in JSON

Knowing the view definition; a simple python script can convert it to json.

import csv
import json

if __name__ == '__main__':
  final_dic = {}
  source_file = "FileDef.txt"
  field_array = []

    with open(source_file, 'r') as f:
      reader = csv.reader(f, delimiter='t')
      schema_items = list(reader)

        for item in schema_items:
          focus_item = {}
          focus_item["name"] = item[0]
          focus_item["nullable"] = True
          focus_string = item[1]

            if item[1].find("NUMBER") != -1:
              focus_item["type"] = "kAR_DATA_TYPE_NUMERIC"
              tokens = focus_string[focus_string.find("(") + 1: -1].split(",")

                precision = tokens[0]
              focus_item["precision"] = int(precision)

                if len(tokens) == 2:
                  scale = tokens[1]
                  focus_item["scale"] = int(scale)

            elif item[1].find("VARCHAR2") != -1:  #VARCHAR2
              focus_item["type"] = "kAR_DATA_TYPE_STR"
              length = focus_string[focus_string.find("(") + 1 : focus_string.find("BYTE") - 1]
              focus_item["length"] = int(length)

            elif item[1].find("DATE") != -1:
              focus_item["type"] = "kAR_DATA_TYPE_TIMESTAMP"

          field_array.append(focus_item)

        columns = {}
      columns["columns"] = field_array

    f = open("out.json", "w")
  f.write(json.dumps(columns, indent=4))
  f.close()

FileDef.txt

The following text file contains the schema of the view with the table delimited columns of:

  • Column Name
  • Data type
  • Nullable
  • Column order
UDF_VARCHAR1 VARCHAR2(200 BYTE) Yes 1
UDF_VARCHAR2 VARCHAR2(200 BYTE) Yes 2
UDF_VARCHAR3 VARCHAR2(200 BYTE) Yes 3
UDF_NUMERIC1 NUMBER(10,6) Yes 4
UDF_NUMERIC2 NUMBER(10,6) Yes 5
UDF_NUMERIC3 NUMBER(10,6) Yes 6
UDF_INT1 NUMBER(10,0) Yes 7
UDF_INT2 NUMBER(10,0) Yes 8
UDF_INT3 NUMBER(10,0) Yes 9
UDF_DATE1 DATE Yes 10
UDF_DATE2 DATE Yes 11
UDF_DATE3 DATE Yes 12

This particular view definition is from Oracle so uses oracle data types; but it would be simple to change the code over to use a definition from a MS SQL database or other source.

Sticking it all back together

  1. Open up the exported task file json
  2. Find the dummy field that you created in the table definition
  3. Overwrite the dummy field with the json created in the script file
  4. Save and reimport the job.

If all goes well; the File source connector will be overwritten with the full table definition.