Code

Qlik Enterprise Manager – Upgrading from 2021.05 to 2022.11

As a business requirement of developers wanting to take advantage of new features in Qlik Enterprise Manager (QEM); we had to upgrade our system from 2021.05 to 2022.11.

This was not a smooth process and encountered several issues on the way.

First Attempt – Straight Upgrade

Simple to upgrade?

Just download the latest installer from Qlik’s website and kick it off?

Unfortunately when I gave developers keys to our dev QEM box to do the upgrade before Christmas as I was busy with other tasks.  

They gave it a go without a backout plan.  

Couple of days later I tried logging into the QEM console and couldn’t get on.

The AttunityEnterpriseManager service kept restarting every minute and the following errors were in the  C:\Program Files\Attunity\Enterprise Manager\data\logs\EnterpriseManager.log

[START: 2023-03-01 09:40:18 Attunity EnterpriseManager Version: 2022.11.0.335]
1 2023-01-02 09:40:18 [Command ] [INFO ] Executing ServiceRunCommand command.
6 2023-01-02 09:40:19 [Host ] [INFO ] Setting up web host
6 2023-01-02 09:40:20 [Host ] [INFO ] The server will listen on the following location: http://xxxxxxx.xxx.xxx/attunityenterprisemanager
6 2023-01-02 09:40:20 [Host ] [INFO ] The server will listen on the following location: https://xxxxxx.xxx.xxx/attunityenterprisemanager
6 2023-01-02 09:40:20 [Host ] [ERROR] Object reference not set to an instance of an object.
6 2023-01-02 09:40:20 [Host ] [INFO ] Stopping service on error...

I tried the usual desperation tricks of restarting the service and then restarting the box.

In the end I had to uninstall QEM and reinstall 2021.05 as there was no backout plan.

The developers were then subsequently banned from doing anymore work on the QEM upgrade

Second attempt – Two jump upgrade

Next attempt was upgrading to 2022.05 and then jumping to 2022.11.

This time I took a backup of C:Program FilesAttunityEnterprise Managerdata so I could easily backout the change if I had problems.

The upgrade from 2021.05 to 2022.05 went smoothly; but when jumping from 2022.05 to 2022.11 the same error occurred.

I then raised a case with Qlik and also posted on their forum asking if anyone else had come across this problem.

Final Success

With help from Qlik support we developed a upgrade work around.  Qlik said that there is a patch in the works as other users have had the same problem.

If you follow along – note that this method might differ for your system.  In the three upgrades I have done internally; they all differed slightly.

Make sure you TEST and HAVE A BACKOUT PLAN

Upgrade Perquisites

This upgrade was done on a Windows 2019 server.

You will need:

  • Admin rights to QEM’s server
  • Licenses for QEM and Analytics
  • Username/Password that AEM connects to the analytics database
  • Privilege user account for to the analytics database
  • Password for the user that you use to connect QR nodes to QEM
  • A SQLite database client
  • Handy to have – an editor that can validate json format

Method

  1. Log on to QEM console
  2. For each server under the server tab – Right click and go to “Stop Monitoring”
  3. Remote desktop to the QEM server
  4. Stop the AttunityEnterpriseManager service
  5. Back up C:\Program Files\Attunity\Enterprise Manager\data
  6. Run the upgrade to QEM 2022.05
  7. Once the upgrade is done; log onto QEM and ensure that version 2022.05 is working correctly
  8. Back up C:Program FilesAttunityEnterprise Managerdata again
  9. Stop the AttunityEnterpriseManager service again
  10. Run following command from the command line:
    cd "C:\Program Files\Attunity\Enterprise Manager\bin"
    aemctl repository export -f C:\temp\qem_repo_202205.json -w

  11. Script out the postgres database with the following commands:
    cd C:\Program Files\PostgreSQL10\bin
    pg_dump -U <username> -s -C -c <analyticdbname> > C:\temp\analyticdbname.sql

  12. Edit C:\temp\cdcanalyticsp.sql.  Find and replace “<analyticdbname>” with “cdcanalytics_2022_11_p”.  Save the file
  13. Run the following command to create a new temporary database on the postgres server:
    psql -d postgres -U <username> -f C:\temp\cdcanalyticsp.sql

  14. This is where the problem lies in our instance.  Open up C:tempqem_repo_202205.json
    Find any code blocks with “type”: “AttunityLicense” in it and remove the whole code block.
    1. On our dev system it was the section with “name”: “Replication Analytics”, but in prod it was section with “name”: “Replication Management”.  
    2. You may need to experiment to find the correct section
  15. Go to Add/Remove and uninstall Enterprise manager
  16. Once remove – delete the the folder C:\Program Files\Attunity\Enterprise Manager\data
  17. Install QEM 2022.11.  When prompted do not install Postgres
  18. Log into QEM and check that it is up and running correctly.
  19. Stop the AttunityEnterpriseManager service again
  20. Run the following command:
    aemctl repository import -f C:\temp\qem_repo_202205_Fix.json -w

  21. Start the AEM service and check that it is running correctly
  22. Re-add the QEM and Analytics license 
  23. Go to Settings -> Repository connections.
    Enter in username password set the database as cdcanalytics_2022_11_p
  24. Intialize the repository and start the connector
  25. Stop the AttunityEnterpriseManager service again
  26. Open up C:\Program Files\Attunity\Enterprise Manager\data\Java\GlobalRepository.sqlite in an SQLite editor
  27. Browse the table objects
    Look for name = “AnalyticsConfiguration” and modify the json value “cdcanalytics_2022_11_p” to the original analytics database name
  28. Save SQLite database
  29. Start the AEM service and check that it is running correctly
  30. Go to the server tab.  Edit each server and re-add the password.  Click on the button “Monitor this server’s tasks and messages”.  Test the connection and click OK
  31. PVT QEM

If things go wrong

Backout process

  1. Uninstall QEM
  2. Delete the folder C:\Program Files\Attunity\Enterprise Manager\data
  3. Reinstall either QEM 2021.05 or QEM 2022.05
  4. Stop the AttunityEnterpriseManager service
  5. Restore the appropriate backed up data folder to C:\Program Files\Attunity\Enterprise Manager\data
  6. Restart the AEM service

Facebook links + Former Girlfriend and Scam Weight loss pills

I swear I am working hard this close to Christmas!

But I did accidentally check in my Facebook account and saw a suspicious post that I have seen getting around the my feed.

I have seen posts like this before.  A public post with a couple of dozen people tagged in it.  In this case; one of the people tagged in the post was my (very nice) former girlfriend.

I have seen variations of posts like this with sunglasses and shoes; but this seems to be a new type that hit my news feed.

But this one intrigued me.  The picture looks very convoluted with an image transposed over the other one.  If you look carefully you can see tiled floors and other architecture.  I am guessing the author did this to throw off image matching software that FB might use to match known scams or copywrite images. 

Looking down the Rabbit hole

I am not a security or a web design expert but decided to check out the website.

I suspect it is a Facebook worm of some sorts.  

One thing I didn’t want to do when investigating the website is either triggering off the worm propagation process and make matters even worse.

As a precaution I opened up the link in on my home computer in an incognito page with Noscript running.  Last thing I wanted to do was blow up my work computer by opening a malicious link.

Here is the website in all its glory:

The short link resolved to a fuller domain name and I ran a whois check to find out more.

 

This actually surprised me.  Other scam sites that I have checked with whois usually have domain names registered quite recently.

For instance a phishing email I got the other day – the domain name was only a day old when I looked it up in whois.

Checking out the source code behind the site – the actual HTML code was very neat and tidy.  Nice – I like well formed code.

Other things I noted:

  • All the main menu links will jump to a particular spot on the page.  This will link off to another website where you can enter in contact details like name, address, phone, credit card number etc.  I didn’t go in deeper than this to see the response of putting in fake details as this was out of my depth.
  • There were Chinese characters in the html comments.  This by no means implying that people who write comments in Chinese are dodgy – but seems a unusual a “US Today” website would have foreign language comments in it.

Image searching

The website was filled with “Before and After” images and testimonies on their product.

Using Google Lens I searched some of these images

Our dubious website:

Google Lens came up with:

Google Lens found the image on a Dutch city website to a nutritional consultant Stein van Vida Vitaal.  Our subject name has changed from “Gerald” to “Fred”

One more check for the end of the day was our “Before and After” bikini model photo in the comments:

Google Lens returned an Insider article by Emily DiNuzzo about “Before and After Body image Photos”.

The author of this website actually got the photo off Shutterstock that she referenced (link broken) on the website

Summary

A proper web developer and security expert can probably pull apart this website and explain how the malicious part of it work – but I am not at that level.

One of the biggest risks to security is the component that sits between the computer keyboard and the chair.  

For all the diving into the code, checking domains and looking up reused images over the past half an hour; the biggest thing that stood out to me is the language in the Facebook post:

“I use it. I didn’t change my diet or exercise. Photos before and after use”

It doesn’t seem right.  The post is very vague. 

What did she use? Why isn’t there more details?

If she lost a heap of weight; wouldn’t her language be prouder? 

More descriptive?

Wouldn’t she be showing photos of herself instead of some convoluted photo?

Why is my former girlfriend tagged in this post?  She’s doesn’t have the personality to use questionable diet products.  If the poster is her friend – would she really share something in this manner?  If it was for my former girlfriend – why not send it in a private message or a group chat?

Don’t make hacker’s lives easy.  Think before you click.

Why can’t Qlik Replicate and Enterprise manager be friends?

A nice quiet evening with the family was interrupted with a critical alert coming through my phone.

Once again server patching had knocked off our Qlik Replicate node.

Logging in I could see that Enterprise manager could not reach one of our nodes with the following error message in the log:

2022-12-15 19:15:40 [ServerDto ] [ERROR] Test connection failed for server:MY_QR_CLUSTER. Message:'Unable to connect to the remote serverA connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond xxx.xxx.xxx.xxx:443'.

I have seen this problem before and it is usually resolved by:

  1. Failing over the QR windows cluster
  2. Restarting the new passive node

Our IT team is aware of the problem and have been researching into a cause and a fix.  Their prevailing theory was that when the cluster gets failed over in server patching – there are some residual connections to the previous active node.

But tonight after multiple failovers and stopping and starting the QR roles – still Enterprise manager couldn’t connect to that QR node.

I did the following checks:

  1. The log repsrv.log log file had no error messages and the service Web UI service was active
  2. From the Enterprise manager; I could ping the QR cluster address and the active node successfully 
  3. From a Chrome session on the Enterprise manager server; I could not get to the QR Web console
  4. From a Chrome session on the QR server; I could get to the QR Web console

A senior IT member joined the troubleshooting call and suggested that we reboot the Enterprise manager server.

So we did and then we couldn’t access Enterprise manager console.

At this point I wanted to quit IT and become a nomad in Mongolia.

Then the senior IT member worked it out.

The Windows server was randomly turning on the Windows Firewall

This was blocking our inbound connections; making the console inaccessible from other locations – except when you were logged onto the server.

This also explains why when this problem previously arise; restarting the server will eventually work because the server group policy will eventually get applied and turn off the Windows firewall. 

Lessons learnt

If you come above this problem in your environment try accessing the QR console from multiple locations:

  • From the Enterprise Manager server
  • From within the QR server with a local address like: https://localhost/attunityreplicate/login/

Good luck

Power BI and the lost tnsnames.ora file

A distraught user contacted me today with a problem that her Power BI dashboard was not refreshing and returning the following error:

OLE DB:ORA-12154: TNS:could not resolve the connect identifier specified

The problem was not just isolated to her; but common across all her team.

I’m not an expert on Oracle and Power BI; and I kindly suggest for her to contact the Helpdesk.  She came back a few minutes later saying that the Helpdesk just passed her off unceremoniously to another team.

Deciding to give her a hand – I had a think about the error message.

TNS files seem to end up in all sorts of weird and wonderful locations on our organisation’s systems.

I also remembered that there was a patching of the Oracle client just done last week.

I asked her to do a search for tnsnames.ora on her C drive.

Meanwhile I ran the following command through the command line on my own computer; which should be a similar configuration:

echo %TNS_ADMIN%

She got back:

C:\app\product\12.1.0\client_1\network\admin

And I got back:

C:\app\product\19.3.0\client_1\network\admin

My hypothesis that IT installed a new version of the Oracle client and this process updated the %TNS_ADMIN% field.

What the patching didn’t do was to shift the tnsnames.ora file to the new directory of C:\app\product\19.3.0\client_1\network\admin.

When Power BI tried to run an Oracle query; it was using the new client installation and hence not finding the tnsnames.ora file.

After copying the TNS file into the correct directory; right away she was able to rerun her Power BI dashboard.

Qlik Replicate – “Json doesn’t start with ‘{‘ [xxxxxxx] (at_cjson.c:1773)” error

Had a shocker of a week.

You know those weeks; where everything went wrong.

Busy fixing other systems; after Operating system patching done by our IT team I didn’t look closely to our Qlik Replicate nodes that have been running smoothly over the past year 

After all there were no alerts; and a quick glance all our tasks were in a running status and none were in suspended or error status.

Next day One of our junior admin the pointed out some Qlik tasks using a high amount of memory.

I looked in and my stomach dropped.

Although the task was “green” and running; no changes were getting through to the destinations (AWS S3 and GCS).  The log file was filled with errors like:

00002396: YYYY-MM-DDT15:21:14 [AT_GLOBAL ]E: Json doesn't start with '{' [xxxxxxx] (at_cjson.c:1773)
00002396: YYYY-MM-DDT15:21:14 [AT_GLOBAL ]E: Cannot parse json: [xxxxxxx(at_protobuf.c:1420)

And hundreds and thousands of transactions were waiting to be written out.

The problem only existed on one QR cluster and only jobs that were writing to AWS S3 and GCS; the Kafka one was fine.  The other QR clusters were running fine

The usual “Turn it off and on again” didn’t work in either stopping or resuming the task; or restarting the server.

In the end I contacted Qlik Supported.

They hypothesised that the blanked patching caused the Qlik Replicate cluster to fail over and corrupt the captured changes stored up waiting to be written out in the next batch process.  When QR tried to read the captured changes – the json was corrupted.

Their fix strategy was:

  1. Stop the task
  2. Using the log file; find out the last successful time or stream position that the task.  This is usually found at the end of the log files.
  3. Using the Run -> Advance Run option; restart the task from the time last written out.

If this didn’t work; the recommended rebuilding the whole task and following the above steps

Luckily their first steps worked.  After finding the correct timestamps we could restart the QR tasks from the correct position.

Now looking into some alerting to prevent this problem again.

Python – Unpacking a COMP-3 number

New business requirement today.  We have some old mainframe files we have to run through Qlik Replicate.

Three problems we have to overcome:

  1. The files are in fixed width format
  2. The files are in EBCDIC format; specifically Code 1047
  3. Inside the records there are comp-3 packed fields 

The mainframe team did kindly provide us with a schema file that showed us the how many bytes make up each field so we could divide up the fixed width file by reading in a certain number of bytes per a field.

Python provided a decode function to decode the fields read to a readable format:

focus_data_ascii = focus_data.decode("cp1047").rstrip()

The hard part now was the comp-3 packed fields. They are made up with some bit magic and working with bits and shifts is not my strongest suite  

I have been a bit spoilt so far working with python and most problems can be solved by “find the module to do the magic for you.”

But after ages of scouring for a module to handle the conversion for me; I had a lot of false leads – testing questionable code with no luck.

Eventually I stumbles upon:

zorchenhimer/cobol-packed-numbers.py

Thank goodness.

It still works with bits ‘n’ shift magic – but it works on the data that I have and now have readable text 

I extended the code to fulfil the business requirements:

# Source https://gist.github.com/zorchenhimer/fd4d4208312d4175d106
def unpack_number(field, no_decimals):
    """ Unpack a COMP-3 number. """
    a = array('B', field)
    value = float(0)

    # For all but last digit (half byte)
    for focus_half_byte in a[:-1]:
        value = (value * 100) + ( ( (focus_half_byte & 0xf0) >> 4) * 10) + (focus_half_byte & 0xf)

    # Last digit
    focus_half_byte = a[-1]
    value = (value * 10) + ((focus_half_byte & 0xf0) >> 4)

    # Negative/Positve check.  If 0xd; it is a negative value
    if (focus_half_byte & 0xf) == 0xd:
        value = value * -1

    # If no_decimals = 0; it is just an int
    if no_decimals == 0:
        return_int = int(value)
        return (return_int)
    else:
        return_float = value / pow(10, no_decimals)
        return (return_float)

Qlik Replicate – “SYS-E-HTTPFAIL, SYS-E-UNRECREQ, Unrecognized request pattern” problem

After a weekend of patching in our dev environment; we came in to discover that half of our Qlik Replicate Nodes were offline in Enterprise manager with the following error message

4 2022-08-08 13:19:24 [ServerDto      ] [ERROR] Test connection failed for server:MY_QRNODE. Message:'SYS-E-HTTPFAIL, SYS-E-UNRECREQ, Unrecognized request pattern 'GET: /servers/local/adusers/devdomain/qlik_user?directory_lookup=false'..'.

We could still access the node through their individual node’s web gui and the tasks were still happily running in the background.  The other half of our QR nodes were still online in Enterprise manager

The usual troubleshooting of restarting the services and restarting the server didn’t fix the problem; and the patching team washed their hands of the problem.  (I can’t really list the patches they applied due to security issues)

A database administrator mentioned it might be a TLS problem as the server team has been changing TLS settings in various places.  This lead me to comparing the connection settings between a QR node that was working to not working (we checked other things like patch levels etc).

Some servers had the username in domainusername format; while others didn’t.  Coincidentally the ones who had the  domainusername format were the ones not working,

Removing the domain section of the username resolved the problem.

So not sure what in the patching caused this issue; but something we have to check on our prod servers before the patches are applied to them.

The Fake LEFT OUTER JOIN

In my role I review a wide variety of SQL code.

Some is highly professional that is above my skill level so I absorb as much learning as I can from it.

On the other side is self taught SQLers from the business world who are trying to build weird and wonderful reports with SQL held together with sticky tape and don’t truly comprehend the code that is written.

An example – I see the following a lot with people who don’t understand the data and trying to program stability in their code like this:

SELECT 
    A.BDATE,
    A.ID,
    A.CUST_ID,
    C.CUSTOMER_NAME,
    C.PHONE
FROM dbo.ACCOUNTS A
LEFT OUTER JOIN dbo.CUSTOMER C
ON   C.BDATE = @DATE
     C.CUST_ID = A.CUST_ID 
WHERE
    A.BDATE = @DATE AND
    A.ACCOUNT_TYPE = 'XX';

This means if for some reason if there is an integrity problem between ACCOUNTS and CUSTOMER; they will still get the account data on the report.

The problem arises when the business requirement of the report evolve and the developer needs to filters on the left joined table. 

For instance – “Report on Accounts on type XX but remove deceased customers so we don’t ring them” 

I quite often see this:

SELECT 
    A.BDATE,
    A.ID,
    A.CUST_ID,
    C.CUSTOMER_NAME,
    C.PHONE
FROM dbo.ACCOUNTS A
LEFT OUTER JOIN dbo.CUSTOMER C
ON   C.BDATE = @DATE
     C.CUST_ID = A.CUST_ID 
WHERE
    A.BDATE = @DATE AND
    A.ACCOUNT_TYPE = 'XX' 

    AND

    C.DECEASED = 'N';

Without knowing it; the developer has broken their “Get all account XX report” business rule by turning their LEFT OUTER JOIN into effectively a INNER JOIN.

The correct way to write the query is:

SELECT 
    A.BDATE,
    A.ID,
    A.CUST_ID,
    C.CUSTOMER_NAME,
    C.PHONE
FROM dbo.ACCOUNTS A
LEFT OUTER JOIN dbo.CUSTOMER C
ON  C.BDATE = @DATE
    C.CUST_ID = A.CUST_ID 

    -------------------------------- 
    AND

    C.DECEASED = 'N'
    -------------------------------- 
WHERE
    A.BDATE = @DATE AND
    A.ACCOUNT_TYPE = 'XX' ;

You can get a good gauge of the skill level of the developer by checking their LEFT OUTER JOINS.  If they using them on every join where you know there is integrity between the two data objects; you can have an educated guess that you’re reviewing a novice developer.

This is where you have to pay extra attention to their joins and where predicates and make them aware of the consequences of the code.

Qlik Replicate – The Wide Text file problem

A wide problem

We had a new business requirement for our team to create a new source for a CSV through Qlik Replicate.

OK – simple enough; create the Source Connector and define the file’s schema in the table section of the connector.

But the file was 250 columns wide.

I had nightmares before of defining file schemas before in programs like SSIS where one little mistake can lead to hours of debugging.

A stroke of Luck

Luckily we knew the source of the CSV file was an output from a view of a database; so the first part of the problem was solved – we had the schema.  Now to import the schema into a Qlik Replicate task.

  1. Create a new File Source connector and prefill as much known information as you can
  2. In the Tables definition of the File Source; add a dummy field
  3. Create a NULL Target connector
  4. Create a dummy task with your new file source and the Null target and save it
  5. Export task with End Points to save out the json definition of the task

Creating the table definition in JSON

Knowing the view definition; a simple python script can convert it to json.

import csv
import json

if __name__ == '__main__':
  final_dic = {}
  source_file = "FileDef.txt"
  field_array = []

    with open(source_file, 'r') as f:
      reader = csv.reader(f, delimiter='t')
      schema_items = list(reader)

        for item in schema_items:
          focus_item = {}
          focus_item["name"] = item[0]
          focus_item["nullable"] = True
          focus_string = item[1]

            if item[1].find("NUMBER") != -1:
              focus_item["type"] = "kAR_DATA_TYPE_NUMERIC"
              tokens = focus_string[focus_string.find("(") + 1: -1].split(",")

                precision = tokens[0]
              focus_item["precision"] = int(precision)

                if len(tokens) == 2:
                  scale = tokens[1]
                  focus_item["scale"] = int(scale)

            elif item[1].find("VARCHAR2") != -1:  #VARCHAR2
              focus_item["type"] = "kAR_DATA_TYPE_STR"
              length = focus_string[focus_string.find("(") + 1 : focus_string.find("BYTE") - 1]
              focus_item["length"] = int(length)

            elif item[1].find("DATE") != -1:
              focus_item["type"] = "kAR_DATA_TYPE_TIMESTAMP"

          field_array.append(focus_item)

        columns = {}
      columns["columns"] = field_array

    f = open("out.json", "w")
  f.write(json.dumps(columns, indent=4))
  f.close()

FileDef.txt

The following text file contains the schema of the view with the table delimited columns of:

  • Column Name
  • Data type
  • Nullable
  • Column order
UDF_VARCHAR1 VARCHAR2(200 BYTE) Yes 1
UDF_VARCHAR2 VARCHAR2(200 BYTE) Yes 2
UDF_VARCHAR3 VARCHAR2(200 BYTE) Yes 3
UDF_NUMERIC1 NUMBER(10,6) Yes 4
UDF_NUMERIC2 NUMBER(10,6) Yes 5
UDF_NUMERIC3 NUMBER(10,6) Yes 6
UDF_INT1 NUMBER(10,0) Yes 7
UDF_INT2 NUMBER(10,0) Yes 8
UDF_INT3 NUMBER(10,0) Yes 9
UDF_DATE1 DATE Yes 10
UDF_DATE2 DATE Yes 11
UDF_DATE3 DATE Yes 12

This particular view definition is from Oracle so uses oracle data types; but it would be simple to change the code over to use a definition from a MS SQL database or other source.

Sticking it all back together

  1. Open up the exported task file json
  2. Find the dummy field that you created in the table definition
  3. Overwrite the dummy field with the json created in the script file
  4. Save and reimport the job.

If all goes well; the File source connector will be overwritten with the full table definition.