Code

Qlik Replicate – MS SQL dates to Confluent Avro

I am just writing a brief post about a conversation I was asked to join.

The question paraphrased:

For dates in an Microsoft SQL Server, when they get passed through Qlik Replicate to Kafka Avro – how are they interpreted? Epoch to utc? Or to the current time zone?

I didn’t know the answer so I created the following simple test:


CREATE TABLE dbo.JD_DATES_TEST
(
 ID INT IDENTITY(1,1) PRIMARY KEY,
 TEST_SMALLDATETIME SMALLDATETIME,
 TEST_DATE date,
 TEST_DATETIME datetime,
 TEST_DATETIME2 datetime2,
 TEST_DATETIMEOFFSET datetimeoffset
);

GO

INSERT INTO dbo.JD_DATES_TEST VALUES(current_timestamp, current_timestamp, current_timestamp, current_timestamp, current_timestamp);

SELECT * FROM dbo.JD_DATES_TEST;
/*
ID          TEST_SMALLDATETIME      TEST_DATE  TEST_DATETIME           TEST_DATETIME2              TEST_DATETIMEOFFSET
----------- ----------------------- ---------- ----------------------- --------------------------- ----------------------------------
1           2025-06-12 12:04:00     2025-06-12 2025-06-12 12:04:16.650 2025-06-12 12:04:16.6500000 2025-06-12 12:04:16.6500000 +00:00

(1 row(s) affected)


*/

On the other side after passing through Qlik Replicate and then onto Kafka in Avro format; we got:

{
  "data": {
    "ID": {
      "int": 1
    },
    "TEST_SMALLDATETIME": {
      "long": 1749729840000000
    },
    "TEST_DATE": {
      "int": 20251
    },
    "TEST_DATETIME": {
      "long": 1749729856650000
    },
    "TEST_DATETIME2": {
      "long": 1749729856650000
    },
    "TEST_DATETIMEOFFSET": {
      "string": "2025-06-12 12:04:16.6500000 +00:00"
    },
    "x_y": {
      "string": "1.0.0"
    }
  },
  "beforeData": null,
  "headers": {
    "operation": "REFRESH",
    "changeSequence": "",
    "timestamp": "",
    "streamPosition": "",
    "transactionId": "",
    "changeMask": null,
    "columnMask": null,
    "transactionEventCounter": null,
    "transactionLastEvent": null
  }
}

So 1749729856650000 equals Thursday, June 12, 2025 12:04:16.650 PM – which is local time.

“Idiots doing Idiot things” – The Top Eight Worst SQL Queries Ever

For over fifteen years I have worked with a Microsoft SQL Data Warehouse that is over twenty-five years old.  Although it is ancient in this world of huge cloud-based Data Vaults – it churns out an abundant of value to the organisation I work for. 

The problem with this value – it is opened to anyone who wants to access to build their queries off.  This means we get a range of SQL queries running on the database.  We have queries ranging from precise and efficient built queries that uses every trick in optimisation – all the way to “WTF are you trying to do?!” queries.

The second category is the one we have the most problem with.  With limited resources of an OnPrem SQL server; some uses write atrocious queries without a perceived care for other people using the database.  There is nothing more frustrating getting a support call out because of a failed data load; only to find the table is locked by someone running a query over the past six hours.

To counter this problem; I created a python script that checks what is running on the database every fifteen minutes.  If a user has a query running for over half an hour, I get an alert to show what they are running.

This allows our team to:

  1. Work out if someone is running a query that is blocking or taking resources away from our critical processes – resulting in timeouts
  2. With our experience help users optimise poorly running queries so they have a better outcome
  3. To stop any stupid queries from running

Since the queries are logged into a DB – I have four years of history of 14,000 unique queries running.  This gives me a lot of learning experiences of what users are running and what problems they face.

From this learning – this is the Top 10 most horrible queries I see running on our database.

8. The “Discovery” Query

This query come from new and experience users exploring the data of the database.  They will run something like this:

SELECT *
FROM dbo.A_VERY_LARGE_TABLE;

Look at the data, grab what they need and then just leave the query running in the background in their Server Management Studio.  When we contact them an hour later; they act surprise that the query is still running in the background – returning millions upon millions of rows to their client.

This is an easy fix with training.  Using the TOP command to limit the number of rows returned, use SP_HELP to return a schema of an object, provide a sandbox for people to explore in are all relatively simple fixes for this problem

7. The “Application Preview” Query

Our SAS application have a nasty default setting that if a user previews a data in a table; it will try and return the whole table – with the user completely unaware that this is happening. 

Even worse; if the application locks up returning the data; the user kills the application from task manager, open it back up and performs the same steps.  Since the first query was not killed, it will still be running on the database.  So quite often you will see half a dozen of the same queries running on the database, all starting at different times.

To prevent this – it is important to assess new DB client software to see what it is trying to do in the background of the database.  What does “preview” mean?  Return 10, 100 or all the rows?  Is there a setting that restricts the number of rows returned; or a timeout function that prevents complex views running forever just returning a small set of data?

If these features are available – it is important to either set this as default in the application; or part of the initial user setup / training program.

6. The “I am going to copy this to Excel and Analyse the Data” Query

This is like exploratory queries that come up.  The user is running a query like:

SELECT *
FROM dbo.A_VERY_LARGE_TABLE;

“Why are you running this query?” I politely enquire.
“Oh – I am going to copy the data into excel and search/pivot/do fancy things with it”
“Ummm – this table has 8 billion rows in it…”

I can sympathise with queries like these.

The user thinks, “Why run the same query over and over again to get the data that I want to explore and present in different ways.  Let just get one large cut and then manipulate it on the client.”

What users don’t realise that dealing with a huge amount of data on the client side is difficult.  Exporting data out of the query client, into an application like excel can be frustrating with memory and CPU limitations compared to a spec’d-out server.  Plus, Excel is not going to handle 8 billion rows.

The requirements of queries like these needs to be analysed – do the users need that atomic level data?  Can strategic aggregation tables be provided to the user, so they have more manageable data to return.  Can different technology like OLAP cubes be built on top of large tables to answer the user’s needs?

5. The “I got duplicates so I am going to get rid of them with a DISTINCT” Query

This is a pet peeve of mine.

The user has duplicate in their results and don’t know why. Instead of investigating why there is duplication (usually result of a poor join), they just slap a DISTINCT at the top of the query and call it done.  Here is a simplified example that I regularly come across:

DECLARE @ACCOUNTS TABLE
(
	ORG INT,
	ACCOUNT_ID INT,
	BRANCH INT,
	BALANCE NUMERIC(18,2),
	PRIMARY KEY (ORG, ACCOUNT_ID)
);

INSERT INTO @ACCOUNTS VALUES(1, 18, 60, 50);
INSERT INTO @ACCOUNTS VALUES(1, 19, 60, 150);

DECLARE @BRANCHES TABLE
(
	ORG INT,
	BRANCH INT,
	BRANCH_NAME VARCHAR(20),
	BRANCH_STATE VARCHAR(3),
	PRIMARY KEY (ORG, BRANCH)
);

INSERT INTO @BRANCHES VALUES(1, 60, 'Branch of ORG 1', 'VIC');
INSERT INTO @BRANCHES VALUES(2, 60, 'Branch of ORG 2', 'VIC');


SELECT 
	A.ORG,
	A.ACCOUNT_ID, 
	A.BRANCH,
	A.BALANCE,
	B.BRANCH_STATE
FROM @ACCOUNTS A
JOIN @BRANCHES B
ON	-- B.ORG = A.ORG	-- User forgot this predicate in the join
	B.BRANCH = A.BRANCH;

Results:

ORGACCOUNT_IDBRANCHBALANCEBRANCH_STATE
1186050.00VIC
1186050.00VIC
11960150.00VIC
11860150.00VIC

The user looks – “Oh Dear – I’ve got duplicates! Let’s get rid of them.”

SELECT DISTINCT
	A.ORG,
	A.ACCOUNT_ID, 
	A.BRANCH,
	A.BALANCE,
	B.BRANCH_STATE
FROM @ACCOUNTS A
JOIN @BRANCHES B
ON	-- B.ORG = A.ORG	-- User forgot this predicate in the join
	B.BRANCH = A.BRANCH;

This poses many problems:

  1. The Database must work harder using incomplete joins on indexes to bring across the data; and then work harder supressing the duplicates
  2. If the field list change; then the distinct might not work anymore.

For example, with the above query – if the user brings in the column “BRANCH_NAME” then the duplicates return:

SELECT DISTINCT
	A.ORG,
	A.ACCOUNT_ID, 
	A.BRANCH,
	A.BALANCE,
	--------- Add in new column ---------
	B.BRANCH_NAME,
	-------------------------------------
	B.BRANCH_STATE
FROM @ACCOUNTS A
JOIN @BRANCHES B
ON	-- B.ORG = A.ORG	-- User forgot this predicate in the join
	B.BRANCH = A.BRANCH;
ORGACCOUNT_IDBRANCHBALANCEBRANCH_NAMEBRANCH_STATE
1186050.00Branch of ORG 1VIC
1186050.00Branch of ORG 2VIC
11960150.00Branch of ORG 1VIC
11860150.00Branch of ORG 2VIC

Preventing queries like this comes down to user experience and training.  If they see duplicates in their data – their initial thoughts should be “Why do I have duplicates?”   

Duplicates might be legitimate and a DISINCT might be OK – but quite often it is because of a bad join.  For an inexperience user – it might be overwhelming to break down a large table to find where the duplicates are coming from, and they might need help from a more experienced user.  This is where internal data forums are useful – where people can help each other with their problems.

4. The “Ever lengthening” Query

This is a query I see scheduled daily for users getting trends over time for a BI tool:

SELECT *
FROM dbo.A_TABLE A
JOIN dbo.B_TABLE B
ON	B.KEY_1 = A.KEY_1
WHERE
	--------------- Start Date ---------------
	A.BUSINESS_DATE >= '2025-01-01' AND
	------------------------------------------
	A.PREDICATE_1 = 'X' AND
	B.PREDICATE_2 = 'Y';

Initially it starts off OK – running fast and the user is happy.  But as time goes by; the query gets slower and slower; returning more and more data.  Eventually it gets to the point the query goes off into the nether and never returns. Since it is always starting the from the same point; it is constantly re-querying the same data over an over again

There are a couple of options you can handle queries like these:

  1. Really examine the requirements of the user’s needs. How much data is really relevant for their trending report? In 2027; will data from 2025 be useful for the observer of the report?
  2. Create a table specific for this report and append a timeframe of data (eg daily) onto it. The report than can just query this specific table and not having to regenerate complex joins over and over again for previous timeframes.

3. The “Linked Database” Query

With the growth of our Data Warehouse, other database on different servers started building process flows off our database; sometimes without out knowledge that they are doing this until we see a linked server connection coming up.

What pattern they use our database is where the problems lie.  One downstream database runs this query every day:

SELECT *
FROM REMOTE_SERVER.SOME_DATABASE.dbo.A_BIG_HISTORICAL_TABLE;

So, they are truncating a landing table on their database and grabbing the whole table.  Hundreds of millions of rows transferring across a slow linked server; taking hours and hours.  And with time; this will get slower and slower as the source table grows its history.

This is a tricky one to tackle with the downstream users.  Odds are if they design a cumbersome process like this their appetite for change to a faster (yet more complex solution) might not be high; especially if their process ‘works’.

If you can get around the political hurdle and convince them the process must change; there are many options to improve the performance.

  1. Use a ETL tool to transfer the data across.  Even a simple python script to copy batch by bath files across will be quicker than using a linked server.
  2. Assess the downstream use cases for the data.  Do they need the whole table with the complete history to satisfy their requirements?  Do they need all the columns; or can some be trimmed off to reduce the amount of data getting transferred?
  3. If the source table a historical table; only bring across the period deltas.  With the downstream process I am having trouble with now; they are bringing across twenty-one million rows daily.  If they only bring across the deltas from the night loads it brings across a whopping five thousand rows.  That is 0.02% of the total row count.  Add in a row count check between the two systems to have confidence that the two tables are in sync and the process will be astronomically quicker.

2. The “I don’t know where that query comes from” Query.

We have a continuing problem from a Microsoft Power BI report that runs a query that looks at data from the beginning of the month to the current date.  It was a poorly written query and therefore as the month went along its performance got worse and worse.

Since the username associated with the query was from the Power BI server’s account – we had no idea who the query belonged to as a couple of simple fixes could drastically improve the performance.

We contacted the team that manages Power BI server and asked them to find the query and update the sql code. 

They said it was completed, and the report sql was updated.

But soon the sql was back – running for hours and hours.

We contacted the team again –

“Hey that query is back.”

They try updating the report again; but soon the query was back again.

So, either someone was constantly uploading an original copy – or another copy of the report was buried somewhere on the server that the admins could not find. 

Since we do not have access to their system, it hard to determine what is happening.

The barbaric solution would be to block the Power BI user from our system; but goodness knows how many critical business processes that will disrupt. 

The best solution at the minute we are doing is just constantly killing the running SQL code on the database with the hope that someone will identify the constantly failing report.  This can be tricky as well if the Power BI server automatically tries rerunning failed reports.

1. The “I don’t listen to your advice” Query.

Disclaimer – this is a frustration rant that I have to get off my chest.

We have a user that constantly runs a long running crook query.  As in when I collated all the long running queries in research for this post – his was at the top by a significant margin.

The problem with the query itself is a join using an uncommonly used business key in a table that is not indexed.  The fix itself is quite simple fix; use the primary keys in join.

But he has been running the same code for months – locked our nightly loads several times and caused incidents.

We tried the carrot approach.  “Hey here is optimised code.  It will make your query run quicker and be less burden on our database.” Got indifferent replies. 

More locked loads we included his manager into correspondence, but she did not seem to want to be involved.

With my carrot approach communication, I gained the impression that he had a self-righteous personality and thinks his role and report is above question.

Enough was enough – it was the stick approach time.

We raised an operational risk against his activity.

The Ops risk manager asks, “Do you have evidence that you attempted to reason and help?”

“Yep,” I replied, “Here is a zipped-up folder of dozens of emails and chat messages that I sent him.”

Senior managers were involved; with my senior manager commenting to my manager in the background “Is that guy all there?”

Anyway, my manager wrote a very terse email saying to correct their query and if they crash the loads again; they will get their access removed from the database.

No committal reply.  No apology.  And to this day they are still running the same query; but just under the radar that it is not locking our loads. 

I am watching him like a hawk – waiting for the day that I can cut his access from the database.

It is a pity that our Database does not have a charge back capability of processes used.  I bet if his manager got a bill of $$$ from one person running one query; she would be more proactive in this problem. 

In retrospect, I would have campaigned to have his access cut right away and make him justify why he should have it back.  When there is hundreds of other people doing the right thing on the database; it is not fair that their deliverables are getting impacted by the indifference attitude of one user.

Installing Animal Shelter Manager 3 on Docker – So close

Intro – Do Unto what the Sister In-law commands

My sister in-law runs a dog training business in Queensland. Part of her business is to track records of her canine clients – especially notes, vaccinations when they’re due, medical records and certificates.

In a previous job she had experience with Animal Shelter Manager (ASM3). She’s familiar with the features and interface to know it will cover her needs.

Her business is not big enough to justify the price of the SaaS version of ASM3; so being the tech savvy (debatable) one the family – it was my task getting it up and running for her.

This lead to several nights of struggling, annoyance and failure.

Fitting the pieces together

To start off I wanted to get a demo version running so I can see what I need to do to deploy it for her.

I checked out the ASM3 github repository and yes! There is a Dockerfile and a docker-compose.yml.

But no – it is six years old and didn’t work when I tried to build it. I tried a couple of other miscellaneous sites offering hope; but to no avail.

In the end after many google searches; I stumbled across https://lesbianunix.dev/about with the following guide:

Apart from a domain name that was sure to set off all the “appropriate content” filters at work; with a few modifications I could get it to work. Looking at the instructions from the author Ræn; it is substantially different to the old Dockerfile and the instructions on the ASM3 home page.

Let’s build it

With my base working version – I cobbled up some Dockerfiles for ASM3 and postres and a docker compose file to tie them together:

https://github.com/jon-donker/asm3_docker

(Note that this is not a production version and have to obscure passwords etc in the final version)

The containers build just fine and fire up with no problem.

Vising the website

http://localhost/

ASM3 redirects and builds the database – but then goes to a login page. I enter the username and password; but it loops back to the login page.

I think the problem lies with the base_url and service_url in asm3.conf; possibly with http-asm3.conf settings.

Anyway – I logged a issue with ASM3 see if it is something simple that I missed; or maybe I have to start pulling apart of source code to find what it is trying to do.

I’ll update this post when I find something.

Qlik Replicate to AWS Postgres (Why are you so slow?!)

Wait – Performance problems writing to an AWS RDS Postgres database? 

Haven’t we been here before?

Yes, we had, and I wrote quite a few posts of the trials and tribulations that we went through.

But this is a new problem that we came across that resulted in several messages backwards and forwards between us and Qlik before we worked out the problem.

Core System Migration.

Our organisation has several core systems; making maintenance of them expensive and holding us back in using the data in these systems in modern day tools like advance AI.  Over the past several years – various projects are running to consolidate the systems together. 

This is all fun and games for the downstream consumers – as they have lots of migration data coming down the pipelines.  For instance, shell accounts getting created on the main system from the sacrificial system.

One downstream system wanted to exclude migration data from their downstream data and branch the data into another database so they can manipulate the migrated data to fit it into their pipeline.

I created the Qlik Replicate task to capture the migration data.  It was a simple task to create.  Unusually, the downstream users created their own tables that they want me top pipe the data into.  In the past, we let Qlik Replicate create the table in the lower environment, copy the schema and use that schema going forwards.

Ready to go we fired up the task in testing to capture the test run through of the migration.

Slow.  So Slow.

The test migration ran on the main core system, and we were ready to capture the changes under the user account running the migration process.

It was running slow.  So slow.

We knew data was getting loaded as we were periodically running a SELECT COUNT(*) on the destination table.  But we were running at less than 20tps.

Things we checked:

  1. The source and target databases were not under duress.
  2. The QR server (although a busy server) CPU and Memory wasn’t maxed out.
  3. There were no critical errors in the error log.
  4. Records were not getting written to the attrep_apply_exceptions table.
  5. There were no triggers built off the landing table that might be slowing down the process
  6. We knew from previous testing that we could get a higher tps.

I bumped up the logging on “Target Apply” to see if we can capture more details on the problem.

One by One.

After searching the log files, we came across an interesting message:

00007508: 2025-02-20T08:33:24:595067 [TARGET_APPLY    ]I:  Bulk apply operation failed. Trying to execute bulk statements in 'one-by-one' mode  (bulk_apply.c:2430)

00007508: 2025-02-20T08:33:24:814779 [TARGET_APPLY    ]I:  Applying INSERTS one-by-one for table 'dbo'.'DESTINATION_TABLE' (4)  (bulk_apply.c:4849)

For some reason instead of using a bulk load operation – QR was loading the records one by one.  This accounted for the slow performance.

But why was it switching to this one-by-one mode?  What caused the main import of bulk insert to fail – but one-by-still works.

Truncating data types

First, I suspected that the column names might be mismatched.  I got the source and destination schema out and compared the two. 

All the column names aligned correctly.

Turning up the logging we got the following message:

00003568: 2025-02-19T15:24:50 [TARGET_APPLY    ]T:  Error code 3 is identified as a data error  (csv_target.c:1013)

00003568: 2025-02-19T15:24:50 [TARGET_APPLY    ]T:  Command failed to load data with exit error code 3, Command output: psql:C:/Program Files/Attunity/Replicate/data/tasks/MY_CDC_TASK_NAME/data_files/0/LOAD00000001.csv.sql:1: ERROR:  value too long for type character varying(20)

CONTEXT:  COPY attrep_changes472AD6934FE46504, line 1, column col12: "2014-08-10 18:33:52.883" [1020417]  (csv_target.c:1087)

All the varchar fields between the source and target align correctly.

Then I noticed the column MODIFIED_DATE.  On the source it is a datetime; while on the postgres target it is just a date.

My theory was that the bulk copy could not handle the conversion – but in the one-by-one; it could truncate the time component off the date and successfully load.

The downstream team changed the field from a date to a timestamp and I reloaded the data.  With this fix the task went blindingly quick; from hours for just a couple of thousands of records to all done within minutes.

Conclusion

I suppose the main conclusion from this exercise is that “Qlik Replicate Knows best.” 

Unless you have a very accurate mapping process from the source to the target; let QR create the destination table in a lower environment.  Use this table as a source and build on it. 

It will save a lot time and heartache later on.

Postgres: EBCDIC decoding through a JavaScript Function

EBCDIC? Didn’t that die out with punch cards and the Dinosaurs?

EBCDIC (Extended Binary Coded Decimal Interchange Code) is an eight-bit character encoding that was created by IBM in the ’60s.

While the rest of the world went on with ASCII and UTF-8; we still find fields in our DB2 database encoded in EBCDIC 037 just to make our lives miserable.

Qlik Replicate when replicating from these fields on its default settings; brings it across as a normal “string” and becomes quite unusable when loaded into a destination system.

Decoding EBCDIC in Postgres

To have the flexibility to decode particular fields in EBCDIC; we need to bring the fields across as BYTES instead of that QR suggests. This can be done in the Table Settings for the table in question:

On the destination Postgres database; load the table into a bytea field.

Now with a udf function in Postgres; we can decode the EBCDIC bytes fields into something readable:

CREATE OR REPLACE FUNCTION public.fn_convert_bytes2_037(
    in_bytes bytea)
    RETURNS character varying
    LANGUAGE 'plv8'
    COST 100
    VOLATILE PARALLEL UNSAFE
AS $BODY$
    const hex_037 = new Map([
        ["40", " ",],
        ["41", " ",],
        ["42", "â",],
        ["43", "ä",],
        ["44", "à",],
        ["45", "á",],
        ["46", "ã",],
        ["47", "å",],
        ["48", "ç",],
        ["49", "ñ",],
        ["4a", "¢",],
        ["4b", ".",],
        ["4c", "<",],
        ["4d", "(",],
        ["4e", "+",],
        ["4f", "|",],
        ["50", "&",],
        ["51", "é",],
        ["52", "ê",],
        ["53", "ë",],
        ["54", "è",],
        ["55", "í",],
        ["56", "î",],
        ["57", "ï",],
        ["58", "ì",],
        ["59", "ß",],
        ["5a", "!",],
        ["5b", "$",],
        ["5c", "*",],
        ["5d", ")",],
        ["5e", ";",],
        ["5f", "¬",],
        ["60", "-",],
        ["61", "/",],
        ["62", "Â",],
        ["63", "Ä",],
        ["64", "À",],
        ["65", "Á",],
        ["66", "Ã",],
        ["67", "Å",],
        ["68", "Ç",],
        ["69", "Ñ",],
        ["6a", "¦",],
        ["6b", ",",],
        ["6c", "%",],
        ["6d", "_",],
        ["6e", ">",],
        ["6f", "?",],
        ["70", "ø",],
        ["71", "É",],
        ["72", "Ê",],
        ["73", "Ë",],
        ["74", "È",],
        ["75", "Í",],
        ["76", "Î",],
        ["77", "Ï",],
        ["78", "Ì",],
        ["79", "`",],
        ["7a", ":",],
        ["7b", "#",],
        ["7c", "@",],
        ["7d", "'",],
        ["7e", "=",],
        ["7f", ","],
        ["80", "Ø",],
        ["81", "a",],
        ["82", "b",],
        ["83", "c",],
        ["84", "d",],
        ["85", "e",],
        ["86", "f",],
        ["87", "g",],
        ["88", "h",],
        ["89", "i",],
        ["8a", "«",],
        ["8b", "»",],
        ["8c", "ð",],
        ["8d", "ý",],
        ["8e", "þ",],
        ["8f", "±",],
        ["90", "°",],
        ["91", "j",],
        ["92", "k",],
        ["93", "l",],
        ["94", "m",],
        ["95", "n",],
        ["96", "o",],
        ["97", "p",],
        ["98", "q",],
        ["99", "r",],
        ["9a", "ª",],
        ["9b", "º",],
        ["9c", "æ",],
        ["9d", "¸",],
        ["9e", "Æ",],
        ["9f", "¤",],
        ["a0", "µ",],
        ["a1", "~",],
        ["a2", "s",],
        ["a3", "t",],
        ["a4", "u",],
        ["a5", "v",],
        ["a6", "w",],
        ["a7", "x",],
        ["a8", "y",],
        ["a9", "z",],
        ["aa", "¡",],
        ["ab", "¿",],
        ["ac", "Ð",],
        ["ad", "Ý",],
        ["ae", "Þ",],
        ["af", "®",],
        ["b0", "^",],
        ["b1", "£",],
        ["b2", "¥",],
        ["b3", "·",],
        ["b4", "©",],
        ["b5", "§",],
        ["b6", "¶",],
        ["b7", "¼",],
        ["b8", "½",],
        ["b9", "¾",],
        ["ba", "[",],
        ["bb", "]",],
        ["bc", "¯",],
        ["bd", "¨",],
        ["be", "´",],
        ["bf", "×",],
        ["c0", "{",],
        ["c1", "A",],
        ["c2", "B",],
        ["c3", "C",],
        ["c4", "D",],
        ["c5", "E",],
        ["c6", "F",],
        ["c7", "G",],
        ["c8", "H",],
        ["c9", "I",],
        ["ca", "­",],
        ["cb", "ô",],
        ["cc", "ö",],
        ["cd", "ò",],
        ["ce", "ó",],
        ["cf", "õ",],
        ["d0", "}",],
        ["d1", "J",],
        ["d2", "K",],
        ["d3", "L",],
        ["d4", "M",],
        ["d5", "N",],
        ["d6", "O",],
        ["d7", "P",],
        ["d8", "Q",],
        ["d9", "R",],
        ["da", "¹",],
        ["db", "û",],
        ["dc", "ü",],
        ["dd", "ù",],
        ["de", "ú",],
        ["df", "ÿ",],
        ["e0", "\\",],
        ["e1", "÷",],
        ["e2", "S",],
        ["e3", "T",],
        ["e4", "U",],
        ["e5", "V",],
        ["e6", "W",],
        ["e7", "X",],
        ["e8", "Y",],
        ["e9", "Z",],
        ["ea", "²",],
        ["eb", "Ô",],
        ["ec", "Ö",],
        ["ed", "Ò",],
        ["ee", "Ó",],
        ["ef", "Õ",],
        ["f0", "0",],
        ["f1", "1",],
        ["f2", "2",],
        ["f3", "3",],
        ["f4", "4",],
        ["f5", "5",],
        ["f6", "6",],
        ["f7", "7",],
        ["f8", "8",],
        ["f9", "9",],
        ["fa", "³",],
        ["fb", "Û",],
        ["fc", "Ü",],
        ["fd", "Ù",],
        ["fe", "Ú"]
    ]);
 
    let in_varchar = "";
    let build_string = "";
     
    for (var loop_bytes = 0; loop_bytes < in_bytes.length; loop_bytes++)
    {
        /* Converts a byte character to a hex representation*/
        let focus_char = ('0' + (in_bytes[loop_bytes] & 0xFF).toString(16)).slice(-2); 
        let return_value = hex_037.get(focus_char.toLowerCase());
 
        /* If no mapping found - replace the character with a space */
        if(return_value === undefined)
        {
            return_value = " ";
        }
 
        build_string = build_string.concat(return_value)
    }
 
    return build_string
$BODY$;

The function can now be used in SQL:

SELECT public.fn_convert_bytes2_037(my_EBCDIC_byte_column)
FROM public.foo;

Reference

JavaScript bytes to HEX string function: Code Shock – How to Convert Between Hexadecimal Strings and Byte Arrays in JavaScript

Postgres JavaScript missing variables (But it is #$%^ there!)

It’s OK

I only cried and contemplated quitting working in IT and becoming a Nomad for a couple of hours.

But I got there in the end; but the following error message will probably plague my nightmares for a couple of weeks:

ERROR:  ReferenceError: inNumber1 is not defined
CONTEXT: fn_js_number_adder() LINE 2: let total = inNumber1 + inNumber2

SQL state: XX000

JavaScript: When in Rome – Do what the Romans do

My job today was to write a JavaScript function in Postgres to convert byte hex values to EBCDIC 037. The aim is to decommission some duplicate pipelines coming from our DB2 database by Qlik Replicate that deliver ASCII converted fields as well as the EBCDIC version.

I haven’t worked in JavaScript since my Uni days and well entrenched in the Python world for my day to day job. Over the past years I have converted using naming conventions in code from camelCase to under_score to match Python’s standard.

So going back to JavaScript – I knew that camelCase is the expected format. Since I didn’t know where my code was going to end up; I wanted it to look professional as it is a reflection on me.

So I wrote a JavaScript function paraphrased as:

CREATE OR REPLACE FUNCTION fn_js_number_adder(inNumber1 numeric, inNumber2 numeric)
RETURNS numeric
as
$$
	let total = inNumber1 + inNumber2

	return total
$$
LANGUAGE plv8;

Looks good – compiles with no errors.

But when I went to test it; I get the following error:

The error drove me crazy! It’s THERE! The variable is THERE!

The original function was a lot more extensive than above so I cut as much out of it as possible in case something else was causing the variable not to be recognised.

Still no luck.

I went to the functions section in pgAdmin as I wanted to compare it against an existing function I created to see what the difference was.

Interesting…

The function’s parameters have changed from inNumber1 and inNumber2 to innumber1 and innumber2.

Scripting out the function I got:

CREATE OR REPLACE FUNCTION public.fn_js_number_adder(
	innumber1 numeric,
	innumber2 numeric)
    RETURNS numeric
    LANGUAGE 'plv8'
    COST 100
    VOLATILE PARALLEL UNSAFE
AS $BODY$
	let total = inNumber1 + inNumber2

	return total
$BODY$;

So; either postgres or pgAdmin changed the case of the parameters from camelCase to lower case. This caused the variable not to be found later in the code.

The fix – Back to under_scores we go

My fix for this instance (whether standard or not) is to go back to under_scores:

CREATE OR REPLACE FUNCTION fn_js_number_adder(in_number1 numeric, in_number2 numeric)
RETURNS numeric
as
$$
	let total = in_number1 + in_number2

	return total
$$
LANGUAGE plv8;

This works and I could run the function

With the naming conventions; I suppose using under_score isn’t too much of a sin since it is a standard on databases. If you want to stay true to camelCase; the parameters can just be in under_score and the rest of the variables be in camelCase.

At lest it is working…now onto EBCDIC conversion.

Qlik Replicate: Oh Oracle – you’re a fussy beast

It’s all fun and games – until Qlik Replicate must copy 6 billion rows from a very wide Oracle table to GCS…

…in a small time window

…with the project not wanting to perform Stress and Volume testing

Oh boy.

Our Dev environment had 108 milling rows to play with, which ran “quick” in relationship the amount of data it had to copy.  But being 33 times smaller; even if it takes an hour in Dev – extrapolating the time out will relate to over 30 hours of run time.

The project forged ahead in the implementation and QR only processed 2% of the changes before we ran out of the time window.

 The QR servers didn’t seem stressed performance wise; had plenty of CPU and RAM.  I suspect the bottle neck was in the bandwidth going out to GCS; but there was no way to monitor how much of the connection has been used.

When in doubt – change the file type

After the failed implementation, we tried to work out how we can improve the throughput to GCS in our dev environment.

I thought changing the destination’s file type might be a way.  JSON is a chunky file format, and my hypothesis was if the JSON was compressed it would transfer to GCS quicker.  We tested out a NULL connector, raw JSON, GZIP JSON and Parquet.  As a test using Dev – we let a test task run for 20min to see how much data is copied across.

Full Load Tuning:

  • Transaction consistency timeout (seconds): 600
  • Commit rate during full load: 100000

Endpoint settings:

  • Maximum file size(KB): 1000000 KB (1GB)

Results

Unfortunately, my hypothesis on compressed JSON was incorrect.  We speculated that compressing the JSON might have been taking up as much time as transferring it.  I would have like to test this theory on a quieter QR server, but time is of the essence.

Parquet seemed to be the winner with the limited testing offering a nice little throughput boost over the JSON formats.  But it wasn’t the silver bullet to our throughput problems. Added onto this; the downstream users would need to spend time modifying their ingestion pipelines.

Divide and conquer – until Oracle says no.

The next stage was to look if we could divide the table up into batches and transfer across section at a time.  Looking at the primary key; it was an identity column that had little meaningful relation to easily divide up into batches.

There was another indexed column called RUN_DATE; which is a date relation to when the record was entered.

OK – let’s turn on Passthrough filtering and test it out.

First of all to test the syntax out in SQL Developer

SELECT COUNT(*)
FROM xxxxx.TRANSACTIONS
WHERE
    RUN_DATE >= '01/Jan/2023' AND
    RUN_DATE < '01/Jan/2024';

The query ran fine meaning that the date syntax was right.

Looking good – let’s add the filter to the Full Load Passthru Filter

But when running the task; it goes into “recoverable error” mode.

Looking into the logs:

00014204: 2024-11-26T08:38:26:184700 [SOURCE_UNLOAD   ]T:  Select statement for UNLOAD is 'SELECT "PK_ID","RUN_DATE", "LOTS", "OF", "OTHER, "COLUMNS"  FROM "xxxxx"."TRANSACTIONS" WHERE (RUN_DATE >= '01/Jan/2023' AND RUN_DATE &lt; '01/Jan/2024')'  (oracle_endpoint_utils.c:1941)
00014204: 2024-11-26T08:38:26:215961 [SOURCE_UNLOAD   ]T:  ORA-01858: a non-numeric character was found where a numeric was expected  [1020417]  (oracle_endpoint_unload.c:175)
00014204: 2024-11-26T08:38:26:215961 [SOURCE_UNLOAD   ]T:  Failed to init unloading table 'xxxxx'.'TRANSACTIONS' [1020417]  (oracle_endpoint_unload.c:385)
00014204: 2024-11-26T08:38:26:215961 [SOURCE_UNLOAD   ]E:  ORA-01858: a non-numeric character was found where a numeric was expected  [1020417]  (oracle_endpoint_unload.c:175)
00014204: 2024-11-26T08:38:26:215961 [SOURCE_UNLOAD   ]E:  Failed to init unloading table 'xxxxx'.'TRANSACTIONS' [1020417]  (oracle_endpoint_unload.c:385)
00014204: 2024-11-26T08:38:26:215961 [SOURCE_UNLOAD   ]T:  Error executing source loop [1020417]  (streamcomponent.c:1942)
00014204: 2024-11-26T08:38:26:215961 [SOURCE_UNLOAD   ]T:  Stream component 'st_1_SRC_DEV_B1_xxxxx' terminated [1020417]  (subtask.c:1643)
00014204: 2024-11-26T08:38:26:215961 [SOURCE_UNLOAD   ]T:  Free component st_1_SRC_DEV_B1_xxxxx  (oracle_endpoint.c:51)
00011868: 2024-11-26T08:38:26:215961 [TASK_MANAGER    ]I:  Task error notification received from subtask 1, thread 0, status 1020417  (replicationtask.c:3603)
00014204: 2024-11-26T08:38:26:215961 [SOURCE_UNLOAD   ]E:  Error executing source loop [1020417]  (streamcomponent.c:1942)
00014204: 2024-11-26T08:38:26:215961 [TASK_MANAGER    ]E:  Stream component failed at subtask 1, component st_1_SRC_DEV_B1_xxxxx  [1020417]  (subtask.c:1474)
00014204: 2024-11-26T08:38:26:215961 [SOURCE_UNLOAD   ]E:  Stream component 'st_1_SRC_DEV_B1_xxxxx' terminated [1020417]  (subtask.c:1643)
00011868: 2024-11-26T08:38:26:231570 [TASK_MANAGER    ]W:  Task 'TEST_xxxxx' encountered a recoverable error  (repository.c:6200)

Error code ORA-01858 seems to be the key to the problem. As an experiment I copied out the select code and ran it into SQL Developer.

Works fine 🙁

OK – maybe it is a quirk of SQL Developer?

Using sqlplus I ran the same code from the command line.

Again works fine 🙁

Resorting to good old Google – I searched ORA-01858.

The top hit was this article from Stack Overflow that recommended confirming the format of the date with the TO_DATE function.

OK Oracle; if you want to be fussy with your dates – let’s explicitly define the date format with TO_DATE in the Full Load Passthru Filter.

RUN_DATE >= TO_DATE('01/Jan/2023','DD/Mon/YYYY') AND RUN_DATE < TO_DATE('01/Jan/2024','DD/Mon/YYYY')

Ahhhh – that works better and Qlik Replicate now runs successfully with the passthrough filter.

Conclusion

I tried a different set of date formats; including an ISO date format and Oracle spat them all out. So using TO_DATE is the simplest way to avoid the ORA-01858 error. I can understand Oracle refusing to run on a date like 03/02/2024; I mean is it the 3rd of Feb 2024; or for Americans the 2nd of Mar 2024? But surprised something very clear like an ISO date format; or 03/Feb/2024 did not work.

Maybe how SQL Developer and SQLplus interacts with the database is different than QR that leads to the different in behaviour of how filters on dates work.

Python & openpyxl – If the fraud team tells you to do something; do it quickly!

A member of our Fraud team contacted me unexpectedly at work.

Hey. I got this excel workbook of a customer’s statement with the data spread across multiple sheets. Do you know how to consolidate them into one sheet?

“OK,” I thought, “It can’t be that hard.”

She sent me the workbook. It had 447 sheets in it! The statement was initially in a PDF format and the Fraud team exported it to Excel. This resulted in a sheet for each page; looking like this:

The data that we were interested in was between A7 and E29 on each page. I visually spot checked half a dozen sheets across the workbook and they were in similar format

Ditching VBA for openpyxl

In yesteryear; I would have VBA inside Excel to consolidate the sheets. But I felt it was time to grow up from VBA and see what python had to offer.

After a quick Google; I came across Data Camp’s Python Excel Tutorial.

It was surprisingly quick and easy to program up. I honestly thought it would have taken me as long; if not longer in VBA.

Here is the scratch code for your perusal

import openpyxl 
import logging
from datetime import datetime
import csv

# Formats a row into a format for the delimited file
def format_row(in_record):

    try:
        in_record[0] = datetime.strftime(in_record[0], "%Y-%m-%d")  # Converts the python date to YYYY-MM-DD
    except TypeError as e:
        root_logger.warning("Cannot convert " + in_record[0] + " to date/time")

    in_record[1] = in_record[1].replace("\n", " \u21B5")    # Adds in a return arrow

    if in_record[2] == None:
        in_record[2] = ""

    if in_record[3] == None:
        in_record[3] = ""
    
    return in_record


if __name__ == '__main__':

    now = datetime.now()

    log_formatter = logging.Formatter("%(asctime)s [%(threadName)-12.12s] [%(levelname)-8.8s]  %(message)s")
    root_logger = logging.getLogger()
    root_logger.setLevel(logging.INFO)

    log_console_handler = logging.StreamHandler()
    log_console_handler.setFormatter(log_formatter)
    root_logger.addHandler(log_console_handler)

    final_array = []

    wb = openpyxl.load_workbook("CONSOLIDATEDExcel.xlsx") 

    for focus_sheet_name in wb.sheetnames:

        root_logger.info("Processing sheet: " + focus_sheet_name)
        
        focus_sheet = wb[focus_sheet_name]
        root_logger.debug("Total number of rows: " + str(focus_sheet.max_row) + '. And total number of columns: ' + str(focus_sheet.max_column))


        if focus_sheet["A6"].value == "Date":   # Checks for the key field "Date" in position A6

            my_list = []

            for sheet_row in focus_sheet.iter_rows(
                min_row=7, max_row=26, min_col=1, max_col=5,    # Gets specific section of the worksheet
                values_only=True):

                sheet_row_list = list(sheet_row)

                if sheet_row_list[0] != None:   # Ignore lines that are blank
                    final_array.append(format_row(sheet_row_list))
        else:
            root_logger.info("Cannot find key column \"DATE\".  Skipping sheet")


    # Output records to a tab delimited file
    with open("Consolidated_output.tab", 'w', newline='', encoding="utf-8") as myfile:
        wr = csv.writer(myfile, quoting=csv.QUOTE_ALL, delimiter="\t")
        wr.writerows(final_array)
        

Also; it ran blindingly quick – as quick as I feel in VBA with Application.ScreenUpdating = False

Anyway; Fraud team was happy I turned their request around quickly. I’m guessing if the fraud team is involved – someone is about to have a bad day.

Docker, Qlik Replicate and Postres – stitching them together

So far – what do we have?

We have:

  1. Postges working in a docker container
  2. Qlik Replicate working in a docker container

Let’s see if we can get them talking to each other

Working on Postgres Config files

First modification is needed is to the postgresql.conf for the postgres docker image. If you were following the docker-compose.yml in my previous post; you can find the file under:

~/apps/postgres-plv8/postgresql.conf

The following changes need to be made in postgresql.conf as found in the Qlik documentation

wal_level = logical
max_replication_slots = 3       # max number of replication slots

Now to added the following lines to

~/apps/postgres-plv8/pg_hba.conf
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     all                                     trust
host    replication     all             127.0.0.1/32            trust
host    replication     all             ::1/128                 trust

Once done – the docker container needs to be restarted for the changes to take effect.

Setting up QR

Log into QR using the address:

https://127.0.0.1:3552/attunityreplicate

Create a new postgres connection

Fill out the following fields:

Field Value
Serverhost.docker.internal
Port9432
UsernameUsername from the docker-compose.yml file for postgres
PasswordPassword from the docker-compose.yml for postgres
DatabaseDatabase from docker-compose.yml for postgres

From here a QR task can be built to read from the postgres database

Qlik Replicate: You’re trapped in a Docker container now!

In Qlik Replicate we tasks unable to resume when we have nasty server failures (for instant the CrowdStrike outage in July 2024).

This only happens in tasks that are impacted are a RDBMS to a cloud storage system like AWS S3 or GCS. 

In the task log the error message takes the form of:

00002396: 2022-08-26T15:21:14 [AT_GLOBAL ]E: Json doesn't start with '{' [1003001] (at_cjson.c:1773)
00002396: 2022-08-26T15:21:14 [AT_GLOBAL ]E: Cannot parse json: [1000251] (at_protobuf.c:1420)

This error gives us problems; I can’t resume the task as the error re-appears.  I can’t even start it from the stream position and must rely on restarting the QR task from a timestamp, which is extremely dangerous with the chance of missing out on data for that split of a second.

I suspect the problem is that the “staging” file on the QR server gets corrupted mid write when the server fails and when resume; QR can’t parse it.

But trying to recreate the problem in a safe environment to diagnose it is tricky.  Our DTL environment doesn’t create enough traffic to trigger the issue.  Also, I don’t want to be abruptly turning off our DTL QR servers and interrupting other people’s testing.  As for trying to recreate the problem in production – the pain of all the red tape is not worth the effort.

I needed a safer space to work in.  A space when I can pump through large volumes of data through QR and kick the QR service around trying to provoke the error.  Armed with my little Linux VM – docker containers was the answer.

CentOS? Why CentOS?

My goal was to build a Docker container with Qlik Replicate and Postgres drivers so I can use it on my Linux VM.

Under Support articles, Qlik has a guide on how to run Qlik Replicate in a Docker container.

Following the instructions I ran into some initial problems.  The first major problem was using the Cent OS docker image.  The issue was that I must use the packages in my company’s artifactory and not external packages.  Although the company had CentOS; there was no other packages available to update and install.  Since my VM cannot reach http://vault.centos.org; the CentOS image was a lame duck.

With CentOS off the cards, I had to use Redhat image that my company provided.  With Redhat – the artifactory had all the packages that I needed.

The second problem was that I was wanting to use the 2023.11 image to match our environment.  With 2023.11 there are some extra steps needed in the docker file compared to 2024.05.  The differences is notated on Qlik’s support article.

The Dockerfile

Here is the Dockerfile

FROM my.companys.repo/redhat/ubi9


ENV QLIK_REPLICATE_BASE_DIR=/opt/attunity/replicate/
ENV ReplicateDataFolder=/replicate/data
ENV ReplicateAdminPassword=AB1gL0ngPa33w0rd
ENV ReplicateRestPort=3552
ENV LicenseFile=/tmp/replicate_license_exp2025-06-29_ser60038556.txt

# Copy across installation packages and licenses
ADD postgresql*.rpm /tmp/
ADD areplicate-*.rpm /tmp/
ADD systemctl /usr/sbin
ADD replicate_license_exp2025-06-29_ser60038556.txt /tmp/

# Update packages
RUN dnf -y update
RUN dnf makecache

# To get ps command
RUN dnf -y install procps-ng
RUN dnf -y install unixODBC unzip
RUN dnf -y install libicu.x86_64
RUN rm -f /etc/odbcinst.ini

# Installing posgres packages
RUN rpm -ivh /tmp/postgresql13-libs-13.9-1PGDG.rhel9.x86_64.rpm
RUN rpm -ivh /tmp/postgresql13-odbc-13.02.0000-2PGDG.rhel9.x86_64.rpm
RUN rpm -ivh /tmp/postgresql13-13.9-1PGDG.rhel9.x86_64.rpm

ADD odbcinst.ini /etc/

# Installing Qlik Replicate
RUN systemd=no yum -y install /tmp/areplicate-2023.11.0-468.x86_64.rpm
RUN yum clean all
RUN rm -f /tmp/areplicate-*.rpm

RUN export LD_LIBRARY_PATH=/opt/attunity/replicate/lib:\$LD_LIBRARY_PATH
RUN echo "export LD_LIBRARY_PATH=/usr/pgsql-13/lib:\$LD_LIBRARY_PATH" >> /opt/attunity/replicate/bin/site_arep_login.sh

ADD start_replicate.sh /opt/attunity/replicate/bin/start_replicate.sh
RUN chmod 775 /opt/attunity/replicate/bin/start_replicate.sh
RUN chown attunity:attunity /opt/attunity/replicate/bin/start_replicate.sh
RUN source $QLIK_REPLICATE_BASE_DIR/bin/arep_login.sh >>~attunity/.bash_profile
ENTRYPOINT /opt/attunity/replicate/bin/start_replicate.sh ${ReplicateDataFolder} ${ReplicateAdminPassword} ${ReplicateRestPort} ${LicenseFile} ; tail -f /dev/null

The postgres packages can be obtained from https://download.postgresql.org/pub/repos/yum/13/redhat/rhel-9-x86_64/

Th file odbcinst.ini content is:

[PostgreSQL]
Description = ODBC for PostgreSQL
Driver      = /usr/lib/psqlodbcw.so
Setup       = /usr/lib/libodbcpsqlS.so
Driver64    = /usr/pgsql-13/lib/psqlodbcw.so
Setup64     = /usr/lib64/libodbcpsqlS.so
FileUsage   = 1

The systemctl file is:

# Run LS command - remove this line 
ls

And of course you need the rpm for Qlik replicate and your license file.

Once the Dockerfile and files are collated in a directory; build the container with:

docker build --no-cache -t ccc/replicate:2023.11 .

If all goes well – a Docker contain will be built and ready to be used.

Docker Compose

To make running the docker images easier; create a docker compose file:

version: '3.3'

services:
  replicate:
    image: docker.io/ccc/replicate:2023.11
    container_name: replicate_2023_11
    ports: 
      - "3552:3552"

    environment:
      - ReplicateRestPort=3552
      - TZ=Australia/Melbourne

    volumes:
      - /dockermount/data/replicate/data:/replicate/data

    extra_hosts:
      - host.docker.internal:host-gateway

volumes:
  replicate:

Save the docker-compose.yml in a directory and from the directory start the container with the command:

docker-compose up -d

If everything is working – run the docker ps command to verify everything is working:

docker ps

So far looking good. Further conformation can be had by connecting into the container and observe the QR processes running:

docker exec -it qr_container_id bash
ps -aux

There should be two main processes; plus a process for each individual QR tasks running:

With everything confirmed – QR console can be accessed from a browser.

https://127.0.0.1:3552/attunityreplicate/