sql

Qlik Replicate: “If DBCC printed error messages, contact your system administrator” – OK I will

At the moment; our organisation is in the middle of a merge project where we are squashing an acquired core system in our existing core system.

To support the project – the Environments team are busy juggling around different development and testing environments; with applications sifting from server to server and different areas getting locked down.

Of course, with all the activity going on, one of our Qlik Replicate tasks started to fail with the following error message:

00018372: 2025-04-23T15:00:15 [SOURCE_CAPTURE  ]W:  Capture functionalities could not be set. RetCode: SQL_ERROR  SqlState: 01000 NativeError: 2528 Message: [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]DBCC execution completed. If DBCC printed error messages, contact your system administrator. Line: 1 Column: -1   (sqlserver_log_utils.c:1831)
00018372: 2025-04-23T15:00:15 [SOURCE_CAPTURE  ]I:  Full logging setup approved for 6 out of 7 tables  (sqlserver_log_utils.c:2315)
00018372: 2025-04-23T15:00:15 [SOURCE_CAPTURE  ]W:  Capture functionalities could not be set. RetCode: SQL_ERROR  SqlState: 01000 NativeError: 2528 Message: [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]DBCC execution completed. If DBCC printed error messages, contact your system administrator. Line: 1 Column: -1   (sqlserver_log_utils.c:1831)
00018372: 2025-04-23T15:00:15 [SOURCE_CAPTURE  ]I:  Full logging setup approved for 1 out of 7 tables  (sqlserver_log_utils.c:2315)
00018176: 2025-04-23T15:00:15 [TASK_MANAGER    ]W:  Table 'dbo'.'my_table' (subtask 0 thread 0) is suspended. Failure in executing add article.  (replicationtask.c:3208)
00018372: 2025-04-23T15:00:15 [SOURCE_CAPTURE  ]W:  Capture functionalities could not be set. RetCode: SQL_ERROR  SqlState: 01000 NativeError: 2528 Message: [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]DBCC execution completed. If DBCC printed error messages, contact your system administrator. Line: 1 Column: -1   (sqlserver_log_utils.c:1831)
00018176: 2025-04-23T15:00:15 [TASK_MANAGER    ]E:  Table error occurred. Based on error behavior policy, task will stop. [1021705]  (replicationtask.c:3227)

In the intial scramble to find the solution (it was Sunday and initially I wasn’t meant to be working); a Google search didn’t come up with a solution. Nor a search over the Qlik forums.

The DBA didn’t have a clue as well (although I think he was out on a boat fishing at that time so his heart wasn’t in it)

Then in a moment of clarity I remember. I have come across this error message before and I documented it down in our internal wiki. Had no clue why I didn’t search there first.

Logging to the rescue

If dear reader have come to my humble sight through a google search. Or (sigh) through aggregated AI; this is what we did.

First; increase the logging for “Source Capture” to “Trace” and rerun the task. Look for the error message:

00002492: 2025-04-28T08:45:00 [SOURCE_CAPTURE  ]T:  RetCode: SQL_ERROR  SqlState: 42000 NativeError: 21798 Message: [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Cannot execute the replication administrative procedure. The 'logreader' agent job must be added through 'sp_addlogreader_agent' before continuing. See the documentation for 'sp_addlogreader_agent'. Line: 1 Column: -1 [1022502]  (ar_odbc_stmt.c:5090)

If you received an error message like this; then the SQL log reader could have stopped. Ask the DBAs to restart the log reader. For our particular instance – all they need to run is:

USE [master]; 

EXEC problem_db.sys.sp_addlogreader_agent;

After the DBAs have run the command; the task can be resumed.

Don’t forget to drop the loging back to INFO once you’re finished.

July 11, 2026 by jonny.donker@gmail.com Qlik Replicate SQL

Qlik Replicate to AWS Postgres (Why are you so slow?!)

Wait – Performance problems writing to an AWS RDS Postgres database?

Haven’t we been here before?

Yes, we had, and I wrote quite a few posts of the trials and tribulations that we went through.

But this is a new problem that we came across that resulted in several messages backwards and forwards between us and Qlik before we worked out the problem.

Core System Migration.

Our organisation has several core systems; making maintenance of them expensive and holding us back in using the data in these systems in modern day tools like advance AI. Over the past several years – various projects are running to consolidate the systems together.

This is all fun and games for the downstream consumers – as they have lots of migration data coming down the pipelines. For instance, shell accounts getting created on the main system from the sacrificial system.

One downstream system wanted to exclude migration data from their downstream data and branch the data into another database so they can manipulate the migrated data to fit it into their pipeline.

I created the Qlik Replicate task to capture the migration data. It was a simple task to create. Unusually, the downstream users created their own tables that they want me top pipe the data into. In the past, we let Qlik Replicate create the table in the lower environment, copy the schema and use that schema going forwards.

Ready to go we fired up the task in testing to capture the test run through of the migration.

Slow. So Slow.

The test migration ran on the main core system, and we were ready to capture the changes under the user account running the migration process.

It was running slow. So slow.

We knew data was getting loaded as we were periodically running a SELECT COUNT(*) on the destination table. But we were running at less than 20tps.

Things we checked:

The source and target databases were not under duress.
The QR server (although a busy server) CPU and Memory wasn’t maxed out.
There were no critical errors in the error log.
Records were not getting written to the attrep_apply_exceptions table.
There were no triggers built off the landing table that might be slowing down the process
We knew from previous testing that we could get a higher tps.

I bumped up the logging on “Target Apply” to see if we can capture more details on the problem.

One by One.

After searching the log files, we came across an interesting message:

00007508: 2025-02-20T08:33:24:595067 [TARGET_APPLY    ]I:  Bulk apply operation failed. Trying to execute bulk statements in 'one-by-one' mode  (bulk_apply.c:2430)

00007508: 2025-02-20T08:33:24:814779 [TARGET_APPLY    ]I:  Applying INSERTS one-by-one for table 'dbo'.'DESTINATION_TABLE' (4)  (bulk_apply.c:4849)

For some reason instead of using a bulk load operation – QR was loading the records one by one. This accounted for the slow performance.

But why was it switching to this one-by-one mode? What caused the main import of bulk insert to fail – but one-by-still works.

Truncating data types

First, I suspected that the column names might be mismatched. I got the source and destination schema out and compared the two.

All the column names aligned correctly.

Turning up the logging we got the following message:

00003568: 2025-02-19T15:24:50 [TARGET_APPLY    ]T:  Error code 3 is identified as a data error  (csv_target.c:1013)

00003568: 2025-02-19T15:24:50 [TARGET_APPLY    ]T:  Command failed to load data with exit error code 3, Command output: psql:C:/Program Files/Attunity/Replicate/data/tasks/MY_CDC_TASK_NAME/data_files/0/LOAD00000001.csv.sql:1: ERROR:  value too long for type character varying(20)

CONTEXT:  COPY attrep_changes472AD6934FE46504, line 1, column col12: "2014-08-10 18:33:52.883" [1020417]  (csv_target.c:1087)

All the varchar fields between the source and target align correctly.

Then I noticed the column MODIFIED_DATE. On the source it is a datetime; while on the postgres target it is just a date.

My theory was that the bulk copy could not handle the conversion – but in the one-by-one; it could truncate the time component off the date and successfully load.

The downstream team changed the field from a date to a timestamp and I reloaded the data. With this fix the task went blindingly quick; from hours for just a couple of thousands of records to all done within minutes.

Conclusion

I suppose the main conclusion from this exercise is that “Qlik Replicate Knows best.”

Unless you have a very accurate mapping process from the source to the target; let QR create the destination table in a lower environment. Use this table as a source and build on it.

It will save a lot time and heartache later on.

March 17, 2025 by jonny.donker@gmail.com Qlik Replicate 0

Qlik Replicate – The saga of replicating to AWS Part 5 – Is MS-SQL the answer?

For those who are following along at home…

We have been toiling on replicating to AWS Postgres RDS with Qlik Replicate for the past two months; trying to achieve a baseline of 300tps.

After many suggestions, tuning and tweaks, conference calls, benchmarks learning experiences, prayers; we couldn’t get our tps close to our baseline.

You can read the main findings in the following pages:

Things that were suggested to us that I did not add to this blog series:

Try Async commits on RDS Postgres. Tried it and got negligible increases.
Shifting the QR server to AWS. Bad idea as there will be even more traffic from the busy DB2 database going across for QR to consume.
Use Amazon Aurora instead of RDS. The downstream developers did not have the appetite to try Aurora; especially with the issues leaning towards network speed.
Use GCP version of Postgres instead of AWS. The downstream developers did not want to commit to another cloud provider.

The problem is how the network connectivity behaves with the Postgres ODBC and the round trips it must do between our location and the AWS data centre. We can try – but we are bound by the laws of physics and the speed of light.

Decisions to be made.

All though our benchmarking and investigation; we have been replicating to a Development MS-SQL database in the data centre as the DB2 database in parallel to give us an idea of what speed we could potentially reach. Without triggers on the MS-SQL destination table; we were easily hitting 300tps. Ramping the changes up; we can hold at 1K tps with no creep in latency.

We were happy with these results; especially with the MS-SQL database was just a small underpowered shared Dev machine; not a full-blown dedicated server.

It took a brave solution architect to propose that we shift from AWS RDS Postgres to an on prem MS-SQL server; especially when our senior management strategy is to push everything to the cloud to reduce the number of on prem servers.

In the end with all our evidence on the performance and the project’s willingness to push on with the proposed solution, the solution stakeholders agreed to move the destination to an on prem database.

They initially wanted us to go with a on prem Postgres database; but since all our Database Administrators are either Oracle or MS-SQL experts and we have no Postgres experts – we went to good old MS-SQL.

It worked; but…damn triggers.

I volunteered to convert the Postgres SQL code into T-SQL as I have worked with T-SQL for the past decade. The conversion went smoothly, and I took the opportunity to optimise several sections of the code to make the solution more maintainable and to run faster.

With our new MS-SQL database all coded up and the triggers turned off; the SVT (stress and volume testing) ran at the TPS for which we were aiming.

But when we turned on the triggers; the performance absolutely crashed.

I was mortified – was it my coding shot and the additional changes that I made the performance worse?

I checked the triggers. I checked the primary keys and the joins. I checked the horizontal partitioning. I checked the database server stats for CPU and memory usage.

Nothing – could not locate the performance problem.

I went back to Qlik Replicate and examined the log files.

Ahh – here is something. The log file was full of entries like this:

00012472: 2024-05-27T16:03:54 [TARGET_APPLY ]W: Source changes that would have had no impact were not applied to the target database. Refer to the 'attrep_apply_exceptions' table for details (endpointshell.c:7632)

Looking inside the attrep_apply_exceptions there corresponding entries like:

UPDATE [dbo].[TEST_DESTINATION] 
SET	[ACCOUNT_ID] = 2,
	[DATA_1] = 'Updated', 
	[DATA_2] = 'Data' 
WHERE 
	[ACCOUNT_ID] = 2;
	-- 0 rows affected

Which was confusing; I checked the destination table, and the update was applied. Why was this update deemed a failure and logged to the attrep_apply_exceptions table? It must be an error in the trigger.

The cause of the problem

Our code can be paraphrased like:

CREATE TABLE dbo.TEST_DESTINATION
(
	ACCOUNT_ID int NOT NULL,
	DATA_1 varchar(100) NULL,
	DATA_2 varchar(100) NULL,
	PRIMARY KEY CLUSTERED 
	(
		ACCOUNT_ID ASC
	)
);

GO

CREATE TABLE dbo.TEST_MERGE_TABLE
(
	ACCOUNT_ID int NOT NULL,
	DATA_1 varchar(100) NULL,
	DATA_2 varchar(100) NULL,
	DATA_3 varchar(100) NULL,
	PRIMARY KEY CLUSTERED 
	(
		ACCOUNT_ID ASC
	)
)

GO

CREATE OR ALTER TRIGGER dbo.TR_TEST_DESTINATION__INSERT
ON dbo.TEST_DESTINATION
AFTER INSERT 
AS
	INSERT INTO dbo.TEST_MERGE_TABLE
	SELECT
		ACCOUNT_ID,
		DATA_1,
		DATA_2,
		'TRIGGER INSERT' AS DATA_3
	FROM INSERTED;

GO

CREATE OR ALTER TRIGGER [dbo].[TR_TEST_DESTINATION__UPDATE]
ON [dbo].[TEST_DESTINATION]
AFTER UPDATE 
AS
	UPDATE dbo.TEST_MERGE_TABLE
	SET ACCOUNT_ID = X.ACCOUNT_ID,
		DATA_1 = X.DATA_1,
		DATA_2 = X.DATA_2,
		DATA_3 = 'TRIGGER DATA'
	FROM dbo.TEST_MERGE_TABLE T
	JOIN INSERTED X
	ON X.ACCOUNT_ID = T.ACCOUNT_ID
	WHERE
		1 = 0;  -- This predicate can be either true or false.  For example we set it false

GO

The problem is in how the trigger TR_TEST_DESTINATION__UPDATE behaves if it returns 0 rows. This can be a legitimate occurrence depending on a join in the trigger.

If I run a simple update like:

UPDATE dbo.TEST_DESTINATION
SET DATA_1 = 'Trigger Upate'
WHERE
	ACCOUNT_ID = 1;

The SQL engine returns:

(0 row(s) affected)    -- Returned from the trigger

(1 row(s) affected)    -- Returned from updating dbo.TEST_DESTINATION

My theory is that Qlik Replicate when reading the rows returned from executing the SQL statement on the destination server; only considers the first row to determine if the change was a success or not. Since the first row is an output from the trigger with 0 rows affected; Qlik considers that the update was a failure and therefore logs it into the attrep_apply_exceptions table.

Apart from this been incorrect as the trigger code is logically working correctly; Qlik Replicate must make another trip to write to the exception table. This resulted in drastically increased latency.

Fixing the issue

The fix (once the problem is known) is relatively straight forward. Any rows returned needs to be supressed from the trigger. For example:

CREATE OR ALTER TRIGGER [dbo].[TR_TEST_DESTINATION__UPDATE]
ON [dbo].[TEST_DESTINATION]
AFTER UPDATE 
AS
BEGIN
	SET NOCOUNT ON;  -- Supress returning the row count

	UPDATE dbo.TEST_MERGE_TABLE
	SET ACCOUNT_ID = X.ACCOUNT_ID,
		DATA_1 = X.DATA_1,
		DATA_2 = X.DATA_2,
		DATA_3 = 'TRIGGER DATA'
	FROM dbo.TEST_MERGE_TABLE T
	JOIN INSERTED X
	ON X.ACCOUNT_ID = T.ACCOUNT_ID
	WHERE
		1 = 0;  -- This predicate can be either true or false
END

When the update statement is run again; the following is returned:

(1 row(s) affected)

Qlik Replicate will now consider the update as a success and not log it as an exception.

May 27, 2024 by jonny.donker@gmail.com Qlik Replicate 0

Qlik Replicate – The saga of replicating to AWS Part 2 – Lions and Tigers and Primary Keys (Oh My!)

The Story so far:

For those who are following from home:

We deduced that replicating through the AWS RDS proxy was a bad idea for performance; but even going directly to the AWS RDS Postgres database from our on-prem system did not give us any noticeable performance gains.

It was time to try looking in different area.

Raw SQL

In the squad investigating the problem, someone asked, “What SQL is Qlik running on the Postgres server? Is it poorly optimised?”

To find out what Qlik Replicate is running; I bumped the TARGET_APPLY up to Verbose, let transactions run through for a couple of minutes and then assessed the log file. Since 99% of the changes are updates on the source database (we know this from a production job that is running on the same source database); our logs were full of update statements like this:

UPDATE dbo.my_dest_table
SET	key_1 = ?,
	key_2 = ?,
	key_3 = ?,
	field_1 = ?,
	field_2 = ?,
	field_3 = ?
WHERE
	key_1 = ? AND
	key_2 = ? AND
	key_3 = ?;

The first thing I noticed was – Qlik was updating ALL the columns, even if only one column’s data changed on the source database.

This also included the Primary key.

Primary keys are computationally expensive updating compared to just your regular day to day columns. As for an example, I wrote a simple POC in t-sql:

/*
CREATE TABLE dbo.my_dest_table
(
	key_1 SMALLDATETIME,
	key_2 INT,
	key_3 VARCHAR(10),

	field_1 INT,
	field_2 VARCHAR(256),
	field_3 SMALLDATETIME,
	PRIMARY KEY(key_1, key_2, key_3)
)
*/

/*
UPDATE dbo.my_dest_table
SET	key_1 = ?,
	key_2 = ?,
	key_3 = ?,
	field_1 = ?,
	field_2 = ?,
	field_3 = ?
WHERE
	key_1 = ? AND
	key_2 = ? AND
	key_3 = ?;
*/

/*
INSERT INTO dbo.my_dest_table VALUES ('2024-01-01', 1, 'key 3', 1, 2, '2023-03-03');
INSERT INTO dbo.my_dest_table VALUES ('2024-01-01', 2, 'key 3', 2, 3, '2024-04-04');
*/

SET STATISTICS TIME ON;

UPDATE dbo.my_dest_table
SET key_1 = '2024-01-01',
	key_2 = 1,
	key_3 = 'key 3',

	field_1 = 10,
	field_2 = 2,
	field_3 = '2023-03-03'
WHERE
	key_1 = '2024-01-01' AND
	key_2 = 1 AND
	key_3 = 'key 3';
/*
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 41 ms.  <------- Oh dear

(1 row affected)
*/

UPDATE dbo.my_dest_table
SET field_1 = 10,
	field_2 = 3,
	field_3 = '2024-04-04'
WHERE
	key_1 = '2024-01-01' AND
	key_2 = 2 AND
	key_3 = 'key 3';
/*
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 6 ms.

(1 row affected)
*/

As demonstrated updating the PK columns (even with the same data) is eight times more expensive than just updating the other data columns.

Asking for help

I asked for help on the Qlik Community forum. I got a response back from the specialists that there is a setting in Qlik’s endpoints that will NOT update the Primary key.

The setting is:

$info.query_syntax.pk_segments_not_updateable

Set to TRUE

I made this change in the endpoint, restarted the task and captured a few more update sql samples. With the change in the settings; the updates now discarded the updates to the PK.

BUT…

This might be a helpful solution and worthy to note for the future; but in this case the solution did not really help us out. After making the change we noticed no noticeable change in our TPS throughput; confirming increasingly it was a network issue.

I could see also other problems with this.

The source system that we are reading from is exceptionally large, have been entrenched in our organisation for 30 years and there is constant work done in the background as other core system are merged into it. We as a downstream user; cannot guarantee the rules around updating Primary Keys. Gut feeling in day-to-day operation that once the PKs are set, they are set in stone in the database. But to base our data integrity for critical reporting on accuracy of data on a “Gut feeling” is not a good data integrity model.

Who to say that the owners of the source system perform an update to a PK to fix an incident; or a new source system consolidation have updates in the steps? Our change management in the organisation is not tight and detailed enough for us to be forewarned of changes like this.

If data integrity were not so critical; we could run a full refresh periodically to reset the data to a baseline to capture updates of PK. But this will require significant rework of the solution to manage these periodical refreshes.

So; we did learn. We did assess. But this didn’t help us out.

May 5, 2024 by jonny.donker@gmail.com Qlik Replicate 0

SQL: Quick and dirty comparing result sets

When doing regression testing – I need to prove that the changes that I have done to two queries returned the same result sets.

If result sets are on the same database then you can do a MINUS operator.

If not – it can be a pain to shift result sets from database to database; especially if they are large.

For quick and dirty testing; I like using the tsql function CHECKSUM and pl sql function ORA_HASH. It can give you some quick results to test whether a column of data has a high probability of been the same as another column.

To test a result set; I have the following excel formulas saved in my “Handy formulas” notebook:

TSQL

=",SUM(CAST(CHECKSUM("&A1&") AS BIGINT)) AS "&A1

PL/SQL

=",SUM(ORA_HASH("&A1&")) AS "&A1

With a list of column name you can then make an checksum sql statement from Excel

Running this query on both result sets; differences can be found by seeing if there is a difference in the retuned values

June 27, 2023 by jonny.donker@gmail.com SQL 0

sql