{"id":1405,"date":"2024-05-27T23:09:11","date_gmt":"2024-05-27T23:09:11","guid":{"rendered":"https:\/\/craftcookcode.com\/?p=1405"},"modified":"2026-01-13T21:40:07","modified_gmt":"2026-01-13T21:40:07","slug":"qlik-replicate-the-saga-of-replicating-to-aws-part-5-is-ms-sql-the-answer","status":"publish","type":"post","link":"https:\/\/craftcookcode.com\/?p=1405","title":{"rendered":"Qlik Replicate \u2013 The saga of replicating to AWS Part 5 &#8211; Is MS-SQL the answer?"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">For those who are following along at home&#8230;<\/h2>\n\n\n\n<p>We have been toiling on replicating to AWS Postgres RDS with Qlik Replicate for the past two months; trying to achieve a baseline of 300tps.<\/p>\n\n\n\n<p>After many suggestions, tuning and tweaks, conference calls, benchmarks learning experiences, prayers; we couldn\u2019t get our tps close to our baseline.<\/p>\n\n\n\n<p>You can read the main findings in the following pages:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/craftcookcode.com\/?p=1241\" title=\"\">Intro<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/craftcookcode.com\/?p=1434\" title=\"\">Part 1 &#8211; pg_bench<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/craftcookcode.com\/?p=1381\" title=\"\">Part 2 &#8211; Primary Keys<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/craftcookcode.com\/?p=1388\" title=\"\">Part 3 &#8211; Wireshark<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/craftcookcode.com\/?p=1402\" title=\"\">Part 4 &#8211; Stream size<\/a><\/li>\n<\/ul>\n\n\n\n<p>Things that were suggested to us that I did not add to this blog series:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Try Async commits on RDS Postgres.&nbsp; Tried it and got negligible increases.<\/li>\n\n\n\n<li>Shifting the QR server to AWS.&nbsp; Bad idea as there will be even more traffic from the busy DB2 database going across for QR to consume.<\/li>\n\n\n\n<li>Use Amazon Aurora instead of RDS.&nbsp; The downstream developers did not have the appetite to try Aurora; especially with the issues leaning towards network speed.<\/li>\n\n\n\n<li>Use GCP version of Postgres instead of AWS.&nbsp; The downstream developers did not want to commit to another cloud provider.<\/li>\n<\/ul>\n\n\n\n<p>The problem is how the network connectivity behaves with the Postgres ODBC and the round trips it must do between our location and the AWS data centre.&nbsp; We can try \u2013 but we are bound by the laws of physics and the speed of light.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Decisions to be made.<\/h2>\n\n\n\n<p>All though our benchmarking and investigation; we have been replicating to a Development MS-SQL database in the data centre as the DB2 database in parallel to give us an idea of what speed we could potentially reach.&nbsp; Without triggers on the MS-SQL destination table; we were easily hitting 300tps.&nbsp; Ramping the changes up; we can hold at 1K tps with no creep in latency.&nbsp;<\/p>\n\n\n\n<p>We were happy with these results; especially with the MS-SQL database was just a small underpowered shared Dev machine; not a full-blown dedicated server.<\/p>\n\n\n\n<p>It took a brave solution architect to propose that we shift from AWS RDS Postgres to an on prem MS-SQL server; especially when our senior management strategy is to push everything to the cloud to reduce the number of on prem servers.<\/p>\n\n\n\n<p>In the end with all our evidence on the performance and the project\u2019s willingness to push on with the proposed solution, the solution stakeholders agreed to move the destination to an on prem database.<\/p>\n\n\n\n<p>They initially wanted us to go with a on prem Postgres database; but since all our Database Administrators are either Oracle or MS-SQL experts and we have no Postgres experts \u2013 we went to good old MS-SQL.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">It worked; but\u2026damn triggers.<\/h2>\n\n\n\n<p>I volunteered to convert the Postgres SQL code into T-SQL as I have worked with T-SQL for the past decade.&nbsp; The conversion went smoothly, and I took the opportunity to optimise several sections of the code to make the solution more maintainable and to run faster.<\/p>\n\n\n\n<p>With our new MS-SQL database all coded up and the triggers turned off; the SVT (stress and volume testing) ran at the TPS for which we were aiming.<\/p>\n\n\n\n<p>But when we turned on the triggers; the performance absolutely crashed.<\/p>\n\n\n\n<p>I was mortified \u2013 was it my coding shot and the additional changes that I made the performance worse?<\/p>\n\n\n\n<p>I checked the triggers. &nbsp;I checked the primary keys and the joins.&nbsp; I checked the horizontal partitioning.&nbsp; I checked the database server stats for CPU and memory usage.&nbsp;<\/p>\n\n\n\n<p>Nothing \u2013 could not locate the performance problem.<\/p>\n\n\n\n<p>I went back to Qlik Replicate and examined the log files.<\/p>\n\n\n\n<p>Ahh &#8211; here is something.  The log file was full of entries like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>00012472: 2024-05-27T16:03:54 &#91;TARGET_APPLY ]W: Source changes that would have had no impact were not applied to the target database. Refer to the 'attrep_apply_exceptions' table for details (endpointshell.c:7632)<\/code><\/pre>\n\n\n\n<p>Looking inside the attrep_apply_exceptions there corresponding entries like:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nUPDATE &#x5B;dbo].&#x5B;TEST_DESTINATION] \nSET\t&#x5B;ACCOUNT_ID] = 2,\n\t&#x5B;DATA_1] = &#039;Updated&#039;, \n\t&#x5B;DATA_2] = &#039;Data&#039; \nWHERE \n\t&#x5B;ACCOUNT_ID] = 2;\n\t-- 0 rows affected\n<\/pre><\/div>\n\n\n<p>Which was confusing; I checked the destination table, and the update was applied.&nbsp; Why was this update deemed a failure and logged to the attrep_apply_exceptions table?  It must be an error in the trigger.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The cause of the problem<\/h2>\n\n\n\n<p>Our code can be paraphrased like:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nCREATE TABLE dbo.TEST_DESTINATION\n(\n\tACCOUNT_ID int NOT NULL,\n\tDATA_1 varchar(100) NULL,\n\tDATA_2 varchar(100) NULL,\n\tPRIMARY KEY CLUSTERED \n\t(\n\t\tACCOUNT_ID ASC\n\t)\n);\n\nGO\n\nCREATE TABLE dbo.TEST_MERGE_TABLE\n(\n\tACCOUNT_ID int NOT NULL,\n\tDATA_1 varchar(100) NULL,\n\tDATA_2 varchar(100) NULL,\n\tDATA_3 varchar(100) NULL,\n\tPRIMARY KEY CLUSTERED \n\t(\n\t\tACCOUNT_ID ASC\n\t)\n)\n\nGO\n\nCREATE OR ALTER TRIGGER dbo.TR_TEST_DESTINATION__INSERT\nON dbo.TEST_DESTINATION\nAFTER INSERT \nAS\n\tINSERT INTO dbo.TEST_MERGE_TABLE\n\tSELECT\n\t\tACCOUNT_ID,\n\t\tDATA_1,\n\t\tDATA_2,\n\t\t&#039;TRIGGER INSERT&#039; AS DATA_3\n\tFROM INSERTED;\n\nGO\n\nCREATE OR ALTER TRIGGER &#x5B;dbo].&#x5B;TR_TEST_DESTINATION__UPDATE]\nON &#x5B;dbo].&#x5B;TEST_DESTINATION]\nAFTER UPDATE \nAS\n\tUPDATE dbo.TEST_MERGE_TABLE\n\tSET ACCOUNT_ID = X.ACCOUNT_ID,\n\t\tDATA_1 = X.DATA_1,\n\t\tDATA_2 = X.DATA_2,\n\t\tDATA_3 = &#039;TRIGGER DATA&#039;\n\tFROM dbo.TEST_MERGE_TABLE T\n\tJOIN INSERTED X\n\tON X.ACCOUNT_ID = T.ACCOUNT_ID\n\tWHERE\n\t\t1 = 0;  -- This predicate can be either true or false.  For example we set it false\n\nGO\n<\/pre><\/div>\n\n\n<p>The problem is in how the trigger TR_TEST_DESTINATION__UPDATE behaves if it returns 0 rows.&nbsp; This can be a legitimate occurrence depending on a join in the trigger.<\/p>\n\n\n\n<p>If I run a simple update like:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nUPDATE dbo.TEST_DESTINATION\nSET DATA_1 = &#039;Trigger Upate&#039;\nWHERE\n\tACCOUNT_ID = 1;\n<\/pre><\/div>\n\n\n<p>The SQL engine returns:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>(0 row(s) affected)    -- Returned from the trigger\n\n(1 row(s) affected)    -- Returned from updating dbo.TEST_DESTINATION<\/code><\/pre>\n\n\n\n<p>My theory is that Qlik Replicate when reading the rows returned from executing the SQL statement on the destination server; only considers the first row to determine if the change was a success or not.&nbsp; Since the first row is an output from the trigger with 0 rows affected; Qlik considers that the update was a failure and therefore logs it into the attrep_apply_exceptions table.  <\/p>\n\n\n\n<p>Apart from this been incorrect as the trigger code is logically working correctly; Qlik Replicate must make another trip to write to the exception table.&nbsp; This resulted in drastically increased latency.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Fixing the issue<\/h2>\n\n\n\n<p>The fix (once the problem is known) is relatively straight forward.  Any rows returned needs to be supressed from the trigger.  For example:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nCREATE OR ALTER TRIGGER &#x5B;dbo].&#x5B;TR_TEST_DESTINATION__UPDATE]\nON &#x5B;dbo].&#x5B;TEST_DESTINATION]\nAFTER UPDATE \nAS\nBEGIN\n\tSET NOCOUNT ON;  -- Supress returning the row count\n\n\tUPDATE dbo.TEST_MERGE_TABLE\n\tSET ACCOUNT_ID = X.ACCOUNT_ID,\n\t\tDATA_1 = X.DATA_1,\n\t\tDATA_2 = X.DATA_2,\n\t\tDATA_3 = &#039;TRIGGER DATA&#039;\n\tFROM dbo.TEST_MERGE_TABLE T\n\tJOIN INSERTED X\n\tON X.ACCOUNT_ID = T.ACCOUNT_ID\n\tWHERE\n\t\t1 = 0;  -- This predicate can be either true or false\nEND\n<\/pre><\/div>\n\n\n<p>When the update statement is run again; the following is returned:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>(1 row(s) affected)<\/code><\/pre>\n\n\n\n<p>Qlik Replicate will now consider the update as a success and not log it as an exception.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For those who are following along at home&#8230; We have been toiling on replicating to AWS Postgres RDS with Qlik Replicate for the past two months;&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":1247,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[16],"tags":[36,53],"class_list":["post-1405","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-qlik-replicate","tag-qlikreplicate","tag-sql"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/1405","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1405"}],"version-history":[{"count":6,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/1405\/revisions"}],"predecessor-version":[{"id":1572,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/1405\/revisions\/1572"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/media\/1247"}],"wp:attachment":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1405"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1405"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1405"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}