{"id":1560,"date":"2024-11-25T22:09:34","date_gmt":"2024-11-25T22:09:34","guid":{"rendered":"https:\/\/craftcookcode.com\/?p=1560"},"modified":"2024-11-25T22:09:35","modified_gmt":"2024-11-25T22:09:35","slug":"qlik-replicate-oh-oracle-youre-a-fussy-beast","status":"publish","type":"post","link":"https:\/\/craftcookcode.com\/?p=1560","title":{"rendered":"Qlik Replicate: Oh Oracle &#8211; you&#8217;re a fussy beast"},"content":{"rendered":"\n<p>It&#8217;s all fun and games &#8211; until Qlik Replicate must copy 6 billion rows from a very wide Oracle table to GCS\u2026<\/p>\n\n\n\n<p>\u2026in a small time window<\/p>\n\n\n\n<p>\u2026with the project not wanting to perform Stress and Volume testing<\/p>\n\n\n\n<p>Oh boy.<\/p>\n\n\n\n<p>Our Dev environment had 108 milling rows to play with, which ran \u201cquick\u201d in relationship the amount of data it had to copy.&nbsp; But being 33 times smaller; even if it takes an hour in Dev \u2013 extrapolating the time out will relate to over 30 hours of run time.<\/p>\n\n\n\n<p>The project forged ahead in the implementation and QR only processed 2% of the changes before we ran out of the time window.<\/p>\n\n\n\n<p>&nbsp;The QR servers didn\u2019t seem stressed performance wise; had plenty of CPU and RAM.&nbsp; I suspect the bottle neck was in the bandwidth going out to GCS; but there was no way to monitor how much of the connection has been used.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">When in doubt &#8211; change the file type<\/h2>\n\n\n\n<p>After the failed implementation, we tried to work out how we can improve the throughput to GCS in our dev environment.<\/p>\n\n\n\n<p>I thought changing the destination\u2019s file type might be a way.\u00a0 JSON is a chunky file format, and my hypothesis was if the JSON was compressed it would transfer to GCS quicker.\u00a0 We tested out a NULL connector, raw JSON, GZIP JSON and Parquet.\u00a0 As a test using Dev \u2013 we let a test task run for 20min to see how much data is copied across.<\/p>\n\n\n\n<p>Full Load Tuning:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Transaction consistency timeout (seconds): 600<\/li>\n\n\n\n<li>Commit rate during full load: 100000<\/li>\n<\/ul>\n\n\n\n<p>Endpoint settings:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Maximum file size(KB): 1000000 KB (1GB)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Results<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"622\" src=\"http:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_2-1024x622.png\" alt=\"\" class=\"wp-image-1564\" srcset=\"https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_2-1024x622.png 1024w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_2-300x182.png 300w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_2-768x466.png 768w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_2-1536x932.png 1536w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_2.png 1939w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Unfortunately, my hypothesis on compressed JSON was incorrect.\u00a0 We speculated that compressing the JSON might have been taking up as much time as transferring it.\u00a0 I would have like to test this theory on a quieter QR server, but time is of the essence.<\/p>\n\n\n\n<p>Parquet seemed to be the winner with the limited testing offering a nice little throughput boost over the JSON formats.\u00a0 But it wasn\u2019t the silver bullet to our throughput problems.  Added onto this; the downstream users would need to spend time modifying their ingestion pipelines.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Divide and conquer \u2013 until Oracle says no.<\/h2>\n\n\n\n<p>The next stage was to look if we could divide the table up into batches and transfer across section at a time.&nbsp; Looking at the primary key; it was an identity column that had little meaningful relation to easily divide up into batches.<\/p>\n\n\n\n<p>There was another indexed column called RUN_DATE; which is a date relation to when the record was entered.<\/p>\n\n\n\n<p>OK \u2013 let\u2019s turn on <a href=\"https:\/\/community.qlik.com\/t5\/Official-Support-Articles\/Filter-for-last-90-days-of-data-in-Qlik-Replicate\/ta-p\/1880761\" target=\"_blank\" rel=\"noopener\" title=\"\">Passthrough filtering<\/a> and test it out.<\/p>\n\n\n\n<p>First of all to test the syntax out in <a href=\"https:\/\/www.oracle.com\/au\/database\/sqldeveloper\/\" target=\"_blank\" rel=\"noopener\" title=\"\">SQL Developer<\/a><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nSELECT COUNT(*)\nFROM xxxxx.TRANSACTIONS\nWHERE\n    RUN_DATE &gt;= &#039;01\/Jan\/2023&#039; AND\n    RUN_DATE &lt; &#039;01\/Jan\/2024&#039;;\n\n<\/pre><\/div>\n\n\n<p>The query ran fine meaning that the date syntax was right.<\/p>\n\n\n\n<p>Looking good &#8211; let&#8217;s add the filter to the <em>Full Load Passthru Filter<\/em><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"563\" src=\"https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_3-1024x563.png\" alt=\"\" class=\"wp-image-1567\" srcset=\"https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_3-1024x563.png 1024w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_3-300x165.png 300w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_3-768x422.png 768w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_3-1536x845.png 1536w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2024\/11\/code_qr_vs_oracle_3.png 1626w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>But when running the task; it goes into &#8220;recoverable error&#8221; mode.<\/p>\n\n\n\n<p>Looking into the logs:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n00014204: 2024-11-26T08:38:26:184700 &#x5B;SOURCE_UNLOAD   ]T:  Select statement for UNLOAD is &#039;SELECT &quot;PK_ID&quot;,&quot;RUN_DATE&quot;, &quot;LOTS&quot;, &quot;OF&quot;, &quot;OTHER, &quot;COLUMNS&quot;  FROM &quot;xxxxx&quot;.&quot;TRANSACTIONS&quot; WHERE (RUN_DATE &gt;= &#039;01\/Jan\/2023&#039; AND RUN_DATE &amp;lt; &#039;01\/Jan\/2024&#039;)&#039;  (oracle_endpoint_utils.c:1941)\n00014204: 2024-11-26T08:38:26:215961 &#x5B;SOURCE_UNLOAD   ]T:  ORA-01858: a non-numeric character was found where a numeric was expected  &#x5B;1020417]  (oracle_endpoint_unload.c:175)\n00014204: 2024-11-26T08:38:26:215961 &#x5B;SOURCE_UNLOAD   ]T:  Failed to init unloading table &#039;xxxxx&#039;.&#039;TRANSACTIONS&#039; &#x5B;1020417]  (oracle_endpoint_unload.c:385)\n00014204: 2024-11-26T08:38:26:215961 &#x5B;SOURCE_UNLOAD   ]E:  ORA-01858: a non-numeric character was found where a numeric was expected  &#x5B;1020417]  (oracle_endpoint_unload.c:175)\n00014204: 2024-11-26T08:38:26:215961 &#x5B;SOURCE_UNLOAD   ]E:  Failed to init unloading table &#039;xxxxx&#039;.&#039;TRANSACTIONS&#039; &#x5B;1020417]  (oracle_endpoint_unload.c:385)\n00014204: 2024-11-26T08:38:26:215961 &#x5B;SOURCE_UNLOAD   ]T:  Error executing source loop &#x5B;1020417]  (streamcomponent.c:1942)\n00014204: 2024-11-26T08:38:26:215961 &#x5B;SOURCE_UNLOAD   ]T:  Stream component &#039;st_1_SRC_DEV_B1_xxxxx&#039; terminated &#x5B;1020417]  (subtask.c:1643)\n00014204: 2024-11-26T08:38:26:215961 &#x5B;SOURCE_UNLOAD   ]T:  Free component st_1_SRC_DEV_B1_xxxxx  (oracle_endpoint.c:51)\n00011868: 2024-11-26T08:38:26:215961 &#x5B;TASK_MANAGER    ]I:  Task error notification received from subtask 1, thread 0, status 1020417  (replicationtask.c:3603)\n00014204: 2024-11-26T08:38:26:215961 &#x5B;SOURCE_UNLOAD   ]E:  Error executing source loop &#x5B;1020417]  (streamcomponent.c:1942)\n00014204: 2024-11-26T08:38:26:215961 &#x5B;TASK_MANAGER    ]E:  Stream component failed at subtask 1, component st_1_SRC_DEV_B1_xxxxx  &#x5B;1020417]  (subtask.c:1474)\n00014204: 2024-11-26T08:38:26:215961 &#x5B;SOURCE_UNLOAD   ]E:  Stream component &#039;st_1_SRC_DEV_B1_xxxxx&#039; terminated &#x5B;1020417]  (subtask.c:1643)\n00011868: 2024-11-26T08:38:26:231570 &#x5B;TASK_MANAGER    ]W:  Task &#039;TEST_xxxxx&#039; encountered a recoverable error  (repository.c:6200)\n<\/pre><\/div>\n\n\n<p>Error code <strong>ORA-01858<\/strong> seems to be the key to the problem.  As an experiment I copied out the select code and ran it into SQL Developer.<\/p>\n\n\n\n<p>Works fine \ud83d\ude41<\/p>\n\n\n\n<p>OK &#8211; maybe it is a quirk of SQL Developer?  <\/p>\n\n\n\n<p>Using sqlplus I ran the same code from the command line.<\/p>\n\n\n\n<p>Again works fine \ud83d\ude41<\/p>\n\n\n\n<p>Resorting to good old Google &#8211; I searched ORA-01858.<\/p>\n\n\n\n<p>The top hit was this article from <a href=\"https:\/\/stackoverflow.com\/questions\/29852389\/getting-error-ora-01858-a-non-numeric-character-was-found-where-a-numeric-was\" target=\"_blank\" rel=\"noopener\" title=\"\">Stack Overflow<\/a> that recommended confirming the format of the date with the TO_DATE function.<\/p>\n\n\n\n<p>OK Oracle; if you want to be fussy with your dates &#8211; let&#8217;s explicitly define the date format with TO_DATE in the <em>Full Load Passthru Filter<\/em>.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nRUN_DATE &gt;= TO_DATE(&#039;01\/Jan\/2023&#039;,&#039;DD\/Mon\/YYYY&#039;) AND RUN_DATE &lt; TO_DATE(&#039;01\/Jan\/2024&#039;,&#039;DD\/Mon\/YYYY&#039;)\n<\/pre><\/div>\n\n\n<p>Ahhhh &#8211; that works better and Qlik Replicate now runs successfully with the passthrough filter.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>I tried a different set of date formats; including an ISO date format and Oracle spat them all out.  So using TO_DATE is the simplest way to avoid the <strong>ORA-01858<\/strong> error.  I can understand Oracle refusing to run on a date like 03\/02\/2024; I mean is it the 3rd of Feb 2024; or for Americans the 2nd of Mar 2024? But surprised something very clear like an ISO date format; or 03\/Feb\/2024 did not work.<\/p>\n\n\n\n<p>Maybe how SQL Developer and SQLplus interacts with the database is different than QR that leads to the different in behaviour of how filters on dates work.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>It&#8217;s all fun and games &#8211; until Qlik Replicate must copy 6 billion rows from a very wide Oracle table to GCS\u2026 \u2026in a small time&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":1561,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[73,16],"tags":[74,27,36],"class_list":["post-1560","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oracle","category-qlik-replicate","tag-ora-01858","tag-oracle","tag-qlikreplicate"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/1560","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1560"}],"version-history":[{"count":5,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/1560\/revisions"}],"predecessor-version":[{"id":1569,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/1560\/revisions\/1569"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/media\/1561"}],"wp:attachment":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1560"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1560"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1560"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}