{"id":833,"date":"2022-07-29T10:44:26","date_gmt":"2022-07-29T10:44:26","guid":{"rendered":"https:\/\/craftcookcode.com\/?p=833"},"modified":"2024-07-10T02:08:53","modified_gmt":"2024-07-10T02:08:53","slug":"qlik-replicate-the-wide-text-file-problem","status":"publish","type":"post","link":"https:\/\/craftcookcode.com\/?p=833","title":{"rendered":"Qlik Replicate &#8211; The Wide Text file problem"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\">A wide problem<\/h3>\n\n\n\n<p>We had a new business requirement for our team to create a new source for a CSV through Qlik Replicate.<\/p>\n\n\n\n<p>OK &#8211; simple enough; create the Source Connector and define the file&#8217;s schema in the table section of the connector.<\/p>\n\n\n\n<p>But the file was 250 columns wide.<\/p>\n\n\n\n<p>I had nightmares before of defining file schemas before in programs like SSIS where one little mistake can lead to hours of debugging.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">A stroke of Luck<\/h3>\n\n\n\n<p>Luckily we knew the source of the CSV file was an output from a view of a database; so the first part of the problem was solved &#8211; we had the schema.&nbsp; Now to import the schema into a Qlik Replicate task.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Create a new File Source connector and prefill as much known information as you can<\/li>\n\n\n\n<li>In the Tables definition of the File Source; add a dummy field<\/li>\n\n\n\n<li>Create a NULL Target connector<\/li>\n\n\n\n<li>Create a dummy task with your new file source and the Null target and save it<\/li>\n\n\n\n<li><em>Export task with End Points<\/em> to save out the json definition of the task<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Creating the table definition in JSON<\/h3>\n\n\n\n<p>Knowing the view definition; a simple python script can convert it to json.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nimport csv\nimport json\n\nif __name__ == &#039;__main__&#039;:\n\u00a0 final_dic = {}\n\u00a0 source_file = &quot;FileDef.txt&quot;\n\u00a0 field_array = &#x5B;]\n\n\u00a0 \u00a0 with open(source_file, &#039;r&#039;) as f:\n\u00a0 \u00a0 \u00a0 reader = csv.reader(f, delimiter=&#039;t&#039;)\n\u00a0 \u00a0 \u00a0 schema_items = list(reader)\n\n\u00a0 \u00a0 \u00a0 \u00a0 for item in schema_items:\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 focus_item = {}\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 focus_item&#x5B;&quot;name&quot;] = item&#x5B;0]\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 focus_item&#x5B;&quot;nullable&quot;] = True\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 focus_string = item&#x5B;1]\n\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 if item&#x5B;1].find(&quot;NUMBER&quot;) != -1:\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 focus_item&#x5B;&quot;type&quot;] = &quot;kAR_DATA_TYPE_NUMERIC&quot;\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 tokens = focus_string&#x5B;focus_string.find(&quot;(&quot;) + 1: -1].split(&quot;,&quot;)\n\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 precision = tokens&#x5B;0]\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 focus_item&#x5B;&quot;precision&quot;] = int(precision)\n\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 if len(tokens) == 2:\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 scale = tokens&#x5B;1]\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 focus_item&#x5B;&quot;scale&quot;] = int(scale)\n\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 elif item&#x5B;1].find(&quot;VARCHAR2&quot;) != -1: \u00a0#VARCHAR2\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 focus_item&#x5B;&quot;type&quot;] = &quot;kAR_DATA_TYPE_STR&quot;\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 length = focus_string&#x5B;focus_string.find(&quot;(&quot;) + 1 : focus_string.find(&quot;BYTE&quot;) - 1]\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 focus_item&#x5B;&quot;length&quot;] = int(length)\n\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 elif item&#x5B;1].find(&quot;DATE&quot;) != -1:\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 focus_item&#x5B;&quot;type&quot;] = &quot;kAR_DATA_TYPE_TIMESTAMP&quot;\n\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 field_array.append(focus_item)\n\n\u00a0 \u00a0 \u00a0 \u00a0 columns = {}\n\u00a0 \u00a0 \u00a0 columns&#x5B;&quot;columns&quot;] = field_array\n\n\u00a0 \u00a0 f = open(&quot;out.json&quot;, &quot;w&quot;)\n\u00a0 f.write(json.dumps(columns, indent=4))\n\u00a0 f.close()\n<\/pre><\/div>\n\n\n<h4 class=\"wp-block-heading\">FileDef.txt<\/h4>\n\n\n\n<p>The following text file contains the schema of the view with the table delimited columns of:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Column Name<\/li>\n\n\n\n<li>Data type<\/li>\n\n\n\n<li>Nullable<\/li>\n\n\n\n<li>Column order<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-preformatted\">UDF_VARCHAR1 VARCHAR2(200 BYTE) Yes 1<br>UDF_VARCHAR2 VARCHAR2(200 BYTE) Yes 2<br>UDF_VARCHAR3 VARCHAR2(200 BYTE) Yes 3<br>UDF_NUMERIC1 NUMBER(10,6) Yes 4<br>UDF_NUMERIC2 NUMBER(10,6) Yes 5<br>UDF_NUMERIC3 NUMBER(10,6) Yes 6<br>UDF_INT1 NUMBER(10,0) Yes 7<br>UDF_INT2 NUMBER(10,0) Yes 8<br>UDF_INT3 NUMBER(10,0) Yes 9<br>UDF_DATE1 DATE Yes 10<br>UDF_DATE2 DATE Yes 11<br>UDF_DATE3 DATE Yes 12<\/pre>\n\n\n\n<p>This particular view definition is from Oracle so uses oracle data types; but it would be simple to change the code over to use a definition from a MS SQL database or other source.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Sticking it all back together<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open up the exported task file json<\/li>\n\n\n\n<li>Find the dummy field that you created in the table definition<\/li>\n\n\n\n<li>Overwrite the dummy field with the json created in the script file<\/li>\n\n\n\n<li>Save and reimport the job.<\/li>\n<\/ol>\n\n\n\n<p>If all goes well; the File source connector will be overwritten with the full table definition.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A wide problem We had a new business requirement for our team to create a new source for a CSV through Qlik Replicate. OK &#8211; simple&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[13,16],"tags":[],"class_list":["post-833","post","type-post","status-publish","format-standard","hentry","category-code","category-qlik-replicate"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/833","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=833"}],"version-history":[{"count":5,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/833\/revisions"}],"predecessor-version":[{"id":1446,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/833\/revisions\/1446"}],"wp:attachment":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=833"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=833"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=833"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}