{"id":1655,"date":"2025-08-06T06:23:01","date_gmt":"2025-08-06T06:23:01","guid":{"rendered":"https:\/\/craftcookcode.com\/?p=1655"},"modified":"2025-08-07T00:42:32","modified_gmt":"2025-08-07T00:42:32","slug":"qlik-replicate-stripping-latency-out-of-log-files","status":"publish","type":"post","link":"https:\/\/craftcookcode.com\/?p=1655","title":{"rendered":"Qlik Replicate: Stripping latency out of log files"},"content":{"rendered":"\n<p>We have been doing a bit of &#8220;Stress and Volume&#8221; testing in Qlik Replicate over the past few days; investigating how my latency is introduced to a MS SQL server task if we run it through a log stream.<\/p>\n\n\n\n<p>If you&#8217;re not aware &#8211; you can get minute latecy from Qlik Replicate by turning the &#8220;Performance&#8221; logs up to &#8220;Trace&#8221; or higher:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img fetchpriority=\"high\" decoding=\"async\" width=\"572\" height=\"1024\" src=\"https:\/\/craftcookcode.com\/wp-content\/uploads\/2025\/08\/code_latency_1-572x1024.png\" alt=\"\" class=\"wp-image-1656\" srcset=\"https:\/\/craftcookcode.com\/wp-content\/uploads\/2025\/08\/code_latency_1-572x1024.png 572w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2025\/08\/code_latency_1-168x300.png 168w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2025\/08\/code_latency_1-768x1374.png 768w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2025\/08\/code_latency_1-858x1536.png 858w, https:\/\/craftcookcode.com\/wp-content\/uploads\/2025\/08\/code_latency_1.png 1072w\" sizes=\"(max-width: 572px) 100vw, 572px\" \/><\/figure>\n\n\n\n<p>This will result in messages getting produced in the task&#8217;s log file like:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n00011044: 2025-08-04T14:57:56 &#x5B;PERFORMANCE     ]T:  Source latency 1.23 seconds, Target latency 2.42 seconds, Handling latency 1.19 seconds  (replicationtask.c:3879)\n00011044: 2025-08-04T14:58:26 &#x5B;PERFORMANCE     ]T:  Source latency 1.10 seconds, Target latency 2.27 seconds, Handling latency 1.16 seconds  (replicationtask.c:3879)\n00009024: 2025-08-04T14:58:30 &#x5B;SOURCE_CAPTURE  ]I:  Throughput monitor: Last DB time scanned: 2025-08-04T14:58:30.700. Last LSN scanned: 008050a1:00002ce8:0004. #scanned events: 492815.   (sqlserver_log_utils.c:5000)\n00011044: 2025-08-04T14:58:57 &#x5B;PERFORMANCE     ]T:  Source latency 0.61 seconds, Target latency 1.87 seconds, Handling latency 1.25 seconds  (replicationtask.c:3879)\n00011044: 2025-08-04T14:59:27 &#x5B;PERFORMANCE     ]T:  Source latency 1.10 seconds, Target latency 1.55 seconds, Handling latency 0.45 seconds  (replicationtask.c:3879)\n\n<\/pre><\/div>\n\n\n<p>I created a simple python script to parse a folder of log files and output it in a excel.  Since the data ends up in a panda data frame; it would be easy to manipulate the data and output it in a specific way:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport os\nimport pandas as pd\nfrom datetime import datetime\n\n\ndef strip_seconds(inString):\n    \n    index = inString.find(&quot; &quot;)  \n    returnString = float(inString&#x5B;0:index])\n    return(returnString)\n\n\ndef format_timestamp(in_timestamp):\n    \n    # Capture dates like &quot;2025-08-01T10:42:36&quot; and add a microsecond section\n    if len(in_timestamp) == 19:\n        in_timestamp = in_timestamp + &quot;:000000&quot;\n\n    date_format = &quot;%Y-%m-%dT%H:%M:%S:%f&quot;\n\n    # Converts date string to a date object    \n    date_obj = datetime.strptime(in_timestamp, date_format)\n\n    return date_obj\n\n        \ndef process_file(in_file_path):\n\n    return_array = &#x5B;]\n    \n    with open(in_file_path, &quot;r&quot;) as in_file:\n    \n        for line in in_file:\n\n            upper_case = line.upper().strip()\n\n            if upper_case.find(&#039;&#x5B;PERFORMANCE     ]&#039;) &gt;= 0:\n                timestamp_temp = upper_case&#x5B;10:37]\n                timestamp = timestamp_temp.split(&quot; &quot;)&#x5B;0]\n                \n                split_string = upper_case.split(&quot;LATENCY &quot;)\n                \n                if len(split_string) == 4:\n                \n                    source_latency = strip_seconds(split_string&#x5B;1])\n                    target_latency = strip_seconds(split_string&#x5B;2])\n                    handling_latency = strip_seconds(split_string&#x5B;3])\n\n                    # Makes the date compatible with Excel\n                    date_obj = format_timestamp(timestamp)\n                    excel_datetime = date_obj.strftime(&quot;%Y-%m-%d %H:%M:%S&quot;)\n                    \n                    # If you&#039;re outputting to standard out\n                    #print(f&quot;{in_file_path}\\t{time_stamp}\\t{source_latency}\\t{target_latency}\\t{handling_latency}\\n&quot;)\n\n                    return_array.append(&#x5B;in_file_path, timestamp, excel_datetime, source_latency, target_latency, handling_latency])\n                     \n    return return_array\n    \n\nif __name__ == &#039;__main__&#039;:\n\n    log_folder = &quot;\/path\/to\/logfile\/dir&quot;\n    out_excel = &quot;OutLatency.xlsx&quot;\n    \n    latency_data = &#x5B;]\n\n    # Loops through files in log_folder\n    for file_name in os.listdir(log_folder):\n\n        focus_file = os.path.join(log_folder, file_name )\n\n        if os.path.isfile(focus_file):\n            filename, file_extension = os.path.splitext(focus_file)\n\n            if file_extension.upper().endswith(&quot;LOG&quot;):\n                print(f&quot;Processing file: {focus_file}&quot;)\n                return_array = process_file(focus_file)\n\n                latency_data += return_array\n                \n\n    df = pd.DataFrame(latency_data, columns=&#x5B;&quot;File Name&quot;, &quot;Time Stamp&quot;, &quot;Excel Timestamp&quot;, &quot;Source Latency&quot;, &quot;Target Latency&quot;, &quot;Handling Latency&quot;])\n    df.info()\n\n    # Dump file to Excel; but you can dump to other formats like text etc\n    df.to_excel(out_excel)\n<\/pre><\/div>\n\n\n<p>And voil\u00e0 &#8211; we have an output to Excel to quickly create statistics on latency for a given task(s).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What statistics to use?<\/h2>\n\n\n\n<p>Latency is something you can analyse in different ways, depending on what you&#8217;re trying to answer. It is also important to pair latency statistics with the change volume that is coming through.<\/p>\n\n\n\n<p>Has the latency jumped at a specific time because the source database is processing a daily batch? Is there a spike of latency around Christmas time where there are more financial transactions compared to a benign day in February?<\/p>\n\n\n\n<p>Generally, if the testers are processing the latency data they provide back:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Average target latency<\/li>\n\n\n\n<li>90<sup>th<\/sup> percentile target latency<\/li>\n\n\n\n<li>95<sup>th<\/sup> percentile target latency<\/li>\n\n\n\n<li>Min target latency<\/li>\n\n\n\n<li>Max target latency<\/li>\n<\/ul>\n\n\n\n<p>This is what we use to compare two runs to each other when the source load is consistent and we&#8217;re changing a Qlik Replicate task<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We have been doing a bit of &#8220;Stress and Volume&#8221; testing in Qlik Replicate over the past few days; investigating how my latency is introduced to&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":1659,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[18,16],"tags":[69,36],"class_list":["post-1655","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python","category-qlik-replicate","tag-python","tag-qlikreplicate"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/1655","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1655"}],"version-history":[{"count":2,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/1655\/revisions"}],"predecessor-version":[{"id":1658,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/posts\/1655\/revisions\/1658"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=\/wp\/v2\/media\/1659"}],"wp:attachment":[{"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1655"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1655"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/craftcookcode.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1655"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}