jonny.donker@gmail.com

The Mystery of Ker-Polik (Rough English Translation)

“The Mystery of Ker-Polik” is a French comic book 1952 by Frédéric-Antonin Breysse.

We had the Dutch version as kids and use to look over the comic; trying to work out what was going on in the story. Oma read a few pages to us; other than that the story itself was (like the title) a Mystery.

Today in my spare time; I used google translate to finally translate the comic into something comprehendible that I can read. The translation is rough and my next stage is to extract the text out and reword it to align better with the story.

Warning: Google did translate some words into swear words; so read at your own discretion

Qlik Replicate – The saga of replicating to AWS Part 1 – pg_bench

If you are new to this story; pick up the introduction from here 

50 TPS – that’s not good…

Our Stress and Volume Testing (SVT) did not turn out that great; which was an understatement. We were aiming for a minimum of 300tps, but we were only achieving 50tps.

Because we could not run the SVT process within business hours as the DB2 databases for production and development shares the same hardware; I had to write a script to simulate the changes on a MS-SQL server.

This means we could use an idle Dev MS-SQL box and run SVT testing whenever we needed it. With the new testing script, I tried various iterations of settings under Change Processing Tuning to see if there was an optimal setting. I could not get above 60tps.

Meanwhile the rest of the team contacted AWS support to see if they can see anything obvious.

AWS Support recommended using the Postgres utility “pg_bench” from various locations to work out if there is a bottle neck between On Prem and AWS RDS. 

They recommend running the benchmark for 15 minutes with different numbers of clients.

Armed with the pg_bench utility; I started performing benchmark runs against the AWS RDS database from various locations in our network, trying to work out where the bottle neck is.

Running the tests


Our network had several locations that we could run pg_bench on to assess the tps to AWS RDS.  To be more specific; to an AWS RDS Proxy.  

To start off we had to learn how to use the command first.  The benchmark database must be initialised with some tables and data.  This is done with the following command:

pgbench --host rds-database-proxy.aws.com --port 5432 --username qlikreplicate -i -s 50 dest_db

Once done; a 15min benchmark was run with the following command:

pgbench --host rds-database-proxy.aws.com --port 5432 --username qlikreplicate -l -c 2 -j 2 -T 900 dest_db

Parameters:

ParameterDescription
--hostAddress of the database/proxy
--portDatabase port
--usernameUsername to connect. Must have permissions to create objects and insert data. Since it was a Dev database I used the Qlik Replicate username
-lWrite information about each transaction to a log. not really useful and in future I would omit it
-cNumber of clients to simulate. I used 2 to replicate a couple of QR tasks
-jNumber of threads to use
-TTime in seconds to run the test for. 900 seconds = 15 minutes

Four tests were conducted:

  1. The OnPrem Qlik Replicate server to AWS RDS Proxy
  2. An EC2 machine to AWS RDS Proxy
  3. The OnPrem Qlik Replicate server to an OnPrem Postgres database.  (Actually to our Enterprise manager analytic database.  So it is not a fancy highly optimised database)
  4. An EC2 machine directly to the AWS RDS database

Why a AWS RDS Proxy?  I’m not sure of the story behind why there was a proxy between Qlik and RDS; rumours it was because the security team didn’t initially allow our On Prem services to connect directly to cloud database.  They insisted that a proxy between the services and databses.

Results

The pg_bench tests revealed several learnings to the team. But unfortunately, there were no answers to our QR latency problem.

It was a bit of grim relief it proved that QR was not the cause of our latency problem; there is a component between us and AWS RDS database that is causing the issue.

Even more unfortunate when I raise the issue to the network team thinking that there might be an IPS, QOS device that might be slowing us down; they could not find anything. I have been informed that the bandwidth between our On Prem and AWS is broad and fast and there should not be any obvious chocking the tps. Grumpily I mused that households can now stream multiple 4K videos at home without a frame dropped while we can’t send a few database changes from our datacentre to an AWS database at reasonable speeds

Interestingly the RDS proxy was a significant bottleneck when we started questioning its purpose in the pipeline. Even the AWS support team was scratching their heads of the purpose of it in the use case we had. Later on, I tried OnPrem to the direct RDS; but that made no difference to the tps. 

Lessons learnt

  • Validate poor latency through Qlik Replicate via another tool.  In this instance pg_bench took QR right out of the equation
  • Don’t use AWS RDS proxy for Qlik Replicate unless you have a critical use case.  Performance test thoroughly your design choices

Qlik Replicate – The saga of replicating to AWS (Intro)

I am exhausted from the past couple of weeks; even getting close to a month of working on a new Qlik Replicate tasks that we are getting no luck with performance.

The upside I have learned many new concepts and techniques in Qlik Replicate, Postgres, MS-SQL and networking.
The downside is that (at the time of writing) we have not solved the problem, and a lot of effort hasn’t returned anything tangible for the business to use.

This post will be an introduction to the problem and the next few posts will dive into the individual areas of QR quirks, diagnostic methods and performance turning that I used.

Background

We have a big expensive DB2 Z/OS database running on premise that have our core account data in hosted in it. A logical account will be divided to two different tables; slowly changing data (like open date, close date) in one table and constant changing data (like account balance) in another table.

Downstream users want these two tables stitched together and “Change Data Capture” implemented on this combined table. The initial thought was to create triggers in DB2 so when data is updated on the “Slowly Changing Data Table” (SCDT) or the “Constant Changing Data Table” (CCDT); the triggers act on a “Combined Table” (CT).

This solution was deemed too expensive by the business; the triggers would drag down the performance of an already stressed DB2 system and to purchase more processing power was not in the viable.

The architects thought, “OK – why don’t we replicate the SCDT and CCDT to AWS RDS Postgres and build the triggers in there? We can then scale the Postgres database cheaply.”

Sound thinking. We did a quick POC that a downstream database can be triggered by data flowing into tables from Qlik Replicate and it worked as expected.

The project jumped into build mode and soon we had development done and testing on the way.

Then it was time to do the “Stress and Volume” testing. 

That is when things started to unravel…

High Hopes but low TPS

Since the Development DB2 Z/OS database shares the same hardware as Production; we had to conduct our Stress and Volume Testing (SVT) on a weekend, so our customers were not adversely affected.

Two test runs were conducted; on SVT run without the database triggers enabled and another with them enabled.

The results were not good. Our SVT SME wrote in their summary:

We ran two load test on Dec 9th Saturday – first test with database trigger OFF, second test with trigger ON.

Each test ran for about 1 hour with total 700k+ transactions at throughput 170tps to 190tps.

  1. During first test with trigger OFF, Qlik process T007 had throughput at 50tps, well below other Qlik processes at 180tps. This suggests the performance bottleneck might not be on the database trigger.During second test with trigger ON, Qlik process T007/T008 had a slightly lower throughput at 40tps, which means the database trigger adds a little bit more latency.

    In the middle of 2nd test, Qlik Enterprise Manager became broken and did not display performance graphs properly, hence I can’t provide you graphs till Jon fixes that.

  2. After the 1-hour test stopped, it took another 3.5 hours for Qlik process T007/T008 to finish its backlog. That means T007/T008 need 4.5 hour in total to complete DB2 data produced in each hour of test. Maths tells us 40:180 = 1:4.5. In other words, the “producer” (DB2 Database) works 4.5 times faster than the “consumer” (Qlik T007/T008).
  3. I notice the memory usage of DB2 related Qlik processes kept going up during test. Not sure if that is the expected behaviour.

Since our bare baseline that we decided internally was at 300tps; this was not looking good.

What lead next was a journey of performance diagnosing with may trials, tribulations, dead ends and a final way forward.

Header Illustration Credit


Qlik Replicate – Where’s my slots gone?

Qlik Replicate is growing rapidly in our organisation.

Starting with one “Proof of concept” server in DTL; we are now up to seventeen.  A mixture of development, test, performance testing servers on OnPrem and in the AWS and GCS cloud; depending where the source database is located.

With trying to keep things neat and organised; we are constantly moving QR tasks from one server to another. 

Anyway the Juniors came to me with the following error message when they were shifting a QR task with a AWS RDS postgres source:

2024-01-31T16:39:38 [SOURCE_CAPTURE  ]E:  Can't resume task after replication slot was dropped. [1020101]  (postgres_endpoint_capture.c:444)

From my understanding (I wasn’t babysitting their process); they exported the old task out with Endpoints and then reimported it into the new server.  They re-entered the passwords for the end points and then try to resume from the stream position of the old task.

I couldn’t fault that logic.

As a test we cloned that task on the same server and made the target endpoint as a NULL.

With this test task we could

  1. Start the task from scratch
  2. Resume it from the stream position that the old task got up to

This proved that the new server had no issues contacting the database and resuming from a position on the postgres database.

In the end I guessed that there was a problem with the replication slot (or lack of) for that task on the postgres server; not sure specifically but it was looking for something that it didn’t exist.

I worked out the following steps to resolve the problem:

  1. Change the task’s target endpoint to a NULL connector.
  2. Start the task with the “Reload Target” option.
  3. Wait until the task has done the “Full load” and has entered CDC mode.
  4. Stop the task and change the target endpoint back to the original end point.
  5. Start the task again with “Advance Run Options” and resume from the “Source change position” of the original taks.

The task should now be happy

Qlik Replicate – The million row challenge with

It seems every new project; the project teams come with new and innovative ways to make my life “interesting” while managing Qlik Replicate.

The latest idea they have is to pretty much request that one of our major Qlik Replicate tasks that is sucking data our of a DB2 database and delivering data to AWS S3 be duplicated and deliver pretty much the same data to GCS.

When I got wind of this proposal it was already too far into the solution design to propose another solution; like copying the data from GCS to AWS S3.  So, it was left up to me to go and tell the owners of the DB2 system that there is another massive feed will be built of their database – basically doing the same thing.

With some persuasion I could get internal funding to research and implement Log Streaming to consolidate our reads from the DB2 system.  Hardware wise this required adding more memory and disk space to the QR servers to handle the log streaming. 

As for the Log stream component of the solution, I started looking into the settings.  Since we are drawing a lot of data from DB2 – I am concerned about disk space and what performance impact the compression setting has on the task.

To test the impact; I set up the following test scenario:

The "Million row" challenge

To test – I created a very wide table in a test MS-SQL (a database I had access to); which contained integers and varchars.

With a python script, a million rows were inserted into the table in batches of 10,000.  In the end the million rows equated to about 1.3Gb on the database.

I did ten test runs, increasing the compression in the log stream endpoint with each individual run.

 

Using Qlik Replicate’s analytics I could then get the metrics for the Log Stream’s task; and a task that is reading from the log stream and outputting the data to a NULL endpoint.

I also grabbed the file sizes that the log stream created on the server’s drive.

Here are the results:

  Log Stream File Size (Mb) LS – CPU % LS – Memory (MB) LS – Source latency (sec) LS – Target latency (sec) Target – CPU % Target – Memory (MB) Target – Source latency (sec) Target – Target latency (sec)
No compression 1,974 7 134 1 3 4 126 3 4
Level 2 718 15 169 2 4 4 105 3 3
Level 3 690 14 156 0 3 4 118 2 2
Level 4 683 16 155 1 2 6 127 2 2
Level 5 675 21 281 1 5 6 156 3 3
Level 6 665 24 224 1 3 5 176 4 4
Level 7 663 23 182 1 3 4 160 4 4
Level 8 657 38 394 1 9 4 181 9 9
Level 9 657 46 345 1 10 6 189 12 12
                   

Some notes on the results

  • The run was done on a shared server; so other processes on the server running at the same time could impact the CPU and memory metrics
  • Analytics only takes a snapshot of the metrics every 5min and the million rows are inserted into the database in less than <10min so we can only get one snapshot of the metrics for the run.  Depending where the run was up to; this could impact the metrics

Conclusions

Like most things in IT; there is no magical setting that will cover all scenarios.  Everything is a compromise, and the settings will need to be adjusted to meet one’s particular source endpoint, nature of the data being replicated, and the latency can be tolerated to disk space usage.

Here is what I learnt.

  • If you require absolute minimum latency as possible – log steaming is not your answer for a couple of reasons. 
    1. There is an inherited one second delay writing batches from the source system to the log stream.
    2. All the data is written to disk and then read by the target QR task.  This means that the changes do not reside in memory and need to be read from disk.
  • Level 2 significantly reduces disk space usage with little compromise to speed and resource usage.
  • Level 4 is the best compromise of disk space, speed, and resource usage.  Anything above level 4 starts to strain the speed and resource usage with little improvement to disk space.

Implementing Log streaming plan

For my scenario – I think Level 2 is the way to go and will stem my testing out from there. 

My implementation plan is to implement the log streaming task into production and let it run by itself for a couple of weeks.

During that time, I will monitor how the log stream task behaves with large batches and how much data it writes to disk as we cannot generate that amount data in our performance testing environments.

Once the log stream task is settled; then I will start migrating the existing Qlik Replicate tasks to read from the log stream task stream instead of directly from the database.

Easter-lily Cacti – Beauty amongst the spines

Years ago – my sister gave me an Easter Lilly cactus (Echinopsis oxygona) in a pot.  It was one of the first cactus I owned.

Then I forgot about it and neglected it over winter.

A year later I had a closer look at it; and the little neglected cactus has grown and there were heaps of cacti pups growing off the original plant.

So with tongs, welding gloves and a towel; I divided the pups off the mother plant and gave them their own individual pots.

The rest is history and after 10 years of growing and dividing cacti – I have over hundred individual cacti plants.  It has become an prickly issue with my wife as she claims I have a hording issue.

But I think there are definitely worse things to to “hoard” than cacti.

Plus they have absolutely beautiful flowers; with a very subtle yet pleasant perfume.

The bees are also on team cacti and early in the morning when the lily type flowers open; they get in early.

Caring for Easter Lilly Cacti

These cacti is pretty forgiving and simple to care for in South Eastern Australia:

  • Place the pot in a warm sunny position that get about 6+ hours a day
  • For watering:
    • In the peak of summer; water once a week with a quick shower of water from the hose.
    • In the milder months – once every two to three weeks
    • In the winter; let the natural rain rain water them; unless it is a particularly dry winter
  • Once a year in the spring; give them a feed of slow release fertiliser.
  • When the cactus starts to outgrow its pot; transplant into a larger pot using well draining soil.  You can buy cacti potting mix from nurseries. 

 

 

Pi-Hole: Metrics from a Postgres database

I love Pi-Hole.

It is one of my favourite services that I have running on my home network.

I definitely take Pi-Hole for granted at home and only really appreciate how effective it is when I travel and connect to other networks.

While the function of Pi-Hole is brilliant – my Data warehouse Analyst and reporting background craves for more power from the metrics behind Pi-Hole.

By default Pi-Hole offers a simple dashboard with a few catchy graphs and metrics.  This can be hypnotizing to watch when first installing Pi-Hole; watching shock of how many DNS queries are blocked visiting different  websites.

Behind the scenes is a sqlite database located on the server for more power users to query.

We need more power!

For me – I have a couple of problems with the Pi-Hole dashboard and crave for more functionality and power from what is on offer.

Some issues I have:

  • On my home network – I have two Pi-Hole servers working in a redundancy.  The main Pi-Hole server is sitting on a hypervisor while the backup one is on a Raspberry Pi.  The main rational of this is that if we have a power outage that knocks the hypervisor offline while I am not at home; the backup Pi-Hole server will take over and save an angry phone call from the wife on why she can’t access the internet.

    One of my wishes is to see the combined Pi-Hole metrics across both servers in one dashboard.

  • A couple of items on the network creates a large amount of noise that skews the metrics.  For instance the security cameras use to query a NTP server every second until I routed them through a NTP server at home.  My work computer when it drops off the VPN will flood particular addresses with traffic. 

    I would like to exclude these addresses from the metrics; but not exclude them from the P-Hole functionality.

  • Access to better analytics from the Pi-Hole data. 
    • How about a heat map of blocked queries on devices? 
    • Or weekly trends with a moving average? 
    • Or ability to drill into a time where there is a high amount of blocked traffic to determine the  device and application is behind it? (Answer: My kid playing mobile app games)

Napkin Solution

This is what I am working on at the moment to extract and load the data from my two Pi-Hole servers into a Postgres database to query from.

  1. A NFS share is set up to share out the pihole-FLT.db to a linux server on both Pi-Hole servers
  2. A cron job runs every x minutes to run a python script.  The python scrip will query the pihole-FTL.db to get the records from the last timestamp it ran to the current timestamp in the Pi-Hole database and land the data in to a “landing” schema on the postgres database
  3. A stored procedure will run to transfer and normalise the data into an ods schema
  4. A reporting tool will then query off the ods schema.  In this case I am using Apache Superset; just because it was freely available and the web nature means I can share dashboards to multiple devices without installing client software.

Performance of querying sqlite through a NFS share

Initially I was worried about querying the pihole-FTL.db over the NFS share; thinking it will be too slow to be practical.

I was thinking of a few clunky solutions:

  1. Copy the entire DB over to the postgres host and load it from there
  2. Write a script that runs on the Pi-Hole server to extract the incremental data from the pihole-FLT.db into files and load the files

Thankfully from the volume of traffic we produce; it is not a concern.

The initial load of copying all the data out of the Pi-Hole database does take a long time if there is a large amount of history to transfer across.

But once the initial load is done; small incremental bites of data only take a few seconds

Current state

I have a pilot running on my home network as a proof of concept.

Fundamentally it works fine and I can gleam some interesting metrics from my Pi-Hole servers.

At the moment the security is not great atrocious as I am just using a super username and password throughout the pipeline.

Once I have cleaned up the postgres sql, python code I can share the complete solution.

Qlik Replicate Expressions – Wait. That’s Illegal

A new team will be taking over for out of hours oncall for Qlik Replicate and I tasked with training up a new team to diagnose and resolve problems with QR.

I thought it was a good idea to create some practice QR tasks in our development environment; intentionally put some problems in the tasks and let the new team diagnose the problem in a safe and no pressure scenarios.

I did the obvious ones; like put an incorrect password in an end point, put a typo in a filter criteria.

Looking for another scenario I thought – “Nothing better than a divide by zero error.”

Wait. That’s Illegal – Part 1

I created a simple table; with a field of CURR_BAL = 1000 and ASSET_VALUE = 0 and added a new field call LVR with the expression of:

$CURR_BALANCE / $ASSET_VALUE

Ran a full load expecting lots of nice and nasty error messages for the new team to investigate.

Hmm – no errors.

That’s not what I expected.

All the rows were transferred across and there was no errors or warnings in the log file, no data errors or anything.

I bumped up all the logging to “Trace” and reran the full load.  Once again, no error messages.

Redirecting the output from the task to a kafka topic; I notice the records that had an intentional div/0 scenario was coming through as null.

{
  "magic": "atMSG",
  "type": "DT",
  "headers": null,
  "messageSchemaId": null,
  "messageSchema": null,
  "message": {
    "data": {
      "ACCOUNT_ID": 5,
      "ACCOUNT_NAME": "Elissa",
      "CURR_BALANCE": "10000.00",
      "ASSET_VALUE": "0.00",
      "LVR": null
    },
    "beforeData": null,
    "headers": {
      "operation": "REFRESH",
      "changeSequence": "",
      "timestamp": "",
      "streamPosition": "",
      "transactionId": "",
      "changeMask": null,
      "columnMask": null,
      "transactionEventCounter": null,
      "transactionLastEvent": null
    }
  }
}

Then I remembered – CDC expressions are built on sqlite syntax.  Looking up how sqlite handles div/0 scenarios; by default sqlite will return a null instead of failing the query on a div/0.

Quoting from Sqlite – Difference Between engines

SQLite does not raise arithmetic exceptions (eg. divide by zero, 1/0). SQLite returns a NULL value for 1/0.

Wait. That’s Illegal – Part 2

I got past my div/0 conundrum after the explanation of what was happening and had a look at a “valid” record.

{
    "magic": "atMSG",
    "type": "DT",
    "headers": null,
    "messageSchemaId": null,
    "messageSchema": null,
    "message": {
        "data": {
            "ACCOUNT_ID": 1,
            "ACCOUNT_NAME": "Alice",
            "CURR_BALANCE": "12345.00",
            "ASSET_VALUE": "500000.00",
            "LVR": 0        <------------- Huh?
        },
        "beforeData": null,
        "headers": {
            "operation": "REFRESH",
            "changeSequence": "",
            "timestamp": "",
            "streamPosition": "",
            "transactionId": "",
            "changeMask": null,
            "columnMask": null,
            "transactionEventCounter": null,
            "transactionLastEvent": null
        }
    }
}

Again our LVR value was wrong.

12345.00 / 500000.00 should be 0.02469; instead of 0

A couple of things popped to mind.

It looked like the classic integer division issue that always catches out new (and experienced) programmer.

But that didn’t make any sense since the source fields were both numeric(18,2).

I checked the destination field that I added.

It was set to numeric(18,2), which isn’t precise enough for the LVR value but we still should have got 0.2 instead 0.

OK – QR must not like the “custom” precision of numeric(18,2) for the LVR field. 

I tried forcing the LVR field data type to REAL(4).

Still got the incorrect value for LVR

Maybe for some reason QR /sqlite treats the values coming in as an integer; even though the source data type is numeric(18,2) on the database.

Let’s give this a go:

CAST($CURR_BALANCE AS NUMERIC) / $ASSET_VALUE

^^ Nope

CAST($CURR_BALANCE AS NUMERIC) / CAST($ASSET_VALUE AS NUMERIC) 

^^ Long shot but nope

CAST($CURR_BALANCE AS REAL) / $ASSET_VALUE 

^^ At last success

{
    "magic": "atMSG",
    "type": "DT",
    "headers": null,
    "messageSchemaId": null,
    "messageSchema": null,
    "message": {
        "data": {
            "ACCOUNT_ID": 1,
            "ACCOUNT_NAME": "Alice",
            "CURR_BALANCE": "12345.00",
            "ASSET_VALUE": "500000.00",
            "LVR": "0.0246"  <------------- Finally!
        },
        "beforeData": null,
        "headers": {
            "operation": "REFRESH",
            "changeSequence": "",
            "timestamp": "",
            "streamPosition": "",
            "transactionId": "",
            "changeMask": null,
            "columnMask": null,
            "transactionEventCounter": null,
            "transactionLastEvent": null
        }
    }
}

It works – but forcing a precise of numeric value to a floating point data type REAL leaves me uneasy; especially in a finance world were precision is needed.

I haven’t researched or tested whether this is a risk of generating an imprecise number through a division or not; something I will need to put on my todo list.

Conclusion

My design principle for Qlik Replicate tasks is to keep them as simple as possible.

Grab the data off the source and get it to the destination as quickly as possible.

By design principle I am against putting expressions in QR tasks:

  1. It is not as feature rich as other programming languages 
  2. It hides business rules that adds a black box and complexity to the pipeline

Point 2 was the big one for me.  I didn’t want downstream users messaging me every time there was an abnormality in the data.  With no expressions in the QR tasks – I could rightfully say, “QR just grabs what it finds on the source database – go and talk to them”

Now with the discovery of these little “features” that could come from expressions; and more importantly no error or warnings about these “features”; it gives me even more incentive not to program in exceptions.

QR in my eyes is not an “ETL” tool – it is just a “EL” tool.

If your designers want complex expressions and transformations programmed into their data pipelines – it should be the downstream developers task.

That’s why they get paid the big bucks.


SQL: Quick and dirty comparing result sets

When doing regression testing – I need to prove that the changes that I have done to two queries returned the same result sets.

If result sets are on the same database then you can do a MINUS operator. 

If not – it can be a pain to shift result sets from database to database; especially if they are large.

For quick and dirty testing; I like using the tsql function CHECKSUM and pl sql function ORA_HASH.  It can give you some quick results to test whether a column of data has a high probability of been the same as another column.

To test a result set; I have the following excel formulas saved in my “Handy formulas” notebook:

TSQL
=",SUM(CAST(CHECKSUM("&A1&") AS BIGINT)) AS "&A1
PL/SQL
=",SUM(ORA_HASH("&A1&")) AS "&A1

With a list of column name you can then make an checksum sql statement from Excel

Running this query on both result sets; differences can be found by seeing if there is a difference in the retuned values

Excel & VBA – Concatenating two fields together with formatting

A work colleague asked me to do a “Mate’s rate” job for his team yesterday.

They had a MS Word document with a huge table in it that they needed to convert to Excel to load into an application.

Paraphrased  it looked like this:

My first thought was to copy this table out and paste it into excel to see how it looks:

Two problems:

  1. The cell “Some data” is merged across three rows
  2. Line 1, Line 2, Line 3 are now split across three rows instead of being in the one cell

I wrote a macro to split out the merge cells; but when it came to combining cells for Line 1, Line 2 and Line 3 I ran into a problem.

None of the usual commands and from my quick google searching could combine the cells and retained the formatting.

So commands like:

=CONCAT(B2:B4)

=B2 & B3 & B4

Would not retain the formatting; let alone add in line feed characters to make it a multiline cell

I’m sure there is a better way – but strapped for time I programmed the following subroutine in VBA

Public Sub combineCells(cell1 As Range, cell2 As Range, output As Range)
    Dim formatIndex As Long
    Dim formatOffset As Long
    Dim hyperlink1 As String
    Dim hyperlink2 As String

    hyperlink1 = ""
    hyperlink2 = ""

    ' Check if the cell has a hyperlin; but don't have a text version of the hyperlink in the Value 2
    If cell1.Hyperlinks.Count <> 0 And InStr(cell1.Value2, "https:") < 1 Then
        hyperlink1 = " [" & cell1.Hyperlinks(1).Address & "]"
    End If

    If cell2.Hyperlinks.Count <> 0 And InStr(cell2.Value2, "https:") < 1 Then
        hyperlink2 = " [" & cell2.Hyperlinks(1).Address & "]"
    End If

 
    ' Handling if the first cell is blank.  If so we don't want a LF at the top of the cell
    If Trim(cell1.Value2) <> "" Then
        output.Value2 = cell1.Value2 & hyperlink1 & vbLf & cell2.Value2 & hyperlink2
        formatOffset = Len(cell1.Value2) + Len(hyperlink1) + 1
    Else
        output.Value2 = cell2.Value2 & hyperlink2
        formatOffset = Len(cell1.Value2)
    End If

  
    ' Copies the formatting from cell1 to the final cell
    ' You can add more options to transfer over different formatting
    For formatIndex = 1 To Len(cell1.Value2)
        output.Characters(formatIndex, 1).Font.Bold = cell1.Characters(formatIndex, 1).Font.Bold
        output.Characters(formatIndex, 1).Font.Italic = cell1.Characters(formatIndex, 1).Font.Italic
        'output.Characters(formatIndex, 1).Font.Underline = cell1.Characters(formatIndex, 1).Font.Underline
    Next

  
    ' Copies the formatting from cell2 to the final cell
    For formatIndex = 1 To Len(cell2.Value2)
        output.Characters(formatIndex + formatOffset, 1).Font.Bold = cell2.Characters(formatIndex, 1).Font.Bold
        output.Characters(formatIndex + formatOffset, 1).Font.Italic = cell2.Characters(formatIndex, 1).Font.Italic
        'output.Characters(formatIndex + formatOffset, 1).Font.Underline = cell2.Characters(formatIndex, 1).Font.Underline
    Next

End Sub

Oh boy it runs slow – to combine a couple of thousands cells took half an hour and I had the typical worries that Excel crashed because everything was locked up.

But it worked for the quick task that I was doing