postgres

Qlik Replicate: You’re trapped in a Docker container now!

In Qlik Replicate we tasks unable to resume when we have nasty server failures (for instant the CrowdStrike outage in July 2024).

This only happens in tasks that are impacted are a RDBMS to a cloud storage system like AWS S3 or GCS. 

In the task log the error message takes the form of:

00002396: 2022-08-26T15:21:14 [AT_GLOBAL ]E: Json doesn't start with '{' [1003001] (at_cjson.c:1773)
00002396: 2022-08-26T15:21:14 [AT_GLOBAL ]E: Cannot parse json: [1000251] (at_protobuf.c:1420)

This error gives us problems; I can’t resume the task as the error re-appears.  I can’t even start it from the stream position and must rely on restarting the QR task from a timestamp, which is extremely dangerous with the chance of missing out on data for that split of a second.

I suspect the problem is that the “staging” file on the QR server gets corrupted mid write when the server fails and when resume; QR can’t parse it.

But trying to recreate the problem in a safe environment to diagnose it is tricky.  Our DTL environment doesn’t create enough traffic to trigger the issue.  Also, I don’t want to be abruptly turning off our DTL QR servers and interrupting other people’s testing.  As for trying to recreate the problem in production – the pain of all the red tape is not worth the effort.

I needed a safer space to work in.  A space when I can pump through large volumes of data through QR and kick the QR service around trying to provoke the error.  Armed with my little Linux VM – docker containers was the answer.

CentOS? Why CentOS?

My goal was to build a Docker container with Qlik Replicate and Postgres drivers so I can use it on my Linux VM.

Under Support articles, Qlik has a guide on how to run Qlik Replicate in a Docker container.

Following the instructions I ran into some initial problems.  The first major problem was using the Cent OS docker image.  The issue was that I must use the packages in my company’s artifactory and not external packages.  Although the company had CentOS; there was no other packages available to update and install.  Since my VM cannot reach http://vault.centos.org; the CentOS image was a lame duck.

With CentOS off the cards, I had to use Redhat image that my company provided.  With Redhat – the artifactory had all the packages that I needed.

The second problem was that I was wanting to use the 2023.11 image to match our environment.  With 2023.11 there are some extra steps needed in the docker file compared to 2024.05.  The differences is notated on Qlik’s support article.

The Dockerfile

Here is the Dockerfile

FROM my.companys.repo/redhat/ubi9


ENV QLIK_REPLICATE_BASE_DIR=/opt/attunity/replicate/
ENV ReplicateDataFolder=/replicate/data
ENV ReplicateAdminPassword=AB1gL0ngPa33w0rd
ENV ReplicateRestPort=3552
ENV LicenseFile=/tmp/replicate_license_exp2025-06-29_ser60038556.txt

# Copy across installation packages and licenses
ADD postgresql*.rpm /tmp/
ADD areplicate-*.rpm /tmp/
ADD systemctl /usr/sbin
ADD replicate_license_exp2025-06-29_ser60038556.txt /tmp/

# Update packages
RUN dnf -y update
RUN dnf makecache

# To get ps command
RUN dnf -y install procps-ng
RUN dnf -y install unixODBC unzip
RUN dnf -y install libicu.x86_64
RUN rm -f /etc/odbcinst.ini

# Installing posgres packages
RUN rpm -ivh /tmp/postgresql13-libs-13.9-1PGDG.rhel9.x86_64.rpm
RUN rpm -ivh /tmp/postgresql13-odbc-13.02.0000-2PGDG.rhel9.x86_64.rpm
RUN rpm -ivh /tmp/postgresql13-13.9-1PGDG.rhel9.x86_64.rpm

ADD odbcinst.ini /etc/

# Installing Qlik Replicate
RUN systemd=no yum -y install /tmp/areplicate-2023.11.0-468.x86_64.rpm
RUN yum clean all
RUN rm -f /tmp/areplicate-*.rpm

RUN export LD_LIBRARY_PATH=/opt/attunity/replicate/lib:\$LD_LIBRARY_PATH
RUN echo "export LD_LIBRARY_PATH=/usr/pgsql-13/lib:\$LD_LIBRARY_PATH" >> /opt/attunity/replicate/bin/site_arep_login.sh

ADD start_replicate.sh /opt/attunity/replicate/bin/start_replicate.sh
RUN chmod 775 /opt/attunity/replicate/bin/start_replicate.sh
RUN chown attunity:attunity /opt/attunity/replicate/bin/start_replicate.sh
RUN source $QLIK_REPLICATE_BASE_DIR/bin/arep_login.sh >>~attunity/.bash_profile
ENTRYPOINT /opt/attunity/replicate/bin/start_replicate.sh ${ReplicateDataFolder} ${ReplicateAdminPassword} ${ReplicateRestPort} ${LicenseFile} ; tail -f /dev/null

The postgres packages can be obtained from https://download.postgresql.org/pub/repos/yum/13/redhat/rhel-9-x86_64/

Th file odbcinst.ini content is:

[PostgreSQL]
Description = ODBC for PostgreSQL
Driver      = /usr/lib/psqlodbcw.so
Setup       = /usr/lib/libodbcpsqlS.so
Driver64    = /usr/pgsql-13/lib/psqlodbcw.so
Setup64     = /usr/lib64/libodbcpsqlS.so
FileUsage   = 1

The systemctl file is:

# Run LS command - remove this line 
ls

And of course you need the rpm for Qlik replicate and your license file.

Once the Dockerfile and files are collated in a directory; build the container with:

docker build --no-cache -t ccc/replicate:2023.11 .

If all goes well – a Docker contain will be built and ready to be used.

Docker Compose

To make running the docker images easier; create a docker compose file:

version: '3.3'

services:
  replicate:
    image: docker.io/ccc/replicate:2023.11
    container_name: replicate_2023_11
    ports: 
      - "3552:3552"

    environment:
      - ReplicateRestPort=3552
      - TZ=Australia/Melbourne

    volumes:
      - /dockermount/data/replicate/data:/replicate/data

    extra_hosts:
      - host.docker.internal:host-gateway

volumes:
  replicate:

Save the docker-compose.yml in a directory and from the directory start the container with the command:

docker-compose up -d

If everything is working – run the docker ps command to verify everything is working:

docker ps

So far looking good. Further conformation can be had by connecting into the container and observe the QR processes running:

docker exec -it qr_container_id bash
ps -aux

There should be two main processes; plus a process for each individual QR tasks running:

With everything confirmed – QR console can be accessed from a browser.

https://127.0.0.1:3552/attunityreplicate/

Running Postgres and pgadmin through Docker on a Dodgy Linux VM

In the organisation that I work in; I have a good old Windows 10 ThinkPad that has been my work horse for many years. 

It does the job and works happily with our on Prem apps and I can do most of my role’s development on it.

There are areas where the work horse cannot help me out.  For instance, I needed to develop a JavaScript function on a Postgres database to split a field into different elements.  I do not have access to be able to develop on the target database; so, I turned to Docker to run a containerised version of Postgres and pgadmin to have a safe area to play in.

The dreaded Linux VM

The “cool” developers have access to Macs to run their DevOps tools on. 

I have a Linux VM, running Ubuntu 20.04 on.

It loads slow, it runs slow and the support VM application hogs a significant amount of the memory available, leaving little left for me. 

But does allow me to run Docker containers.

The first container I created; broke the VM.  The VM support team speculated that a port for Postgres or pgadmin broke the organisation’s VM ports.  They rebuilt my VM and I tried again.

docker-compose.yml

This is my docker-compose.yml file for Postgres and pgadmin

version: '3.3'

services:
  db:
    #image: postgres
    image: clkao/postgres-plv8
    container_name: local_pgdb
    restart: always
    ports:
      - "9432:5432"
    environment:
      - POSTGRES_PASSWORD=verystrongpassword
      - POSTGRES_USER=jonny 
      - POSTGRES_DB=work

    volumes:
      #- ~/apps/postgres:/var/lib/postgresql/data
      - ~/apps/postgres-plv8:/var/lib/postgresql/data
    
  pgadmin:
    image: dpage/pgadmin4
    container_name: pgadmin4_container
    restart: always
    ports:
      - "9888:80"
      - "9443:443"
   
    environment:
       PGADMIN_DEFAULT_EMAIL: jonny@craftcookcode.com
       PGADMIN_DEFAULT_PASSWORD: verystrongpassword
       
       # Fix for IPv6-disabled systems https://stackoverflow.com/questions/68766411/pgadmin-4-in-docker-failed-with-gunicorn-server-start-error
       PGADMIN_LISTEN_ADDRESS: 0.0.0.0

    volumes:
      - ~/apps/pg_admin/pgadmin-data:/var/lib/pgadmin

volumes:
  local_pgdata:

There a couple of changes from the boiler plate docker-compose.yml files on the internet:

  • The ports are mapped to non-standard ports. This is to avoid any potential problems with ports conflicting with the VM software
  • I had to change the volumes to my home drive due to security settings on my VM

Errors, Problems and Issues (Oh my)

When initially running the docker-compose; I got the following error and pgadmin wouldn’t start.

pgadmin4_container | ERROR  : Failed to create the directory /var/lib/pgadmin/sessions:
pgadmin4_container |            [Errno 13] Permission denied: '/var/lib/pgadmin/sessions'
pgadmin4_container | HINT   : Create the directory /var/lib/pgadmin/sessions, ensure it is writeable by
pgadmin4_container |          'pgadmin', and try again, or, create a config_local.py file
pgadmin4_container |          and override the SESSION_DB_PATH setting per
pgadmin4_container |          https://www.pgadmin.org/docs/pgadmin4/8.9/config_py.html

This issue was resolved from an article from a Stack Overflow thread by changing the ownership of the pg_admin trigger to 5050

sudo chown -R 5050:5050 ~/apps/pg_admin

The next error I had was a “Can’t connect to (‘::’, 80)” error in pgadmin

pgadmin4_container | [2024-07-09 05:25:58 +0000] [1] [INFO] Starting gunicorn 22.0.0
pgadmin4_container | [2024-07-09 05:25:58 +0000] [1] [ERROR] Retrying in 1 second.
pgadmin4_container | [2024-07-09 05:25:59 +0000] [1] [ERROR] Retrying in 1 second.
pgadmin4_container | [2024-07-09 05:26:00 +0000] [1] [ERROR] Retrying in 1 second.
pgadmin4_container | [2024-07-09 05:26:01 +0000] [1] [ERROR] Retrying in 1 second.
pgadmin4_container | [2024-07-09 05:26:02 +0000] [1] [ERROR] Retrying in 1 second.
pgadmin4_container | [2024-07-09 05:26:03 +0000] [1] [ERROR] Can't connect to ('::', 80)

Again Google and Stack Overflow came to the rescue. The issue was potentially caused if IPv6 is disabled on the VM. I added in the the following line to the yml file:

PGADMIN_LISTEN_ADDRESS: 0.0.0.0

This resolved the issue and now pgadmin could start up.

Inside pgadmin

When I got inside pgadmin; for the life of me I couldn’t connect to the Postgres database.

I could see that the Postres container was running with no errors. I could see the remapped ports. I could connect to Postgres with psql. Why couldn’t I connect to the Postgres in pgadmin?

I was frustrated and tired after a long day of work and had walked away from the computer.

When I got back after a walk around the block and a cup of tea – I could now see the problem and the solution:

Initially I was using “Hostname” as 127.0.0.1 and port as 9432. Because that’s where my mind went to how to connect to the Postgres database running locally.

But because pgadmin is trying to access Postgres from within the docker network; it will be looking for port 5432 instead of 9432 and the container name local_pgdb instead of 127.0.0.1

If I am running from outside the docker; I would use localhost and port 9432. For instance I imported some data to develop off:

psql --host localhost --port 9432 --username jonny -d work -f ~/some_postgres_data.dmp

Once I entered the right details; pgadmin works fine connecting and I could develop the Postgres function in a safe area.

Pi-Hole: Metrics from a Postgres database

I love Pi-Hole.

It is one of my favourite services that I have running on my home network.

I definitely take Pi-Hole for granted at home and only really appreciate how effective it is when I travel and connect to other networks.

While the function of Pi-Hole is brilliant – my Data warehouse Analyst and reporting background craves for more power from the metrics behind Pi-Hole.

By default Pi-Hole offers a simple dashboard with a few catchy graphs and metrics.  This can be hypnotizing to watch when first installing Pi-Hole; watching shock of how many DNS queries are blocked visiting different  websites.

Behind the scenes is a sqlite database located on the server for more power users to query.

We need more power!

For me – I have a couple of problems with the Pi-Hole dashboard and crave for more functionality and power from what is on offer.

Some issues I have:

  • On my home network – I have two Pi-Hole servers working in a redundancy.  The main Pi-Hole server is sitting on a hypervisor while the backup one is on a Raspberry Pi.  The main rational of this is that if we have a power outage that knocks the hypervisor offline while I am not at home; the backup Pi-Hole server will take over and save an angry phone call from the wife on why she can’t access the internet.

    One of my wishes is to see the combined Pi-Hole metrics across both servers in one dashboard.

  • A couple of items on the network creates a large amount of noise that skews the metrics.  For instance the security cameras use to query a NTP server every second until I routed them through a NTP server at home.  My work computer when it drops off the VPN will flood particular addresses with traffic. 

    I would like to exclude these addresses from the metrics; but not exclude them from the P-Hole functionality.

  • Access to better analytics from the Pi-Hole data. 
    • How about a heat map of blocked queries on devices? 
    • Or weekly trends with a moving average? 
    • Or ability to drill into a time where there is a high amount of blocked traffic to determine the  device and application is behind it? (Answer: My kid playing mobile app games)

Napkin Solution

This is what I am working on at the moment to extract and load the data from my two Pi-Hole servers into a Postgres database to query from.

  1. A NFS share is set up to share out the pihole-FLT.db to a linux server on both Pi-Hole servers
  2. A cron job runs every x minutes to run a python script.  The python scrip will query the pihole-FTL.db to get the records from the last timestamp it ran to the current timestamp in the Pi-Hole database and land the data in to a “landing” schema on the postgres database
  3. A stored procedure will run to transfer and normalise the data into an ods schema
  4. A reporting tool will then query off the ods schema.  In this case I am using Apache Superset; just because it was freely available and the web nature means I can share dashboards to multiple devices without installing client software.

Performance of querying sqlite through a NFS share

Initially I was worried about querying the pihole-FTL.db over the NFS share; thinking it will be too slow to be practical.

I was thinking of a few clunky solutions:

  1. Copy the entire DB over to the postgres host and load it from there
  2. Write a script that runs on the Pi-Hole server to extract the incremental data from the pihole-FLT.db into files and load the files

Thankfully from the volume of traffic we produce; it is not a concern.

The initial load of copying all the data out of the Pi-Hole database does take a long time if there is a large amount of history to transfer across.

But once the initial load is done; small incremental bites of data only take a few seconds

Current state

I have a pilot running on my home network as a proof of concept.

Fundamentally it works fine and I can gleam some interesting metrics from my Pi-Hole servers.

At the moment the security is not great atrocious as I am just using a super username and password throughout the pipeline.

Once I have cleaned up the postgres sql, python code I can share the complete solution.