Python

Python – Unpacking a COMP-3 number

New business requirement today.  We have some old mainframe files we have to run through Qlik Replicate.

Three problems we have to overcome:

  1. The files are in fixed width format
  2. The files are in EBCDIC format; specifically Code 1047
  3. Inside the records there are comp-3 packed fields 

The mainframe team did kindly provide us with a schema file that showed us the how many bytes make up each field so we could divide up the fixed width file by reading in a certain number of bytes per a field.

Python provided a decode function to decode the fields read to a readable format:

focus_data_ascii = focus_data.decode("cp1047").rstrip()

The hard part now was the comp-3 packed fields. They are made up with some bit magic and working with bits and shifts is not my strongest suite  

I have been a bit spoilt so far working with python and most problems can be solved by “find the module to do the magic for you.”

But after ages of scouring for a module to handle the conversion for me; I had a lot of false leads – testing questionable code with no luck.

Eventually I stumbles upon:

zorchenhimer/cobol-packed-numbers.py

Thank goodness.

It still works with bits ‘n’ shift magic – but it works on the data that I have and now have readable text 

I extended the code to fulfil the business requirements:

# Source https://gist.github.com/zorchenhimer/fd4d4208312d4175d106
def unpack_number(field, no_decimals):
    """ Unpack a COMP-3 number. """
    a = array('B', field)
    value = float(0)

    # For all but last digit (half byte)
    for focus_half_byte in a[:-1]:
        value = (value * 100) + ( ( (focus_half_byte & 0xf0) >> 4) * 10) + (focus_half_byte & 0xf)

    # Last digit
    focus_half_byte = a[-1]
    value = (value * 10) + ((focus_half_byte & 0xf0) >> 4)

    # Negative/Positve check.  If 0xd; it is a negative value
    if (focus_half_byte & 0xf) == 0xd:
        value = value * -1

    # If no_decimals = 0; it is just an int
    if no_decimals == 0:
        return_int = int(value)
        return (return_int)
    else:
        return_float = value / pow(10, no_decimals)
        return (return_float)