Qbot Technical Analysis

image

Table of Contents

Introduction

Qbot (also known as Qakbot) is a sophisticated banking Trojan that first appeared in 2007. Initially designed to steal banking credentials, it has since evolved into a highly versatile malware with capabilities such as self-propagation, remote code execution (RCE), and facilitating ransomware deployment. Typically delivered via phishing emails and exploit kits, Qakbot remains a persistent threat in modern cyber attacks.

In this technical analysis, I focus on key aspects of reversing Qakbot, including dealing with string decryption, resolving API structures, and extracting payloads using Binary Refinery.

Check out my code repository here

Sample information

SHA256: B92C0AAFB4E9B0FC2B023DBB14D7E848249F29E02B0E4CD8624CE27E55C9AC4C

Target Machine: x32

First seen: 2020-08-17 15:35:21 UTC

Sample: MalwareBazaar

Unpacking

The sample first stops at a VirtualProtect breakpoint. Returning to the user code, we can see the unpacked file is already stored in memory.

Inspecting the memory dump reveals that this file’s sections differ from those of the original packed file. image

Analyze unpacked file

SHA256: D20B23F825B578B8EB266DCE9BC9D89C1AAA38EA972924E59B599FE4FAC4D3B9

Opening the file in DIE suggests that it may still be packed or contain another stage, based on the high entropy sections. image

Examining the sections in pestudio, we find two high-entropy sections:

  • .rdata (7.643)
  • .rsrc (7.843) image

String decryption

Here is the string decryption function used by Qbot. image

From the Code References tab, we can see that this function is called 55 times — a significant number. I decided to implement it in Python to build a script that finds, replaces, or creates an enum list for later use with the Binary Ninja API.

def strdec(offset):
    enc_str_addr = 0x0040b898
    enc_str_size = 0x373b
    enc_str = bv.read(enc_str_addr, enc_str_size)
    xor_key_addr = 0x00410130
    xor_key_size = 0x40
    xor_key = bv.read(xor_key_addr, xor_key_size)
    dec = ""
    if offset < enc_str_size - 1:
        temp = offset
        while temp < enc_str_size - 1:
            if xor_key[temp & 0x3f] == enc_str[temp]:
                length = temp - offset
                break
            temp += 1

    for i in range(length):
        dec += chr(xor_key[(offset + i) & 0x3f] ^ enc_str[offset + i])
    log_info(f"Decrypted string: {dec} at offset {offset}")

    return dec

The results looked good, so I moved to the enum creation. Using mlil, my plan was to:

  • Locate all caller sites of the str_decrypt function
  • Extract string offsets and decrypt them.
  • Create an anonymous enum type, where the enum members are the decrypted strings and their corresponding offsets.
func = bv.get_functions_by_name("str_decrypt")[0]
# Get the callsites
caller_sites = [cs for cs in func.caller_sites]

# Store the decrypted strings and offsets
decs = []

for cs in caller_sites:
	# Get the string offset
    offset = cs.mlil.params[0]
    # At some callersites, the offset is a constant, at others it is a variable
    if offset.operation == MediumLevelILOperation.MLIL_CONST:
        offset = offset.constant
        decs.append((strdec(offset), offset))
    else:
        log_info(f"Failed to decrypt string at {hex(cs.address)} because offset is not a constant ({offset.operation})")

#### create an enum and store decs ###
bv.begin_undo_actions()

# Anonymous enum
enum = Type.enumeration(arch=bv.arch, members=decs, width=4)

# Enum building
builder = TypeBuilder.enumeration()
for member in enum.members:
    builder.append(member.name, member.value)

registered_name = bv.define_user_type("dec_enum", builder.immutable_copy())
enum_type = bv.get_type_by_name(registered_name)

bv.commit_undo_actions()

After changing the argument type to enum dec_enum str_offset, we can see that all offsets are replaced with readable strings, making the code much easier to understand. image

Resolving API from struct

Qbot also uses an API struct that contains information necessary to resolve APIs from different DLLs. Here is the implementation of the mw_resolve_api_struct function. image

The API struct can be represented using the following structure:

struct api_struct __packed
{
    void* api_addr;
    enum dec_enum api_name;
    enum dec_enum dll_name;
};

Based on our work with string encryption, my plan is to build a function that walks through the struct, decrypts all the API and DLL names and updates them in the enum dec_enum that we’ve already built.

Note for creating a script to extract the API struct:

  • Loop through the struct array to determine its length.
  • Apply the appropriate struct type to the array.
  • Decrypt the API and DLL names.
  • Rename the api_addr variable based on the corresponding api_name

Just like the String decryption section, I will locate all the caller sites of the mw_resolve_api_struct function, retrieve the data variable, and calculate the size of the struct array. After that, I will apply the struct array type to the corresponding address.

for cs in caller_sites:
    addr = cs.mlil.params[0].constant
    structs_array.append(addr)
    buffer = bv.read(addr, 0x1000)
	# count_structs function uses Struct to identify the size of the struct array
    count = count_structs(buffer) 
    log_info(f"Found {count} structs at {hex(addr)}")
    # Apply the type
    bv.define_user_data_var(addr, f"api_struct [{count}]")

As a result, the data buffers will be converted into struct arrays, but many api_name and dll_name will still appear as offset because they have not yet been updated in dec_enum. image

To resolve this, We need to decrypt all strings referenced by offsets in each struct and add them to the dec_enum type. Here’s the code to accomplish this.

for structs in structs_array:
    structs = bv.get_data_var_at(structs)
    for structure in structs:
        api_addr = structure['api_addr'].value
        api_name = update_member(structure['api_name'])
        dll_name = update_member(structure['dll_name'])
        bv.define_user_data_var(api_addr, "void*")
        var = bv.get_data_var_at(api_addr)
        var.name = "mw_" + api_name
        log_info(f"Decrypted API name: {api_name} at {hex(api_addr)}")
        log_info(f"Decrypted DLL name: {dll_name} at {hex(api_addr)}")

The update_member function retrieves the offset value, decrypts the corresponding string, and adds it to dec_enum if it is not already present.

def update_dec_enum(name, offset):
    enum_t = bv.types['dec_enum'].mutable_copy()
    enum_t.append(name, offset)
    bv.define_user_type("dec_enum", enum_t)

def update_member(arg):
    try:
        offset = arg.value
        decrypted = strdec(offset)
        update_dec_enum(decrypted, offset)
    except:
        offset = arg.value.value
        decrypted = strdec(offset)
    return decrypted

Here are some struct arrays after running the code. image

Gathering information

Check for AntiVirus

  • Qbot uses CreateToolhelp32Snapshot to take a process snapshot and check if any of the antivirus programs listed below are running.
Str-offsetName
0x1f7eccSvcHst.exe
0x1c9bavgcsrvx.exe;avgsvcx.exe;avgcsrva.exe
0x20e2MsMpEng.exe
0x2ecfmcshield.exe
0x3371avp.exe;kavtray.exe
0x357degui.exe;ekrn.exe
0x3632bdagent.exe;vsserv.exe;vsservppl.exe
0xaf3AvastSvc.exe
0x748coreServiceShell.exe;PccNTMon.exe;NTRTScan.exe
0x6b2SAVAdminService.exe;SavService.exe
0x1e36fshoster32.exe
0x2c6fWRSA.exe
0x4e1vkise.exe;isesrv.exe;cmdagent.exe
0x2df7ByteFence.exe
0x85eMBAMService.exe;mbamgui.exe
0x2a93fmon.exe

Check if the process is running as SYSTEM

  • Malware often attempts to escalate privileges or verify if it is running with SYSTEM access before performing privileged actions.
  • Qbot first uses GetTokenInformation to retrieve the token information of the current process. image
  • It then uses AllocateAndInitializeSid to construct a SID, where the first sub-authority is 18, representing the highest privilege level on a Windows system.
  • Next, it calls EqualSid to compare this SID with the retrieved token information, determining whether the process is running under the LOCAL SYSTEM security context. image

Check analysis tools

Qbot uses CreateToolhelp32Snapshot to take a process snapshot and check if any of the tools listed below are running.

Name
Fiddler.exe
samp1e.exe
sample.exe
runsample.exe
lordpe.exe
regshot.exe
Autoruns.exe
dsniff.exe
VBoxTray.exe
HashMyFiles.exe
ProcessHacker.exe
Procmon.exe
Procmon64.exe
netmon.exe
vmtoolsd.exe
vm3dservice.exe
VGAuthService.exe
pr0c3xp.exe
ProcessHacker.exe
CFF Explorer.exe
dumpcap.exe
Wireshark.exe
idaq.exe
idaq64.exe
TPAutoConnect.exe
ResourceHacker.exe
vmacthlp.exe
OLLYDBG.EXE
windbg.exe
bds-vision-agent-nai.exe
bds-vision-apis.exe
bds-vision-agent-app.exe
MultiAnalysis_v1.0.294.exe
x32dbg.exe
VBoxTray.exe
VBoxService.exe
Tcpview.exe

Check VMs

  • Qbot uses SetupDiGetDeviceRegistryPropertyA to retrieve system properties. image

  • Checks if any virtual machine or sandbox-related strings exist.

Name
VMware Pointing
VMware Accelerated
VMware SCSI
VMware SVGA
VMware Replay
VMware server memory
CWSandbox
Virtual HD
QEMU
Red Hat VirtIO
srootkit
VMware VMaudio
VMware Vista
VBoxVideo
VBoxGuest
  • Verifies the existence of certain services.
Name
vmxnet
vmscsi
VMAUDIO
vmdebug
vm3dmp
vmrawdsk
vmx_svga
ansfltr
sbtisht

Extract payload

Qbot stores its payload inside the resource section, encrypts it using RC4, and compresses it with blzpack. image

By inspecting the binary with Resource Hacker, we can see that dumped.bin contains seven resources, but only resource 307 is the largest. image

After using resolved APIs to extract the resource data, Qbot pass it to mw_call_decrypt_resource for further processing. image

Decrypt RC4

The data is first decrypted using RC4, where the key is the first 20 bytes of the data. After decryption, the first 20 bytes of the decrypted data are stored as the SHA-1 checksum. Then, the rest of the data is hashed using SHA-1, and the two values are compared for verification. image

The entire process can be implemented using the binary-refinery pipeline below.

  • Extract the resource image
  • Cut the first 20 bytes as k (Key for RC4) image
  • Decrypt the rest to get the compressed PE image
  • Then we cut out the first 20 bytes as it is the SHA1 checksum image
  • The final step is hash the rest and compare them image

Decompress Blzpack

Qbot compresses important data before encrypting it with RC4. The compression and decompression functions are implementations of the blzpack library, with a minor modification to the magic number (changed from 0x626C7A1A to 0x616CD31A). image

Here I simply reused the code from sysopfb, with a few modifications due to the magic number change.

from ctypes import *
import binascii
import zlib
import struct
import sys

brieflz = cdll.LoadLibrary('./brieflz.dll')
DEFAULT_BLOCK_SIZE = 1024 * 1024

### Code from sysopfb
### https://github.com/sysopfb/Malware_Scripts/blob/master/qakbot/blzpack.py
def decompress_data(data, blocksize=DEFAULT_BLOCK_SIZE, level=1):
    decompressed_data = b""
    max_packed_size = brieflz.blz_max_packed_size(blocksize)
    (magic,level,packedsize,crc,hdr_depackedsize,crc2) = struct.unpack_from('>IIIIII', data)
    data = data[24:]
    while magic == 0x626C7A1A and len(data) > 0:
        compressed_data = create_string_buffer(data[:packedsize])
        workdata = create_string_buffer(blocksize)
        depackedsize = brieflz.blz_depack(byref(compressed_data), byref(workdata), c_int(hdr_depackedsize))
        if depackedsize != hdr_depackedsize:
            print("[!] Decompression error")
            print("[!] DepackedSize: "+str(depackedsize) + "\nHdrVal: "+str(hdr_depackedsize))
            return None

        decompressed_data += workdata.raw[:depackedsize]
        data = data[packedsize:]

        if len(data) > 0:
            (magic,level,packedsize,crc,hdr_depackedsize,crc2) = struct.unpack_from('>IIIIII', data)
            data = data[24:]
        else:
            break

    return decompressed_data

  

def main():
    if len(sys.argv) != 2:
        print(f"[!] Usage: {sys.argv[0]} <compressed_file>")
        sys.exit(1)
  
    with open(sys.argv[1], "rb") as f:
        data = f.read()
        print(f"[+] Read data from file {sys.argv[1]} with size {len(data)} bytes")

    # Fix the magic number
    data = data.replace(b"\x61\x6c\xd3\x1a", b"\x62\x6C\x7A\x1A")
    print("[+] Decompressing data...")

    decompressed_data = decompress_data(data)
    print(f"[+] Successfully decompressed data : {decompressed_data[:20]}")

    with open("decompressed.bin", "wb") as f:
        f.write(decompressed_data)
        print("[+] Decompressed data written to decompressed.bin")

if __name__ == "__main__":
    main()

After running the code with the compressed data, we obtain the next stage, which is another PE file. image

Extract IOC

As I showed above, Qbot continuously decrypts resource 308 in the next stage. However, since this stage contains two resources (308 and 311), I will reuse the binary-refinery pipeline to extract both.

Resource 308: Based on some excellent research blogs, I discovered that this contains campaign information, such as the botnet ID and compilation time. image

Resource 311: This resource contains the C2 IP addresses used by Qbot. image

Conclusion

Qbot is an advanced piece of malware, especially since this is my first time analyzing a real-world malware sample. There were so many new things that I had to pause the analysis to research them. Due to time constraints, as my new semester has started, I unfortunately have to wrap up this analysis quickly. I hope you find something interesting in this post.

References

Aziz Farghly sysopfb