Qbot Technical Analysis
Table of Contents
Introduction
Qbot (also known as Qakbot) is a sophisticated banking Trojan that first appeared in 2007. Initially designed to steal banking credentials, it has since evolved into a highly versatile malware with capabilities such as self-propagation, remote code execution (RCE), and facilitating ransomware deployment. Typically delivered via phishing emails and exploit kits, Qakbot remains a persistent threat in modern cyber attacks.
In this technical analysis, I focus on key aspects of reversing Qakbot, including dealing with string decryption, resolving API structures, and extracting payloads using Binary Refinery.
Check out my code repository here
Sample information
SHA256: B92C0AAFB4E9B0FC2B023DBB14D7E848249F29E02B0E4CD8624CE27E55C9AC4C
Target Machine: x32
First seen: 2020-08-17 15:35:21 UTC
Sample: MalwareBazaar
Unpacking
The sample first stops at a VirtualProtect breakpoint. Returning to the user code, we can see the unpacked file is already stored in memory.
Inspecting the memory dump reveals that this file’s sections differ from those of the original packed file.
Analyze unpacked file
SHA256: D20B23F825B578B8EB266DCE9BC9D89C1AAA38EA972924E59B599FE4FAC4D3B9
Opening the file in DIE suggests that it may still be packed or contain another stage, based on the high entropy sections.
Examining the sections in pestudio, we find two high-entropy sections:
.rdata
(7.643).rsrc
(7.843)
String decryption
Here is the string decryption function used by Qbot.
From the Code References tab, we can see that this function is called 55 times — a significant number. I decided to implement it in Python to build a script that finds, replaces, or creates an enum list for later use with the Binary Ninja API.
def strdec(offset):
enc_str_addr = 0x0040b898
enc_str_size = 0x373b
enc_str = bv.read(enc_str_addr, enc_str_size)
xor_key_addr = 0x00410130
xor_key_size = 0x40
xor_key = bv.read(xor_key_addr, xor_key_size)
dec = ""
if offset < enc_str_size - 1:
temp = offset
while temp < enc_str_size - 1:
if xor_key[temp & 0x3f] == enc_str[temp]:
length = temp - offset
break
temp += 1
for i in range(length):
dec += chr(xor_key[(offset + i) & 0x3f] ^ enc_str[offset + i])
log_info(f"Decrypted string: {dec} at offset {offset}")
return dec
The results looked good, so I moved to the enum creation. Using mlil, my plan was to:
- Locate all caller sites of the
str_decrypt
function - Extract string offsets and decrypt them.
- Create an anonymous enum type, where the enum members are the decrypted strings and their corresponding offsets.
func = bv.get_functions_by_name("str_decrypt")[0]
# Get the callsites
caller_sites = [cs for cs in func.caller_sites]
# Store the decrypted strings and offsets
decs = []
for cs in caller_sites:
# Get the string offset
offset = cs.mlil.params[0]
# At some callersites, the offset is a constant, at others it is a variable
if offset.operation == MediumLevelILOperation.MLIL_CONST:
offset = offset.constant
decs.append((strdec(offset), offset))
else:
log_info(f"Failed to decrypt string at {hex(cs.address)} because offset is not a constant ({offset.operation})")
#### create an enum and store decs ###
bv.begin_undo_actions()
# Anonymous enum
enum = Type.enumeration(arch=bv.arch, members=decs, width=4)
# Enum building
builder = TypeBuilder.enumeration()
for member in enum.members:
builder.append(member.name, member.value)
registered_name = bv.define_user_type("dec_enum", builder.immutable_copy())
enum_type = bv.get_type_by_name(registered_name)
bv.commit_undo_actions()
After changing the argument type to enum dec_enum str_offset
, we can see that all offsets are replaced with readable strings, making the code much easier to understand.
Resolving API from struct
Qbot also uses an API struct that contains information necessary to resolve APIs from different DLLs. Here is the implementation of the mw_resolve_api_struct
function.
The API struct can be represented using the following structure:
struct api_struct __packed
{
void* api_addr;
enum dec_enum api_name;
enum dec_enum dll_name;
};
Based on our work with string encryption, my plan is to build a function that walks through the struct, decrypts all the API and DLL names and updates them in the enum dec_enum
that we’ve already built.
Note for creating a script to extract the API struct:
- Loop through the struct array to determine its length.
- Apply the appropriate struct type to the array.
- Decrypt the API and DLL names.
- Rename the
api_addr
variable based on the correspondingapi_name
Just like the String decryption
section, I will locate all the caller sites of the mw_resolve_api_struct
function, retrieve the data variable, and calculate the size of the struct array. After that, I will apply the struct array type to the corresponding address.
for cs in caller_sites:
addr = cs.mlil.params[0].constant
structs_array.append(addr)
buffer = bv.read(addr, 0x1000)
# count_structs function uses Struct to identify the size of the struct array
count = count_structs(buffer)
log_info(f"Found {count} structs at {hex(addr)}")
# Apply the type
bv.define_user_data_var(addr, f"api_struct [{count}]")
As a result, the data buffers will be converted into struct arrays, but many api_name
and dll_name
will still appear as offset because they have not yet been updated in dec_enum
.
To resolve this, We need to decrypt all strings referenced by offsets in each struct and add them to the dec_enum
type. Here’s the code to accomplish this.
for structs in structs_array:
structs = bv.get_data_var_at(structs)
for structure in structs:
api_addr = structure['api_addr'].value
api_name = update_member(structure['api_name'])
dll_name = update_member(structure['dll_name'])
bv.define_user_data_var(api_addr, "void*")
var = bv.get_data_var_at(api_addr)
var.name = "mw_" + api_name
log_info(f"Decrypted API name: {api_name} at {hex(api_addr)}")
log_info(f"Decrypted DLL name: {dll_name} at {hex(api_addr)}")
The update_member
function retrieves the offset value, decrypts the corresponding string, and adds it to dec_enum
if it is not already present.
def update_dec_enum(name, offset):
enum_t = bv.types['dec_enum'].mutable_copy()
enum_t.append(name, offset)
bv.define_user_type("dec_enum", enum_t)
def update_member(arg):
try:
offset = arg.value
decrypted = strdec(offset)
update_dec_enum(decrypted, offset)
except:
offset = arg.value.value
decrypted = strdec(offset)
return decrypted
Here are some struct arrays after running the code.
Gathering information
Check for AntiVirus
- Qbot uses
CreateToolhelp32Snapshot
to take a process snapshot and check if any of the antivirus programs listed below are running.
Str-offset | Name |
---|---|
0x1f7e | ccSvcHst.exe |
0x1c9b | avgcsrvx.exe;avgsvcx.exe;avgcsrva.exe |
0x20e2 | MsMpEng.exe |
0x2ecf | mcshield.exe |
0x3371 | avp.exe;kavtray.exe |
0x357d | egui.exe;ekrn.exe |
0x3632 | bdagent.exe;vsserv.exe;vsservppl.exe |
0xaf3 | AvastSvc.exe |
0x748 | coreServiceShell.exe;PccNTMon.exe;NTRTScan.exe |
0x6b2 | SAVAdminService.exe;SavService.exe |
0x1e36 | fshoster32.exe |
0x2c6f | WRSA.exe |
0x4e1 | vkise.exe;isesrv.exe;cmdagent.exe |
0x2df7 | ByteFence.exe |
0x85e | MBAMService.exe;mbamgui.exe |
0x2a93 | fmon.exe |
Check if the process is running as SYSTEM
- Malware often attempts to escalate privileges or verify if it is running with SYSTEM access before performing privileged actions.
- Qbot first uses
GetTokenInformation
to retrieve the token information of the current process. - It then uses
AllocateAndInitializeSid
to construct a SID, where the first sub-authority is18
, representing the highest privilege level on a Windows system. - Next, it calls
EqualSid
to compare this SID with the retrieved token information, determining whether the process is running under the LOCAL SYSTEM security context.
Check analysis tools
Qbot uses CreateToolhelp32Snapshot
to take a process snapshot and check if any of the tools listed below are running.
Name |
---|
Fiddler.exe |
samp1e.exe |
sample.exe |
runsample.exe |
lordpe.exe |
regshot.exe |
Autoruns.exe |
dsniff.exe |
VBoxTray.exe |
HashMyFiles.exe |
ProcessHacker.exe |
Procmon.exe |
Procmon64.exe |
netmon.exe |
vmtoolsd.exe |
vm3dservice.exe |
VGAuthService.exe |
pr0c3xp.exe |
ProcessHacker.exe |
CFF Explorer.exe |
dumpcap.exe |
Wireshark.exe |
idaq.exe |
idaq64.exe |
TPAutoConnect.exe |
ResourceHacker.exe |
vmacthlp.exe |
OLLYDBG.EXE |
windbg.exe |
bds-vision-agent-nai.exe |
bds-vision-apis.exe |
bds-vision-agent-app.exe |
MultiAnalysis_v1.0.294.exe |
x32dbg.exe |
VBoxTray.exe |
VBoxService.exe |
Tcpview.exe |
Check VMs
-
Qbot uses
SetupDiGetDeviceRegistryPropertyA
to retrieve system properties. -
Checks if any virtual machine or sandbox-related strings exist.
Name |
---|
VMware Pointing |
VMware Accelerated |
VMware SCSI |
VMware SVGA |
VMware Replay |
VMware server memory |
CWSandbox |
Virtual HD |
QEMU |
Red Hat VirtIO |
srootkit |
VMware VMaudio |
VMware Vista |
VBoxVideo |
VBoxGuest |
- Verifies the existence of certain services.
Name |
---|
vmxnet |
vmscsi |
VMAUDIO |
vmdebug |
vm3dmp |
vmrawdsk |
vmx_svga |
ansfltr |
sbtisht |
Extract payload
Qbot stores its payload inside the resource section, encrypts it using RC4, and compresses it with blzpack.
By inspecting the binary with Resource Hacker, we can see that dumped.bin
contains seven resources, but only resource 307 is the largest.
After using resolved APIs to extract the resource data, Qbot pass it to mw_call_decrypt_resource
for further processing.
Decrypt RC4
The data is first decrypted using RC4, where the key is the first 20 bytes of the data. After decryption, the first 20 bytes of the decrypted data are stored as the SHA-1 checksum. Then, the rest of the data is hashed using SHA-1, and the two values are compared for verification.
The entire process can be implemented using the binary-refinery pipeline below.
- Extract the resource
- Cut the first 20 bytes as k (Key for RC4)
- Decrypt the rest to get the compressed PE
- Then we cut out the first 20 bytes as it is the SHA1 checksum
- The final step is hash the rest and compare them
Decompress Blzpack
Qbot compresses important data before encrypting it with RC4. The compression and decompression functions are implementations of the blzpack library, with a minor modification to the magic number (changed from 0x626C7A1A
to 0x616CD31A
).
Here I simply reused the code from sysopfb, with a few modifications due to the magic number change.
from ctypes import *
import binascii
import zlib
import struct
import sys
brieflz = cdll.LoadLibrary('./brieflz.dll')
DEFAULT_BLOCK_SIZE = 1024 * 1024
### Code from sysopfb
### https://github.com/sysopfb/Malware_Scripts/blob/master/qakbot/blzpack.py
def decompress_data(data, blocksize=DEFAULT_BLOCK_SIZE, level=1):
decompressed_data = b""
max_packed_size = brieflz.blz_max_packed_size(blocksize)
(magic,level,packedsize,crc,hdr_depackedsize,crc2) = struct.unpack_from('>IIIIII', data)
data = data[24:]
while magic == 0x626C7A1A and len(data) > 0:
compressed_data = create_string_buffer(data[:packedsize])
workdata = create_string_buffer(blocksize)
depackedsize = brieflz.blz_depack(byref(compressed_data), byref(workdata), c_int(hdr_depackedsize))
if depackedsize != hdr_depackedsize:
print("[!] Decompression error")
print("[!] DepackedSize: "+str(depackedsize) + "\nHdrVal: "+str(hdr_depackedsize))
return None
decompressed_data += workdata.raw[:depackedsize]
data = data[packedsize:]
if len(data) > 0:
(magic,level,packedsize,crc,hdr_depackedsize,crc2) = struct.unpack_from('>IIIIII', data)
data = data[24:]
else:
break
return decompressed_data
def main():
if len(sys.argv) != 2:
print(f"[!] Usage: {sys.argv[0]} <compressed_file>")
sys.exit(1)
with open(sys.argv[1], "rb") as f:
data = f.read()
print(f"[+] Read data from file {sys.argv[1]} with size {len(data)} bytes")
# Fix the magic number
data = data.replace(b"\x61\x6c\xd3\x1a", b"\x62\x6C\x7A\x1A")
print("[+] Decompressing data...")
decompressed_data = decompress_data(data)
print(f"[+] Successfully decompressed data : {decompressed_data[:20]}")
with open("decompressed.bin", "wb") as f:
f.write(decompressed_data)
print("[+] Decompressed data written to decompressed.bin")
if __name__ == "__main__":
main()
After running the code with the compressed data, we obtain the next stage, which is another PE file.
Extract IOC
As I showed above, Qbot continuously decrypts resource 308 in the next stage. However, since this stage contains two resources (308 and 311), I will reuse the binary-refinery pipeline to extract both.
Resource 308: Based on some excellent research blogs, I discovered that this contains campaign information, such as the botnet ID and compilation time.
Resource 311: This resource contains the C2 IP addresses used by Qbot.
Conclusion
Qbot is an advanced piece of malware, especially since this is my first time analyzing a real-world malware sample. There were so many new things that I had to pause the analysis to research them. Due to time constraints, as my new semester has started, I unfortunately have to wrap up this analysis quickly. I hope you find something interesting in this post.