Deobfuscate with Binary Ninja API
Table of Contents
Introduction
Recently, I’ve become very interested in learning how to use the Binary Ninja API to build a deobfuscator. So, I read all their posts and started with a simple challenge from the Grand Reverse Engineering Challenge.
This article is about rewriting the original code from the author, fixing it to fit with the new API update, and also making some changes because the original code did not work as expected. The new code is here
The problems
This challenge makes it harder to reverse by using a jump table address.
It will store the index to var_390
and then jump to the address stores in the table at 0x006eba60
. This table is an array with 0x5a addresses.
So we have to find a way to automatically extract the index, get the new address, and then patch it with an unconditional jump.
The idea
Get the addresses from the table
First of all, we have to get all addresses from the table. We know they are addresses of functions, so the plan is to read through each element of the table, check if they are already functions, create functions if not, then store the addresses in a global array.
jmp_table_addr = 0x6eba60
n = 0x5a
br = BinaryReader(bv)
br.seek(jmp_table_addr)
addrs = []
for i in range(n):
addr = br.read64le()
if not bv.get_function_at(addr):
bv.create_user_function(addr)
log_info("Adding function at: 0x%x" % addr)
else:
log_info("Function already exists at: 0x%x" % addr)
addrs.append(addr)
bv.update_analysis_and_wait()
Use a pattern to find the instruction containing the index
Based on Xusheng’s idea, we will use the pattern mov qword [rbp-0x{constant_offset}]
, as it is repeated all the time whenever it stores the index of the next jump address.
But that is not enough because we can find many addresses that have that pattern but are not inside any function from the jump table.
So we must store the range of each function, then anytime we find an address that has the pattern, we must check if it is the right place we want to extract the index.
index_offset = 0x388
search_text = f"mov qword [rbp-0x{index_offset:x}],"
for res in bv.find_all_text(bv.start, bv.end, search_text):
addr, text, line = res
if not is_in_ranges(addr, valid_addr_range):
log_info(f"Skipping address {addr: #x}")
continue
Extract the index and patch with a jump
After we find all the addresses that contain the index, we have to check if they fit the code structure.
Using the LLIL, we can parse each instruction, check until we confirm that it is the right track, then extract the address from the table.
llil_func = func.llil # Get the LLIL instructions of the function
llil_ins = func.get_llil_at(addr) # Get the LLIL instruction at the address
idx = llil_ins.instr_index # Get the instruction index
if llil_func[idx + 1].operation == LowLevelILOperation.LLIL_GOTO: # Check if the next instruction is a GOTO
llil_next = llil_func[idx + 1].dest + 1# Get the ins after the destination of the GOTO
else:
llil_next = idx + 1
while llil_func[llil_next].src.src.left.operation != LowLevelILOperation.LLIL_ADD:
llil_next += 1
llil_check_ins = llil_func[llil_next] # Get the instruction at the index
try:
offset_jmp_table = llil_check_ins.src.src.right.constant # Get the offset of the jump table
except:
continue
if offset_jmp_table != -addr_array_stack_addr:
continue # Not what we want
if llil_ins.operation == LowLevelILOperation.LLIL_STORE:
if llil_ins.src.operation == LowLevelILOperation.LLIL_CONST:
next_jump_offset = llil_ins.src.constant
log_info(f"Next jump offset: 0x{next_jump_offset:x}")
try:
next_jump = addrs[next_jump_offset]
except:
next_jump = 0x0
log_warn("Next jump not found")
Patch the program
The last step is patching the code, as we have all the indexes.
log_info(f"Start patching at {addr: #x} from function name {func.name} with 'jmp {next_jump: #x}'")
code = arch.assemble(f"jmp {next_jump: #x}", addr)
bv.write(addr, code)
The result
-
Before
-
After