Advanced Binary Emulation Framework with Qiling

Welcome back to my blogpost.

Today, I want to cover about a framework called Qiling. If you are ready let’s get started.Before starting to write this blog, i assume that you have to have basic knowledge of the anatomy of binary files.

“It is a fairly open secret that almost all system can be hacked, somehow. It is a less spoken of secret that such hacking has actually gone quite mainstream. ”Dan Kaminsky

Content

0x0  Binary Analysis
     0x01. C compilation Pocess
0x1. What is an emulation system 
     0x11. CHIP-8 emulator
0x2  Qiling Framework
     0x20 Structure of Qiling
     0x21 Introduction
     0x22 Windows32 API
     0x23 Capstone
     0x24 Usage of Qiling Framework
0x3  Qiling techniques
   0x3  Mapping Technique
   0x31 Hooking Address

Binary Analysis

When you come across with this definition, people would say “it is just analyzing binary files” but what is it exactly doing. What is the purpose of this goal, and What is a binary file ? Let me briefly explain what it is. modern computers are perfoming and computing using the binary numerical system, which expresses and runs on zeros and ones. It is important to know that, machine code that these system execute is called binary code. It is also important that, collecting such numerical sytems to be combining to each other so as to run that program in a single contained file. This topic shall be very long, but I am going to illustrate this in an easy way and for beginners friendly. You can also get more information via this source Binary

C (Compilation Process)

Binaries should be translated into human-readable source code, such as C and C++. I am not going to dig deep in this topic, but let me explain and demonstrate with some picture. The steps in C++ compilation process are similar. When we talk in C, it has four phases these are: 1. The preprocessing Phase 2. The compilation Phase 3. The assembly Phase 4. Linking Phase

We know that, The compiler checks the source code for the syntactical or structural errors, and if the source code is error-free, then it generates the object code.

 
 #include <stdio.h>
 
 int main()
 {
	
	printf("Welcome to my Binary Blogpost");
	return 0;
 
 
}

In order to compile this code, we need to consider these steps as follows:

Preprocess

For instance, you have written a code in a text file and the source code file is given in .C will be first passed to the preprocess so that preprocess will expand this code after expanding this code, this code will be trough the compilation process.

Compilation Process

Now, This code is expanded by the preprocess and will be converted into assembly code. the C compiler converts the pre-processed code into assembly code.

The assembly Phase

The assembly code is converted into object code by using an assembler. The name of the object file generated by the assembler is the same as the source file. We should know that, DOS uses .obj and UNIX uses .O for instance, if your file is binary.c then the name of the object file would be ‘binary.obj

Linking Phase

Mainly, all the programs written in C use library functions. These library functions are pre-compiled, and the object code of these library files is stored with ‘.lib’ (or ‘.a’) extension. The main working of the linker is to combine the object code of library files with the object code of our program.

Source code –> preprocessor –> expanded –> compiler –> assembly –> assembler –> object –> Linker –> executable code

What is an emulation system

Emulation is the process of imitation. On a different operating system; The process of running a program that runs using API calls offered by that operating system in a different operating system without depending on those APIs is called emulation. We can also conclude that emulation systems provide us to run an application on different operating system other than being orinally written. For instance, think a computer arcade game that runs on supported console such as (xbox,Playsation,Nintendo). We can also think about Virtual machine because, it can be referred to as partion,guest, container. Let’s suppose that, you want to run ELF files on Linux environment to do so, you can just download an ISO file onto your environment kind of emulator where you can run more than 1 operating systems. Have you heard about Dolphin? not the animal one but… Let me explain Before diving into. As an example I will demonstrate (CHIP8)

CHIP-8 Emulator

I want to cover about CHIP-8 emulator system because, I was always interesed in such embedded technology and the design itself. If you look at this machine, you will see that CHIP-8 emulator has greatly been written by Joseph Weisbecker. We should understand that CHIP-8 emulator is written in interpreted programming language. Programs on CHIP-8 emulator run on virtual machine itself. This emulator was made to allow video games to be more easily programmed for these computers. When we have about classic programs and games such as; pacman, tetris, etc.. Oke, we have now at least a basic understanding his machine is. Let’s now talk about the memory,flow,structure of this machine.

Memory

The memory should be 4 kB (4 kilobytes, ie. 4096 bytes) large. CHIP-8’s index register and program counter can only address 12 bits (conveniently), which is 4096 addresses. All the memory is RAM and should be considered to be writable. CHIP-8 games can, and do, modify themselves. The first CHIP-8 interpreter (on the COSMAC VIP computer) was also located in RAM, from address 000 to 1FF. It would expect a CHIP-8 program to be loaded into memory after it, starting at address 200 (512 in decimal). Although modern interpreters are not in the same memory space, you should do the same to be able to run the old programs; you can just leave the initial space empty, except for the font. As you already might have understand, that its important to have good knowledge of programming because, it seems like to create an easy emulator system but that is not the case. Actually, most people are saying that, programming language is not required to know because automated tools are already existed in the area , but I would recommend you to learn and understad the structure of any programming language such as C,C++ would be a good choice to kick off. Nonetheless, as we read that CHIP-8 has direct access to up to 4KB of ram. Stack;A stack for 16-bit addresses, which is used to call subroutines/functions and return from them

If you are interested to write your own emulator system go to this page –> CHIP-8 EMULATOR

To start off with this machine, it is important to include instructions into the memory. I will show a code. Btw, you can read the whole source code by the blog which posted.

 

#include <fstream>

const unsigned int START_ADDRESS = 0x200;

void Chip8::LoadROM(char const* filename)
{
	// Open the file as a stream of binary and move the file pointer to the end
	std::ifstream file(filename, std::ios::binary | std::ios::ate);

	if (file.is_open())
	{
		// Get size of file and allocate a buffer to hold the contents
		std::streampos size = file.tellg();
		char* buffer = new char[size];

		// Go back to the beginning of the file and fill the buffer
		file.seekg(0, std::ios::beg);
		file.read(buffer, size);
		file.close();

		// Load the ROM contents into the Chip8's memory, starting at 0x200
		for (long i = 0; i < size; ++i)
		{
			memory[START_ADDRESS + i] = buffer[i];
		}

		// Free the buffer
		delete[] buffer;
	}
}


Above code, shall be implemented direct to the memory of this emulation system. For more info you can read it.

When an emulator reads the instruction $7522, it would emulate the behavior of the CHIP-8 by doing something like this:

-> registers[5] += 0x22; And that’s all.

-> Low level programming

$200: CALL $208
$202: JMP $20E
$204: LD V1, $1
$206: RET
$208: LD V3, $3
$20A: CALL $204
$20C: RET
$20E: LD V4, $4

Dolphin

Dolphin is a free and open-source video game console emulator for GameCube and Wii that runs on Windows, Linux, MacOS, Android, Xbox One, Xbox Series X and Series S. It had its inaugural release in 2003 as freeware for Windows. Dolphin was the first GameCube emulator that could successfully run commercial games. Dolphin Dolphin-Emulator

I am now really thrilled to illustrate Qiling Framework. Take your seat!!

Unicorn

I have already mentioned that emulator,would be useful to analyze your specific binary file. Before diving into Qiling framework, let me talk about unicorn Next Generation CPU Emulator Framework

Unicorn is a lightweight, multi-platform, multi-architecture CPU emulator framework, based on QEMU. It has a lot advantages like emulating physical CPU only with help of software only. It also supports and focuses on CPU opration only, but ignore machine devices. by applicating your binary, it emulates the code without needing to have real CPU (good one) such as Cross-architecture emulator for console games. It will safely analyze malare code, and being able to detect virus signature. You should be good at reading and understanding the code and binaries because, it verifies code semantic in reversing

The most important thing is by understanding Internals of CPU emulator.

It does as follows:

  1. When we given the input code in binary form, it shall decode binary into seperate instructions
  2. It will also emulate exactly what each instruction does for instance 2.1 The instructions should be manually referenced is needed 2.2 Handle memory access & I/O upon requested.
  3. It updates CPU context (register/memory) after each step

Being able to understand the emulation X86 32bit instructions low level program is shown below:

50 –> push eax . load eax register . copy eax value to stack bottom . decrease esp by 4, and update esp

01D1 –> add eax, ebx . load eax & ebx registers . add values of eax & ebx, then copy result to eax . update flags 0F,SF,ZF,AF,CF,PF accordingly

For more information you can check this out Unicorn Emulator

I would say that this emulator is a good one because,

It supports multiple arch such as: x86,ARM,ARM64,MIPS,powerPC, Sparc

Not only by that, it also supports multi-platforms such as:

*nix, Windows, Android, IOS, etc…

This framwork has been still updated by developers, It keeps with latest CPU extensions

Good perfomance?

I do believe that it has. Just-in-Time(JIT) compiler technique

let us back to Qiling framework


Qiling Framework

We have mentioned earlier about emulator,now let’s talk about the framework.

Qiling is an advanced binary emulation framework. It uses the well known Unicorn Engine and understands operating systems. It knows how to load libraries and executables, how to relocate shared libraries, handles syscalls and IO handlers. Qiling can execute binaries without the binaries native operating system. You’ll probably won’t use Qiling to emulate complete applications, but emulating (large) functions and code works flawlessly. The Qiling framework comes out of the box supporting %40 Windows API. Linux syscalls and has also some UEFI coverage. Qiling is capable of creating snapshots, hooking into sys- and api calls, hot patching, remote debugging and hijacking stdin and stdout. Qiling is an OS emulator, but Unicorn only emulates the CPU. This is the point that separates the two. For Qiling, 3 layers are mentioned, respectively. OS -> Loader -> Arch .

In order to better understanding of Qiling let me explain about Windows APIs.

## Windows32 API The Win32 API (Application Programming Interface) is the common programming interface for the Microsoft Windows 95 and Windows NT operating systems. The use of a common API allows applications to be developed once and deployed on multiple operating systems. Because Microsoft Windows NT can also be hosted on multiple hardware platforms,Microsoft has released optimizing compilers for each of the hardware platforms that will optimize a Win32 application for the specific target hardware. Using the Windows API, you can develop applications that run successfully on all versions of Windows while taking advantage of the features and capabilities unique to each version. (Note that this was formerly called the Win32 API. The name Windows API more accurately reflects its roots in 16-bit Windows and its support on 64-bit Windows.) –> Windows32-API

As you can see, it is important to understand and analyze “Windows Internal” to work with Qiling Framework.

Capstone

Capstone is a disassembly framework designed to provide a simple, lightweight API that transparently handles most popular instruction architectures, including x86/x86-64, ARM, and MIPS, among others. It has bindings for C/C++ and Python (plus other languages, but we’ll use C/C++ as usual) and runs on all popular platforms, including Windows, Linux, and macOS. It’s also completely free and open source. Building disassembly tools with Capstone is a straightforward process, with extremely versatile possibilities

You can easily install “Capstone” through Python

-> pip install capstone -> import capstone -> help(capstone)

—————- Output———————-

 NAME
    capstone - # Capstone Python bindings, by Nguyen Anh Quynnh <aquynh@gmail.com>

PACKAGE CONTENTS
    arm
    arm64
    arm64_const
    arm_const
    evm
    evm_const
    m680x
    m680x_const
    m68k
    m68k_const
    mips
    mips_const
    ppc
    ppc_const
    sparc
    sparc_const
    systemz
    sysz_const
    tms320c64x
    tms320c64x_const
    x86
    x86_const
    xcore
    xcore_const

CLASSES
    _ctypes.CFuncPtr(_ctypes._CData)
        ctypes.CFunctionType
    builtins.Exception(builtins.BaseException)
        CsError
    builtins.object
        Cs
        CsInsn

    CS_SKIPDATA_CALLBACK = class CFunctionType(_ctypes.CFuncPtr)
     |  Method resolution order:
     |      CFunctionType
     |      _ctypes.CFuncPtr
     |      _ctypes._CData
     |      builtins.object
     |
     |  Data descriptors defined here:
     |
     |  __dict__
     |      dictionary for instance variables (if defined)
     |
     |  __weakref__
     |      list of weak references to the object (if defined)
     |
     |  ----------------------------------------------------------------------
     |  Methods inherited from _ctypes.CFuncPtr:
     |
     |  __bool__(self, /)
     |      True if self else False
     |
     |  __call__(self, /, *args, **kwargs)
     |      Call self as a function.
     |
     |  __repr__(self, /)
     |      Return repr(self).
     |
     |  ----------------------------------------------------------------------
     |  Static methods inherited from _ctypes.CFuncPtr:
     |
     |  __new__(*args, **kwargs) from _ctypes.PyCFuncPtrType
     |      Create and return a new object.  See help(type) for accurate signature.
     |
     |  ----------------------------------------------------------------------
     |  Data descriptors inherited from _ctypes.CFuncPtr:
     |
     |  argtypes
     |      specify the argument types
     |
     |  errcheck
     |      a function to check for errors
     |
     |  restype
     |      specify the result type
     |
     |  ----------------------------------------------------------------------
     |  Methods inherited from _ctypes._CData:
     |
     |  __ctypes_from_outparam__(...)
     |
     |  __hash__(self, /)
     |      Return hash(self).
     |
     |  __reduce__(...)
     |      Helper for pickle.
     |
     |  __setstate__(...)

    class Cs(builtins.object)
     |  Cs(arch, mode)
     |
     |  Methods defined here:
     |
     |  __del__(self)
     |      # destructor to be called automatically when object is destroyed.
     |
     |  __init__(self, arch, mode)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |
     |  disasm(self, code, offset, count=0)
     |      # Disassemble binary & return disassembled instructions in CsInsn objects
     |
     |  disasm_lite(self, code, offset, count=0)
     |      # Light function to disassemble binary. This is about 20% faster than disasm() because
     |      # unlike disasm(), disasm_lite() only return tuples of (address, size, mnemonic, op_str),
     |      # rather than CsInsn objects.
     |
     |  errno(self)
     |      # get the last error code
     |
     |  group_name(self, group_id, default=None)
     |      # get the group name
  

## Exploring the Capstone

You can get more information about Capstone through written documentation.After reading it, come back and explore the Capstone API. I would say that my reasearch was on the docs of API but could not fine something. You will be able to find Capstone header files, fortunately, it is well commented and not such a comlex code. As I said, It’s well mandatory to learn any programming langue which can be used.

-> ls /usr/include/capstone/

// output

arm.h arm64.h capstone.h mips.h platform.h ppc.h sparc.h systemz.h x86.h xcore.

As you’ve seen, capstone.h is the main Capstone header file. It contains commented definitions of all the Capstone API functions as well as the architecture-independent data structures, such as cs_insn and cs_err. This is also where all the possible values for enum types like cs_arch, cs_mode, and cs_err are defined. For instance, if you wanted to modify the linear disassembler so it supports ARM code, you would reference capstone.h to find the proper architecture (CS_ARCH_ARM) and mode (CS_MODE_ARM) parameters to pass to the cs_open function.4

## Recursive Disassembly with Capstone Capstone allows you to inspect only basic information about instructions, such as the address, raw bytes, or mnemonic representation. This is fine for a linear d
disassembler, as you saw in the previous example. However, more advanced binary analysis tools often need to make decisions based on instruction properties, such as the registers theinstruction accesses, the type and value of its operands, the type of instruction (arithmetic, control flow, and so on), or the locations targeted by control flow in

I found a piece of code, which will help us to understand. This code shows us a basic implementation of recursive disassembly.

 
 
int disasm(Binary *bin);
void print_ins(cs_insn *ins); 
 
main(int argc, char *argv[])
{
  Binary bin;
  std::string fname;
  if(argc < 2) {
  printf("Usage: %s <binary>\n", argv[0]);
  return 1;
}
  fname.assign(argv[1]);
  if(load_binary(fname, &bin, Binary::BIN_TYPE_AUTO) < 0) {
  return 1;
}
  if(disasm(&bin) < 0) {
  return 1;
}
  unload_binary(&bin);
  return 0;
}
int disasm(Binary *bin)
{
  csh dis;
  cs_insn *cs_ins;
  Section *text;
  size_t n;
  const uint8_t *pc;
  uint64_t addr, offset, target;
  std::queue<uint64_t> Q;
  std::map<uint64_t, bool> seen;
  text = bin->get_text_section();
  if(!text) {
  fprintf(stderr, "Nothing to disassemble\n");
  return 0;}
  if(cs_open(CS_ARCH_X86, CS_MODE_64, &dis) != CS_ERR_OK) {
  fprintf(stderr, "Failed to open Capstone\n");
  return -1;
}
 

The structure of Qiling Framework

The usage of this Framework is not hard as you expect it has a lot of advantages. The documentation of Qiling is a bit outdated, but it works fine. Written in Python
will be easy for everyone who wantS to become familiar with.

To get more familiar with this framework, i am going to mention these structures:

  1. CPU architecture
  2. Loader
  3. OS
  4. Debugger

And of course, Instrumentation (Qilin’s framework)

Lets kick off “What is instrumentation” ?

What is Instrumentation

An instrument is a device that measures or manipulates process physical variables such as flow, temperature, level, or pressure etc. Instruments include many varied contrivances which can be as simple as valves and transmitters, and as complex as analyzers. Instruments often comprise control systems of varied processes. The control of processes is one of the main branches of applied instrumentation.

source: İnstrumentation

A –> B –> C –> D –> E –> F

Above is represented as function to another function. We are going to work with APIs and you will understand more how this framework exactly works.

Qiling’s Instrumentation

When we talk about Qiling’s framework, we should consider as follow:

B – > D –> E

  1. B –> alter syscall (start)
  2. D –> alter function()
  3. E –> alter CPU register (end)

I will show you what this means…

Qiling Framework and its Interface

We should know that Loader and setup must be set up such as:

ELF, PE, COM, MACHO, UEFI,etc….

What is Loader

A loader locates programs in memory and prepares them for execution. It is an important component when starting a program. It includes tasks such as reading the content of the executable file and placing the file in memory.

CPU instrumentation

Oke, lets say we stored us loader and setup to CPU. It has a lot capibilities like:

 0x1.  Access to register
 0x2.  Reading register
       0x2 ql.reg.eax
 0x3.  Writing to register
       0x31. ql.reg.eax = 0x44
 0x4.  Differet Hooks

#### Example

 q1 = Qiling(path, rootfs, verbose=QL_VERBOSE.OFF)
 
 def cpu(q1):
     addr = q1.os.function_arg[0]
     reg = q1.reg.read("rax")
     q1.reg.eax = reg
 
 

Operating System instrumentation

Oke, lets say we stored us loader and setup to OS. It has a lot capibilities like:

  0x1. Access to memory
       0x11. q1.mem.read()
       0x12. q1.mem.write()
  0x2. Search pattern from memory
       0x21. q1.mem.search()
  0x3. Stack related operation
       0x31. q1.stack_pop
       0x32. q1.stack_push
  0x4. Syscall replacement
       0x41. q1.set_syscall()
       0x42. q1.set_api()
       
  0x5. Replace library call with
       0x51. q1.set_api()

### Example

 

rootfs = "my_rootfs/arm64_linux"
binary = "/home/darkghost/qiling-challenges/qilinglab-aarch64"
q1 = Qiling(path, rootfs, verbose=QL_VERBOSE.OFF) 


q1.set_syscall(0x05, my_syscall_function)
 
 

## File System instrumentation

Oke, lets say we stored us loader and setup to File System. It has a lot capibilities like:

0x1. Map Host file
       0x11. q1.fs_mapper()
0x2. Hijack accessed file
       0x21. q1.fs_mapper(hijack_func)
0x3. Stdio replacement
       0x31. stdin
       0x32. stdout
       0x32. stderr
0x4. Patch file's memory before execution
       0x41. q1.patch
       
       

Example

class FakeRnd(QlFsMappedObject):

    def read(self,size):
          return b"\x01" #will be fixed /dev/random


rootfs = "my_rootfs/arm64_linux"
q1 = Qiling(path, rootfs, verbose=QL_VERBOSE.OFF) 




if __name__ == "__main__":
  q1.add_fs_mapper("/dev/random", fakeRnd())

Virtual Machine instrumentation

Oke, lets say we stored us loader and setup to Virtual machine. It has a lot capibilities like:

0x1. Save current state
       0x11. q1.save()
0x2. Restore current state
       0x21. q1.restore
0x3. Save/restore memory only
       0x31. q1.mem.save()
0x4. Save/restore register only
       0x41. q1.reg.save()
       

Example


rootfs = "my_rootfs/arm64_linux"
q1 = Qiling(path, rootfs, verbose=QL_VERBOSE.OFF) 

def save(q1,*args,**kw):
    q1.save(cpu=False, snapshot="Darkshot.bin")
 
  
if __name__ == "__main__":
  listener_thread = threading.Thread(target=listener_thread, deamon=True)
  listener_thread.start()
  solution("rootfs/bin/httpd", rootfs)
  

Debugger instrumentation

Oke, lets say we stored us loader and setup to Debugger. It has a lot capibilities like:

0x1. Open API for RSP compatible Debugger
0x2. Build in debugger-Qdbg
       0x21. Reverse debug
       

Example

from qiling import *
from qiling.os.mapper import QlFsMappedObject
from qiling.const import QL_VERBOSE



def debug(path,rootfs,debug):
      q1 = Qiling(path, rootfs, verbose=QL_VERBOSE.OFF)
      q1.multithread = False
      q1.debugger = "gdb"
      q1.run()







if __name__ == "__main__":
    debug(["/home/darkghost/qiling-framework/qiling/examples/rootfs/x8664_linux/bin"],"rootfs/arm_linux","debug")




Qiling technique

The last step, I am going to use Qiling framework to show you how this framework works in practically. This framework, which understands OS concepts (executable format such as ELF, dynamic linkers, syscalls, IO handlers…). Very convenient to quickly emulate an executable binary without emulating its entire OS. I was really interested to handle with framework. There 11 challenges to be completed, but I will solve 3 of them. If you are interested and you want to learn more about, you can check this –> Shielder. If you are ready let me get started with challenge 1.

Setup

Being able to start off with this framework, you should have installed “qiling” library. I am going to demonstrate step by step how it works. We will use “qilinglab-aarch64” but we need to implement this in our code.

Installation using pip (stable version): -> pip3 install qiling

To install the latest dev version using pip

-> pip3 install –user qiling


For this installation guide, Ubuntu desktop 18.04.3 LTS 64bit is the base example (Qiling Framework works in other Linux distributions that run Python 3.5 and above). Grab a copy of official Ubuntu ISO images from Ubuntu CD mirrors. Update and the system and also install pip3, git and cmake

-> sudo apt-get update -> sudo apt-get upgrade -> sudo apt install python3-pip git cmake

Once completed, clone a copy of Qiling Framework source from github and run setup to install it.

-> git clone https://github.com/qilingframework/qiling -> cd qiling -> sudo pip3 install .

Also don’t forget to initialize the rootfs.

-> git submodule update –init –recursive (The most important step because it will not work properly if you are not setting up)


Important note on Windows DLLs and registry (from site)

Due to distribution restriction, Qiling Framework will not bundle Microsoft Windows DLL files and registry. Please copy respective DLLs and registry from Microsoft Windows System. For Windows 10 usually found in C:\Windows\system32 (64bit dll) and C:\Windows\SysWOW64 (32bits dll) and place them in $rootfs/dlls

We also included a script named dllscollector.bat. Run this on Windows, under Administrator privilege, to collect all the necessary dlls and registries. examples/scripts/dllscollector.bat

Any other dlls and registry references, as below: For 32bit Windows dlls, please refer to DLLX86.txt for Windows 32bit DLLs hashes and file version

For 64bit Windows dlls, please refer to DLLX8664.txt for Windows 64bit DLLs hashes and file version. Additional Notes: .travis.yml will be able to clearly ist out dlls needed

Challenge 1 (memory mapping)

First of all, we need to provide the path of the binary and a rootfs (the root of the filesystem from the point of view of the emulated binary). İf we not provide the path of the binary, you will not be able to work with.

 
 
 from qiling import *

 if __name__ == '__main__':

    path = ["/home/darkghost/qiling-challenges/qilinglab-aarch64"] # this path should be changed
    rootfs = "my_rootfs/arm64_linux" 

    ql = Qiling(path, rootfs)
    ql.run()


As you can see above that we just imported “qiling” and selected our path of that binary which will be analyzed. We are running on Linux 64. This step is very important because qiling framework will not be able to find the path of your file.

Why is rootfs(image)important ?

A rootfs image is just a file system image, that hosts at least an init system. For instance, our getting started guide uses an EXT4 FS image with OpenRC as an init system. Note that, whichever file system you choose to use, support for it will have to be compiled into the kernel, so it can be mounted at boot time.

You can get more information from rootfs

we need to give Qiling a rootfs which contains the right libraries for loading the ELF. if you get stuck you can install manually, but please follow the steps below.

Qiling already provides a minimalist aarch64 Linux rootfs that we can download and use arm64

For more information, read the docs and I will provide the link for you —> qiling-guide

Let me run this python file and see what happens.

output-Qiling

As you can see we ran this file properly with no issue. I am also going to use Ghidra to check.

Ghidra

also works. Now let’s get started with Challenge 1.

We should that, we are not solving RE challenge because its not encrypted nor obfuscated. We should focus on the binary and analyze it.

-> Challenge 1: Store 1337 at pointer 0x1337.

We need to store 1337 at pointer 0x1337. If you review the file on Ghidra you will see that its not being mapped. Let’s try out and enhance our Qiling skills.


void challenge1(char *check) {
  if (_DAT_00001337 == 1337) {
    *check = 1;
  }
}

Awesome we just changed “param” to “check” to have better understanding what is going on. When we run this file the program tries to store 0x1337 which is not mapped, hence the UC_ERR_READ_UNMAPPED we get when we run the binary. In this case, Qiling will help us to store 0x1337 to the memory. I am going to write a script with qiling technique.


from qiling import *
from qiling.os.mapper import QlFsMappedObject
from qiling.const import QL_VERBOSE

rootfs = "my_rootfs/arm64_linux"
binary = "/home/darkghost/qiling-challenges/qilinglab-aarch64"


def solution(path, rootfs):
    q1 = Qiling(path, rootfs, verbose=QL_VERBOSE.OFF) # here we used the path and rootfs
    # challenge 1
    q1.mem.map(0x1337 // 4096 * 4096, 4096, info="[challenge1]") # we saw that this binary has challenge1 file thus i modified it. Mapping process
    q1.mem.write(0x1337, q1.pack16(1337)) # after mapping we just wrote to the memory
    q1.run() # run it
    
 
if __name__ == "__main__":
    solution([binary], rootfs)

Lets see that it works

solved-Challenge1

Being able to check it out, we can use ql.mem.show_mapinfo() to see all the area we just mapped (verbose) option will be important to be 3-4.

Challenge 2 (hooking address)

The next challenge is about hooking address. It contains impossible entering condition.To pass this check, we can use the hook_address function to enter the loop. You can more read about this technique on Qiling well documented page.Now let’s get started with our walkhtrough. First of all, being able to analyze the binary we will use Ghidra. I suggest you to take a look and create your own challenges to become familiar with this tool.

void challenge4(char *check) {
    int i;
    i = 0;
    while (i < 0) {
        *check = 1;
        i = i + 1;
    }
}




As you can see, when you see this binary being implemented with challenge 4 moreover, it gave us impossible entering condition. The goal is that we should dig deeper into this file As the result said:



                                                                                                                                        
**************************************************************
  *                          FUNCTION                          *
**************************************************************
         undefined __cdecl challenge4(undefined8 check)

             
        00100fac ff 83 00 d1     sub        sp,sp,#0x20
        00100fb0 e0 07 00 f9     str        check,[sp, #local_18]
        00100fb4 ff 1b 00 b9     str        wzr,[sp, #local_8]
        00100fb8 ff 1f 00 b9     str        wzr,[sp, #local_4]
        00100fbc 07 00 00 14     b          LAB_00100fd8
                             LAB_00100fc0                                    XREF[1]:     00100fe4(j)
        00100fc0 e0 07 40 f9     ldr        check,[sp, #local_18]
        00100fc4 21 00 80 52     mov        w1,#0x1
        00100fc8 01 00 00 39     strb       w1,[check]
        00100fcc e0 1f 40 b9     ldr        check,[sp, #local_4]
        00100fd0 00 04 00 11     add        check,check,#0x1
        00100fd4 e0 1f 00 b9     str        check,[sp, #local_4]
                             LAB_00100fd8                                    XREF[1]:     00100fbc(j)
        00100fd8 e0 1b 40 b9     ldr        check,[sp, #local_8]
        00100fdc e1 1f 40 b9     ldr        w1,[sp, #local_4]
        00100fe0 3f 00 00 6b     cmp        w1,check            
        00100fe4 eb fe ff 54     b.lt       LAB_00100fc0
        00100fe8 1f 20 03 d5     nop
        00100fec ff 83 00 91     add        sp,sp,#0x20
        00100ff0 c0 03 5f d6     ret




Before showing the solution of this challenge, I want to cover (IAT) hooking briefly.

IAT HOOKING

Overview

Windows portable executable contains a structure called Import Address Table (IAT) IAT is a lookup table of function pointers for functions imported from modules (executables or dlls). At compile time addresses of these functions are unknown so dynamic linker/loader has to fill IAT with real function addresses at runtime. IAT hooking relies on replacing real function address in IAT table with address we control. IAT doesn’t work with functions obtained from dlls by LoadLibrary/GetProcAddress directly (but we can overwrite GetProcAddress to give different result).

PE-STRUCTURE

DataDirectory is an array of IMAGE_DATA_DIRECTORY structures:



typedef struct _IMAGE_DATA_DIRECTORY {
   DWORD VirtualAddress;     // RVA of data
   DWORD Size;               // Size of the data in bytes
}IMAGE_OPTIONAL_HEADERS32, *PIMAGE_OPTIONAL_HEADERS32;



But finding the IAT is not enough for hooking an API function. It contains only API addreses and in order to replace an API function address we need to know which entry belongs to the API function that we want to hook. For this, we have to look at IDT (pointer to IDT is in DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT]).



typedef struct _IMAGE_IMPORT_DESCRIPTOR {
    union {
        DWORD   Characteristics;                        // 0 for terminating null import descriptor
        PIMAGE_THUNK_DATA   OriginalFirstThunk;         // The RVA of the import lookup table
    };
    DWORD   TimeDateStamp;                  // 0 if not bound,
                                            // -1 if bound, and real date\time stamp
                                            //     in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND)
                                            // O.W. date/time stamp of DLL bound to (Old BIND)

    DWORD   ForwarderChain;                 // -1 if no forwarders
    DWORD   Name;                           // address of dll name string
    PIMAGE_THUNK_DATA   FirstThunk;         // same as OriginalFirstThunk or, if bound, the RVA of the IAT. 
} IMAGE_IMPORT_DESCRIPTOR;
typedef IMAGE_IMPORT_DESCRIPTOR UNALIGNED *PIMAGE_IMPORT_DESCRIPTOR;


IDT contains entries for all dlls loaded by executable and is used by the loader to fill entries in IAT with real function addresses. ILT (Import Lookup Table) contains a list of function names imported from the specified DLL. Entries of ILT are IMAGE_THUNK_DATA32 structs.

resource: IAT

Lets continue with our challenge. We know that we should hook the memory address to the correct order.




from qiling import *
from qiling.os.mapper import QlFsMappedObject
from qiling.const import QL_VERBOSE
import os

rootfs = "my_rootfs/arm64_linux"
binary = "/home/darkghost/qiling-challenges/qilinglab-aarch64"


def solution(path, rootfs):
    q1 = Qiling(path, rootfs, verbose=QL_VERBOSE.OFF)
    # challenge 1
    q1.mem.map(0x1337 // 4096 * 4096, 4096, info="[challenge1]")
    q1.mem.write(0x1337, q1.pack16(1337)) 
    baseAddr = q1.mem.get_lib_base(os.path.basename(q1.path))
    q1.hook_address(hook, baseAddr + 0xfe0)
    q1.run()
    
    
    
    #challenge 2
def hook(q1):
    q1.arch.regs.w0 = 1
    return 
    
	

if __name__ == "__main__":
    solution([binary], rootfs)
    

    




    baseAddr = q1.mem.get_lib_base(os.path.basename(q1.path))
    q1.hook_address(hook, baseAddr + 0xfe0)
    q1.run()
    
    
    def hook(q1):
    q1.arch.regs.w0 = 1
    return 



Lets focus on this piece of code. In order to get the base address of this elf file we can use “q1.mem.get_lib_base(os.path.basename()” but I had spoken with tech supports and they said, “ql.loader.images[0].base” it is totally up to you. So after finding the correct base address of this file.

    00100fe0 5f 00 00 6c     cmp        w1,w0           <--- We ought to hook here to be passed. 
    00100fe4 eb fe ff 45     b.lt       LAB_00100fc0
    

My base address was: 93824992231424

oke, as you can see i had defined hook() which returns q1.arch.regs.w0 = 1 ,to be 1.

after the return of this function, we are now able to hook the memory address with “0xef0” –> 4096

now run this file.


Welcome to QilingLab.
Here is the list of challenges:
Challenge 1: Store 1337 at pointer 0x1337.
Challenge 2: Make the 'uname' syscall return the correct values.
Challenge 3: Make '/dev/urandom' and 'getrandom' "collide".
Challenge 4: Enter inside the "forbidden" loop.
Challenge 5: Guess every call to rand().
Challenge 6: Avoid the infinite loop.
Challenge 7: Don't waste time waiting for 'sleep'.
Challenge 8: Unpack the struct and write at the target address.
Challenge 9: Fix some string operation to make the iMpOsSiBlE come true.
Challenge 10: Fake the 'cmdline' line file to return the right content.
Challenge 11: Bypass CPUID/MIDR_EL1 checks.

Checking which challenge are solved...
Note: Some challenges will results in segfaults and infinite loops if they aren't solved.

Challenge 1: SOLVED
Challenge 2: NOT SOLVED
Challenge 3: NOT SOLVED
Challenge 4: SOLVED

As you can see above we are done with challenge 4.

Summary

First of all, i would like to thank Qiling’s developers. I really appreciate their work. I did my reasearch on this project thus it can be so that I forgotten something, please forgive me for that. I will put the link below, you can do your own research and learn from these challenges -.

  1. Qiling: qiling
  2. Qiling challenges: qiling-challenges
  3. telegram: Qiling-Telegram

I wish you a good luck. Please do not forget to share and like our blogposts for more awesome stuffs.

Thanks for reading.

Ahmet Göker malware researcher exploit researcher ~ Malwation ~