Zydis v4.0 released!
Today we are happy to announce the release of Zydis v4.0!
Simplified Disassembler API
We introduced new all-in-one functions for decoding and formatting an instruction in a single call without the need to initialize a decoder or formatter up-front. This simplifies using Zydis in cases where performance isn’t of utmost concern and you just want to decode a few instructions on the quick.
c
// Decode and format instruction to human-readable assembly in one call.ZydisDisassembledInstruction insn;ZydisDisassembleIntel(/* machine_mode: */ ZYDIS_MACHINE_MODE_LONG_64,/* runtime_address: */ 0,/* buffer: */ "\x55",/* length: */ 1,/* instruction: */ &insn);assert(insn.info.mnemonic == ZYDIS_MNEMONIC_PUSH);assert(strcmp(insn.text, "push rbp") == 0);
For those who prefer AT&T syntax, just swap out Intel
with ATT
.
Encoding Instructions
Zydis now supports encoding instructions, allowing users to generate new code on the fly or to rewrite existing one with less effort than ever before.
Generate New Instructions
c
ZydisEncoderRequest req ={.machine_mode = ZYDIS_MACHINE_MODE_LONG_64,.mnemonic = ZYDIS_MNEMONIC_MOV,.operand_count = 2,.operands ={{.type = ZYDIS_OPERAND_TYPE_REGISTER,.reg.value = ZYDIS_REGISTER_RAX,},{.type = ZYDIS_OPERAND_TYPE_IMMEDIATE,.imm.u = 0x1337,}}};ZyanU8 insn_bytes[ZYDIS_MAX_INSTRUCTION_LENGTH];ZyanUSize insn_length = sizeof(insn_bytes);ZydisEncoderEncodeInstruction(&req, insn_bytes, &insn_length);assert(insn_length == 7);assert(memcmp(insn_bytes, "\x48\xc7\xc0\x37\x13\x00\x00", 7) == 0);
This example code encodes a mov
instruction from scratch, writing the instruction
bytes into insn_bytes
and placing the instruction length into insn_length
.
Rewriting Code
c
// mov rcx, qword ptr ds:[0x1234]ZyanU8 mov_bytes[] = "\x48\x8B\x0C\x25\x34\x12\x00\x00";// Decode and print the original instruction.ZydisDisassembledInstruction insn;ZydisDisassembleIntel(ZYDIS_MACHINE_MODE_LONG_64, 0, mov_bytes, sizeof(mov_bytes), &insn);assert(strcmp(insn.text, "mov rcx, [0x0000000000001234]") == 0);// Convert the decoded instruction into an encoder request.ZydisEncoderRequest req;ZydisEncoderDecodedInstructionToEncoderRequest(&insn.info, insn.operands, insn.info.operand_count, &req);// Change a few things about it.req.operands[0].reg.value = ZYDIS_REGISTER_RSI;req.operands[1].mem.displacement += 0x10000;// Encode the changed instruction.ZyanU8 new_insn[ZYDIS_MAX_INSTRUCTION_LENGTH];ZyanUSize new_insn_length = sizeof(new_insn);ZydisEncoderEncodeInstruction(&req, new_insn, &new_insn_length);// Disassemble and print new instruction again.ZydisDisassembleIntel(ZYDIS_MACHINE_MODE_LONG_64, 0, new_insn, new_insn_length, &insn);assert(strcmp(insn.text, "mov rsi, [0x0000000000011234]") == 0);
This example decodes a mov
instruction, converts the decoded instruction
into an encoder request, changes the register and displacement values and
then lastly encodes it back to binary code. The final disassembly step is
included merely for demonstration purposes, showing that the instruction
was indeed changed as requested.
Split Operand Decoding
Zydis allows users to not only inspect explicit operands, but also implicit
ones. Implicit operands are operands that are not printed in the human-readable
assembly generated by the formatter, but are still inspected or changed by the
CPU when the instruction is executed. A prominent example for an instruction
with many implicit operands is pushad
which essentially pushes all general
purpose registers onto the stack despite having zero explicit operands.
In v3, the ZydisDecodedInstruction
structure contained a field
c
ZydisDecodedOperand operands[ZYDIS_MAX_OPERAND_COUNT];
where ZYDIS_MAX_OPERAND_COUNT
was defined as 10
, the worst-case assumption
for the instruction with the maximum number of implicit operands. While this was
convenient, it also caused significant avoidable bloat of the instruction
structure, sometimes causing issues when it was allocated on stack in
environment where stack space is restricted (e.g. kernel threads). Oftentimes,
users were not even interested in the visible operands, only wanting to inspect
the mnemonic and instruction length in the majority of cases.
In Zydis v4, operand decoding is now optional:
c
ZyanU8 jmp_bytes[] = "\xE9\xAB\x00\x00\x00";ZydisDecoder decoder;ZydisDecoderInit(&decoder, ZYDIS_MACHINE_MODE_LONG_64, ZYDIS_STACK_WIDTH_64);ZydisDecodedInstruction insn;ZydisDecoderContext ctx;ZydisDecoderDecodeInstruction(&decoder, &ctx, jmp_bytes, sizeof(jmp_bytes), &insn);// Only decode operands if we're actually interested in the mnemonic.if (insn.mnemonic == ZYDIS_MNEMONIC_JMP){// Only decoder visible operands.ZydisDecodedOperand operands[ZYDIS_MAX_OPERAND_COUNT_VISIBLE];ZydisDecoderDecodeOperands(&decoder, &ctx, &insn, operands,ZYDIS_MAX_OPERAND_COUNT_VISIBLE);assert(operands[0].imm.value.s == 0xAB);}
For users that are looking for a way to achieve something close to the previous
behavior, the convenience function ZydisDecoderDecodeFull
is offered:
c
ZyanU8 jmp_bytes[] = "\xE9\xAB\x00\x00\x00";ZydisDecoder decoder;ZydisDecoderInit(&decoder, ZYDIS_MACHINE_MODE_LONG_64, ZYDIS_STACK_WIDTH_64);ZydisDecodedInstruction insn;ZydisDecodedOperand operands[ZYDIS_MAX_OPERAND_COUNT];ZydisDecoderDecodeFull(&decoder, jmp_bytes, sizeof(jmp_bytes), &insn, operands);assert(insn.mnemonic == ZYDIS_MNEMONIC_JMP);assert(operands[0].imm.value.s == 0xAB);
Simplified Formatter API
The formatter API previously had an Ex
variant of each function whose
only difference was that it had an additional user_data
argument. This
resulted in unnecessary duplication and bloat of the public interface, so
we decided to just add the user_data
argument to the regular functions.
Users that don’t wish to pass additional context to the formatter can simply
pass NULL
.
Amalgamated Builds
We are now publishing amalgamated builds of the library for every version.
These builds essentially combine all header files into a single Zydis.h
and all source files into a single Zydis.c
, making it very easy to link
against Zydis by just copying it into your project.
Amalgamated builds can also be created manually by running the
assets/amalgamate.py
script in the Zydis repository.
Porting Guide
Because v4
contains a range of breaking changes to the API, we offer a
porting guide explaining the required changes to help making the migration
process less painful.
Credits
A huge thanks goes to Mappa, who contributed pretty much the entire implementation of the instruction encoder!