关于RetDec反编译x86 PE可执行文件生成的__asm_rep_movsb_memcpy、__asm_rep_stosb_memset及__asm_in等函数的技术问询
Great questions—these are common confusions when working with RetDec's decompiled output, so let's break this down clearly:
1. Are __asm_rep_stosb_memset() and __asm_rep_movsb_memcpy() identical to standard memset()/memcpy()?
Short answer: Yes, they behave exactly like the standard C library versions—but they're direct mappings of x86 assembly instructions rather than calls to the runtime.
rep stosb is the x86 assembly workhorse for filling memory with a repeated value, which is exactly what memset() does. Similarly, rep movsb is the instruction for copying blocks of memory, the core of memcpy(). RetDec just wraps these instructions into pseudocode functions to make the decompiled code readable, without changing their functionality.
2. Why doesn't RetDec just use standard memset()/memcpy()?
RetDec plays it safe and prioritizes literal accuracy over making assumptions, which is why it sticks to these wrappers instead of substituting standard library calls:
- No guesswork about the original code: The decompiler can't be certain the original binary used the standard C runtime's
memset()ormemcpy()—the author might have written hand-coded assembly that usesrep stosb/rep movsbdirectly. Using the wrapper preserves the original instruction's intent without making unfounded assumptions. - Preserves low-level details: Some optimized or custom memory functions might add extra checks (like alignment) or use different register setups than the raw
repinstructions. RetDec wants to show you exactly what the assembly did, not what it could have been replaced with. - Easier cross-referencing: If you're comparing the decompiled code to the original disassembly, seeing
__asm_rep_stosb_memset()immediately tells you where arep stosbinstruction was used—something that would be hidden if it was replaced withmemset().
3. What do __asm_in() and other __asm_* functions mean?
These are RetDec's way of representing x86 assembly instructions that don't have a direct equivalent in standard C. They're not real functions—they're pseudocode placeholders for instructions that interact with hardware, special registers, or perform privileged operations.
__asm_in(port)maps directly to the x86ininstruction, which reads data from a hardware I/O port. Your example__asm_in(513)is the decompiled version ofin al, 0x201(since 513 = 0x201 in hex)—it reads one byte from I/O port 0x201 into thealregister, then assigns that value tov3.- Other
__asm_*functions (like__asm_out()for writing to ports,__asm_cpuid()for CPU identification) follow the same rule: their names match the assembly instruction, and arguments match the instruction's operands.
Since these are placeholders, they won't be defined in any header. For reverse engineering, you have two options:
- Implement simple wrappers using inline assembly (e.g., for GCC, a
__asm_inwrapper could useinbinline assembly to read the port). - Replace them with context-specific logic once you figure out what the I/O port or instruction does (e.g., if port 0x201 is a keyboard status register, you can add a comment explaining that and handle the value accordingly).
内容的提问来源于stack exchange,提问作者Andreas




