R2Pay Under the Microscope: Breaking White-Box Crypto
Defeating White-Box Obfuscation with Differential Computation AnalysisSafety & Legal Notice : This content is provided for educational, research, and authorized security testing purposes only. Apply these techniques only to systems or applications you own or have explicit permission to assess, and only within isolated lab environments. The author disclaims any responsibility for misuse or illegal use of this material.
I. Introduction
In this article, we target the second part of the R2Pay challenge. While the First Article focused on bypassing the application’s runtime protections, this one goes further recovering the PIN code and extracting the secret AES master key from the white-box implementation embedded in the native library libnative-lib_x86_64.so.
The approach combines two complementary techniques. First, dynamic instrumentation using Frida and QBDI to trace the memory access patterns of the target function during execution. Second, Differential Computation Analysis (DCA) to statistically recover the AES key from those memory traces without ever reversing the white-box implementation itself.
The article walks through the full attack pipeline: 1) identifying the leakage surface in the .data section, 2) collecting traces across 500 executions, formatting the dataset, 3) and running the DCA to recover the last round key and derive the master key through the reverse AES key schedule.
II. Reversing the Authentication Flow
1. PIN Code Discovery via Fuzzing
User authentication is enforced through a 4-digit PIN code. Because the PIN space contains only 10,000 possible values, exhaustive enumeration is computationally inexpensive. By hooking the validation function with Frida, each candidate PIN can be submitted programmatically while monitoring the application’s response, allowing the correct value to be recovered efficiently. The following script illustrates the instrumentation logic used for this process.
MainActivity[METHOD_NAME].implementation = function (bArr, b) { for (let pin = 0; pin < 10000; pin++) { const inputStr = pad4(pin) + suffix; const buf = Java.array("byte", toBytes(inputStr)); try { const out = original.call(this, buf, CTL); const b0 = out[0] & 0xff; if (b0 !== 0x51) { console.log(`[HIT] PIN ${pad4(pin)} -> r2c-${hex(out, 1)}`); } } catch (e) { // Exception tracker during sequence processing } } return original.call(this, bArr, b); };
The execution of the pin_fuzzer.js Frida script completed successfully in approximately 1 hour and 15 minutes, resulting in the identification of the correct PIN code: 5971.
2. Reversing the PIN Verification & Salt Extraction
Once the correct PIN code was identified, the next step was to analyze the application’s authentication function, which relies on PIN-based verification, and examine its internal workflow. During the initial phase of the analysis, we focused on the most frequently executed functions within gXftm3iswpkVgBNDUp (sub_1780F0) and identified three key functions that were consistently invoked: 0x272530, 0x27d100, and 0x27e810.
By tracing memory accesses across multiple executions using different PIN codes, we observed recurring patterns that enabled us to identify the salt value used during the derivation process. This analysis also revealed that the application relies on the PBKDF2-HMAC-SHA256 key derivation algorithm for PIN verification.
The first strong indicator is the presence of the exact SHA-256 initial hash values defined in FIPS 180-4.
67 e6 09 6a → 0x6a09e667 ; ← H0 85 ae 67 bb → 0xbb67ae85 ; ← H1 72 f3 6e 3c → 0x3c6ef372 ; ← H2 3a f5 4f a5 → 0xa54ff53a ; ← H3 7f 52 0e 51 → 0x510e527f ; ← H4 8c 68 05 9b → 0x9b05688c ; ← H5
The second strong indicator is the presence of the HMAC inner and outer padding constants. During execution, we observed buffers filled with repeated 0x36 and 0x5c values:
[+] ENTER 0x27d100 arg0 = 0x7acc868f4a40 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 7acc868f4a40 00 00 00 00 00 00 00 00 67 e6 09 6a 85 ae 67 bb ........g..j..g. 7acc868f4a50 72 f3 6e 3c 3a f5 4f a5 7f 52 0e 51 8c 68 05 9b r.n<:.O..R.Q.h.. arg1 = 0x7acc868fa560 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 7acc868fa560 03 0f 01 07 36 36 36 36 36 36 36 36 36 36 36 36 ....666666666666 7acc868fa570 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 6666666666666666 arg2 = 0x40 [-] LEAVE 0x27d100 ret=0x0 ................... [+] ENTER 0x27d100 arg0 = 0x7acc868f4a40 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 7acc868f4a40 00 00 00 00 00 00 00 00 67 e6 09 6a 85 ae 67 bb ........g..j..g. 7acc868f4a50 72 f3 6e 3c 3a f5 4f a5 7f 52 0e 51 8c 68 05 9b r.n<:.O..R.Q.h.. arg1 = 0x7acc868fa5a0 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 7acc868fa5a0 69 65 6b 6d 5c 5c 5c 5c 5c 5c 5c 5c 5c 5c 5c 5c iekm\\\\\\\\\\\\ 7acc868fa5b0 5c 5c 5c 5c 5c 5c 5c 5c 5c 5c 5c 5c 5c 5c 5c 5c \\\\\\\\\\\\\\\\ arg2 = 0x40 [-] LEAVE 0x27d100 ret=0x0
The first four bytes are particularly interesting because they directly reveal the PIN code XORed with the padding constants. Reversing the XOR operation gives the PIN code we used. The same observation can be made with the outer pad:
0x03 ^ 0x36 = '5' ; Byte 0: Derived Key ASCII char 0x0f ^ 0x36 = '9' ; Byte 1: Derived Key ASCII char 0x01 ^ 0x36 = '7' ; Byte 2: Derived Key ASCII char 0x07 ^ 0x36 = '1' ; Byte 3: Derived Key ASCII char
...................
...................
0x69 ^ 0x5C = '5' ; Byte 0: Derived Key ASCII char 0x65 ^ 0x5C = '9' ; Byte 1: Derived Key ASCII char 0x6B ^ 0x5C = '7' ; Byte 2: Derived Key ASCII char 0x6D ^ 0x5C = '1' ; Byte 3: Derived Key ASCII char
Another major indicator is the presence of the PBKDF2 block index: 00 00 00 01 appended immediately after the salt buffer. The salt was identified through multiple executions with different PIN codes, remaining constant across all runs, and it is located in the .rodata section at address 0x002881a0.
[+] ENTER 0x27d100 arg0 = 0x7acc868f4a40 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 7acc868f4a40 40 00 00 00 00 00 00 00 5a a3 90 db 70 b7 6f 76 @.......Z...p.ov 7acc868f4a50 af 18 6e 91 74 64 38 69 9d 26 3f de f4 6f a0 f1 ..n.td8i.&?..o.. arg1 = 0x7acbc01961a0 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 7acbc01961a0 4a d8 91 93 4b 99 c3 a0 44 5f 66 ad 76 ea a1 06 J...K...D_f.v... 7acbc01961b0 b7 0e 29 f6 61 f7 8d ac f5 41 78 7d f5 9b a2 25 ..).a....Ax}...% arg2 = 0x10 [-] LEAVE 0x27d100 ret=0x0 [+] ENTER 0x27d100 arg0 = 0x7acc868f4a40 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 7acc868f4a40 50 00 00 00 00 00 00 00 5a a3 90 db 70 b7 6f 76 P.......Z...p.ov 7acc868f4a50 af 18 6e 91 74 64 38 69 9d 26 3f de f4 6f a0 f1 ..n.td8i.&?..o.. arg1 = 0x7acbb9e54e7c 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 7acbb9e54e7c 00 00 00 01 a6 20 14 a9 01 00 00 00 08 8b e5 b9 ..... .......... 7acbb9e54e8c cb 7a 00 00 80 95 19 c0 cb 7a 00 00 a6 20 14 a9 .z.......z... .. arg2 = 0x4 [-] LEAVE 0x27d100 ret=0x0
After generating the hash, the application compares it with a hardcoded value stored in the .rodata section at address 0x2881B0. The assembly code below shows how the calculated hash is checked against this embedded hash.
00179839 loc_179839: 00179839 mov eax, 147C3175h 0017983E mov ecx, 0F69AAE47h 00179843 mov rdx, [rbp+var_12A8] 0017984A mov rsi, rdx 0017984D add rsi, 1 00179851 mov [rbp+var_12A8], rsi 00179858 movzx edi, byte ptr [rdx] 0017985B mov rdx, [rbp+var_12B0] 00179862 mov rsi, rdx 00179865 add rsi, 1 00179869 mov [rbp+var_12B0], rsi 00179870 movzx r8d, byte ptr [rdx] 00179874 cmp edi, r8d ; Compare calculate key hash with correct one 00179877 cmovnz eax, ecx 0017987A mov [rbp-12B4h], eax 00179880 jmp loc_17A703
To verify our findings, we used the Python script below:
import hashlib salt = bytes.fromhex("4ad891934b99c3a0445f66ad76eaa106") expected = bytes.fromhex("b70e29f661f78dacf541787df59ba225e144628488b46b4c6047d4ced38a3af7") pin = "5971" key = hashlib.pbkdf2_hmac("sha256", pin.encode(), salt, 32, 32) print("Hash result :", key.hex()) print("Match :", key == expected)
The PIN code and its associated salt are:r2con{5971:4ad891934b99c3a0445f66ad76eaa106}.
III. Breaking White-Box AES and Recovering the Master Key
After discovering the PIN code and its salt, we will now explain how to extract the master key of the White-Box AES implementation using Differential Computation Analysis (DCA). Before doing so, we first need to understand some theoretical concepts.
1. The White-Box Adversary Model
In a standard black-box model, the attacker only sees inputs and outputs. In the white-box threat model, the attacker has total control over the execution environment.
Capabilities: The attacker can run the AES binary on an untrusted host (e.g., a mobile phone or a game console), inspect memory at runtime, hook instructions, and use Dynamic Binary Instrumentation (DBI) frameworks like QBDI or Frida.
Target: The hidden AES key embedded inside obfuscated lookup tables, network-of-tables, or randomized encodings.
2. Multi-Model CPA Approach
In a typical DCA attack, a single leakage model is used to recover each key byte. The most common choice is the Hamming Weight model applied to an AES intermediate value. The attack initially performs DCA using the standard last-round AES leakage model:
L = HW(InvSbox(C ⊕ K))
where (C) is the ciphertext byte and (K) is the key-byte guess. In most cases, this model is sufficient to recover the correct byte. However, some bytes may exhibit weak leakage, resulting in very similar correlation values for multiple key candidates.
To address this limitation, we combined the Hamming Weight (HW) model with several complementary leakage models:
- HW(C ⊕ k): Models the leakage of the AddRoundKey operation by targeting the intermediate state after key mixing.
- Hamming Distance (HD): Captures transition leakage between successive internal states during computation.
- Bit-level models: Exploit individual bits of the InvSbox output when the overall HW signal is too weak or masked.
- Verified brute force: For bytes no model resolves confidently, the top candidates are tested against real plaintext/ciphertext pairs until the correct key is confirmed.
Each model targets the same secret-dependent computation from a different angle, ensuring full key recovery even when a single model fails.
3. Putting the Approach into Practice
3.1. Trace Collection
This step consists of tracing the memory access patterns of the white-box AES encryption routine. As always, our analysis entry point is the function sub_1780F0. Before instrumenting it, we first need to identify which memory region is worth tracing.
Inspecting the sections of libnative-lib.so with objdump, the .data section immediately stands out as the primary candidate, its unusually large size (565 KB) strongly suggests it contains precomputed lookup tables rather than ordinary program data.
We instrumented sub_1780F0 using Frida and QBDI, restricting the memory callback to the .data range (0x28c000 – 0x319038) and recording every read access along with its address, size, and value.
Three access patterns emerge immediately and tell a clear story:
- Purple bars (0x297000 – 0x2be000): 288 four-byte reads at a constant 1 KB stride across 8 instruction sites, organized in four parallel columns. This is the unmistakable signature of an AES T-table implementation, where the SubBytes, ShiftRows, and MixColumns operations are merged into a single 32-bit lookup.
- Red bars (0x2e3000 – 0x319000): 1 730 single-byte reads, exactly 32 per page, uniformly distributed across 57 pages and 51 instruction sites. The perfect regularity is the fingerprint of a byte-indexed S-box substitution layer, the most active region, accounting for 85% of all reads in the trace.
Once the memory regions of interest were identified, we used memory_trace.js which is a custom QBDI/Frida instrumentation script to systematically collect memory access traces across a large set of inputs.
3.2. Processing the Collected Traces
The memory_trace.js script produces two types of output files per execution:
- plaintexts.json: a single file accumulating one entry per execution, each containing the execution index, the plaintext input, and the ciphertext returned by sub_1780F0:
- trace_XXXX.json: one file per execution, containing the list of .data memory accesses recorded during that encryption call. Each record is a 5-element array:
The field val (index [3]) is the raw memory value read from .data; this is the leakage sample. It is secret-dependent because it reflects the intermediate AES state InvSbox[ct XOR k], which depends on the unknown last-round key byte k.
After that, we used dca_format.py script to processes the full trace directory in one pass: it extracts col[3] & 0xFF from every record as the leakage signal, aligns all traces to the same length, and applies z-score normalisation to remove scheduling jitter. It then writes three files: samples.bin (the N × T leakage matrix), ciphers.bin (the N × 16 ciphertext array), and plains.bin (the N × 16 plaintext array), which are the only inputs dca.py needs to launch the DCA attack.
3.3. Key Recovery via Differential Computation Analysis
At this stage, we apply dca.py to analyze the collected traces and extract the AES master key. The script implements the multi-model DCA approach presented in the core methodology section, iterating over all 16 ciphertext bytes, correlating the leakage matrix against HW(InvSbox[ct XOR k]) for each key hypothesis, and falling back to alternative models for any byte where the primary correlation is inconclusive.
After executing dca.py against the 500 collected traces, the script successfully recovered the AES master key. The DCA attack processed all 16 ciphertext bytes independently, correlating the leakage matrix against HW(InvSbox[ct XOR k]) for each key hypothesis. Once all 16 bytes of the last round key RK10 = 768d19e17f62a01eb0cdd39a28e1798f were recovered, the reverse key schedule rolled them back to the original master key: r2p4y1sN0wSecur3
IV. Conclusion
In this article, we demonstrated a complete DCA attack against a white-box AES implementation, recovering the master key r2p4y1sN0wSecur3 using only memory access traces, no source code, no symbols, no disassembly required.
The attack combined Frida and QBDI instrumentation to collect 500 memory traces, a DCA engine with multiple leakage models to recover the last round key RK10, and the reverse AES key schedule to derive the final master key.
This result confirms a fundamental weakness of software white-box cryptography: no matter how the key is obfuscated within lookup tables, the memory access patterns it produces are statistically exploitable. The secret does not need to be visible, it just needs to influence what gets read from memory, and DCA does the rest.
References
- Chow, S., Eisen, P., Johnson, H., & Van Oorschot, P. C. (2002). White-Box Cryptography and an AES Implementation. Selected Areas in Cryptography (SAC 2002). Springer, LNCS 2595, pp. 250–270.
- Billet, O., Gilbert, H., & Ech-Chatbi, C. (2004). Cryptanalysis of a White Box AES Implementation. Selected Areas in Cryptography (SAC 2004). Springer, LNCS 3357, pp. 227–240.
- Bos, J. W., Hubain, C., Michiels, W., & Teuwen, P. (2016). Differential Computation Analysis: Hiding Your White-Box Designs is Not Enough. Cryptographic Hardware and Embedded Systems (CHES 2016). Springer, LNCS 9813, pp. 215–236.
- Frida Dynamic Instrumentation Toolkit. (2024). Frida — A World-Class Dynamic Instrumentation Framework. https://frida.re
- QBDI — QuarkslaB Dynamic binary Instrumentation. QBDI: A Dynamic Binary Instrumentation Framework Based on LLVM. https://qbdi.quarkslab.com
- National Institute of Standards and Technology (NIST). (2001). Advanced Encryption Standard (AES). FIPS Publication 197. U.S. Department of Commerce.
- Daemen, J., & Rijmen, V. (2002). The Design of Rijndael: AES — The Advanced Encryption Standard.
- RedFenec. (2026). R2Pay Under the Microscope: Runtime Protections. https://redfenec.com/r2pay-under-the-microscope-runtime-protections/
Most Recent Posts
- All Posts
- Exploit Development
- Fuzzing
- Penetration Testing
- Reverse Engineering
- Vulnerability Research


