By: Walker Rowe, May 01, 2017 (09:21 AM)

Wikileaks Marble Framework CIA String Obfuscator Code Explained

WikiLeaks has just published the third batch of CIA documents. While they say this is source code, it is not the actual spyware that we have been waiting for. Instead this is code the CIA uses to obfuscate their programs so they cannot be traced back to the CIA. So it’s just one piece of the larger puzzle.

The reason they obfuscate code is in a compiled program, which is in binary format, the strings can be extracted and read. In fact in Linux you use the command strings to do that. So people doing forensics can use this to look for and see comments and text values that, for example, if there are written in Russian or Chinese gives a clue where they came from.

The CIA documents say that their programs read and entire C language project and obfuscate strings, meaning move their bits around so they they become garbled noise that someone cannot read. What is interesting is they say those programs will compile even when scrambled like that. That must mean it only works on strings and not commands like for loops, arithmetic, etc.

We can take a look at the code and run one example. Download it from Wikileaks here and then unzip it.

Then we can run the CIA Python program giving it the arguments of which file of strings to read it and what directory to use to put the C code it creates. It basically creates a C language subroutine that scrambles the strings using the random values it picked. In the example below, it picked 6 and 0x55 for bitwise operations, which is how they scramble it,

python ./devutils/stringobfuscation/br_angry/ -v string.txt out

DEBUG:__main__:Using shift: 6, xor: 0x55
DEBUG:__main__:[gen_string_files] Processing header: string.txt …
DEBUG:__main__:[gen_string_files] Wrote to string_strings.h
DEBUG:__main__:[gen_string_files] Wrote to string_strings.c

That created two files:

ls out
string_strings.c string_strings.h

Below we modified the strings.c program so that we could paste it into this online code compiler and to make it a complete program. The CIA code is a subroutine which means it cannot be run as a standalone program. So we make it a complete program so that we can demonstrate it.

This code will be impossible to understand for people who do not know C++. But that does not matter. C and C++ are the languages hackers often use since they create executable programs (i.e., .dll, .exe, and .so files).

In the example below, it takes the input string “secret” which we hard coded and outputs this value in hex:


We cannot print the output value the CIA calculated because it is not a displayable string. In other words it makes a series of ASCII characters that fall outside of the range ASCII character set range 0-255.

The program is below. If we go back and look at the Python program the output was:

Using shift: 6, xor: 0x55

That translates into these two complicated lookinging lines of code:

str[i] = ((str[i] << 6) & (0xFF << 6)) | ((str[i] >> 2) & (0xFF >> 2));
str[i] = str[i]^0x55;

That code means take every letter in “secret” and perform bitwise operators on it. In other words it changes 0s and 1s to their opposite value (depending on which bitwise operator they use (<<, >>, &, | ) and then sticks an 0x55 hex character on the end.


using namespace std;

int main()
int i;

std::string str = "secret";
std::string sstr;

int len = str.length();

for (i = 0; i< len; i++) {
str[i] = ((str[i] << 6) & (0xFF << 6)) | ((str[i] >> 2) & (0xFF >> 2));
str[i] = str[i]^0x55;

str[len] = '\0';

std::stringstream ss;
for(int i=0; i<str.length(); ++i) {
ss << std::hex << (int)str[i];
sstr = ss.str();

std::cout << "obfuscated: " << sstr;

return 0;

Be Informed. Stay One Step Ahead.

Sign up for our newsletter and stay up to date with the latest industry news, trends, and technologies