Everyone knows that encryption/decryption is very important when writing malware. That can be used for everything, starting from obfuscating strings and shellcodes and ending with encrypting traffic between malware and the C2 server in order to avoid detection.
What is Hill Cipher?
The Hill cipher is a very interesting algorithm that is based on linear algebra and uses matrices to encrypt/decrypt words. It represents characters with numbers and uses those numbers to do calculations with matrices. Let me explain it more easily. Let’s say there is a string that needs to be obfuscated: VirtualProtect. This is the name of a function that is part of WinAPI that changes protection on allocated memory. The thing is that AVs often check if there is VirtualProtect function call after VirtualAlloc. Mostly, when malware developers use those functions, they do this:
They allocate enough memory for a shellcode with VirtualAlloc and set permissions PAGE_READWRITE. This permission tells the CPU that it can write/read data in that memory region, BUT it cannot execute instructions from that memory region.
The attacker writes malicious shellcode into that memory region.
The attacker calls VirtualProtect to change the permission of that memory region, which often is PAGE_EXECUTE_READWRITE. Finally, malware creates a new thread that executes written shellcode.
This behavior is often flagged as malicious (because mostly, it is!) by AV engines, and the process gets killed and the malware deleted.
I already talked about function name obfuscation and loading functions dynamically here. That is a very good recommendation, but it has one problem: those strings are unencrypted, and even static analysis can detect them. That’s where the Hill Cipher comes in. It can be used for hiding those function names.
The Hill Cipher works this way: it takes a matrix as a key and a vector as data. It does multiplication on them, and the output vector is encrypted data. If we want to decrypt already encrypted data, we will have to have the inverted matrix of the key matrix in order to get the real value. The process is the same.
Implementing the key
First of all, I have to write a key. I’m going to use a class for that:
This is the most simple implementation of the key class. Of course, I can’t do anything here yet, but it can store 3x3 data AND it has a constructor that fills every cell of the matrix with 0.
I’ll need to add two functions: get_value and set_value. The first function will take two parameters: row and column in order to find the exact place of the number that I’ll be looking for. On the other hand, the second function will take three parameters: row, column and value. With these parameters I can tell the key where I want to store a number and what number I want to store:
Ok, time to write get_value and set_value functions. BTW, there is going to be a difference with parameters. That’s because there is only one dimension, so I’m just going to use column:
In the Hill Cipher, characters are represented by numbers. A is 0 and Z is 25. Before I add an actual character encoding, I’ll need to make little changes in the Key and Vector classes. In set_value function to be exact. I’m going to check value and if it’s not going to be [0-25], then I’m not going to write anything:
Ok, classes are fine. Time to add character encoding. First of all, I’m going to create a class which will hold enum and two functions: encode_character and decode_character. The first function will receive char value and convert it to [0-25] unsigned integer format. On the other hand, the second function will receive a number that is going to be [0-25] unsigned integer format and output a character:
classCharacterEncoder { enumCharacters : int { A = 0, B = 1, C = 2, D = 3, E = 4, F = 5, G = 6, H = 7, I = 8, J = 9, K = 10, L = 11, M = 12, N = 13, O = 14, P = 15, Q = 16, R = 17, S = 18, T = 19, U = 20, V = 21, W = 22, X = 23, Y = 24, Z = 25 };
public: CharacterEncoder() {
}
unsignedintencode_character(char character){ int index = character - 'A';
if (index < 0 || index > 25) { return0; }
Characters number = static_cast<Characters>(index);
return number; }
chardecode_character(unsignedint number){ if (number < 0 || number > 25) { return'#'; }
char ch = static_cast<char>(number + 'A');
return ch; } };
BTW, I return # symbol in decode_character if an error happens because the function HAS to return char value, so # is going to be a symbol of error and TERROR!!!
Alright. In the same class, I’ll write a function that encrypts and decrypts data. Here is encrypt_vector vector function:
There is one problem. I need to implement the calculation of the inverse matrix of the key. First of all, I’ll have to check if the key matrix is invertible, and if it is, then I can calculate the inverse matrix. I’m going to add verify_key function into Key class, that’ll verify that the key is valid(Inversible matrix). It turns out I need several functions to make this work. First of all, I need mod26 function to calculate modulo. I also need functions for calculating the determinant and calculating if inversing is possible, and finally, I’ll need a function that will inverse the key matrix:
Ok. This is Key class now. My head almost exploded three times when I was writing those calculations (finally, my mistakes were fixed by ChatGPT). I’m not a math professor, ALRIGHT???
I think the code is done here. I also have encrypt_vector and decrypt_vector functions ready. All the malware has to do now is to iterate over data and load characters slowly into a vector to do encryption or decryption. The final code is 190 lines long:
classCharacterEncoder { enumCharacters : int { A = 0, B = 1, C = 2, D = 3, E = 4, F = 5, G = 6, H = 7, I = 8, J = 9, K = 10, L = 11, M = 12, N = 13, O = 14, P = 15, Q = 16, R = 17, S = 18, T = 19, U = 20, V = 21, W = 22, X = 23, Y = 24, Z = 25 };
public: CharacterEncoder() {
}
unsignedintencode_character(char character){ int index = character - 'A';
if (index < 0 || index > 25) { return0; }
Characters number = static_cast<Characters>(index);
return number; }
chardecode_character(unsignedint number){ if (number < 0 || number > 25) { return'#'; }
Well, cryptography is fun. You can make up some really weird shit to obfuscate data. You might think, “Toko, could we use more simple encryption algos to encrypt the data?” Well, yes, but actually no. Complex mechanisms like this actually help malware to avoid AV engines because complex tasks take time and INSTRUCTIONS!!! To be honest, this will make malware more undetectable for AV engines than the simple XOR function. That’s because there is more non-violent code written, and MORE code means MORE time doing non-violent things. AV engines love that…