Authored by: Anandeshwar Unnikrishnan
Stage 1: GULoader Shellcode Deployment
In latest GULoader campaigns, we’re seeing an increase in NSIS-based installers delivered through E-mail as malspam that use plugin libraries to execute the GU shellcode on the sufferer system. The NSIS scriptable installer is a extremely environment friendly software program packaging utility. The installer habits is dictated by an NSIS script and customers can prolong the performance of the packager by including customized libraries (dll) generally known as NSIS plugins. Since its inception, adversaries have abused the utility to ship malware.
NSIS stands for Nullsoft Scriptable Installer. NSIS installer information are self-contained archives enabling malware authors to incorporate malicious belongings together with junk information. The junk information is used as Anti-AV / AV Evasion approach. The picture under exhibits the construction of an NSIS GULoader staging executable archive.
The NSIS script, which is a file discovered within the archive, has a file extension “.nsi” as proven within the picture above. The deployment technique employed by the menace actor will be studied by analyzing the NSIS script instructions offered within the script file. The picture proven under is an oversimplified view of the entire shellcode staging course of.
The file that holds the encoded GULoader shellcode is dropped on to sufferer’s disc based mostly on the script configuration together with different information. Junk is appended at first of the encoded shellcode. The encoding model varies from pattern to pattern. However in all most all of the instances, it’s a easy XOR encoding. As talked about earlier than, the shellcode is appended to junk information, due to this, an offset is used to retrieve encoded GULoader shellcode. Within the picture, the FileSeek NSIS command is used to do correct offsetting. Some samples have unprotected GULoader shellcode appended to junk information.
A plugin utilized by the NSIS installer is nothing however a DLL which will get loaded by the installer program at runtime and invokes capabilities exported by the library. Two DLL information are dropped in consumer’s TEMP listing, in all analyzed samples one DLL has a constant title of system.dll and title of the opposite one varies.
The system.dll is chargeable for allocating reminiscence for the shellcode and its execution. The next picture exhibits how the NSIS script calls capabilities in plugin libraries.
The system.dll has the following exports as proven the in the picture under. The operate named “Name” is getting used to deploy the shellcode on sufferer’s system.
- The Name operate exported by system.dll resolves following capabilities dynamically and execute them to deploy the shellcode.
- CreateFile – To learn the shellcode dumped on to disk by the installer. As a part of installer arrange, all of the information seen within the installer archive earlier are dumped on to disk in new listing created in C: drive.
- VirtualAlloc – To carry the shellcode within the RWX reminiscence.
- SetFilePointer – To hunt the precise place of the shellcode within the dumped file.
- ReadFile – To learn the shellcode.
- EnumResourceTypesA – Execution through callback mechanism. The second parameter is of the sort ENUMRESTYPEPROCA which is solely a pointer to a callback routine. The handle the place the shellcode is allotted within the reminiscence is handed because the second argument to this API resulting in execution of the shellcode. Callback capabilities parameters are good assets for oblique execution of the code.
Vectored Exception Dealing with in GULoader
The implementation of the exception dealing with by the Working System supplies a possibility for the adversary to take over execution circulation. The Vectored Exception Dealing with on Home windows supplies the consumer with means to register customized exception handler, which is solely a code logic that will get executed on the occasion of an exception. The fascinating factor about dealing with exceptions is that the way in which through which the system resumes its regular execution circulation of this system after the occasion of exception. Adversaries exploit this mechanism and take possession of the execution circulation. Malware can divert the circulation to the code which is beneath its management when the exception happens. Usually it’s employed by the malware to attain following objectives:
- Hooking
- Covert code execution and anti-analysis
The GuLoader employs the VEH primarily for obfuscating the execution circulation and to decelerate the evaluation. This part will cowl the internals of Vectored exception dealing with on Home windows and investigates how GUloader is abusing the VEH mechanism to thwart any evaluation efforts.
- The Vectored Exception Dealing with (VEH) is an extension of Structured Exception Dealing with (SEH) with which we will add a vectored exception handler which can be known as regardless of of our place in a name body, merely put VEH is just not frame-based.
- VEH is abused by malware, both to govern the management circulation or covertly execute consumer capabilities.
- Home windows supplies AddVectoredExceptionHandler Win32 API so as to add customized exception handlers. The operate signature is proven under.
The Handler routine is of the sort PVECTORED_EXCEPTION_HANDLER. Additional checking the documentation, we will see the handler operate takes a pointer to _EXCEPTION_POINTERS sort as its enter as proven within the picture under.
The _EXCEPTION_POINTERS sort holds two vital constructions; PEXCEPTION_RECORD and PCONTEXT. PEXCEPTION_RECORD accommodates all the data associated to exception raised by the system like exception code and so on. and PCONTEXT construction maintains CPU register (like RIP/EIP, debug registers and so on.) values or state of the thread captured when exception occurred.
- This implies the exception handler can entry each ExceptionRecord and ContextRecord. Right here from inside the handler one can tamper with the info saved within the ContextRecord, thus manipulating EIP/RIP to regulate the execution circulation when consumer software resumes from exception dealing with.
- There may be one fascinating factor about exception dealing with, the execution to the appliance is given again through NtContinue native routine. Exception dispatch routines name the handler and when handler returns to dispatcher, it passes the ContextRecord to the NtContinue and execution is resumed from the EIP/RIP within the file. On a aspect word, that is an oversimplified clarification of the entire exception dealing with course of.
Vectored Handler in GULoader
- GULoader registers a vectored exception handler through RtlAddVectoredExceptionHandler native routine. The under picture exhibits the management circulation of the handler code. Apparently a lot of the code blocks current listed below are junk added to thwart the evaluation efforts.
- The GULoader’s handler implementation is as follows (disregarding the junk code).
- Reads ExceptionInfo handed to the handler by the system.
- Reads the ExceptionCode from ExceptionRecord construction.
- Checks the worth of ExceptionCode subject towards the computed exception codes for STATUS_ACCESS_VIOLATION, STATUS_BREAKPOINT and STATUS_SINGLESTEP.
- Based mostly on the exception code, malware takes a department and executes code that modifies the EIP.
The GULoader units the entice flag to set off single stepping deliberately to detect evaluation. The handler code will get executed as mentioned earlier than, a block of code is executed based mostly on the exception code. If the exception is single stepping, standing code is 0x80000004, following actions happen:
- The GULoader reads the ContextRecord and retrieves EIP worth of the thread.
- Increments the present EIP by 2 and reads the one byte from there.
- Performs an XOR on the one-byte information fetched from step earlier than and a static worth. The static worth modifications with samples. In our pattern worth is 0x1A.
- The XOR’ed worth is then added to the EIP fetched from the ContextRecord.
- Lastly, the modified EIP worth from prior step is saved within the ContextRecord and returns the management again to the system(dispatcher).
- The malware has the identical logic for the entry violation exception.
- When the shellcode is executed with out debugger, INT3 instruction invokes the vectored exception handler routine, with an exception of EXCEPTION_BREAKPOINT, handler computes EIP by incrementing the EIP by 1 and fetching the info from incremented location. Later XORing the fetched information with a relentless in our case 0x1A. The result’s added to present EIP worth. The logic applied for dealing with INT3 exceptions additionally scan this system code for 0xCC directions put by the researchers. If 0xCC are discovered which are positioned by researchers then EIP is just not calculated correctly.
EIP Calculation Logic Abstract
Set off through interrupt instruction (INT3) | eip=((ReadByte(eip+1)^0x1A)+eip) |
Set off through Single Stepping(PUSHFD/POPFD) | eip=((ReadByte(eip+2)^0x1A)+eip) |
*The worth 0x1A modifications with samples
Detecting Irregular Execution Move through VEH
- The shellcode is structured in such a manner that the malware can detect irregular execution circulation by the order through which exception occurred at runtime. The pushfd/popfd directions are adopted by the code that when executed throws STATUS_ACCESS_VIOLATION. When program is executed usually, the execution is not going to attain the code that follows the pushfd/popfd instruction block, thus elevating solely STATUS_SINGLESTEP. When accidently stepped over the pushfd/popfd block in debugger, the STATUS_SINGLESTEP is just not thrown on the debugger because it suppreses this as a result of the debugger is already single stepping by way of the code, that is detected by the handler logic after we encounter code that follows the pushfd/popfd instruction block wich throws a STATUS_ACCESS_VIOLATION. Now it runs right into a nested exception state of affairs (the entry violation adopted by suppressed single stepping exception through entice). Due to this, at any time when an entry violation happens, the handler routine checks for nested exception data in _EXCEPTION_POINTERS construction as mentioned to start with.
Beneath picture exhibits this the fastidiously laid out code to detect evaluation.
The Egg searching: VEH Assisted Runtime Padding
One fascinating characteristic seen in GULoader shellcode within the wild is runtime padding. Runtime padding is an evasive habits to beat automated scanners and different safety checks employed at runtime. It delays the malicious actions carried out by the malware on the goal system.
- The egg worth within the analyzed pattern is 0xAE74B61.
- It initiates a seek for this worth in its personal information phase of the shellcode.
- Don’t neglect the truth that that is applied through VEH handler. This search itself provides 0.3 million of VEH iteration on prime of standard VEH management manipulation employed within the code.
- The loader ends this search when it retrieves the handle location of the egg worth. To ensure the worth is just not being manipulated by any means by the researcher, it performs two extra checks to validate the egg location.
- If the verify fails, the search continues. The method of retrieving the situation of the egg is proven within the picture under.
- As talked about above, the validity of the egg location is checked by retrieving byte values from two offsets: one is 4 bytes away from the egg location and the worth is 0xB8. The opposite is at 9 bytes from the egg location and the worth is 0xC3. This verify must be handed for the loader to proceed to the subsequent stage of an infection. Core malware actions are carried out after this runtime padding loop.
The next photos present the egg location validity checks carried out by GULoader. The values 0xB8 and 0xC3 are checked by utilizing correct offsets from the egg location.
Stage 2: Surroundings Test and Code Injection
Within the second stage of the an infection chain, the GULoader performs anti-analysis and code injection. Main anti-analysis vectors are listed under. After ensuring that shellcode is just not working in a sandbox, it proceeds to conduct code injection right into a newly spawned course of the place stage 3 is initiated to obtain and deploy precise payload. This payload will be both commodity stealer or RAT.
Anti-analysis Methods
- Employs runtime padding as mentioned earlier than.
- Scans entire course of reminiscence for evaluation software particular strings
- Makes use of DJB2 hashing for string checks and dynamic API handle decision.
- Strings are decoded at runtime
- Checks if qemu is put in on the system by checking the set up path:
- C:Program Informationqqaqqa.exe
- Patches the next APIs:
- DbgUIRemoteBreakIn
- The operate’s prologue is patched with ExitProcess name
- LdrLoadDll
- The preliminary bytes are patched with instruction “mov edi edi”
- DbgBreakPoint
- Patches with instruction nop
- Clears hooks positioned in ntdll.dll by safety merchandise or researcher for the evaluation.
- Window Enumeration through EnumWindows
- Hides the shellcode thread from the debugger through ZwSetInformationThread by passing 0x11 (ThreadHideFromDebugger)
- System driver enumeration through EnumDeviceDrivers andGetDeviceDriverBaseNameA
- Put in software program enumeration through MsiEnumProductsA and MsiGetProductInfoA
- System service enumeration through OpenSCManagerA and EnumServiceStatusA
- Checks use of debugging ports by passing ProcessDebugPort (0x7) class to NtQueryInformationProcess
- Use of CPUID and RDTSC directions to detect digital environments and instrumentation.
Anti-dump Safety
At any time when GULoader invokes a Win32 api, the decision is sandwiched between two XOR loops as proven within the picture under. The loop previous to the decision encoded the lively shellcode area the place the decision is going down to forestall the reminiscence from getting dumped by the safety merchandise based mostly on occasion monitoring or api calls. Following the decision, the shellcode area is decoded once more again to regular and resumes execution. The XOR key used is a phrase current within the shellcode itself.
String Decoding
This part covers the method undertaken by the GUloader to decode the strings on the runtime.
- The NtAllocateVirtualMemory known as to allocate a buffer to carry the encoded bytes.
- The encoded bytes are computed by performing numerous arithmetic and logical operations on static values embedded as operands of meeting directions. Beneath picture exhibits the restoration of encoded bytes through numerous mathematical and logical operations. The EAX factors to reminiscence buffer, the place computed encoded values get saved.
The primary byte/phrase is reserved to carry the scale of the encoded bytes. Beneath exhibits a 12 byte lengthy encoded information being written to reminiscence.
Later, the primary phrase will get changed by the primary phrase of the particular encoded information. Beneath picture exhibits the buffer after changing the primary phrase.
The encoded information is totally recovered now, and malware proceeds to decode it. For decoding the easy XOR is employed, and secret’s current within the shellcode. The meeting routine that does the decoding is proven in the picture under. Every byte within the buffer is XORed with the important thing.
The results of the XOR operation is written to identical reminiscence buffer that holds the encoded information. A closing view of the reminiscence buffer with decoded information is proven under.
The picture exhibits the decoding the string “psapi.dll”, later this string is utilized in fetching the handlees of numerous capabilities to make use of anti-evaluation.
The stage 2 culminates in code injection, to be particular GULoader employs a variation of the method hollowing approach, the place a benign course of is spawned in a suspended state by the malware stager course of and proceeds to overwrite the unique content material current within the suspended course of with malicious content material, later the state of the thread within the suspended course of is modified by modifying processor register values like EIP and eventually the method resumes its execution. By controlling EIP, malware can now direct the management circulation within the spawned course of to a desired code location. After a profitable hollowing, the malware code can be working beneath the duvet of a legit software.
The variation of hollowing approach employed by the GULoader doesn’t exchange the file contents, however as a substitute injects the identical shellcode and maps the reminiscence within the suspended course of. Apparently, GULoader employs a further approach if the hollowing try fails. Extra particulars are lined within the following part.
Listed under Win32 native APIs are dynamically resolved at runtime to carry out the code injection.
- NtCreateSection
- ZwMapViewOfSection
- NtWriteVirtualMemory
- ZwGetContetThread
- NtSetContextThread
- NtResumeThread
Overview of Code Injection
- Initially picture “%windirpercentMicrosoft.NETFrameworkversion on 32-bit methods<model>CasPol.exe” is spawned in suspended mode through CreateProcessInternalW native API.
- The Gu loader retrieves a deal with to the file “C:WindowsSysWOW64iertutil.dll” which is utilized in part creation. The part object created through NtCreateSection can be backed by iertutil.dll.
- This habits is principally to keep away from suspicion, a piece object which isn’t backed by any file might draw undesirable consideration from safety methods.
- The following part within the code injection is the mapping of the view created on the part backed by the iertutil.dll into the spawned CasPol.exe course of. As soon as the view is efficiently mapped to the method, malware can inject the shellcode within the mapped reminiscence and resume the method thus initiating stage 3. The native api ZwMapViewOfSection is used to carry out this job. Following the execution of the above API, the malware checks the results of the operate name towards the under listed error statuses.
- C0000018 (STATUS_CONFLICTING_ADDRESS)
- C0000220 (STATUS_MAPPED_ALIGNMENT)
- 40000003 (STATUS_IMAGE_NOT_AT_BASE).
- If the mapping is unsuccessful and standing code returned by ZwMapViewOfSection matches with any of the code talked about above, it has a backup plan.
- The GuLoader calls NtAllocateVirtualMemory by immediately calling the system name stub which is often present in ntdll.dll library to bypass EDR/AV hooks. The reminiscence is allotted within the distant CasPol.exe course of with an RWX reminiscence safety. Following picture exhibits the direct use of NtAllocateVirtualMemory system name.
After reminiscence allocation, it writes itself into distant course of through NtWriteVirtualMemory as mentioned above. GULoader shellcodes taken from the subject are greater in measurement, samples taken for this evaluation are all larger than 20 mb. In samples analyzed, the buffer measurement allotted to carry the shellcode is 2950000 bytes. The under picture exhibits the GuLoader shellcode within the reminiscence.
Deceptive Entry level
- The GULoader is very evasive in nature, if irregular execution circulation is detected with assist of employed anti-analysis vectors, the EIP and EBX fields of thread context construction (of CasPol.exe course of) can be overwritten with a decoy handle, which is required for the stage 3 of malware execution. The situation ebp+4 is used to carry the entry level regardless of of the very fact whether or not program is being debugged or not.
- The Gu loader makes use of ZwGetContextThread and NtSetContextThread routines to perform modification of the thread state. The CONTEXT construction is retrieved through ZwGetContextThread, the worth [ebp+14C] is used because the entry level handle. The present EIP worth held within the EIP subject within the context construction of the thread can be modified to a recalculated handle based mostly on worth at ebp+4. Beneath picture exhibits the RVA calculation. The bottom handle of the executing shellcode (stage 2) is subtracted from the digital handle [ebp+4] to acquire RVA.
The RVA is added to the base handle of the newly allotted reminiscence within the CasPol.exe course of to acquire new VA which can be utilized within the distant course of. The brand new VA is written into EIP and EBX subject within the thread context construction of the CasPol.exe course of retrieved through ZwGetContextThread. Beneath picture exhibits the modified context construction and worth of EIP.
Lastly, by calling ZwSetContextThread, the changes made to the CONTEXT construction is dedicated within the goal thread of CasPol.exe course of. The thread is resumed by calling NtResumeThread. The CasPol.exe resumes execution and performs stage 3 of the an infection chain.
Stage 3: Payload Deployment
The GULoader shellcode resumes execution from inside a brand new host course of, on this report, analyzed samples inject the shellcode both into the identical course of spawned as a baby course of or caspol.exe. Stage3 performs all of the anti-analysis as soon as once more to ensure this stage is just not being analyzed. In spite of everything checks, GUloader proceeds to carry out stage3 actions by decoding the encoded C2 string within the reminiscence as proven within the picture under. The decoding methodology is identical as mentioned earlier than.
Later the addresses of following capabilities are resolved dynamically by loading wininet.dll:
- InternetSetOptionA
- InternetOpenUrlA
- InternetReadFile
- InternetCloseHandle.
The under picture exhibits the response from the content material supply community (cdn) server the place the ultimate payload is saved. On this evaluation, a payload of measurement 0x2E640 bytes is distributed to the loader. Apparently, the primary 40 bytes are ignored by the loader. The precise payload begins from the offset 40 which is highlighted within the picture.
The cdn server is nicely protected, it solely serves to shoppers with correct headers and cookies. If these aren’t current within the HTTP request, the next message is proven to the consumer.
Closing Payload
Quasi Key Era
Step one in decoding the the downloaded closing payload by the GUloader is producing a quasi key which can be later utilized in decoding the precise key embeded within the GULoader shellcode. The encoded embeded key measurement is 371 bytes in analysed pattern. The method of quasi key era is as follows:
- The 40th and 41st bytes (phrase) are retrived from the obtain buffer within the reminiscence.
- The above phrase is XORed with the primary phrase of the encoded embeded key alongside and a counter worth.
- The method is repeated untill the the phrase taken from the downloaded information totally decodes and have a worth of 0x4D5A “MZ”.
- The worth current within the counter when the 4D5A will get decoded is taken because the quasi key. This secret’s proven as “key-1” within the picture under. Within the analysed pattern the worth of this secret’s “0x5448”
Decoding Precise Key
The embedded key within the GULoader shellcode is of the scale 371 bytes as mentioned earlier than. The quasi secret’s used to decode the embeded key as proven within the picture under.
- Every phrase within the embeded secret’s XORed with quasi key key-1.
- When the interation counter exceeds the scale worth of 371 bytes, it stops and proceeds to decode the downloaded payload with this new key.
The decoded 371 bytes of embeded secret’s proven under within the picture under.
Decoding File
A byte stage decoding occurs after embeded secret’s decoded within the reminiscence. Every byte of the downloaded information is XORed with the important thing to acquire the precise information, which is a PE file. The decoded information is overwritten to the identical buffer used to obtain the decoded information.
The ultimate decoded PE file residing within the reminiscence is proven within the picture under:
Lastly, the loader masses the PE file by allocating the reminiscence with RWX permission within the stage3 course of, based mostly on analyzing a number of samples it’s both the identical course of in stage 2 because the youngster course of, or casPol.exe. The loading concerned code relocation and IAT correction as anticipated in such a state of affairs. The ultimate payload resumes execution from inside the hollowed stage3 course of. Beneath malware households are often seen deployed by the GULoader:
- Vidar (Stealer)
- Raccoon (Stealer)
- Remcos RAT
Beneath picture exhibits the injected reminiscence areas in stage3 course of caspol.exe on this report.
Conclusion
The position performed by malware loaders popularly generally known as “crypters” is critical within the deployment of Distant Administration Instruments and stealer malwares that focus on shopper information. The exfiltrated Private Identifiable Data (PII) extracted from the compromised endpoints are largely collected and funneled to numerous underground information promoting marketplaces. This additionally impacts companies as numerous essential data used for authentication functions are getting leaked from the private methods of the consumer resulting in preliminary entry on the corporate networks. The GuLoader is closely utilized in mass malware campaigns to contaminate the customers with standard stealer malware like Raccoon, Vidar, and Redline. Commodity RATs like Remcos are additionally seen delivered in such marketing campaign actions. On the intense aspect, it’s not tough to fingerprint malware specimens used within the mass campaigns due to the amount its quantity and relevance, detection guidelines and methods will be constructed round this actual fact.
Following desk summarizes all of the dynamically resolved Win32 APIs
Win32 API |
RtlAddVectoredExceptionHandler |
NtAllocateVirtualMemory |
DbgUIRemoteBreakIn |
LdrLoadDll |
DbgBreakPoint |
EnumWindows |
Nt/ZwSetInformationThread |
EnumDeviceDrivers |
GetDeviceDriverBaseNameA |
MsiEnumProductsA |
MsiGetProductInfoA |
TerminateProcess |
ExitProcess |
NtSetContextThread |
NtWriteVirtualMemory |
NtCreateSection |
NtMapViewOfSection |
NtOpenFile |
NtSetInformationProcess |
NtClose |
NtResumeThread |
NtProtectVirtualMemory |
CreateProcessInternal |
GetLongPathNameW |
Sleep |
NtCreateThreadEx |
WaitForSingleObject |
TerminateThread |
CreateFileW |
WriteFile |
CloseHandle |
GetFileSize |
ReadFile |
ShellExecuteW |
SHCreateDirectoryExW |
RegCreateKeyExA |
RegSetValueExA |
OpenSCManagerA |
EnumServiceStatusA |
CloseServiceHandle |
NtQueryInformationProcess |
InternetOpenA |
InternetSetOptionA |
InternetOpenUrlA |
InternetReadFile |
InternetCloseHandle |
IOC
889fddcb57ed66c63b0b16f2be2dbd7ec0252031cad3b15dfea5411ac245ef56
59b71cb2c5a14186a5069d7935ebe28486f49b7961bddac0a818a021373a44a3
4d9cdd7526f05343fda35aca3e0e6939abed8a037a0a871ce9ccd0e69a3741f2
c8006013fc6a90d635f394c91637eae12706f58897a6489d40e663f46996c664
c69e558e5526feeb00ab90efe764fb0b93b3a09692659d1a57c652da81f1d123
45156ac4b40b7537f4e003d9f925746b848a939b2362753f6edbcc794ea8b36a
e68ce815ac0211303d2c38ccbb5ccead144909d295230df4b7a419dfdea12782
b24b36641fef3acbf3b643967d408b10bf8abfe1fe1f99d704a9a19f1dfc77e8
569aa6697083993d9c387426b827414a7ed225a3dd2e1e3eba1b49667573fdcb
60de2308ebfeadadc3e401300172013be27af5b7d816c49696bb3dedc208c54e
23458977440cccb8ac7d0d05c238d087d90f5bf1c42157fb3a161d41b741c39d