[ Technical Teardown: Exploit & Malware in .HWP files ]

This article will focus on teaching analysts on analysing malicious JavaScript code within the HWP files and a walkthrough of how we can analyse .HWP files that was used to deliver malware.

[ 1st Sample used in the analysis ]
MD5: 8EB5A3F38EB3DE734037AA463ADE7665
SHA256: D0361ADB36E81B038C752EA1A7BDC5517B1E44F82909BC2BD27B77B2652667EE
As of writing, the detection rate for this sample according to VT is 12/54

[ Part 1 : Understanding OLE compound file ]
We need to first understand how OLE compound files work.
Inside these OLE compound files, there are folder (storage) and file (stream). We will use SSViewer(http://www.mitec.cz/ssv.html) to take a look into the interior of the malicious .hwp file.

Most of the Streams in .hwp files are “zlib compressed“. We can see from the image below that the structure in .HWP files differs from .doc files.

However, today we are going to focus on “DefaultJScript“. You may ask why that? Well, think of “DefaultJScript” as VBA in Office documents.



[ Part 2 : Getting Started ]
For those who want to follow along. Do note, this is a MALICIOUS file, so please do the analysis in a “safe” environment.

Now, let’s start getting our hands dirty…and open the suspicious .hwp file with Cerbero Profiler.

As we can see from the image below, the data within “DefaultJScript” looks gibberish. So how do we make sense out of it?


As i’ve mentioned earlier, most of the streams within .hwp are “zlib compressed
So let’s “Select All” within the “DefaultJScript” stream and press “Ctrl+T

Now let’s add “Unpack Zlib” and remember to check the “Raw” checkbox and add it as shown in the image below.


Then let’s press “Preview” and have a look.


After decompressing the raw bytes, we can start to see some readable words. But it seems to be in Unicode.

Now let’s add in another filter to remove the “00” bytes.
Select “Replace“, change the mode to “Bytes” and add in “00” for the “In” value as shown below.


We should get back something like the one shown below.


If we were to analysed the decoded JavaScript, we can see more interesting stuff as shown in the image below.


So it seems that the JavaScript is doing Base64 decoding of the very long string and dropping it as “msvcr.exe”
I wrote the following Ruby script to decode the Base64 String.

After Base64 decoding the string, the output file looks like this,

The hash of this malware is 765834b1b780dacda8baa671c76328445cb8278099bad375ee22130f48920a7a
We won’t be going through this malware this time round.

[ 2nd Sample used in the analysis ]
MD5: a986a3fdf2afba98de21f0596a022b9b
SHA256: bd8fa7793f2192d4ff3979526955d5d6c965218eb0c0fe579f8ef0602357d5a9
As of writing, the detection rate according to VT is still pretty low. 3/53

[ Part 3 : Getting Started on analysing Exploits in .HWP files ]
This is a .hwp file containing an exploit (Most probably CVE-2013-4979 or CVE-2013-0808).
I drew a diagram like the one shown below to illustrate the general idea of how this exploit works.


For this particular exploit, the first thing we should be looking at is BinData/BIN0001.EPS as shown below.

There is an unknown error upon opening the document using hwp2010.

Nevertheless analysis can still be done by extracting the EPS files from the doc
Let’s do a quick network check by opening the eps file using hwp2010 and we can see that the exploit was indeed executed and connect to www.ethanpublishing[.]com/ethanpublishing/phpcms/templates/default/member/account_manage/teacup.jpg if we use FakeNet or similar tools.

We suppose that teacup.jpg” is most likely the payload. However, the jpg file is no longer found using the url so we cannot conduct further analysis on it.


Let’s go on to focus our analysis on the vulnerablity that was exploited by the eps file.

Opening the file eps file in the text editor we can identify a few components of the exploit.
The green block represents a NOP sled using 0xB5.
The blue block represents a NOP sled using 0x90.
The red block represents the shellcode.


Following the shellcode is this line of post script command

This command would execute a “Heap spray”. 500 blocks of the NOP sleds and shellcodes would be ‘sprayed’ in the memory. The NOP sleds and shellcodes is allocated as a string with a length of 65535 characters.

Next we want to determine which vulnerable process is the exploit targetting.
We do so by trying to search for traces of the NOP sleds and shellcodes in the memory of the vulnerable process.
At first it looks like the vulnerable process is likely hwp.exe or HimTrayIcon.exe


However, we could not locate any trace of NOP sleds and shellcodes in both processes.

At this point, I wonder if other child processes could be created by Hwp.exe. These child processes could have termininated after the execution of the shellcode.

One ‘trick’ we used was to modify the start of the shellcode with the opcode “0xEBFE” which is actually an infinite loop. This would allow the process that executed the shellcode to run continously without terminating.


Now we can attach our debugger into the gbb.exe process and we located the NOP sleds and shellcodes


Now after locating the vulnerable process, we have to debug into it to locate where the vulnerable code is exploited.
We now located the code in where hwp.exe created the gbb.exe process.


We shall modify the “CreationFlags” to CREATE_SUSPENDED. This would allow us to attach debugger at the start of the execution of the gbb.exe process.


After tracing the code we located the instructions in gsdll32.dll that executed the NOP sled “0xB5B5” which is MOV CH,B5


From the vulnerable instructions, we can more or less conclude that the vulnerablity is indeed based on CVE-2013-0808
For more information on CVE-2013-0808, you can read it up this article by CoreSecurity.

In the meantime, we hope you enjoyed reading this and we would be happy to receive your feedback!

Best Regards
Jacob Soo & peta909