Mainframes – Java – Deserialization

I was asked a week or so ago whether or not I thought z/OS would be susceptible to the types of Java deserialization attacks we’ve seen (a great primer from Fox Glove Security).   Of course!, I replied.  However, I don’t like unsubstantiated claims – so I built this POC:

java

It uses the basic ysoserial payload generator found on Github.   The SerialTestPlain.java file I use to test is from a blog here:

The source is simple:

Simple enough right?    Java on the mainframe is basically Java anywhere.  The only major gotcha (which should come as no surprise to anyone) are with issues of character translation  EBCDIC<->ASCII.   In this case, the ysoserial jarfile I built on x86 and just binary copied it to OMVS and that worked out of the box.

Other times I’ve had to use an a2e / e2a custom decoder – just depends on the implementation.  Currently working to test the JBoss exploits and modify them, if need be, in MSF for z/OS.  More as that unfolds!

NOTES:

While testing this POC first on x86, I kept running into an error like this:

The above mentioned blog helped – Basically Java 1.8u72 (since last December) needs to have the most current version of ysoserial, and use the CommonsCollections5 in order to work (and it does).   Prior versions of Java work just fine with the Release Version (0.04) of ysoserial.

Also, aside from fixes that are library based (like the Adobe Commons Collections one used here), most fixes to this bug have to happen in customized code, often written by organizations.   That makes this vulnerability particularly ugly and potentially difficult to mitigate.

Things I’ve Learned (and things to come)

I started writing a list of topics I’ve learned, some in excruciating detail, some just enough to know where to look for further details (trust me, that is no small feat).

I’m writing this not only as a way of keeping me honest on those days when nothing goes right, but also as a way to incentivize those among you who, like myself, have an insatiable desire to learn – and the tenacity to “figure it out.”

In most organizations, the below is accomplished by teams of people.   Some of the items (gen’ing a system from scratch, for instance – or setting up SMP/E, SMS, etc. from scratch – might never be a part of even a very senior mainframer’s repertoire).  I wanted to see what it would take to go it alone.

My plan is to build this page out with good links to relevant data – and/or if I get really ambitious build some how-tos on the finer points, if there is interest.  Real language how and wherefores.

THINGS I’VE LEARNED  (since I started a deep technical dive into mainframes)

  • RACF
    • password construction / algorithms
    • user profile management
    • using callable services
    • TSO commands for many common elements
    • building certificates
    • importing certificates
    • user certificates
  • Storage Management
    • Configuring SMS from scratch
    • Initializing devices
    • using DFDSS to move, backup and restore files
    • using IDCAMS for catalog and VSAM file management
    • what the eff a VSAM file is
    • how to allocate datasets
    • different access methods (qsam, bdam, etc)
    • what the hell a cylinder (or track) is
    • How big a mod 3,9,27,54 EAV are
    • Initializing volumes
    • labeling and initializing tapes
    • troubleshooting space abends (D37,B37,E37)
  • System Programming
    • SMP/E updates, installation, management
    • Building jcl from scratch
    • Configruing IPL parms, parmlibs, and startup shutdown procs from scratch
    • checking system resources
    • How apf authorization works
    • building a lnklist
    • building an lpalib
    • building a multi-tier catalog system
    • taking SVC dumps
    • Getting a trace of a component
    • reading said trace
    • using IPCS
    • troubleshooting failed ipls
  • z/OS crypto
    • keychain management
    • key management
    • password configuration
  • Assembler Programming
    • What a load module is
    • What a program module is
    • how to disassemble them
    • writing assembly
    • using 4 different debuggers
    • patching programs the hard way
    • building a ZAP
    • Using Macros
    • Compiling, Linking
    • Callable service usage
    • What the hell Language Environment is
  • SDSF
    • Jes2 job management
    • Reading a job log
    • managing output
    • managing active jobs
    • reconfiguring SDSF screens
  • JES2
    • NJE
    • Job management
    • Configuration files
    • parmlib entries
  • TCP/IP
    • TN3270 configuration
    • FTP configuration
    • FTP/S configuration
    • TN3270 + ssl configuration
    • Policy Agent configuration
    • TSO / USS tcp/ip commands
  • OMVS/USS
    • creating zfs filesystems
    • dbx debugger
    • compiling and linking with xlc
    • What the hell Language Environment is
  • z/OS operations
    • Many console commands (devices, stg)
    • How to research a WTOR
    • Vtam commands
    • tcpip commands
    • device commands
  • ISPF
    • Panel customization
    • DDLIST wizardry
    • Editor fine-tuning
    • Keylist modification
    • using the editor – line & main command sets

THINGS LEFT TO LEARN

  • VTAM
  • REXX
  • memory areas and control blocks in depth
  • So much more (work in progress)
  • SMF
  • z/OMSF
  • policy agent
  • ATTLS (in depth)
  • Coding Exits
  • Hardware configuration
  • Cross memory operations (PC, SRB, etc)
  • much more

A logical first step

The first z/OS exploit module in the Metasploit Framework, landed last Friday.

This is an exploit which takes advantage of a default or poorly configured FTP server. And, it requires valid credentials.  However, given this (and it’s a very common configuration), you will be presented with a very nice Unix shell – allowing for deeper testing of the system.

This is how it begins:  attackers look for low hanging fruit. The evolution of pentesting tools for the mainframe has to start somewhere, and this is the first concrete milestone in what has been an ongoing journey. Many x86 exploits are simply taking advantage of default configurations or poorly written code. z/OS is no different – it can suffer from neglected configurations and defaults like all other OS’s. So, that’s where I started. From here, we’ll build on default configuration exploits and work up and on through binary / code exploitation. Baby steps.

At any rate – I’m very proud of how this turned out. Thanks to those who helped in the prototyping phases (SoF & others noted within the exploit) – and as always, the super helpful folks on the MSF teams. For those of you testing mainframe systems – hopefully it’ll help red teamers with an easy win and start the conversation on securing the big iron.

There are more goodies in the queue, so stay tuned!

PR # 6834 – Authorized FTP JCL exploit for z/OS

JCL Scripting for Metasploit-Framework

# update 3/31 – added Reverse Shell JCL – this can be used by any direct-to-JES2 delivery method (e.g. FTP, NJE, etc)

PR #6737

In continuation of adding more mainframe functionality to Metasploit, I’ve built (and am in the process of incorporating) JCL (job control language)-based payloads (and exploits which use them) within the framework.

Once these updates are complete, Metasploit users with credentials (or some other type of vulnerability exploit), will be able to submit jobs directly to JES2 via ftp(/s) or NJE (hats off to Soldier of Fortran, for the python prototype of NJE).

I’ll keep a running tally of the Pull Requests here, along with demonstrations and updates.

The first PR is simply a basic JCL cmd payload, that does nothing but submit a job which always returns a code 0 (success).

PR #6717

Here’s a screenshot of one of the finished exploit modules that will be submitted for inclusion soon:ftp_exploit

More to come!

 

Topics on Encryption – SHARE 2016 in San Antonio

I’m talking about encryption on Thursday, March 3 at SHARE in San Antonio.  Here’s a preview:

Description

Enterprise security has always been concerned with password protection and storage encryption. These two areas make up a large portion of the IT risk portfolio. In this talk the speaker will discuss both of these areas of interest as they apply to z/OS and the mainframe:

  • We will review the recent updates to the RACF password hashing algorithm; comparing and contrasting between the legacy (DES) and current (KDFAES – Key Derivation Function with AES256) algorithms by looking at practical application of open source password cracking tools against RACF.
  • We will provide an in depth review of self-encrypting drive technologies as potential risk mitigation mechanisms, discussing the benefits of of using this technology as well as providing other mitigation techniques which better secure your data from prying eyes.

Attendees will come away with a richer understanding of the RACF hashing algorithms and better understand the risk and rewards of full disk encryption.

SHARE.org San Antonio

Smashing the z/OS LE “Daisy” Chain for Fun and Cease and Desist letters (GUEST POST)

The following is a cross-post from REDDIT, reposted here with permission from the author  @_Ciq (twitter)  –  Excellent write-up!!! -BeS

<Big wall of text trigger warning.>

Over the past few months I’ve been becoming increasingly interested in the CTF concept; finding (purposely built) flaws in software and exploiting them so that arbitrary code can be executed. Having an IBM mainframe background, I was wondering if a similar exercise could be made for that platform. Considering I only have access to a PL/1 compiler, I set out to see if its calling conventions can be exploited.

Introduction.

A typical security flaw for x86 systems consists of overflowing a buffer that lives on the stack, in order to overlay the piece of memory that contains the return address, which coincidentally also lives on the stack, to change the location that a subroutine will return back to. A good example of these are string buffer overflows, because the programmer carelessly copied a character array without properly checking the size of both.

Writing over this return address is trivial because a buffer will usually live close to said return address on the stack. This because of the x86 architecture revolving around a stack, and the common calling conventions used in x86 programs.

The Language Environment and its workings.

IBM mainframe programs written in compiled languages (COBOL, PL/1, C/C++, Java) are (since relatively recently) subject to being Language Environment (LE) compliant. Compilers will provide compliance automatically. The LE exists to provide a standardized calling convention, standardize tracing, and standardized debugging.

LE programs are, despite the z/Architecture not inherently being stack based, reliant on a stack structure. As a prologue to each procedure being executed, it will set up what is called a Dynamic Storage Area (DSA), which ties in with the calling conventions. It contains a couple of things;

  • A pointer to the DSA of the procedure that called the current DSA’s procedure.
  • A pointer to the DSA of the procedure that this DSA’s procedure called. (Optional, IBM compilers ignore this field.)
  • [1] A save area where all registers are saved when the current DSA’s procedure calls another procedure.
  • [2] A pointer to the next free byte on the stack. This gets set depending on how much storage the procedure needs for stack variables. It is used by any procedure called by this procedure to quickly determine where it can build its DSA.

And then some, but these things are the most important for us right now.

[1] z/Architecture has 16 general purpose registers. Unlike x86, they don’t have physically enforced functions. Everything is convention. The contents of registers can be important to a procedure. Hence upon calling another procedures, the contents of the registers must be saved.

There is no CALL assembler instruction in the z/Architecture. A call is compiled to a hard branch. There are several types of branches. A call to a procedure would typically be compiled to a branch with two operands. The register containing the address where to branch to, and the register in which to save the location of the instruction right after the current branch instruction, which serves as a return address for the called procedure (this is very important). There are two calling conventions in the mainframe world; caller saves and callee saves. In LE, the callee saves the registers of the caller, and it does this in the register save area that is in the caller’s DSA (third item in the list above). The caller will restore its register when the callee returns control to the caller.

For the sake of materializing this abstract explanation; upon a call-type branch, the address to return to will be saved in register 14.

In x86 calling conventions, the stack pointer grows higher and lower by POPing and PUSHing items to it. On z/Architecture, there is a register that is the stack pointer, but only by convention. No one is forcing anyone to use any register as a stack pointer. LE has chosen register 13 to be the stack pointer. You don’t push and pop items to the stack on z/Architecture. On a call, the LE sets the stack pointer (by checking where the next available stack byte is, which was determined at the start of the calling procedure), determines where the next stack frame should start (saved at [2]), and constructs its DSA at the currently set stack pointer. Your stack variables live right after the DSA and are accessed by addressing memory relative to the stack pointer. Typically when you compile a program, the listing will include the offsets of the variables to the beginning of the DSA for each procedure. Notice how this makes it annoying to dynamically allocate stack variables.

When setting up the DSA, a procedure (callee) will keep a pointer of the caller’s DSA, in its own (callee’s) DSA.

When a procedure wants to return to the procedure that called it, it needs an address to return to. As we discussed earlier, that address was originally saved in register 14 when the branch to the callee happened. The callee then proceeded to save all registers (including register 14) in the savearea set up in the caller’s DSA. The callee keeps a pointer to its caller’s DSA. So the callee can perfectly fetch the address it needs to return to. Take away from this that the return address lives on the stack, in proximity to stack variables.

How to exploit this?

On x86 architecture, string copies will compile to loops of byte moves. On z/Architecture, we can do this with one single assembler instruction; MVC. MVC takes the target location, a length, and a source location. Considering that strings are defined with a fixed length in mainframe programming languages (including PL/1), the compiler is aware of each string’s length at all times. As a result, if you try to assign a 10 byte string in to a 5 byte string, the compiler will do two things. Warn you that you’re doing something silly, and compile to an MVC instruction with length parameter 5, copying only the first half of the 10 byte string. As a result, exploiting string copies simply doesn’t work, because you can’t write exploitable code that way.

As a mainframe assembler programmer, I understood this fairly quickly. So I had to find another attack vector. In my limited research in to software vulnerability exploitation, I learned that a typical sign of having found a buffer overflow is when the application crashes. z/Architecture, and S/360 before it, rely heavily on ABENDs (abnormal ends). When an ABEND happens, it is accompanied by a code. One of the most common ABEND codes is S0C1. A S0C1 happens when the CPU is decoding an instruction, but finds that the OPCODE does not exist. Assuming that the IBM compilers are flawless, this almost always means that the program managed to branch to an area of memory containing data, and not instructions. For the times I was called up in the middle of the night to resolve a S0C1, it’s a welcome change that this is exactly what I wanted to happen for once.

So what programmer errors usually cause S0C1 ABENDs? Addressing an array with an index larger than it was allocated with. How does this happen? Same way x86 buffer overflows happen, no size checking.

Take a developer who knows that business rules dictate that no one can have more than 5 checking accounts. In a batch program that lists checking accounts, he has an array in which he loads the current person’s 5 checking accounts. He opens a cursor to the database, and starts fetching rows from the resultset, increasing an integer each time he writes a checking account to the array. Sometime, usually around 3:00 AM, the program will treat a person who somehow managed to open 6 checking accounts. The program tries to store the fetched row at an address that it shouldn’t, and sooner or later, the application breaks. S0C1.

Now that’s still pretty far from an exploit. But; we found that stack variables live close to return addresses on the stack, and we found that we can write to parts of the stack that we’re not supposed to write to. Good luck opening a 6th checking account which will meaningfully overlay a return address though, that’d be something.

Now, the LE stack grows upwards (unless forced through options in the LE parameters to do the opposite, which literally no one does, because why?). This is annoying! If we manage to write to an address higher than the highest address of an array, we’re just writing in to uninitialized stack space, that’s not helping us anything.

There is still an attack vector though; globally declared arrays. When a program gains control (we’ll call this level 0), it sets up its DSA as if it’s a procedure, and its stack variables go right above it. When a subroutine (level 1) is called from level 0, its DSA lives just above those globally declared stack variables. If we can get a globally declared array to write out of its bounds, it’ll eventually write in to level 1’s DSA.

Writing in to level 1’s DSA when we’re executing as level 1 doesn’t help us any though. The return address that level 1 returns to lives in the DSA of level 0 (register 14 containing the return address is saved in the caller’s DSA remember?), which lives under the globally declared variables. We can attack any level’s return address higher than level 0’s though. So if we can write out of bounds to a globally declared array from level 2 or higher, we’re golden. Note how a problem can arise as soon as more global variables live between the array we want to abuse, and the DSA we want to attack. If those are critical for a procedure to get to the point where it will return to its caller, it won’t, because they are destroyed when smashing the stack. Unless we put meaningful stuff in there, but this makes things much more complicated.

There is an exception to this. When you pass a pointer to an array (that lives somewhere between two level’s DSAs to a higher level procedure as an argument), you effectively extend the scope of that array to that procedure. This case is incredibly rare in the real world, I’d assume, since most PL/1 programmers I know don’t know about procedure arguments, and much less about pointers.

Working Proof of Concept.

So now to put this in practice.

We’re going to use a massively simple PL/1 program. Initially, I wanted to read a file line by line until it reached its end of file status, copying each line to a globally declared array, without bounds checking of course. When there are as many lines in the file as there are lines allocated for the array, we’re fine. But as soon as we add another, we’d be overwriting the DSA of the level 1 procedure. This doesn’t work, I know why (reading a file’s line is a call in itself), but not why (the read file call is a call to a closed source procedure, and I’m pretty sure it needs the complete DSA chain to be intact).

So what I do is I read all lines in to an array, then copy the elements of that array one by one in to an array with one element less allocated to it. I cheat twice in this PoC, this is the first. I’m working on a better concept with less cheating, of course.

From here on out it’s basically sitting through debugging sessions, learning offsets within the stack by comparing memory contents with listings, knowing the layout of the DSA, and knowing how a procedure gets its return address.

Turns out that in order to get a return address, a procedure will get its stack base + offset 0x04, which contains a pointer to its caller’s DSA, and then get that address + offset 0x0C, which is where register 14 (the return address, remember?) was saved during the callee’s prologue.

So we need to craft an overlay that will overwrite offset 0x0C in the DSA to change where the program will branch to when trying to return to that DSA’s procedure.

In our example, the stack looks something like this right before we write the 6th element to an array with 5 elements;

Address 192A6878 is the location of the first element in the array. It’s also our payload (more about that later). Each element is 80 bytes long, so element 2 starts at 192A68C8, etc etc. The 6th element (for which no allocation was made!) will be written to 192A6A08, which contains garbage. Between the end of the stack variables and the beginning of another DSA exists some padding, since the compiler aligns DSAs in a certain way. The DSA that lives right above this array is level 1’s DSA. It starts at address 192A6A40, I know this because I studied the offsets within a DSA (IBM documents) and the offsets of the stack variables to the stack frame base, which I get from the output listing when compiling the program (you don’t get this in real life). Offset 0x0C to the DSA base is the address that this DSA’s procedure has its callees return to.

Our payload will be 80 bytes long, because each element in the array is 80 bytes long. We get to inject our payload at 192A6A08, and the memory location we need to overlay is 192A6A4C. That’s at an offset of 68 (decimal), to the beginning of our payload. So our payload looks something like this in EBCDIC (yeah we don’t play with ASCII or UTF too well);

Hexadecimal it looks like this;

Notice how I put a valid (31 bit, it looks like 32 bits) address in the location that will be overlaid over the return address in the caller’s DSA. It’s 992A6878[*], the address of the first element of the array we’re attacking. I conveniently injected executable code on to the stack in this way. I could have put it in any element, including the 6th which overlays the return address. I could’ve also put it in common storage by using another program (z/Architecture memory layout feature), memory that is accessible by every address space.

* The address shown by the debugger is 192A6878, not 992A6878. The latter address has bit 0 flipped to 1, which indicates (in z/Architecture) that the program is running in 31 bit mode, which is the successor of the 24 bit mode before it. The debugger has problems with this, but the program is in fact running in 31 bit mode, and the address of the executable payload should have bit 0 flipped to 1.

This is also the second time I cheat by the way. I can’t easily find the address of our payload, it changes from execution to execution (more about this later). So to make it easy for myself, I print the address of the first element of the array to the spool, from the program itself, effectively telling me where I need to force the program to branch to. No program in the real world will do this for you!

So; this payload will make the program branch to our executable payload as soon as any procedure, no matter how many levels deep we are right now, tries to return to level 1.

So what’s in our payload?

It’s the machine code (hand)compiled from assembler instructions looking like this;

Supervisor call 35 wants register 1 to contain a pointer to where its parameters are in memory. The parameters are in between the branch and the supervisor call. By branching over the parameters, straight to the SVC, and saving the address of the parameter list in register 1 by doing so, we accomplish a lot in just one instruction.

Supervisor call 35 is “Write To Operator”, or WTO. It writes a message to the consoles that are configured to display this particular message. It actually displays it so that’s fun.

This is arbitrary code being executed from an LE enabled program running under z/OS on a z/Architecture machine.

Random notes and thoughts.

z/Architecture does not feature executable space protection, nor does it have the possibility to mark memory as non-executable. This makes it very easy for us to inject code. As long as we can get it in memory, we can execute it.

z/Architecture does not feature ASLR, explicitly. The address at which the array in our PoC lives changes from time to time though. I’ve been doing some documentation digging (which is all I can do because IBM is closed source when it comes to code), and found that the stack is allocated at the start of an LE program with a GETMAIN SVC call (standard stuff). This triggers the Virtual Storage Manager to find an empty piece of memory with the size that was requested, with the characteristics requested. Why, and under what circumstances the VSM decides that another virtual page should be used this execution, as opposed to any other execution, I don’t know yet. I’m afraid I’ll have to reverse engineer parts of the GETMAIN SVC and VSM for that. Or go work for IBM?


z/Architecture features memory protection in the form of protection keys. You can only edit memory that has the same protection key as the address space you’re running it from. Protection keys run from 0-16 and are set when the address space is started. This protects userland programs from fucking up system areas that live in common storage. You cannot change the protection key of your address space*. When you allocate (virtual) memory, that memory is marked with the protection key of the address space that requested it.

z/Architecture features certain instructions that are protected. You can only run them when your program is compiled to be APF-authorized, and resides in an APF-authorized (z/OS option) library. Userland programs will never be APF authorized. APF authorization allows you to switch from problem state to supervisor state. Supervisor state allows you to change your protection key.

Since we’re probably only going to find vulnerabilities in userland programs, which aren’t (I hope in all shops) APF-authorized, we won’t be able to completely own the system. But we’ll be able to shit on a lot of stuff regardless.

If a vulnerability is found in an APF authorized program, some serious escalation can occur. This was part of the exploit-chain used by Warg to shit on Logica’s (Swedish bank) mainframes. An APF authorized program can create a new SVC. Warg introduced a new SVC which allowed any program to escalate its APF authorization (APF authorization is nothing but a bit in protection key 0 storage, which you can write to from an SVC since those run in key 0, supervisor mode). So any program you run, you can call that SVC and switch it to be APF authorized. This allowed him to hold control over the machine for a very long time, since this is seriously hard to find when you’re trying to protect the system.


Considering how simple the PoC is, I hope it’s clear that in order to be able to exploit a vulnerability, the planets really need to align. The vuln needs to exist, you need to somehow figure out how to craft the payload (hard if you’re not on the system, and have access to source code, compile listings, and a debugging session), and get the payload on the system. Then, when you have arbitrary code execution, you need to know how the system is set up to do interesting stuff. Open sockets, create files, edit files, delete files, what have you.


I mentioned earlier that it would be interesting to introduce the arbitrary code that we want to execute in to the common storage area, which is accessible by every address space. Address spaces are limited in what they can do based on authorities granted to the user they are started with by RACF. Authorities are for example; what files they have access to, what commands they can execute.

If you find a vulnerable program, but fail to inject enough code, you can use another program to inject it in to common storage, get a pointer to it, and then use that pointer as a target address for your exploit. Since the code being run from common storage would run in the address space of the vulnerable program, it’d be running under the user that the vulnerable program was started with. This is potential privilege escalation, as you get to do more stuff than you were originally allowed to do. Note that you first need to be able to log on to the system in order to inject anything in to common storage, which under normal circumstances you shouldn’t be allowed to.

Conclusion.

So is it possible to “stack smash”, or “smash the daisy chain” (as I like to call it) of z/OS LE programs? Yes.

Will you be hard pressed to find a vulnerability? Yes.

Will you be even more hard pressed to somehow be allowed by business rules to inject something that is executable? Yes. My visa account number is sadly not executable machine code.

If you’re reading this because one of your managers is flipping out, I’m sorry.

Final note.

I’m really new at this exploiting stuff, and probably missing a lot of easy things that could make the process of finding and abusing vulnerabilities a lot less complicated. If someone with more experience wants to get in to this, I’d be more than happy to help them get started, because I know I’ll get my return on investment when that person surpasses me and can teach me in return.

Link to original Reddit post

A (mostly) useful debugger on z/OS

One item that has eluded me in my continuing quest to dive deeply into z Systems architecture is finding a native (IBM-supplied) assembler debugger/disassembler** on the platform that is fully functional, user-friendly and versatile.

Quickly I gave up on user-friendly and versatile and settled on trying to find one that was fully functional; simply stated – can execute and sustain integrity through any command in the POP manual. Of the 4 debuggers on the system, or as part of an add-on package, all have major out-of-the box ‘features’ that prevent them from being a worthwhile tool to anyone who has used a modern debugger (Like Immunity, Olly, or even GNU’s gdb).*

For this platform, my criteria for fully functional was this:

  1. Be able to debug programs which are link-edited APF-authorized and live in APF authorized libraries.
  2. Debug programs that switch into and out of supervisor state, and also switch PSW key mask values.
  3. Allow execution of commands that switch address space control (ASC) modes and manipulate storage in any of the 3 modes.
  4. Do the rest of the actions a basic debugger would do: set breakpoints, single step, watch variables / registers, examine and change memory, and so on.

The first 3 are functions unique to the z/Architecture platform, and I would expect at least the 3 z/OS based debuggers to handle (and possibly the one OMVS/USS based one also).   None do.  At least out of the box.  A quick list of the debuggers tried, and their limitations is at the end of this post.

Continue reading A (mostly) useful debugger on z/OS