Accessing Data Structures Located in a Randomized Address Space (ASLR)

How to Eliminate Entropy and Bring the Universe Back to the Singularity

by Matt Davis (enferex) (mattdavis9@gmail.com)

So, what is one to do when bored and needing something to stimulate the old neurons?  Why, inspect memory!  With that said, it was getting late one evening and I needed something to keep the brain stimulated, thus I decided to go poking around the memory space of a process.  You know, hunt around for golden nuggets within a Linux process to see what shiny new things I could uncover.  Now, this isn't the first time I have done this, but I noticed that evening that the GNU C Library had portions loaded into memory with write permissions enabled.  It was then that I wondered what I could do.

Moreover, this led me to the writable portion of the random table in my process.  This table is used for generating random values.  Since random values are critical for security (e.g. asymmetric encryption, TCP sequence numbers, etc.), trying to manipulate that table might permit me to make such values nonrandom and insecure for applications that rely on them.  An attacker can use a known value to aid their attack.  Thus, manipulating the random table to produce deterministic values can compromise the security of a protocol or application.  However, any program serious about security should not be using glibc for their entropy.  Instead, something like /dev/urandom (Linux's driver for producing random values) should be favored.  But, if your program (e.g. game) relies on randomness for a non-security dependent purpose, a simple generator like that provided by glibc should be just fine.  As a note, I was not intending to manipulate such a table when exploring my process' memory, it just kinda happened.

The following is just an example of the memory space in Linux for an instance of the program cat:

> cat /proc/self/maps
00400000-0040b000 r-xp 00000000 08:01 7084978                            /usr/bin/cat
0060a000-0060b000 r--p 0000a000 08:01 7084978                            /usr/bin/cat
0060b000-0060c000 rw-p 0000b000 08:01 7084978                            /usr/bin/cat
01c18000-01c39000 rw-p 00000000 00:00 0                                  [heap]
7f533a263000-7f533a406000 r-xp 00000000 08:01 7081205                    /usr/lib/libc-2.17.so
7f533a406000-7f533a606000 ---p 001a3000 08:01 7081205                    /usr/lib/libc-2.17.so
7f533a606000-7f533a60a000 r--p 001a3000 08:01 7081205                    /usr/lib/libc-2.17.so
7f533a60a000-7f533a60c000 rw-p 001a7000 08:01 7081205                    /usr/lib/libc-2.17.so
7f533a60c000-7f533a610000 rw-p 00000000 00:00 0
7f533a610000-7f533a631000 r-xp 00000000 08:01 7081212                    /usr/lib/ld-2.17.so
7f533a67b000-7f533a804000 r--p 00000000 08:01 7113566                    /usr/lib/locale/locale-archive
7f533a804000-7f533a807000 rw-p 00000000 00:00 0
7f533a831000-7f533a832000 r--p 00021000 08:01 7081212                    /usr/lib/ld-2.17.so
7f533a832000-7f533a833000 rw-p 00022000 08:01 7081212                    /usr/lib/ld-2.17.so
7f533a833000-7f533a834000 rw-p 00000000 00:00 0
7fff8632f000-7fff86350000 rw-p 00000000 00:00 0                          [stack]
7fff8639e000-7fff863a0000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsys call]

Anyway, that writable portion of glibc intrigued me.

What could possibly be in that writable segment of the glibc copy that resided in my process' memory space, and why?

Well, the "why" can be answered pretty easily.  Quite simply, a library has global variables and data that the running process is permitted to manipulate.  For glibc, this data can be manipulated via calling glibc functions.

For example, calling srand() or srandom() will manipulate a table used in generating the random values when rand() or random() are called.

To get a better idea of what was going on, I wrote a simple C program, compiled it, and then loaded it up in my debugger (GNU Debugger).  By using the features of GDB, one can quickly snoop around the memory space and see what lies within the deep depths of their processes.  Upon embarking on this sort of late night exploration, I was quickly greeted by the symbol name for one of the items located in this writable memory space, randtbl.

Now, this value is both writable and loaded at an address that is non-deterministic, thanks to my kernel randomizing the address space (more on this in a jiffy).  Since I was running in GDB, the address of the randtbl was static and always at the same location.  Anyway, performing the following commands in GDB can give more insight about the randtbl location:

(gdb) x &randtbl
0x7ffff7dd50a0 <randtbl>: 0x00000003
(gdb) info symbol &randtbl
randtbl in section .data of /usr/lib/libc.so.6
(gdb) info address randtbl
Symbol "randtbl" is at 0x7ffff7dd50a0 in a file compiled without debugging.

As we can see from GDB, randtbl is a valid symbol, with the first portion of data having a value of 3 and located in the (writable) .data section of the shared library libc.  We also know that my libc has no debugging goodies, but that really does not concern us too much.  As a GDB fan, I should also mention one additional command useful for inspecting the process' memory space: info proc maps, which is essentially the same information you would get if you read the /proc/maps entry for the process.

Recall that when the Linux kernel loads an executable into memory, a copy of the writable libraries that the program needs (in this case glibc) is loaded into the process' memory space.  That way the process can manipulate the data and no other process will see the changes.  This is memory that is only for the process, and lasts only the lifetime of the process.  For shared libraries that have non-writable portions (like .code for functions) multiple processes can share the same library code, eliminating the need to duplicate library instruction and reducing the amount of memory necessary for programs to run.

As a security measure, the Linux kernel can be configured to randomize the address space of a process so that loaded libraries are located at a non-deterministic location in the process' memory space.  This nifty feature prevents attackers from attacking a process at runtime by using information about known addresses in a library.  With Address Space Layout Randomization (ASLR), the addresses of loaded libraries are not known until runtime and change every execution.  Therefore, it would be pretty tricky to craft an exploit to target a specific address.

Now, back to the randtbl hackery.  So, how can I get access to the random table and manipulate it (for research purposes of course) if I do not know its address until runtime?  Possibly a linker script could allow me to alias the address, with a variable in my program.  But, nah, I don't want to do that.  I want to build my program and access the table without having to write a linker script.  Let's keep things as simple as possible.

Instead of a linker script, I browsed the glibc-2.17 source code and found that srand() makes use of this randtbl.  So, I added a call to srand() in my program and then hopped into GDB to look at the assembly.

It seems that srand() is actually wrapped by a function that passes a structure called unsafe_state to srand().  The first two members in unsafe_state are pointers into the randtbl, as the glibc source code clearly shows.

The flow of execution is simple.

My program first calls srand() (actually its a glibc wrapper).  Next, this glibc wrapper calls the actual srand() function with the address of unsafe_state as an argument.  Recall that uunsafe_state contains pointers to the randtbl.  srand() then manipulates randtbl and returns control back to the wrapper and then the wrapper returns control back to my program.

Now, this is the key piece...

The wrapper calling srand() calls a function that uses the unsafe_state as the first argument.

After this call is complete, srand() returns immediately.  srand() never clobbers the register last used to pass unsafe_state, therefore when srand() completes, the user program (the portion you write) has access to this register.

This means that your program can access unsafe_state and all of its contents (randtbl) by just reading the rdi register.  This occurs because a 64-bit Intel x86 uses a calling convention when compiled by GCC v4.8.1 where the rdi register will contain the first argument passed by the wrapper to srand().

And that register (containing the address of unsafe_state structure), is never overwritten (clobbered) by srand() or its wrapper.

This means that someone can obtain access to randtbl by simply calling srand(), and then immediately looking at the rdi register, which should be the address of the unsafe_state variable that contains pointers to randtbl.

And there you have it, the ability to access a writable randtbl located within a randomized address space!

Well, the following does just that:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <time.h>

static void print_rand_values(int n_values)
{
  int i;

  printf(">> Printing %d values from rand()...\n", n_values);
  for (i = 0; i < n_values; i++)
    printf("%d\n", rand());
}

int main(void)
{
  int32_t type, n_elts;
  uintptr_t unsafe_state_addr, rand_tbl_ptr;

  /* Call srandom, which sets the rdi register to 
   * the address of unsafe_state glibc struct
   */
  srand(time(NULL));

  /* Read the address of 'unsafe_state' state, defeating ASLR */
  __asm__ __volatile__("mov %%rdi, %0\n":"=r"(unsafe_state_addr));

  /* Dereference the member (second address) in the unsafe_state struct */
  rand_tbl_ptr = *(uintptr_t *) (unsafe_state_addr + sizeof(void *));

  /* The second member in 'unsafe_state' is a pointer to the second element of
   * randtbl: randtbl[1]. So we backup int32_t to get to the head of randtbl.
   */
  rand_tbl_ptr = rand_tbl_ptr - sizeof(int32_t);
  printf(">> randtbl located at %p\n", (void *) rand_tbl_ptr);
  printf(">> Before clearing random table\n");
  print_rand_values(10);

  /* How large 'randtbl' can vary.
   * See glibc-2.17 source.
   *
   * Note that the first byte of 'randtbl' is a flag:
   * If the first byte of randtbl is:
   * -- TYPE_0 (a value of 0) then the table contains 0 32bit integers
   * -- TYPE_1 (a value of 1) then the table contains 8 32bit integers
   * -- TYPE_2 (a value of 2) then the table contains 16 32bit integers
   * -- TYPE_3 (a value of 3) then the table contains 32 32bit integers
   * -- TYPE_4 (a value of 4) then the table contains 64 32bit integers
   */
  type = *(int32_t *) rand_tbl_ptr;
  n_elts = 0;
  switch (type) {
    case 0:
      n_elts = 0;
      break;
    case 1:
      n_elts = 8;
      break;
    case 2:
      n_elts = 16;
      break;
    case 3:
      n_elts = 32;
      break;
    case 4:
      n_elts = 64;
      break;
  }

  printf(">> Clearing contents of randtbl "
         "which is an array of %d int32 values...\n", n_elts);
  memset((void *) rand_tbl_ptr, 0, n_elts * sizeof(int32_t));
  print_rand_values(10);
  return 0;
}

Now that my program has access to the random table, let's see what happens if I zero the table using memset().

To see what I had done, I immediately called rand() to see what value it produced.  Muahah, it produced a non-random value of 0.  Woohoo!

I made random deterministic.  Of course, this only affects the process and any child process that the compromised process creates (via fork()).  If another executable is called (via exec()), then its address space is fresh, and it has a copy of the unmodified randtbl, thus it acts on an unmodified rantbl.

Also note that any future calls to srand() will reset randtbl and result in rand()/random() producing values as if nothing ever happened.

So what is the practicality of this being used as an exploit?

This would require some pretty clever shellcode, as the exploit would have to inject a call to srand(), perform a read to get the address of randtbl, and then zero-out the table.

Why is this important?  Well, most programs relying on secure uses of random numbers (e.g. TCP sequence numbers, asymmetric crypto, etc.) would/should be using a different source of randomness anyways (e.g. /dev/urandom).  Further, we just accessed and manipulated a single variable, that being randtbl.  Other variables in other libraries might also be accessed via this same method.

Anyway, I hope that this spiel was insightful.  Now go take what you learned and see what other data in some other library you can manipulate!

Shoutouts: The ruxcon crew, count, __ben (the villain)

Resources

glibc-2.17 source: www.gnu.org/software/libc

Code: randhack_small.c

Return to $2600 Index