[pbctf 2020] Queensarah2

This post originally appeared on the flagbot site, feel free to read it there. It is presented here as a form of archival and to collect all my posts in one place.

The secret mainframe for a distributed hacker group has been discovered. We have managed to exfiltrate some of the code that it runs, but we don’t have a physical copy of their access badges. Can you still get the flag? source
Remote: nc queensarah2.chal.perfect.blue 1
Note: enter flag as pbctf{lower_case_flag_text}

Looking at the provided source, the cipher procedes in 2log2|m|2\lceil\log_2 |m|\rceil rounds to encrypt a message mm. Each round replaces each bigram through the sbox (which is unknown and the key) SS, and reorganizes the message such that all bigrams are broken up.

One observation we soon made was that there’s no key schedule present, so the slide attack might apply. However, using short messages would result in us being unable to properly recover and validate a key obtained from a collision, and using long messages would seemingly require more pairs than we can query to realistically have a collision, that’s a first idea we can’t get to work.

In hindsight: this is actually an existing pen-and-paper cipher, that’s been nicely cryptanalized by Robert Xiao. And, as the flag indicates, it does make use of the slide attack due to some observations we missed.

Next, we have the observation that encrypting a single bigram mm results in the ciphertext S(S(m))S(S(m)), as the reordering has no effect. Let’s call this relationship E(m)=S2(m)E(m) = S^2(m). From this, we try to propagate knowledge. Say we start with the assumption that S(𝚊𝚊)=𝚡𝚢S(\mathtt{aa}) = \mathtt{xy}, then knowing that E(𝚊𝚊)=αE(\mathtt{aa}) = \alpha directly implies that S(𝚡𝚢)=αS(\mathtt{xy}) = \alpha, and we have new information that can be used in the same way and give us more information. Unfortunately, this is very closely related to the cycle decomposition of SS, and we’d need to guess or assume as many bigrams as there are cycles in SS, which quickly gets prohibitively large. So that’s another idea that failed. Some code because we tried it nonetheless:

def recover(double, sbox):
    sbox = copy.copy(sbox)
    rev = {v:k for k, v in sbox.items()}
    change = True
    while change:
        change = False
        for k, v in double.items():
            if k in sbox and v in rev:
                if rev[v] != sbox[k]: return {}
                continue
            if v in rev:
                change = True
                intermediate = rev[v]
                rev[intermediate] = k
                sbox[k] = intermediate
            if k in sbox:
                change = True
                intermediate = sbox[k]
                rev[v] = intermediate
                sbox[intermediate] = v
    return sbox

Our final idea, that finally worked goes further in looking at these cycle decompositions. Recall that E=S2E = S^2, and we can obtain EE by simply encrypting all bigrams individually. So if we can enumerate all square roots of EE, we can validate each of them individually by checking that some test encryptions under that sbox (of more than a single bigram, since those will all be consistent) result in the same ciphertext the server gives us. For a long time, we believed that enumerating these square roots would be too slow. We should have looked at the actual sizes of the cycles first instead…

So how do we take a square root of a permutation? Let’s first investigate what happens to a permutation when we square it. Observe that each cycle of SS is squared independently. We can then distinguish two cases: odd length cycles and even length cycles. An odd length cycle 𝒞\mathcal{C} will square into an odd length cycle, since gcd(|𝒞|,2)=1\gcd\left(|\mathcal{C}|, 2\right) = 1. For an even length cycle 𝒞\mathcal{C}, we again see two different cases, if |𝒞|2(mod4)|\mathcal{C}| \equiv 2 \pmod 4 (so when the length is of the form 2(2k+1)2(2k + 1), twice an odd number), it squares into two odd-length cycles, otherwise into two even-length cycles. This means that we can’t simply decide that a cycle of odd length in EE corresponds to a cycle of odd length in SS. We need to consider all pairings of cycles of the same length, taking into account that for odd length, those might also originate from a single odd-length cycle each. To make the implementation a bit easier on ourselves, we simply requested a few different EE by connecting to the server twice, such that there were at most 2 cycles of each length in EE.

To make requesting these 729 encryptions of bigrams fast, and not be constrained too much by network communication, we simply send all encryption queries at once, receive all encryptions at once, and nicely parse them out.

from pwn import *
import string, itertools, copy
from math import ceil, log
import collections, json

io = remote("queensarah2.chal.perfect.blue", 1)
io.recvline()
target = io.recvline().decode().split("'")[1]
print("target =", target)
alpha = string.ascii_lowercase + "_"

def encrypt(sbox, message):
    if len(message) % 2:
        message += "_"

    message = list(message)
    rounds = int(2 * ceil(log(len(message), 2))) # The most secure amount of rounds

    for round in range(rounds):
        # Encrypt
        for i in range(0, len(message), 2):
            message[i:i+2] = sbox[''.join(message[i:i+2])]

        # Shuffle, but not in the final round
        if round < (rounds-1):
            message = [message[i] for i in range(len(message)) if i%2 == 0] + [message[i] for i in range(len(message)) if i%2 == 1]

    return ''.join(message)

cst = [''.join(random.choice(alpha) for _ in range(512)) for _ in range(10)]
io.sendline("\n".join(cst))
csv = io.clean(1).decode().splitlines()[1::2]

def consistent(sbox):
    if len(sbox) != 27**2 or len(sbox) != len(set(sbox.values())): return False
    for a, b in zip(cst, csv):
        if encrypt(sbox, a) != b: return False
    return True

def cycles(double):
    seen = set()
    res = []
    for k in double:
        if k in seen: continue
        c = []
        while k not in seen:
            seen.add(k)
            c.append(k)
            k = double[k]
        res.append(c)
    return res

io.sendline("\n".join(a + b for a, b in itertools.product(alpha, repeat=2)))
double_sbox = dict(zip(map(''.join, itertools.product(alpha, repeat=2)), io.clean(1).decode().splitlines()[1::2]))
assert len(double_sbox) == len(set(double_sbox.values())) == 27**2
print(sorted(map(len, cycles(double_sbox))))
assert sum(map(len, cycles(double_sbox))) == 27**2

def decrypt(sbox, message):
    if len(message) % 2:
        message += "_"

    message = list(message)
    rounds = int(2 * ceil(log(len(message), 2))) # The most secure amount of rounds

    for round in range(rounds):
        # Encrypt
        for i in range(0, len(message), 2):
            message[i:i+2] = sbox[''.join(message[i:i+2])]

        # Shuffle, but not in the final round
        if round < (rounds-1):
            message = sum(map(list, zip(message[:len(message)//2], message[len(message)//2:])), [])

    return ''.join(message)

def pairings(c1, c2):
    assert len(c1) == len(c2)
    for i in range(len(c1)):
        combined = sum(map(list, zip(c1[i:] + c1[:i], c2)), [])
        yield {combined[i]:combined[(i + 1) % len(combined)] for i in range(len(combined))}

def odd_cycle(cyc):
    n = len(cyc)
    return {cyc[i]:cyc[(i + (n + 1) // 2) % n] for i in range(n)}

def solve(sbox, cycs):
    if not cycs:
        assert len(sbox) == len(set(sbox)) == len(set(sbox.values()))
        for a, b in zip(cst, csv):
            if encrypt(sbox, a) != b:
                return None
        return sbox
    if len(cycs) > 1 and len(cycs[0]) == len(cycs[1]):
        for p in pairings(cycs[0], cycs[1]):
            t = solve(sbox | p, cycs[2:])
            if t is not None:
                return t
    if len(cycs[0]) % 2 == 1:
        t = solve(sbox | odd_cycle(cycs[0]), cycs[1:])
        if t is not None:
            return t
    else:
        assert False

assert collections.Counter(map(len, cycles(double))).most_common(1)[0][1]
sbox = solve({}, sorted(cycles(double), key=len))
inv = {v:k for k, v in sbox.items()}
print(decrypt(inv, target))

Flag: pbctf{slide_attack_still_relevant_for_home_rolled_crypto_systems} yes, but let’s do it with less queries anyway :)