[GCTF 2022] Cycling
Points: 201 (50 solves)
It is well known that any RSA encryption can be undone by just encrypting the ciphertext over and over again. If the RSA modulus has been chosen badly then the number of encryptions necessary to undo an encryption is small. However, if the modulus is well chosen then a cycle attack can take much longer. This property can be used for a timed release of a message. We have confirmed that it takes a whopping 2^1025-3 encryptions to decrypt the flag. Pack out your quantum computer and perform 2^1025-3 encryptions to solve this challenge. Good luck doing this in 48h.
Exploring the challenge
Let’s have a brief look at the source code we’re provided with:
e = 65537
n = ... # snip
ct = ... # snip
# Decryption via cycling:
pt = ct
for _ in range(2**1025 - 3):
pt = pow(pt, e, n)
# Assert decryption worked:
assert ct == pow(pt, e, n)
# Print flag:
print(pt.to_bytes((pt.bit_length() + 7)//8, 'big').decode())
In short, we’re faced with an RSA encryption of the flag, and one additional “fact” that’s supposed to help us in some way. The fact is that when we repeat the encrypting exponentiation times, we achieve the same as decrypting the ciphertext. Working through the math, this tells us that holds, at the very least when represents the flag.
A well-known fact when dealing with RSA, and modular exponentiation in general, is that the order of the multiplicative group mod is equal to the number of integers that are coprime to . This quantity is known as Euler’s totient .
Furthermore, from Euler’s theorem (or Lagrange’s theorem if we frame it in a group-theoretic way), we know that any number taken to the th power (mod ), results in the identity . This property is in fact vital for the correctness of RSA, as we rely on the fact that when we say that .
From all this known theory underlying the RSA cryptosystem, we can now finally make a first deduction: .
Background: Carmichael’s
Actually, that previous statement is what you would say with a basic understanding of the principles underlying RSA, but it’s in fact not entirely correct. It could very well be the case that for . Moreover, this will for the case for every when is an RSA modulus. One thing that Lagrange’s theorem will still give us, even when , is that .1
The smallest exponent such that for all is known as the Carmichael function . We know that , but we can even write down a nicer formula:2
which, when we apply this to our RSA modulus , becomes
Returning to our wrong statement from before, we now know that where , and furthermore, since doesn’t look too suspicious, nor can a readable flag be influenced all that much, we can in fact hope that .
Factors of factors of factors; and some subtractions
Now that we understand the nuances of the formula a bit better, we can think further towards solving this challenge. Remember what we wrote down earlier?
This tells us something more, since we have exactly the form of statement that lead us to introducing in the first place. We could now say that , which gives us a somewhat nice relation between the value we’d been given, and our RSA modulus .
Let’s not worry about the possibility that is only a divisor, and instead assume that it holds with equality . Then we can try to write down what we expect to be:3
We now would like to relate these values to somehow. By the above, it should be clear that any , so when we list all divisors of in turn, and add to them, we should end up with a set of candidates , such that . The value itself is not particularly easy to factor in a short amount of time, but luckily it’s not an esoteric, unknown value, but a nicely structured one. And as it often happens to nicely structured values, they show up on factordb.
Who cares if it’s not the private key? It works
With the set , what could we do? One option to consider is applying the same trick again, but using rather than the factorization of , to recover and . Annoyingly, is a rather large set, and enumerating all subsets of it takes exponential time, so we’ll have to throw that idea out.
Instead, let’s look back at our initial, more naive, understanding of the RSA cryptosystem, where we used rather than to compute the decryption exponent . Even though we didn’t have the smallest modulus possible, we still had full correctness, since — as we know by now — . We can take that to the extreme: we know/assume that all factors of are among our values , so if we simply take , it should hold that , and as such, we could use as a “replacement” for or when it comes to computing a decryption exponent.
With this, we have enough ideas and information to finally solve this challenge4.
CTF{Recycling_Is_Great}
All your factorbase are belong to us
The official intended
solution relies on similar observations, but instead of finding some
number
,
it uses those potential factors of
to directly factor the modulus
.5 To understand how this factorization
works, we look back at a well-known, simple special-purpose factoring
algorithm: Pollard’s
p - 1
algorithm.
Pollard’s algorithm allows finding a factor (say for ) under the assumption that is -powersmooth. That is, all prime power divisors are bounded by . This property enables us to find some product of prime powers less than , such that but . In turn, looking at Fermat’s little theorem, we can see that for all , it holds that , but it will often be the case that , and so looking for should allow us to recover . Traditionally, Pollard’s method computes by repeatedly taking th powers of an accumulator and testing whether the results in a factorization. The advantage of this is twofold: one doesn’t need to fully compute in its entirety6 and as long as the largest factor of differs from the largest factor of ,7 the method will still work, rather than yielding as the .
To abstract the bound away, we simply notice that all it gives us is some superset of prime (power) factors of . This is exactly what we already found by enumerating primes such that ! If we call such a set a factor base, and slightly generalize Pollard’s algorithm, we can apply it to our situation as well. And this is exactly the approach we see in the official solution script: we take some base (there called ), repeatedly exponentiate it by potential prime factors, and check if .
Once a factorization of is found, it is of course only a matter of performing regular RSA decryption to obtain the flag.
Does this always work?
Until now, we’ve made several assumptions that turned out to be correct, in order to solve this challenge. Even the official solution turns out to rely on those assumptions,8 so we’d like to have a look at how much trouble we’d be in when the assumptions would be invalidated. To restate our assumptions more explicitly:
-
- The inner corresponds to the multiplicative order of the flag,
- The outer corresponds to the multiplicative order of , modulo
- and are square-free, that is, none of their prime factors occur with multiplicity .
- All are square-free, where , but this has been confirmed by the factorization of
And let’s also introduce some counterexamples for all of these, where our assumption is invalidated, and our solution becomes broken:
- We explore the possibility for both
s:
- Let , then , but for the message , it’s already true that , rather than the expected . This in turn implies that we only need to complete decrypt this message by cycling.
- We now use , so and , but for instance only has multiplicative order modulo .
- Consider , then . would then be , and we’d never pick up as a potential factor out of any subset.
- Similar issues to the earlier point occur, except they are introduced slightly later in the process of computing .
Other than those assumptions, there’s some more things that can go wrong. We’ve relied on enumerating all subsets of factors of , but — as we remarked with an initial failed idea of repeating such an enumeration on the result of that — that takes exponential time in the number of factors. If we end up with a large amount of these factors, we might in fact already get in trouble trying to enumerate all subsets, and spend a long time waiting for that.9 On the other side of that medallion, to obtain that initial list of factors, we need to be able to factor this value . Now, if we consider for instance the worst case, where and are what we could call doubly-safe primes, i.e. , and are both safe primes, there’s only a minor difference in number of bits between and and the latter is even a new RSA modulus. By assumed security of RSA (unless you get extra information like in this challenge), factoring that would not be feasible.
Can we fix it?
Yes we can!
Sort of, at least. Unless my very vague notion from the footnote in the previous section pans out, I don’t expect we could get around the exponential time enumeration, or the factoring problems. We can however try to fix up some of the problems our assumptions brought along, and get a slightly more generic solution.
Let’s first look at the case where
.
Since for our original approach, we only care about
itself, and not about factoring, we only need to find a multiple of
,
which we still get from our powerset enumeration (under the assumption,
for now, that we get
).
This means we can still compute an effective decryption exponent that
works for
itself, and any element with an order dividing
.
Moreover, we can still use this to fully factor
too. Once we decrypt
,
we know that
,
which means we can again apply our p - 1
approach with a
factor base to find an exponent
such that
.
Note that we require the use of
as basis here, rather than an arbitrary number
.
The only condition under which this will still necessarily fail is when
divides
,
since then any exponent such that
also satisfies
.10
Next, we investigate the case where
.
Unfortunately, the solution here isn’t quite as clean. When the order of
is too small, we simply lack the information to recover enough primes.
When the order of
is only slightly too small, we should be able to salvage it with only a
constant cost to our computation time. Pick a fixed-size (multi)set of
small primes
,
and let
.
This increases the computation time with a factor
,
but as long as the “lost” factor of
is factorable over
,
our algorithm works again. Optionally, if we would like to deal with
more complex situations, we could also construct more complex ways to
add extra factors. For example, an approach comparable to the
“two-stage” variant of the p - 1
algorithm is possible,
where we take e.g. at most 4 “small” primes and 1 “medium” sized extra
prime.
Finally, how can we deal with the annoyance of prime powers? To deal
with prime power divisors of
,
it’s possible to apply a strategy similar to the p - 1
factoring algorithm, where every prime factor is simply included
multiple times. Since exponents larger than
can be detected in the factorization of
,
we recommend including each factor twice, unless evidence to the
contrary is present from the factorization of
.11 The more conservative approach
would be to include each prime
a number of times proportional to
,
which comes at an obvious computational cost. For the prime power
divisors of
,
we’ll need to do just a bit more work. If we see a prime power
when factoring
,
it should be clear from how
works that we expect to also see the factors of
appear. In that case, it’s obvious that
should be in
.
When however
,
we only see
and the factors of
appear when factoring
.
Hence, when we add a prime to
that still divides
,
we should add it with a higher multiplicity.
Talk is cheap, show me the code
We present here the code as we implemented it during the CTF. That
is, making maximal assumptions such that it still gets the flag. The
code for factorization based on Pollard’s p - 1
algorithm
can be found in the official
solution. Interested readers are encouraged to understand, implement
and share the suggested improvements of this article :)
proof.all(False) # speed up primality checking a bit
import itertools
from Crypto.Util.number import long_to_bytes
# From the itertools documentation/example
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(len(s)+1))
# From factordb
factors = [2, 3, 5, 17, 257, 641, 65537, 274177, 2424833, 6700417, 67280421310721, 1238926361552897, 59649589127497217, 5704689200685129054721, 7455602825647884208337395736200454918783366342657, (2^256+1)//1238926361552897, (2^512+1)//18078591766524236008555392315198157702078226558764001281]
assert 2**1025-2 == prod(factors)
C = []
for ps in powerset(factors):
v = prod(ps) + 1
if is_prime(v):
C.append(prod(ps) + 1)
Ξ = prod(C)
e = 65537
n = 0x99efa9177387907eb3f74dc09a4d7a93abf6ceb7ee102c689ecd0998975cede29f3ca951feb5adfb9282879cc666e22dcafc07d7f89d762b9ad5532042c79060cdb022703d790421a7f6a76a50cceb635ad1b5d78510adf8c6ff9645a1b179e965358e10fe3dd5f82744773360270b6fa62d972d196a810e152f1285e0b8b26f5d54991d0539a13e655d752bd71963f822affc7a03e946cea2c4ef65bf94706f20b79d672e64e8faac45172c4130bfeca9bef71ed8c0c9e2aa0a1d6d47239960f90ef25b337255bac9c452cb019a44115b0437726a9adef10a028f1e1263c97c14a1d7cd58a8994832e764ffbfcc05ec8ed3269bb0569278eea0550548b552b1
ct = 0x339be515121dab503106cd190897382149e032a76a1ca0eec74f2c8c74560b00dffc0ad65ee4df4f47b2c9810d93e8579517692268c821c6724946438a9744a2a95510d529f0e0195a2660abd057d3f6a59df3a1c9a116f76d53900e2a715dfe5525228e832c02fd07b8dac0d488cca269e0dbb74047cf7a5e64a06a443f7d580ee28c5d41d5ede3604825eba31985e96575df2bcc2fefd0c77f2033c04008be9746a0935338434c16d5a68d1338eabdcf0170ac19a27ec832bf0a353934570abd48b1fe31bc9a4bb99428d1fbab726b284aec27522efb9527ddce1106ba6a480c65f9332c5b2a3c727a2cca6d6951b09c7c28ed0474fdc6a945076524877680
d = pow(e, -1, Ξ)
print(long_to_bytes(int(pow(ct, d, n))))