CTF complete - HW bounty still ongoing - 2.337 BTC

Ledger’s first CTF (Capture The Flag) event has officially ended! We’d like to take this opportunity to thank all the individuals and teams that participated to the contest. We received more than 500 answers, and while most participants wished to remain anonymous, we believe that the techniques employed indicate many were security professionals.
If you’re interested by the solutions to the challenges, you’ll find them towards the end of this blogpost.
We would like to thank and congratulate all of the contestants who solved our challenges, and especially the top ranked.
The 3 fastest contestants to complete all 3 CTF Challenges won a Ledger Nano S CTF-2018 Winner special edition, as well as the fastest contestant to solve each individual CTF challenge.
The 7 next finishers on the general ranking will win a Nano-S special edition CTF-2018 Finisher

The fastest solution to the 3 problems came from Georges Gagnerot of the Eshard team (@eshardnews). They solved the 3 problems after only 27 hours and 25 minutes, congratulations!
The second fastest solution was from Aleksei Udovenko at the University of Luxembourg. His time of 30 hours and 40 minutes was also very quick.
Finally, the podium is completed by Team Plebe which solved the 3 problems in only 34 hours and 32 minutes.

The fastest contestants to solve were also rewarded, congratulations to:

CTF #1: Eric Dangereux
CTF #2: Georges Gagnerot (Eshard)
CTF #3: Pascal Pailler (CryptoExperts)

The winners & the finishers have won Ledger Nano S collector editions.

Here are the top 10 finishers:

The complete ranking table is available here.

Step 2: Hardware Bounty

All the finishers qualified (and a few others) to try & extract the private key from a dedicated Hardware Bounty, which they received mid april. More than 100 devices with the same key has been sent. The dedicated bounty is a simple USB device which computes a public key from a private key (a simple scalar multiplication).This public key is sent back using the USB connection. There are a few countermeasures in place to protect the private key.

This private key unlocks a Bitcoin Wallet on which 1.337 BTC are stored.
Interestingly, after almost 2 months, the 1.337 BTC reward for this bounty has not been moved so far suggesting no one has been able to circumvent the countermeasures Ledger put in place to protect the private key yet.
The contestants are security professionals, and since we’d like to raise the stakes even higher, 1 BTC has been added to the current reward, increasing it to 2.337 BTC.
We already had some feedbacks from eshard, NinjaLab, Riscure and others, and it seems this implementation is not that easy to break :).
Good luck for the researchers, please note there is also a bounty program on the Nano S.

Solutions

You can find some materials in the dedicated Github
Antoine Ferron also made a write-up of his own Solutions.
Eshard also made a dedicated Github.

* Solution CTF #1 *

The first problem was the easiest one. It’s based on the well-known properties of RSA: fixed points. These fixed points are also called unconcealed messages.For every given RSA modulus M and exponent d, there exists messages $x$ such that $x^d mod N = x$ . The encryption for this very message is the identity.
Fortunately, for RSA large modulus, the probability to have such messages is very low.
The question was to retrieve the exponents d for which the number of unconcealed messages is minimum for a given modulus. The modulus was chosen such that brute forcing all $x$ and all $e$ would take terribly long to accomplish.
A naive algorithm like:

   sol = 0
   mini = N
   for e in range(2:phi-1):
      if gcd (e,phi) == 1
         s == O
         for m in range 1:n-1:
            if m^d mod N == m
               s += 1
         if s < mini:
            mini = s
            sol = e
         if s == mini:
            sol *= e % 1337694213377816
   return sol

This naive algorithm is perfectly correct but would take a terribly long time.
Fortunately, the number of unconcealed messages for a given e can be easily computed as
$s = (1 + gcd(e-1,q-1)) * (1 + gcd(e-1,p-1));$
Most of the received solutions were based on this formula.
A mathematician (Thomas Piellard) sent me an interesting accurate proof:
“ $Z/nZ = Z/pZ \times Z/qZ$ , so I searched in $Z/pZ$ .
In $Z/pZ$ , the exponents $e$ which minimize the cardinal have only 2 fixed points: $1$ and $-1$ .
So, a criterion for such an exponent $e$ is : $gcd(e,\varphi(p))==1$ and $gcd(e-1,\varphi(p))==2$ .
Indeed, if $gcd(e-1,\varphi(p))==2$ , it means that the only fixed elements are $1$ and $-1 = a^{\varphi(p)/2}$ where $a$ is a generator in $Z/pZ$ .
In $Z/pZ \times Z/qZ$ , the same mechanism applies, the exponents e which minimizes the cardinal have only 8 (non-zero) fixed points
$(1,1),(1,0),(1,-1),(0,1),(0,-1),(-1,1),(-1,-1),(-1,0)$ .
Finally the criterion for $e$ becomes
$gcd(e,\varphi(n))==1$ and $gcd(e-1,\varphi(p))==2$ and $gcd(e-1,\varphi(q))==2$
The following one-liner Pari-gp gives the correct solution.

criterion(p,q)=
{
   r = eulerphi(p*q);
   res = Mod(1,1337694213377816);
   for(i=1,r,
      if(gcd(i+1,r)==1,
         if(gcd(i,p-1)==2 && gcd(i,q-1)==2, /*else*/
            res = res*Mod(i+1, 1337694213377816);
         );
      );
   );
   return(lift(res));
}

Finally the fastest to find the solution was Eric Dangereux based on the number of divisors of $e-1$ :
$x^e mod x == n$ means $x^{e-1} == 1 mod n$
To have a minimum of solutions, $e-1$ must have 1 common factor with $\varphi(n)$ . That can be quickly brute forced.
Something like that may work:

res = 1
for e in range(2, phi):
   if number.GCD(e, phi) == 1:
      if count_divisors(e - 1) == 1:
         res = (res * e) % 1337694213377816
print('solution:', res)

Notes:

This problem was freely inspired (directly copied) from the project euler 182 which used to be my hobby :).
A few references have accidentally been inserted in the problem statements to the leet speak (1337), to the ISO7816 and to the Security Exception APDU status word (6982) very well known in the smartcard world. Sorry for the inconvenience.

* Solution CTF #2 *

This challenge was surprisingly the most difficult.
This was a kind of white-box based on a standard AES. A few countermeasures freely inspired from the Secure Hardware Implementation were added, and more standard countermeasures were added.
The source code was not given. Only a binary compiled for Linux-64 was provided. The binary was not striped which greatly eased the reverse.
This setup led to a challenge mixing easy white-box crypto with easy Reverse Engineering.
The program flow was quite simple.
It takes 16 bytes as input, computes an AES with a fixed key which is embedded in the program and compares to a constant

exp=0x13376942133769421337694213376942

It was thus asked to find the plaintext for which the encryption equals exp
A few countermeasures were added:

1 real AES among 15 (14 dummy AES)

The program takes 16 bytes as input, and apply them 15 permutations.
The permutations are the same by pair, except the 8th which is alone and is the identity.
These 15 new plaintexts will then be encrypted and the outputs are xored.
The pairs cancels out and thus the real AES is output

int perms[NCIPHER][15] = {
   {13,14,6,8,11,12,0,9,1,10,15,7,4,2,5,3},
   {13,6,9,7,15,5,8,2,10,1,12,14,4,3,11,0},
   {7,9,3,4,13,6,1,10,11,15,0,2,12,8,5,14},
   {15,12,2,8,10,4,9,7,14,11,0,1,13,5,3,6},
   {0,2,14,13,6,5,4,12,9,10,11,8,3,7,15,1},
   {4,14,0,8,9,6,3,2,5,15,10,12,1,13,11,7},
   {8,13,3,14,12,6,5,7,11,2,9,1,0,15,10,4},
   {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}, // This one contains the correct input
   {13,14,6,8,11,12,0,9,1,10,15,7,4,2,5,3},
   {13,6,9,7,15,5,8,2,10,1,12,14,4,3,11,0},
   {7,9,3,4,13,6,1,10,11,15,0,2,12,8,5,14},
   {15,12,2,8,10,4,9,7,14,11,0,1,13,5,3,6},
   {0,2,14,13,6,5,4,12,9,10,11,8,3,7,15,1},
   {4,14,0,8,9,6,3,2,5,15,10,12,1,13,11,7},
   {8,13,3,14,12,6,5,7,11,2,9,1,0,15,10,4},
};
   word8 m[NCIPHER][16];
   for(int i = 0; i < 16; i++){
      char hx[2];
      int curHx;
      hx[0] = argv[1][2*i];
      hx[1] = argv[1][2*i+1];
      curHx = strtol(hx,NULL,16);
      for(int j=0; j< NCIPHER; j++){
         m[j][perms[j][i]] = (unsigned char) curHx;
      }
   }

Random scheduling

The 15 AES are not computed one after the other but mixed. Each AES computation has 40 sub functions to complete.
A random number is generated, it decides which of the 15 AES performs one step ahead.
One step is either a masking/unmasking, Subbytes+addroundkey, Mixcolumns of shiftrow operation.

// One more step.
// Simple state machine. retrieve the current operation for the cipher number r.
// perform the next step / and increment the state for this encryption
inline void oneStep(int r){
   int op = scheduling[r];
   switch(op){
      case 0: // masking
         addroundkey(states[r], mask);
      break;
      case 1: case 5: case 9: case 13: case 17: case 21: case 25: case 29: case 33: case 37: // subbytes
         subAndAdd(states[r], mask, (int) (op/4) );
      break;
      case 2: case 6: case 10: case 14: case 18: case 22: case 26: case 30: case 34: case 38: // SR
         shiftrows(states[r]);
      break;
      case 3: case 7: case 11: case 15: case 19: case 23: case 27: case 31: case 35: // MX
         mixcolumns(states[r]);
      break;
      case 4: case 8: case 12: case 16: case 20: case 24: case 28: case 32: case 36: // masking MX
         addroundkey(states[r],mask3);
      break;
      case 39: // unmasking
         addroundkey(states[r],shiftedmask);
      break;
   }
   scheduling[r]++;
}
// random scheduling
inline void schedule(){
   bool finishJob = false;
   // Job is finished when all the 15 encryption have been performed
   while(!finishJob){
      // Generate a random deciding which of the 15 encryptions performs one step ahead
      int r = rand() % NCIPHER;
      // if the encryption is not finished yet
      while(scheduling[r] == ENDCIPHER) r = rand() % NCIPHER;
      // Check if debug is on going
      struct timespec now, tmstart;
      clock_gettime(CLOCK_REALTIME, &tmstart);
      // one step ahead for encryption number r
      oneStep(r);
      clock_gettime(CLOCK_REALTIME, &now);
      double msec = (double)((now.tv_sec+now.tv_nsec*1e-9) - (double)(tmstart.tv_sec+tmstart.tv_nsec*1e-9));
      msec *= 1000;
      // If more than 500 ms -> a fault is infected onto the whole cipher using the masks
      for(int i = 0; i < 4; i++){
         for(int j = 0; j < 4; j++){
            mask3[i][j] ^= (unsigned char) (msec / 500) * (rand() % 255);
            mask[i][j] ^= (unsigned char) (msec / 500) * (rand() % 255);
         }
      }
      // Job is finished when all the 15 encryption have been performed
      finishJob = true;
      for(int i = 0; i< NCIPHER; i++)
         finishJob &= (scheduling[i] == ENDCIPHER);
   }
}

Masking

The whole AES computation is masked using a boolean masking mechanism. The same mask is used during the whole computation allowing 2nd order attacks. This is a 128 bits mask which is randomly generated at the beginning of the encryption.
– The Subbytes operation is modified to also include the addroundkey operation.
It computes Subbytes(s xor k) in a masked way:
it takes as input (s xor m) -> Subbytes(s xor k) xor m
– Mixcolumns is a linear operation, meaning that Mixcolumns(s xor m) = Mixcolumns(s) xor Mixcolumns(m)
In order to keep the same mask all along the encryption, after the Mixcolumns operation, the state is xored with (Mixcolumns(m) xor m)

void genSboxes(word8 keys[11][4][4], word8 masks[4][4]){
    for(int r = 0; r < 11; r++)
       for(int j = 0; j < 4; j++)
          for(int k = 0; k < 4; k++)
             for(int l = 0; l < 256; l++)
                for(int m = 0; m < 256; m++){
                    if(r == 9) sboxes[r][j][k][l][m] = S[m ^ l ^ keys[r][j][k]] ^ zecrc[j*4+k] ^ m ^ keys[10][j][k];
                    else sboxes[r][j][k][l][m] = S[m ^ l ^ keys[r][j][k]] ^ zecrc[j*4+k] ^ m;
                }
}

Integrity detection

Just before triggering the encryption, the executable computes the xor of all 128-bit words. It gives a ‘crc’ value which is used as an additional input for the Sbox.
This value is used as an infective fault countermeasure. Each Sbox has been computed using a particular value of crc.
So, if the attacker modifies the executable, it will give a wrong value.
In order to have a deterministic value of this crc after the compilation, I patched the executable in order to make it the right value for the crc.

   // compute a crc-like of the launched executable
   std::ifstream input( argv[0], std::ios::binary );
   std::vector buffer((
        std::istreambuf_iterator(input)),
        (std::istreambuf_iterator()));
   for(int i = 0; i< (int) buffer.size(); i++){
      crc[i%16] ^= buffer[i];
   }

inline void subAndAdd(word8 a[4][4], word8 mask[4][4], int r) {
   int i, j;
   for(i = 0; i < 4; i++)
      for(j = 0; j < 4; j++) a[i][j] = sboxes[r][i][j][a[i][j]][mask[i][j]] ^ crc[i*4+j];
}

Debug detection

This countermeasure consists in measuring duration between 2 scheduling and dynamically change the value of the masks inducing an infective fault countermeasures.
This countermeasure has not been efficient at all, mainly because the timings were not tight enough.

Solutions of CTF#2.

Several very different solutions have been implemented. All the solutions started by a first analysis of the binary to understand the overall function. As the executable was not striped, this first analysis was easy.
After that, different approaches have been implemented.

The Hax0r revers0rs did not see what was difficult, launched IDA, wrote a few lines of GDB Python, allowing to dump the Sbox and extract the key.
Less experimented revers0rs provided a more significant effort to reverse the first round sboxes

Interesting solutions came with DFA (Differential Fault Analysis) implementations. DFA consists in injecting errors during a cryptographic operation, in order to obtain faulty results, and to compare faulted outputs with correct ones.The attackers induce faults using GDB during the computation, retrieve the faulty result and then executed their favorite AES DFA implementation. DFA on AES is a very powerful method to extract a key. The best attack allows to extract the full key inducing a single fault on the 8th round of an AES computation.

SCA A few participants implemented a Side Channel Attack. The principle of Side Channel Attacks on Hardware is to retrieve a correlation between the data manipulated by the device and a physical measurement (such as Power Consumption). Here, this is a Software implementation, so, instead of measuring power on a device, the contestants used directly the value manipulated by the software implementation. For instance, Boris Batteux implemented an interesting mixed solution. He first reversed the binary to understand properly how it works. He then wrote an emulator of the initial implementation allowing to modify any part of it without being trapped by the countermeasures. He changed the rand function to make it deterministic. Finally He traced the value of the input of each of the different operations for the correct plaintext (dismissing the dummies). From these traces, he ran CPA on each byte to retrieve the key. Finally, He only had to decrypt the output using this key.

* Solution CTF #3 *

This problem was quite an easy problem for those who know Elliptic Curve Cryptography. Elliptic Curve Cryptography relies on the difficulty of the Discrete Logarithm Problem. It means that given a scalar d and a point on an Elliptic Curve $G$ . The operation $(d,G) \rightarrow dG$ is easy to compute while $(dG, G) \rightarrow d$ is very hard to compute.
This problem is known as the Elliptic Curve Discrete Logarithm Problem. If one is able to break this problem (in a efficient way), the security of Elliptic Curve Cryptography falls down.
In the case of ECDSA, the security of a signature relies on the unpredictability of the random $k$ .
In the $camcoin Implementation, choosing a k which depends on the time of the signature is definitely NOT a good idea. Once k is retrieved, it’s easy to retrieve the private key : $s = (y.k-h)/x$
To retrieve this k, 2 different strategies have been implemented:
– Brute force. Some contestants implemented a brute force attacks targeting the value of k. As it’s a timestamp and the range of time is quite short, this is completely doable.
– Baby Step Giant Step. One contestant claimed having implemented a modified version of Baby Step Giant Step. This algorithm is an algorithm used to compute discrete logarithm. But, on real world Elliptic Curve, it’s clearly not efficient enough to solve the discrete logarithm problem. However, in this algorithm, it’s possible to specify range of search which allows to retrieve the $k$ .
In fact, when we designed the problem we inserted two signatures using the same nonce $k$ on purpose.
Finally:
– Nonce Reuse Attack.
This is a well known attack, if we have two different signatures for which the same value of k is used, we can take advantage of this :
We have:
$x_1 = x_2 = kG$
$y_1 = k^{-1} * (H_1 + s.x_1) = k^{-1} H_1 + s . k^{-1} . x$
$y_2 = k^{-1} * (H_2 + s.x_1) = k^{-1} H_2 + s . k^{-1} . x$
Thus,
$y_1 - y_2 = k^{-1} * (H_1 - H_2)$
$k = (H_1 - H_2) / (y_1 -y_2)$
Once $k$ is retrieved, it’s possible to compute the secret s. Then it’s only a matter of computing a signature as it’s requested in the problem. And that was not the easiest part :). The different encodings were a bit annoying.