Blog

John Adams March 31, 2019

I’ve never seen Hamilton, but I’ve listened to the soundtrack many times. I’ve always pictured John Adams like this:

Which I realize now is not John Adams but Silas Adams from Deadwood, which I’ve been frantically rewatching in anticipation of the upcoming film.

John Adams seems to have looked quite different. Here’s a painting of him at 29:

Writing shouldn't feel this hard. March 31, 2019

A lot has been written about procrastination, perhaps as a way for the author to distract themselves from what they really want to write about. So I’ll be brief.

For me it really comes down to a mismatch between feeling productive and actually being productive. I want to write, but first I want to make sure I have the perfect environment for writing: perfect text editor, perfect-looking website, perfect publish process, and so on.

I do enjoy each of those activities and they have some value, but the amount of time I’m willing to spend doing them is insanely disproportionate with the benefit they bring. But doing them makes me feel productive, so I keep at it. In an attempt to remove obstacles from writing, I end up creating obstacles in my mind.

Blogging. March 31, 2019

I decided to try out blogging again. This time I’ll try not to spend too much time researching static site generators and reorganizing files. At least not after the initial setup.

I’m going to try Hugo instead of Jekyll, which I used on the old blog. The reason is that Hugo is the new hotness. I could say that it’s because it supports org-mode natively, but I’m actually going with ox-hugo (exporting to Markdown) because it’s apparently more powerful.

The getting started guide on Hugo’s page is simple enough. Unfortunately the theme it recommends seems to strip out all markup, which led me to some confusion: I thought something was wrong with my system, but it’s the theme that’s wrong. I know this because I tried Minimal (which turns out not to be that minimal) and the markup was all there.

I guess I’ll eventually have to write my own theme (or more realistically, tweak a nice-looking, functioning theme). This initial experience suggests that it’ll be a bumpy road.

John Dolan's "The War Nerd Iliad" January 8, 2018

I could never get into poetry. Whenever I read a poem I wind up only paying attention to form and style and miss whatever the poem is trying to say. I’m sure there is a way to learn how to read poems, but I’ve never managed to find it.

This of course leaves me out of a lot of literature I’d like to know better, like much of Ancient Greece, The Canterbury Tales, Os Lusíadas, and Shakespeare plays (excluding the ones I had the opportunity to see performed). The closest I can generally get to those works is by reading summaries on Wikipedia.

So I was excited when I heard that John Dolan, who I’d known for his straightforward and insightful analysis of war (usually under his nickname The War Nerd), was writing a prose translation of the Iliad. After a few years of waiting I finally had the chance to read it, and I have a few thoughts.

(Disclaimer: prior to reading this I knew very little about the Iliad, Homer, or indeed Greek mythology in general, besides references in modern popular culture. So a lot of what I say might sound obvious, outrageous or just wrong.)

I was familiar with the story of the Trojan War, mostly via the 2004 movie Troy. I remember a common criticism of the movie was that the gods don’t show up at all, whereas they are prevalent in the original text. I imagined the Iliad would have gods do things like God in the Old Testament: bringing a huge flood, causing ten plagues, or sending out a bear to kill children. But in fact Greek gods are not just occasionally doing things to people, they are the main drivers of action. And they are of course extremely human. Indeed they are just like humans, just a lot more powerful and better connected. As Dolan notes in the introduction, they’re more like the Sopranos than the Christian god.

As powerful as Hektor and Akilles are, they always need help from Zeus, Apollo, Athena or Thetis. And the gods are always meddling in human affairs, often for petty reasons that are nevertheless well-established in the text (often disliking a relative or not receiving enough sacrifices). This differentiates the use of gods here from the artifice of deus ex machina that was used in the tragedies of Classical Greece.

I’m convinced that prose it’s the best way to consume this work. If the Iliad really evolved over centuries as oral tradition, as modern scholars conjecture, and was not conceived as poetry by a single person, as Classical scholar assume, then a prose translation is the most true to its origins, given that people tell stories as prose these days.

Eroom's law May 23, 2017

Eroom’s law, named after Moore’s law spelled backwards, predicts that discovering drugs is becoming more difficult with time; more specifically, the average cost of getting a drug approved by the FDA doubles every 9 years.

If Moore’s law has fueled a large part of the technological revolution of the past 50 years, as well as speculation about what technology still has to bring us, then Eroom’s law reflects a lackluster performance in pharmacology, and predicts a grimmer future for it still.

Why does Eroom’s law hold? One easy explanation that jumps to mind is that of course, discovering things gets harder with time. But that might be incorrect; Moore’s law provides a striking counterexample for that. Fortunately, the paper on Eroom’s law provides a thorough discussion of the possible issues.

Fixed vs. growth mindset and programming April 30, 2017

I recently listened to the Greater than Code episode with Felienne Hermans. A lot of interesting points are raised during the episode, but the one that really stuck with me was the question of how the way programming is perceived in society can drive people (in particular women) from it. Felienne says that the more people perceive skill as coming from innate ability, the more women are driven away from it.

This is of course related to the idea of fixed vs. growth mindset and the work of Carol Dweck. A person with fixed mindset believes that skills are innate, whereas a growth mindset postulates that people learn skills mainly through effort. I still haven’t read the research (or its criticism, for example Scott Alexander’s several long posts on the subject), but I want to offer my own view of it before I’m inevitably proven wrong by hard numbers.

I believe most people have a mixture of both mindsets, in a specific way: they believe in the growth mindset in general, but when it comes to concrete skills, fixed mindset sets in. The belief in growth mindset is exemplified by inspirational phrases and quotes such as “Genius is 10 percent inspiration and 90 percent transpiration”. These phrases come in particularly handy whenever people see others being a lot more skilled than they are at various tasks.

But where growth mindset serves as a coping mechanism, people think with a fixed mindset when choosing which skills to learn, or, more importantly, which skills not to learn. When I hear non-mathematically inclined people talking about their experience with mathematics, they most commonly say things like “I was never good at math” or “math wasn’t for me”. There’s never any mention of the fact that they at some point made an effort to learn the subject, and that at a later point they chose to stop making that effort.

Mathematics might be an extreme example of fixed mindset domination, but it really shows up whenever people decide to stop learning a skill. Math is just something everyone who went to school had an experience with, with many finding it unpalatable. The fact that the fixed mindset effectively excludes certain skills from people’s radar should be of more concern, especially if you’re (as I am) less interested in measuring how well people learn and more interested in how and why people choose different things to learn.

This ties us back to the discussion on Greater than Code. Regardless of how the effectiveness of teaching is influenced by the two mindsets, it seems clear that in order to teach people technical skills like programming, the growth mindset must be encouraged. It’s embarrassing that it’s never been this easy to start programming, and yet the programmer class is still very much a closed, homogeneous club. As Felienne notes, the way forward is removing the barriers that act as gatekeepers but aren’t really fundamental to programming. She talks a bit about the more algorithmic, CS-y aspects of it, which I agree are massively overrated; I’d also point to the emphasis on tools, which generally get a lot more attention than they deserve.

It’s no coincidence that these two aspects relate back to two other male-dominated fields, namely math and (real) engineering. Fortunately programming allows for a lot more variety than that, and I’m always excited to see what programs people come from neither of these backgrounds are writing. I particularly like the parallels Felienne draws between programming and writing (beyond the obvious ones); certainly we want everyone to be able to write, not just the ones who come from literary backgrounds.

C++ doesn't seem too bad these days November 18, 2016

I recently attended a lecture by Bjarne Stroustrup on the future of C++. As someone who hasn’t written any C++ since the days of the C++98 standard, and so still sees the STL as the main feature of the language, I was curious to see what has been developed since then. I knew that it now has lambda expressions and a primitive form of type inference, but not much beyond that. Also, as someone who was exposed to Stroustrup’s book in college, I was curious to see him speaking.

To be honest, I was expecting a cranky, old, cult-like leader figure. But now I don’t have that impression at all. While I still see some hubris there — when asked about possible design failures, he didn’t recognize anything major, besides focusing too much on multiple inheritance —, he seemed to be just a guy who wants to provide the highest possible level of low (or zero) cost abstractions.

As to the future of C++, I’m actually a bit optimistic. It seems that the standards committee has recognized the competition (and here I’m thinking of Rust) and added quite a few nice features to the language. For instance, concepts (which seem to be similar to Rust traits), error types and ranges. And it looks like ASIO is finally making it into the language. It might not seem like much, but for a language that was originally designed as an extension of C (and thus was more-or-less doomed from the beginning) and has accumulated a lot of cruft over the years, it’s understandable that things take a long time.

Stroustrup says that the next challenges for him are concurrency, and complete type and resource safety. I’m excited to see the solutions the committee will come up with, although I’ll probably have to wait a few years for that. After the talk, someone revealed his lack of enthusiasm about the new standards (which are supposed to arrive every 3 years now, according to Stroustrup) by commenting that compiler vendors for embedded devices are very slow at implementing them. Which is a bit of a downer, especially considering C++’s stated goal of being the fastest language around. But at least clang and gcc seem to be very eager to implement the new changes, so there’s no good reason not to try them out if you have access to these compilers.

Lattices and quantum complexity October 5, 2016

A lot of cryptography has been developed based on assumptions that some problem is computationally hard. For lattice-based cryptography, the hardness assumption is on that some problem on lattices—usually the problem is finding the shortest vector in the lattice (SVP).

There’s an obvious caveat in every computational assumption—what if it’s wrong?—, but often there are good reasons to believe it, at least compared to other assumptions. Under this lens, lattices look pretty good: there haven’t been any decent advances in classical algorithms for SVP in decades, whereas one cannot say the same about factoring or discrete log. (Of course, that might just be because fewer people care about lattices now.) Perhaps more importantly, nobody has succeeded at using quantum computers to improve the situation, compared to a complete break for factoring and discrete log.

Why should anyone expect quantum computers to help? As usual in complexity, there’s a lot of conjectures everywhere, but some are more plausible than others, and we can use them to guide us. The first thing is that approximating SVP (also known as GapSVP) within a polynomial factor is known not to be NP-hard, unless NP=coNP. Hence it is a candidate for being in the strange class of problems that are neither in P nor are NP-complete. Other interesting candidates are graph isomorphism as well as our friends factoring and discrete log.

Now if we combine the widely-believed conjecture that NP ≠ coNP with the also widely-believed conjecture that NP ⊄ BQP, then there’s some hope that we can find a quantum algorithm that solves GapSVP. (If those conjectures are false, then an algorithm would be a huge breakthrough in complexity theory, which is considered unlikely).

The hidden subgroup problem (HSP)

Another point is that SVP is connected to the hidden subgroup problem, which has been the source of a few interesting positive results in quantum complexity theory.

Let \(G\) be a finite group and \(H \leq G\). Assume we have oracle access to a function \(f \colon G \to \{0,1\}^* \) which is promised to be constant on each coset of \(H\) (we say that \(f\) *hides* \(H\)). HSP asks us to find \(H\).

HSP is perhaps better seen as a framework. Some problems for which quantum computers seem to help can be seen as instances of it. For example, consider Simon’s problem, where we’re given oracle access to a function \(f \colon \{0,1\}^n \to \{0,1\}^n\) which we are promised to be either 1-to-1 or 2-to-1, and we have to find out which is the case. Note that this is HSP over \(\mathbb{Z}_2^n\).

It has been shown that if you only have a classical computer, you need an exponential number of queries to \(f\), whereas with a quantum computer, you only need a linear number of queries, plus a polynomial-time computation. In other words, Simon’s problem is a case of a separation between classical and quantum query complexity.

More famously, Shor’s algorithm for factoring is very similar to the algorithm to solve Simon’s problem. This is because factoring can be seen to reduce to the problem of period finding, where we are given oracle access to \(f \colon \mathbb{Z} \to \mathbb{Z}\) and we have to find its period. Here we can also see the problem as finding a hidden subgroup, with the caveat that since \(\mathbb{Z}\) is not finite, the problem doesn’t quite fit in HSP. In any case, this is another case where quantum computers since to help, although this doesn’t prove a separation since we don’t know whether factoring is not in P.

As mentioned, the relevance of HSP for lattices is that GapSVP reduces to the dihedral coset problem, which is essentially HSP over the dihedral group \(D_n\). (I say “essentially” because the problem also asks that the solution be done by the “standard method” for HSP: namely, sampling cosets.) Another interesting reduction is from graph isomorphism to HSP over the symmetric group \(S_n\). As noted, graph isomorphism is unlikely to be NP-hard, so it might be that these specific instances of the HSP are not without reach, at least from the quantum point of view.

LWE and quantum reductions

Nevertheless, just like in the classical case, there hasn’t been a lot of progress in quantum algorithms for SVP. In light of that, some people have used the quantum hardness of the problem as an assumption. This enables a reduction from LWE to GapSVP.

The reduction, as shown by Regev, is roughhly as follows. Assume we have an oracle for the LWE problem on with error distribution \(\mathcal{N}(0, n)\). The first point is that it is sufficient to solve the discrete Gaussian sampling (DGS) problem: given a lattice \(L\) and \(r > 0\), sample a point from \(D_{L,r}\), the Gaussian distribution on \(r\), where each point is distributed according to a Gaussian distribution with variance proportional to \(r\). Here, \(r\) is chosen large enough that \(D_{L,r}\) behaves somewhat like a continuous Gaussian distribution, but not too large.

The algorithm proceeds iteratively. We start with a set of \(\text{poly}(n)\) samples from \(D_{L,R}\), where \(R\) is large (say \(R = r 2^n\)), and end with a set of \(\text{poly}(n)\) samples from \(D_{L,r}\). Because \(R\) is very large, it is possible to sample from it efficiently.

First the algorithm uses its \(\text{poly}(n)\) samples to produce a sample from \(D_{L,R'}\), where \(R' < R/2\). This is done by using the LWE oracle and a quantum algorithm. Repeating this enough times, we get \(\text{poly}(n)\) samples from \(D_{L,R'}\). Now we can repeat this procedure \(n\) times to get \(\text{poly}(n)\) samples from \(D_{L,R/(2^n)} = D_{L,r}\).

In a bit more detail, each iteration step is as follows. Assume we have the samples from \(D_{L,R}\). We use the LWE oracle to solve a CVP instance in the dual lattice \(L^*\). Then, we prepare a quantum state representing the (classical) Fourier transform of the lattice. This can be done by using the CVP solution. Finally, we perform a quantum Fourier transform to get a state corresponding to \(D_{L,R'}\).

It is interesting that there’s only a small (although repeated many times) quantum step in the reduction, which yet seems difficult to remove. As Regev notes, giving an oracle for the CVP instance seems completely useless in the classical case: in order to properly use the oracle, we’d have to use information we already know. But with a quantum computer, the oracle enables us to perform a reversible computation on a quantum superposition, which in this case gives us the desired result.

It also makes me wonder whether one could get more interesting reductions by delving more deeply into the quantum world. Of course classical reductions are in general more convincing, but to me that just reflects the fact that we are a lot more uncertain about quantum computers. But quantum computers aren’t all-powerful, and it’d be interesting to see more stuff based on quantum hardness.

Lattice-based cryptography September 2, 2016

I’ve been learning a bit about lattice-based cryptography. Specifically, constructions based on Ring-LWE. These constructions have been getting a bit of attention lately because they can be made quite efficient, and it is hoped that they are post-quantum.

By “post-quantum”, cryptographers mean cryptosystems that would answer the question “If someone built a large, universal quantum computer now, what could we still use?” Note that RSA and Diffie-Hellman are out, since factoring and discrete log are fast in a quantum computer, and the same is true for elliptic cryptography. Note also that they do not mean algorithms that will never be broken by quantum computers, as such a result would be a major breakthrough in complexity theory; one still uses assumptions. The assumption for lattice-based cryptography is that the problem of finding the shortest vector in a lattice (SVP) is hard even for quantum computers.

Ring-LWE is inspired by a problem in machine learning: given a list of input-output pairs from an unknown function $f$, where some random noise is added to the output, find $f$. Ring-LWE is essentially this problem in a ring. It is assumed to be hard because there’s a reduction from SVP to it. Also, there have been efficient implementations of cryptosystems based on the assumption that it is hard, and it looks like there’s still room to improvement. So this area seems to have the very nice qualities of being both interesting for research and close to practical.

\(\Lambda \circ \lambda\)

I was particularly delighted when I saw this paper detailing a framework for doing Ring-LWE in Haskell. I have always been a fan of Haskell, especially for math-y stuff, so I was very curious to know how it works, and, more importantly, how I can work with it.

The first obstacle is that the (very well-written) paper explains how the framework works, but not how one uses it. Which is fair: this is a research paper, not documentation. But it also meant that getting started with it would be a bit of a challenge. Especially as the actual documentation, which I suppose would be enough for a sufficiently experienced Haskell programmer, was also not enough for me.

My main issues seem to derive from the fact that the framework does some quite “advanced” use of the Haskell type system: GADTs, phantom types, kind polymorphism. These might very well be standard now—I haven’t followed the evolution of Haskell over the past few years. Back when I was learning the language, the main obstacle was understanding how monads work, and after that most code seemed more or less straightforward. But using the API requires being versed in these more advanced features. Of course, there is a good reason why they were used. After a quick look at the documentation, one will note that even things like integers are often supposed to be used as types. The purpose here is that checking things like whether an integer divides another can then be done at compile time. Which is nice, but also makes programming a bit more difficult to learn (which of course is partly the point: if you don’t know what you’re doing, your program won’t compile).

I suppose that’s one drawback of pandering to both researchers and practitioners: I am interested in the practice part, but a lot more has been written about the research. At any rate, I’m willing to help with the practice part. So I have been writing a tutorial of sorts on how to use $\Lambda \circ \lambda$ . I’ve been writing it mostly for myself, but I also want it to be useful to others, so I put it up on my GitHub page. It is pretty bare-bones as of now, but check this space in the future.

Attack on GCM in TLS August 18, 2016

An attack on TLS with AES-GCM was recently presented at BlackHat (slides, paper). The attack is relatively simple, so I thought it’d be a good idea for a first post to explain it.

AES-GCM is a block cipher introduced in TLS 1.2, the latest TLS standard. It was proposed after repeated discoveries of attacks on TLS. In light of that, researchers argued that a particularly weak link is in combining encryption and authentication. They thus set out to combine both of them in a single protocol; such protocols are commonly known by the acronym AEAD, for Authenticated Encryption with Associated Data. An AEAD encrypts only part of the data to be sent; however, all of it is authenticated.

As an AEAD, AES-GCM became a popular choice. Currently it is the most common one. It is not without weaknesses, however. During the standardization process, Antoine Joux pointed out that GCM is vulnerable to an attack if IVs are reused. This is the attack used in the BlackHat paper.

GCM

GCM follows the so-called encrypt-then-MAC paradigm: encrypt the plaintext with a block cipher, then use a MAC to authenticate the ciphertext along with the associated data, then send both the ciphertext and the authentication tag.

Say the plaintext is divided into \(n\) blocks: \(P_1,\dots,P_n\), the associated data is divided into \(m\) blocks: \(A_1,\dots,A_m\), and let \(K\) be the secret key. GCM proceeds as follows:

  1. compute \(J_i := J(\text{IV}, i)\) for \(i \in {0,\dots,n}\), where \(J\) is a counter function specified by the protocol, and IV is a value chosen by the application.
  2. compute \(C_i = \text{Enc}_K(J_i) \oplus P_i\) for \(i \in {1,\dots,n}\), where \(\text{Enc}_K\) is the encryption function for a symmetric-key algorithm (here, AES), used with key \(K\).
  3. let \(H=\text{Enc}(0^{128})\).
  4. compute \(X_0=0\), \(X_i = G_H(X_{i−1} \oplus A_i)\) for \(i \in {1,\dots,m}\), where \(G_H\) is defined as multiplication by \(H\) in the Galois field \(\text{GF}(2^{128})\).
  5. compute \(X_{i+m} = G_H(X_{i+m−1} \oplus C_i)\) for \(i \in {1,\dots,n}\).
  6. compute the authentication tag \(T = G_H(X_{m+n} \oplus L) \oplus S\) where \(L=\text{len}(A) | \text{len}(C)\) and \(S = \text{Enc}_K(J_0)\).

The attack

The attack is possible when an IV is reused, so that two authentication tags are computed using the same value \(S\).

First, note that \(T\) is given by computing the following polynomial in \(\text{GF}(2^{128})\):

$$g(X) = A_1 X^{m+n+1} + \dots + A_m X^{n+2} + C_1 X^{n+1} + \dots + C_n X^2 + L X + S$$

at the point \(X = H\).

Now assume that an IV is reused to compute two tags \(T_M,T_{M'}\) for two messages \(M, M'\). We can write \(T_M = g_M(H), T_{M'} = g_{M'}(H)\) with \(g_m(X) = f_m(X) + L_m X + S\), where \(f_m(X)\) is a polynomial on both the ciphertext \(c\) and associated data \(a\) and \(L_m=\text{len}(a) | len(c)\). Note that the attacker can see the ciphertext and associated data, so she knows \(f_M,f_{M′},L_M,L_{M′}\). It does not know \(S\), but \(S\) does *not* change with the message.

Note also that by computing \(g_M(X) + g_{M'}(X)\), we get a polynomial that depends only on values known to the attacker (since the field has characteristic 2, \(S + S = 0\)). Now let \(g'(X) = g_M(X) + g_{M'}(X) + T_M +T_{M'}\). This polynomial is also known to the attacker, and furthermore \(g'(H) = 0\). The attacker can thus factor \(g'(X)\) and try to guess \(H\). In principle \(g'(X)\) could have a large number of roots, since its degree is the number of blocks, but in practice it is usually small.

Once \(H\) is known, it is easy to forge messages. This makes it possible to do a man-in-the-middle attack on TLS. The attacker simply relays messages back and forth, recording the IVs used. Once an IV is reused, she performs the computation described above to learn \(H\).

The attacker can then redirect the client to a static page on the server. She can now easily replace the server’s response by its own messages, as follows. Since the contents of the page are known, she simply XORs the content with the server’s response and XORs the resulting string with her own message. Due to the way GCM is constructed, this will be a valid ciphertext. She then sends that along with an authentication tag (which she can generate on her own, now that she knows \(H\)) to the client.

The attack in the real world

As remarked by Joux, IVs should never be reused, so the fix is easy from the theoretical point of view: just don’t reuse IVs! Nevertheless, the authors have found a number of vulnerable devices. They note that most of their attempts to contact the owners—some of which were pretty big companies, like VISA—were unsuccessful. This feels frustrating to me—does VISA not have people working on Internet security?—, but for them it’s probably just a matter of cost.