|author||David Benjamin <email@example.com>||Sun Oct 20 10:53:17 2019 -0400|
|committer||CQ bot account: firstname.lastname@example.org <email@example.com>||Fri Jan 03 16:41:59 2020 +0000|
Replace aes_nohw with a bitsliced implementation. aes_nohw is currently one of several variable-time table-based implementations in C or assembly (armv4, x86, and x86_64). Replace all of these with a C bitsliced implementation, with 32-bit, 64-bit, and 128-bit (SSE2) variants. This is based on the algorithms described in: https://bearssl.org/constanttime.html#aes https://eprint.iacr.org/2009/129.pdf https://eprint.iacr.org/2009/191.pdf This makes our AES implementation constant-time in all build configurations! There were far too many benchmarks to put in the commit message. Instead, please refer to this fancy spreadsheet: https://docs.google.com/spreadsheets/d/1wDCzfkPl7brfjWJKq55awQjwCPhOYI8O7zSQZuEc2Xg/edit?usp=sharing Parallel modes on x86 and x86_64 do fine due to the SSE2 code. AES-GCM actually gets faster. The 64-bit (4x) bitsliced implementation is less effective at speeding parallel modes but still helps. The 32-bit (2x) bitsliced implementation even less. Non-parallel modes, sadly, take a *dramatic* performance hit. I tried a constant-time table lookup for comparison, but bitslicing was still better. This implementation performs comparably to the table in BearSSL's documentation, which suggests I didn't do anything obviously wrong. (Note BearSSL's table for 'ct' corresponds to a 32-bit bitsliced implementation compiled for 64-bit. Compiling this implementation for 64-bit matches, but compiling it for 32-bit seems to be considerably slower.) Assumptions that may make this palatable: - AES-GCM is by far the most important AES mode, and we perform okay with it. Modern things aren't built out of CBC. - A nontrivial chunk of Chrome users on Windows don't have SSSE3 and would be affected by this change. They would get the SSE2 version which performs well for AES-GCM *and* is constant-time. - ARM devices are primarily mobile which cycles hardware much faster. Chrome for Android has required NEON for several years now, so it would not run this code. (Aside from https://crbug.com/341598.) - aarch64 mandates NEON, so it would not run this code. - QUIC packet number encryption does use a one-off block operation, but only once per packet. - Arguably this is undoing a performance gain that we never earned. That said, it was a dramatic performance gain in places. As an alternative, we could just check in the SSE2 version and drop the x86 and x86_64 table-based assembly, but this still leaves the generic code with cache-timing side channels. Change-Id: I0f4b4467a49790509503c529d7c0940318096a00 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/39206 Commit-Queue: Adam Langley <firstname.lastname@example.org> Reviewed-by: Adam Langley <email@example.com>
BoringSSL is a fork of OpenSSL that is designed to meet Google's needs.
Although BoringSSL is an open source project, it is not intended for general use, as OpenSSL is. We don't recommend that third parties depend upon it. Doing so is likely to be frustrating because there are no guarantees of API or ABI stability.
Programs ship their own copies of BoringSSL when they use it and we update everything as needed when deciding to make API changes. This allows us to mostly avoid compromises in the name of compatibility. It works for us, but it may not work for you.
BoringSSL arose because Google used OpenSSL for many years in various ways and, over time, built up a large number of patches that were maintained while tracking upstream OpenSSL. As Google's product portfolio became more complex, more copies of OpenSSL sprung up and the effort involved in maintaining all these patches in multiple places was growing steadily.
Currently BoringSSL is the SSL library in Chrome/Chromium, Android (but it's not part of the NDK) and a number of other apps/programs.
There are other files in this directory which might be helpful: