Rewrite and tighten ASN1_INTEGER encoding and decoding.

This fixes several issues around ASN1_INTEGER handling. First, invalid
INTEGERs (not allowed in BER or DER) will no longer be accepted by
d2i_ASN1_INTEGER. This aligns with upstream OpenSSL, which became strict
in 6c5b6cb035666d46495ccbe4a4f3d5e3a659cd40, part of OpenSSL 1.1.0.

In addition to matching the standard, this is needed to avoid
round-tripping issues: ASN1_INTEGER uses a sign-and-magnitude
representation, different from the DER two's complement representation.
That means we cannot represent invalid DER INTEGERs. Attempting to do so
messes up some invariants and causes values to not round-trip correctly
when re-encoded. Thanks to Tavis Ormandy for catching this.

Next, this CL tidies the story around invalid ASN1_INTEGERs (non-minimal
and negative zero). Although we will never produce them in parsing, it
is still possible to manually construct them with ASN1_STRING APIs.
Historically (CVE-2016-2108), it was possible to get them out of the
parser, due to a different bug, *and* i2d_ASN1_INTEGER had a memory
error in doing so. That different bug has since been fixed, but we
should still handle them correctly and test this. (To that end, this CL
adds a test we ought to have added importing upstream's
3661bb4e7934668bd99ca777ea8b30eedfafa871 back in
c4eec0c16b02c97a62a95b6a08656c3a9ddb6baa.)

As the two's complement invariants are subtle as it is, I've opted to
just fix the invalid values before encoding. However, invalid
ASN1_INTEGERs still do not quite work right because ASN1_INTEGER_get,
ASN1_INTEGER_cmp, and ASN1_STRING_cmp will all return surprising values
with them. I've left those alone.

Finally, that leads to the zero value. Almost every function believes
the representation of 0 is a "\0" rather than "". However, a
default-constructed INTEGER, like any other string type, is "". Those do
not compare as equal. crypto/asn1 treats ASN1_INTEGER generically as
ASN1_STRING enough that I think changing the other functions to match is
cleaner than changing default-constructed ASN1_INTEGERs. Thus this CL
removes all the special cases around zero.

Update-Note: Invalid INTEGERs will no longer parse, but they already
would not have parsed in OpenSSL. Additionally, zero is now internally
represented as "" rather than "\0".

Bug: 354
Change-Id: Id4d51a18f32afe90fd4df7455b21e0c8bdbc5389
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51632
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
diff --git a/crypto/asn1/a_int.c b/crypto/asn1/a_int.c
index 833ca7c..e75338f 100644
--- a/crypto/asn1/a_int.c
+++ b/crypto/asn1/a_int.c
@@ -59,6 +59,7 @@
 #include <string.h>
 #include <limits.h>
 
+#include <openssl/bytestring.h>
 #include <openssl/err.h>
 #include <openssl/mem.h>
 
@@ -94,111 +95,88 @@
     return ret;
 }
 
-/*
- * This converts an ASN1 INTEGER into its content encoding.
- * The internal representation is an ASN1_STRING whose data is a big endian
- * representation of the value, ignoring the sign. The sign is determined by
- * the type: V_ASN1_INTEGER for positive and V_ASN1_NEG_INTEGER for negative.
- *
- * Positive integers are no problem: they are almost the same as the DER
- * encoding, except if the first byte is >= 0x80 we need to add a zero pad.
- *
- * Negative integers are a bit trickier...
- * The DER representation of negative integers is in 2s complement form.
- * The internal form is converted by complementing each octet and finally
- * adding one to the result. This can be done less messily with a little trick.
- * If the internal form has trailing zeroes then they will become FF by the
- * complement and 0 by the add one (due to carry) so just copy as many trailing
- * zeros to the destination as there are in the source. The carry will add one
- * to the last none zero octet: so complement this octet and add one and finally
- * complement any left over until you get to the start of the string.
- *
- * Padding is a little trickier too. If the first bytes is > 0x80 then we pad
- * with 0xff. However if the first byte is 0x80 and one of the following bytes
- * is non-zero we pad with 0xff. The reason for this distinction is that 0x80
- * followed by optional zeros isn't padded.
- */
-
-int i2c_ASN1_INTEGER(const ASN1_INTEGER *a, unsigned char **pp)
+/* negate_twos_complement negates |len| bytes from |buf| in-place, interpreted
+ * as a signed, big-endian two's complement value. */
+static void negate_twos_complement(uint8_t *buf, size_t len)
 {
-    int pad = 0, ret, i, neg;
-    unsigned char *p, *n, pb = 0;
-
-    if (a == NULL)
-        return (0);
-    neg = a->type & V_ASN1_NEG;
-    if (a->length == 0)
-        ret = 1;
-    else {
-        ret = a->length;
-        i = a->data[0];
-        if (ret == 1 && i == 0)
-            neg = 0;
-        if (!neg && (i > 127)) {
-            pad = 1;
-            pb = 0;
-        } else if (neg) {
-            if (i > 128) {
-                pad = 1;
-                pb = 0xFF;
-            } else if (i == 128) {
-                /*
-                 * Special case: if any other bytes non zero we pad:
-                 * otherwise we don't.
-                 */
-                for (i = 1; i < a->length; i++)
-                    if (a->data[i]) {
-                        pad = 1;
-                        pb = 0xFF;
-                        break;
-                    }
-            }
-        }
-        ret += pad;
+    uint8_t borrow = 0;
+    for (size_t i = len - 1; i < len; i--) {
+        uint8_t t = buf[i];
+        buf[i] = 0u - borrow - t;
+        borrow |= t != 0;
     }
-    if (pp == NULL)
-        return (ret);
-    p = *pp;
-
-    if (pad)
-        *(p++) = pb;
-    if (a->length == 0)
-        *(p++) = 0;
-    else if (!neg)
-        OPENSSL_memcpy(p, a->data, (unsigned int)a->length);
-    else {
-        /* Begin at the end of the encoding */
-        n = a->data + a->length - 1;
-        p += a->length - 1;
-        i = a->length;
-        /* Copy zeros to destination as long as source is zero */
-        while (!*n && i > 1) {
-            *(p--) = 0;
-            n--;
-            i--;
-        }
-        /* Complement and increment next octet */
-        *(p--) = ((*(n--)) ^ 0xff) + 1;
-        i--;
-        /* Complement any octets left */
-        for (; i > 0; i--)
-            *(p--) = *(n--) ^ 0xff;
-    }
-
-    *pp += ret;
-    return (ret);
 }
 
-/* Convert just ASN1 INTEGER content octets to ASN1_INTEGER structure */
+static int is_all_zeros(const uint8_t *in, size_t len) {
+    for (size_t i = 0; i < len; i++) {
+        if (in[i] != 0) {
+            return 0;
+        }
+    }
+    return 1;
+}
 
-ASN1_INTEGER *c2i_ASN1_INTEGER(ASN1_INTEGER **a, const unsigned char **pp,
+int i2c_ASN1_INTEGER(const ASN1_INTEGER *in, unsigned char **outp)
+{
+    if (in == NULL) {
+        return 0;
+    }
+
+    /* |ASN1_INTEGER|s should be represented minimally, but it is possible to
+     * construct invalid ones. Skip leading zeros so this does not produce an
+     * invalid encoding or break invariants. */
+    int start = 0;
+    while (start < in->length && in->data[start] == 0) {
+        start++;
+    }
+
+    int is_negative = (in->type & V_ASN1_NEG) != 0;
+    int pad;
+    if (start >= in->length) {
+        /* Zero is represented as a single byte. */
+        is_negative = 0;
+        pad = 1;
+    } else if (is_negative) {
+        /* 0x80...01 through 0xff...ff have a two's complement of 0x7f...ff
+         * through 0x00...01 and need an extra byte to be negative.
+         * 0x01...00 through 0x80...00 have a two's complement of 0xfe...ff
+         * through 0x80...00 and can be negated as-is. */
+        pad = in->data[start] > 0x80 ||
+              (in->data[start] == 0x80 &&
+               !is_all_zeros(in->data + start + 1, in->length - start - 1));
+    } else {
+        /* If the high bit is set, the signed representation needs an extra
+         * byte to be positive. */
+        pad = (in->data[start] & 0x80) != 0;
+    }
+
+    if (in->length - start > INT_MAX - pad) {
+        OPENSSL_PUT_ERROR(ASN1, ERR_R_OVERFLOW);
+        return 0;
+    }
+    int len = pad + in->length - start;
+    assert(len > 0);
+    if (outp == NULL) {
+        return len;
+    }
+
+    if (pad) {
+        (*outp)[0] = 0;
+    }
+    OPENSSL_memcpy(*outp + pad, in->data + start, in->length - start);
+    if (is_negative) {
+        negate_twos_complement(*outp, len);
+        assert((*outp)[0] >= 0x80);
+    } else {
+        assert((*outp)[0] < 0x80);
+    }
+    *outp += len;
+    return len;
+}
+
+ASN1_INTEGER *c2i_ASN1_INTEGER(ASN1_INTEGER **out, const unsigned char **inp,
                                long len)
 {
-    ASN1_INTEGER *ret = NULL;
-    const unsigned char *p, *pend;
-    unsigned char *to, *s;
-    int i;
-
     /*
      * This function can handle lengths up to INT_MAX - 1, but the rest of the
      * legacy ASN.1 code mixes integer types, so avoid exposing it to
@@ -209,85 +187,69 @@
         return NULL;
     }
 
-    if ((a == NULL) || ((*a) == NULL)) {
-        if ((ret = ASN1_INTEGER_new()) == NULL)
-            return (NULL);
-        ret->type = V_ASN1_INTEGER;
-    } else
-        ret = (*a);
-
-    p = *pp;
-    pend = p + len;
-
-    /*
-     * We must OPENSSL_malloc stuff, even for 0 bytes otherwise it signifies
-     * a missing NULL parameter.
-     */
-    s = (unsigned char *)OPENSSL_malloc((int)len + 1);
-    if (s == NULL) {
-        i = ERR_R_MALLOC_FAILURE;
-        goto err;
+    CBS cbs;
+    CBS_init(&cbs, *inp, (size_t)len);
+    int is_negative;
+    if (!CBS_is_valid_asn1_integer(&cbs, &is_negative)) {
+        OPENSSL_PUT_ERROR(ASN1, ASN1_R_INVALID_INTEGER);
+        return NULL;
     }
-    to = s;
-    if (!len) {
-        /*
-         * Strictly speaking this is an illegal INTEGER but we tolerate it.
-         */
-        ret->type = V_ASN1_INTEGER;
-    } else if (*p & 0x80) {     /* a negative number */
-        ret->type = V_ASN1_NEG_INTEGER;
-        if ((*p == 0xff) && (len != 1)) {
-            p++;
-            len--;
-        }
-        i = len;
-        p += i - 1;
-        to += i - 1;
-        while ((!*p) && i) {
-            *(to--) = 0;
-            i--;
-            p--;
-        }
-        /*
-         * Special case: if all zeros then the number will be of the form FF
-         * followed by n zero bytes: this corresponds to 1 followed by n zero
-         * bytes. We've already written n zeros so we just append an extra
-         * one and set the first byte to a 1. This is treated separately
-         * because it is the only case where the number of bytes is larger
-         * than len.
-         */
-        if (!i) {
-            *s = 1;
-            s[len] = 0;
-            len++;
-        } else {
-            *(to--) = (*(p--) ^ 0xff) + 1;
-            i--;
-            for (; i > 0; i--)
-                *(to--) = *(p--) ^ 0xff;
+
+    ASN1_INTEGER *ret = NULL;
+    if (out == NULL || *out == NULL) {
+        ret = ASN1_INTEGER_new();
+        if (ret == NULL) {
+            return NULL;
         }
     } else {
-        ret->type = V_ASN1_INTEGER;
-        if ((*p == 0) && (len != 1)) {
-            p++;
-            len--;
-        }
-        OPENSSL_memcpy(s, p, (int)len);
+        ret = *out;
     }
 
-    if (ret->data != NULL)
-        OPENSSL_free(ret->data);
-    ret->data = s;
-    ret->length = (int)len;
-    if (a != NULL)
-        (*a) = ret;
-    *pp = pend;
-    return (ret);
+    /* Convert to |ASN1_INTEGER|'s sign-and-magnitude representation. First,
+     * determine the size needed for a minimal result. */
+    if (is_negative) {
+        /* 0xff00...01 through 0xff7f..ff have a two's complement of 0x00ff...ff
+         * through 0x000100...001 and need one leading zero removed. 0x8000...00
+         * through 0xff00...00 have a two's complement of 0x8000...00 through
+         * 0x0100...00 and will be minimally-encoded as-is. */
+        if (CBS_len(&cbs) > 0 && CBS_data(&cbs)[0] == 0xff &&
+            !is_all_zeros(CBS_data(&cbs) + 1, CBS_len(&cbs) - 1)) {
+            CBS_skip(&cbs, 1);
+        }
+    } else {
+        /* Remove the leading zero byte, if any. */
+        if (CBS_len(&cbs) > 0 && CBS_data(&cbs)[0] == 0x00) {
+            CBS_skip(&cbs, 1);
+        }
+    }
+
+    if (!ASN1_STRING_set(ret, CBS_data(&cbs), CBS_len(&cbs))) {
+        goto err;
+    }
+
+    if (is_negative) {
+        ret->type = V_ASN1_NEG_INTEGER;
+        negate_twos_complement(ret->data, ret->length);
+    } else {
+        ret->type = V_ASN1_INTEGER;
+    }
+
+    /* The value should be minimally-encoded. */
+    assert(ret->length == 0 || ret->data[0] != 0);
+    /* Zero is not negative. */
+    assert(!is_negative || ret->length > 0);
+
+    *inp += len;
+    if (out != NULL) {
+        *out = ret;
+    }
+    return ret;
+
  err:
-    OPENSSL_PUT_ERROR(ASN1, i);
-    if ((ret != NULL) && ((a == NULL) || (*a != ret)))
+    if (ret != NULL && (out == NULL || *out != ret)) {
         ASN1_INTEGER_free(ret);
-    return (NULL);
+    }
+    return NULL;
 }
 
 int ASN1_INTEGER_set(ASN1_INTEGER *a, long v)
@@ -334,11 +296,10 @@
     out->type = type;
 
     size_t leading_zeros;
-    for (leading_zeros = 0; leading_zeros < sizeof(uint64_t) - 1;
-         leading_zeros++) {
-        if (out->data[leading_zeros] != 0) {
-            break;
-        }
+    for (leading_zeros = 0; leading_zeros < sizeof(uint64_t); leading_zeros++) {
+      if (out->data[leading_zeros] != 0) {
+        break;
+      }
     }
 
     out->length = sizeof(uint64_t) - leading_zeros;
@@ -410,41 +371,34 @@
                                       int type)
 {
     ASN1_INTEGER *ret;
-    int len, j;
-
-    if (ai == NULL)
+    if (ai == NULL) {
         ret = ASN1_STRING_type_new(type);
-    else
+    } else {
         ret = ai;
+    }
     if (ret == NULL) {
         OPENSSL_PUT_ERROR(ASN1, ASN1_R_NESTED_ASN1_ERROR);
         goto err;
     }
-    if (BN_is_negative(bn) && !BN_is_zero(bn))
+
+    if (BN_is_negative(bn) && !BN_is_zero(bn)) {
         ret->type = type | V_ASN1_NEG;
-    else
+    } else {
         ret->type = type;
-    j = BN_num_bits(bn);
-    len = ((j == 0) ? 0 : ((j / 8) + 1));
-    if (ret->length < len + 4) {
-        unsigned char *new_data = OPENSSL_realloc(ret->data, len + 4);
-        if (!new_data) {
-            OPENSSL_PUT_ERROR(ASN1, ERR_R_MALLOC_FAILURE);
-            goto err;
-        }
-        ret->data = new_data;
     }
-    ret->length = BN_bn2bin(bn, ret->data);
-    /* Correct zero case */
-    if (!ret->length) {
-        ret->data[0] = 0;
-        ret->length = 1;
+
+    int len = BN_num_bytes(bn);
+    if (!ASN1_STRING_set(ret, NULL, len) ||
+        !BN_bn2bin_padded(ret->data, len, bn)) {
+        goto err;
     }
-    return (ret);
+    return ret;
+
  err:
-    if (ret != ai)
+    if (ret != ai) {
         ASN1_STRING_free(ret);
-    return (NULL);
+    }
+    return NULL;
 }
 
 ASN1_INTEGER *BN_to_ASN1_INTEGER(const BIGNUM *bn, ASN1_INTEGER *ai)
diff --git a/crypto/asn1/asn1_test.cc b/crypto/asn1/asn1_test.cc
index 5ba2342..ae3ddc2 100644
--- a/crypto/asn1/asn1_test.cc
+++ b/crypto/asn1/asn1_test.cc
@@ -217,7 +217,7 @@
       // -1
       {{0x02, 0x01, 0xff}, V_ASN1_NEG_INTEGER, {0x01}, "-1"},
       // 0
-      {{0x02, 0x01, 0x00}, V_ASN1_INTEGER, {0x00}, "0"},
+      {{0x02, 0x01, 0x00}, V_ASN1_INTEGER, {}, "0"},
       // 1
       {{0x02, 0x01, 0x01}, V_ASN1_INTEGER, {0x01}, "1"},
       // 127
@@ -360,6 +360,13 @@
       }
     }
 
+    // Default construction should return the zero |ASN1_INTEGER|.
+    if (BN_is_zero(bn.get())) {
+      bssl::UniquePtr<ASN1_INTEGER> by_default(ASN1_INTEGER_new());
+      ASSERT_TRUE(by_default);
+      objs["default"] = std::move(by_default);
+    }
+
     // Test that every |ASN1_INTEGER| constructed behaves as expected.
     for (const auto &pair : objs) {
       // The fields should be as expected.
@@ -392,6 +399,21 @@
         EXPECT_EQ(0, ASN1_STRING_cmp(obj, pair2.second.get()));
       }
     }
+
+    // Although our parsers will never output non-minimal |ASN1_INTEGER|s, it is
+    // possible to construct them manually. They should encode correctly.
+    std::vector<uint8_t> data = t.data;
+    const int kMaxExtraBytes = 5;
+    for (int i = 0; i < kMaxExtraBytes; i++) {
+      data.insert(data.begin(), 0x00);
+      SCOPED_TRACE(Bytes(data));
+
+      bssl::UniquePtr<ASN1_INTEGER> non_minimal(ASN1_STRING_type_new(t.type));
+      ASSERT_TRUE(non_minimal);
+      ASSERT_TRUE(ASN1_STRING_set(non_minimal.get(), data.data(), data.size()));
+
+      TestSerialize(non_minimal.get(), i2d_ASN1_INTEGER, t.der);
+    }
   }
 
   for (size_t i = 0; i < OPENSSL_ARRAY_SIZE(kTests); i++) {
@@ -422,12 +444,41 @@
     }
   }
 
+  std::vector<uint8_t> kInvalidTests[] = {
+      // The empty string is not an integer.
+      {0x02, 0x00},
+      // Integers must be minimally-encoded.
+      {0x02, 0x02, 0x00, 0x00},
+      {0x02, 0x02, 0x00, 0x7f},
+      {0x02, 0x02, 0xff, 0xff},
+      {0x02, 0x02, 0xff, 0x80},
+  };
+  for (const auto &invalid : kInvalidTests) {
+    SCOPED_TRACE(Bytes(invalid));
+
+    const uint8_t *ptr = invalid.data();
+    bssl::UniquePtr<ASN1_INTEGER> integer(
+        d2i_ASN1_INTEGER(nullptr, &ptr, invalid.size()));
+    EXPECT_FALSE(integer);
+  }
+
   // Callers expect |ASN1_INTEGER_get| and |ASN1_ENUMERATED_get| to return zero
   // given NULL.
   EXPECT_EQ(0, ASN1_INTEGER_get(nullptr));
   EXPECT_EQ(0, ASN1_ENUMERATED_get(nullptr));
 }
 
+// Although invalid, a negative zero should encode correctly.
+TEST(ASN1Test, NegativeZero) {
+  bssl::UniquePtr<ASN1_INTEGER> neg_zero(
+      ASN1_STRING_type_new(V_ASN1_NEG_INTEGER));
+  ASSERT_TRUE(neg_zero);
+  EXPECT_EQ(0, ASN1_INTEGER_get(neg_zero.get()));
+
+  static const uint8_t kDER[] = {0x02, 0x01, 0x00};
+  TestSerialize(neg_zero.get(), i2d_ASN1_INTEGER, kDER);
+}
+
 TEST(ASN1Test, SerializeObject) {
   static const uint8_t kDER[] = {0x06, 0x09, 0x2a, 0x86, 0x48, 0x86,
                                  0xf7, 0x0d, 0x01, 0x01, 0x01};
diff --git a/crypto/err/asn1.errordata b/crypto/err/asn1.errordata
index 9490db2..8ba7cf6 100644
--- a/crypto/err/asn1.errordata
+++ b/crypto/err/asn1.errordata
@@ -44,6 +44,7 @@
 ASN1,194,INVALID_BIT_STRING_PADDING
 ASN1,142,INVALID_BMPSTRING
 ASN1,143,INVALID_DIGIT
+ASN1,196,INVALID_INTEGER
 ASN1,144,INVALID_MODIFIER
 ASN1,145,INVALID_NUMBER
 ASN1,146,INVALID_OBJECT_ENCODING
diff --git a/include/openssl/asn1.h b/include/openssl/asn1.h
index 5554a72..d9a9ca9 100644
--- a/include/openssl/asn1.h
+++ b/include/openssl/asn1.h
@@ -605,7 +605,9 @@
 OPENSSL_EXPORT int ASN1_STRING_cmp(const ASN1_STRING *a, const ASN1_STRING *b);
 
 // ASN1_STRING_set sets the contents of |str| to a copy of |len| bytes from
-// |data|. It returns one on success and zero on error.
+// |data|. It returns one on success and zero on error. If |data| is NULL, it
+// updates the length and allocates the buffer as needed, but does not
+// initialize the contents.
 OPENSSL_EXPORT int ASN1_STRING_set(ASN1_STRING *str, const void *data, int len);
 
 // ASN1_STRING_set0 sets the contents of |str| to |len| bytes from |data|. It
@@ -1014,6 +1016,12 @@
 // |V_ASN1_INTEGER| or |V_ASN1_ENUMERATED|, while negative values have a type of
 // |V_ASN1_NEG_INTEGER| or |V_ASN1_NEG_ENUMERATED|. Note this differs from DER's
 // two's complement representation.
+//
+// The data in the |ASN1_STRING| may not have leading zeros. Note this means
+// zero is represented as the empty string. Parsing functions will never return
+// invalid representations. If an invalid input is constructed, the marshaling
+// functions will skip leading zeros, however other functions, such as
+// |ASN1_INTEGER_cmp| or |ASN1_INTEGER_get|, may not return the correct result.
 
 DEFINE_STACK_OF(ASN1_INTEGER)
 
@@ -2040,5 +2048,6 @@
 #define ASN1_R_BAD_TEMPLATE 193
 #define ASN1_R_INVALID_BIT_STRING_PADDING 194
 #define ASN1_R_WRONG_INTEGER_TYPE 195
+#define ASN1_R_INVALID_INTEGER 196
 
 #endif