Use a (mostly) constant-time multi-scalar mult for Trust Tokens.

With multi-scalar multiplication, we're stuck pondering the doubling
case. But it's fine for trust tokens, because the points are independent
and the scalars are uniformly generated and not under attacker control.
That means the probability of hitting a double is negligible. (It's
equivalent to accidentally finding the discrete log of two independent
points.)

Before:
Did 306 TrustToken-Exp0-Batch1 generate_key operations in 2000725us (152.9 ops/sec)
Did 1428 TrustToken-Exp0-Batch1 begin_issuance operations in 2080325us (686.4 ops/sec)
Did 105 TrustToken-Exp0-Batch1 issue operations in 2070658us (50.7 ops/sec)
Did 88 TrustToken-Exp0-Batch1 finish_issuance operations in 2023864us (43.5 ops/sec)
Did 12283000 TrustToken-Exp0-Batch1 begin_redemption operations in 2000063us (6141306.5 ops/sec)
Did 315 TrustToken-Exp0-Batch1 redeem operations in 2084451us (151.1 ops/sec)
Did 35000 TrustToken-Exp0-Batch1 finish_redemption operations in 2024388us (17289.2 ops/sec)
Did 315 TrustToken-Exp0-Batch10 generate_key operations in 2045481us (154.0 ops/sec)
Did 138 TrustToken-Exp0-Batch10 begin_issuance operations in 2022158us (68.2 ops/sec)
Did 10 TrustToken-Exp0-Batch10 issue operations in 2148640us (4.7 ops/sec)
Did 8 TrustToken-Exp0-Batch10 finish_issuance operations in 2047452us (3.9 ops/sec)
Did 12167000 TrustToken-Exp0-Batch10 begin_redemption operations in 2000118us (6083141.1 ops/sec)
Did 315 TrustToken-Exp0-Batch10 redeem operations in 2084853us (151.1 ops/sec)
Did 35000 TrustToken-Exp0-Batch10 finish_redemption operations in 2014997us (17369.8 ops/sec)

Did 777 TrustToken-Exp1-Batch1 generate_key operations in 2034967us (381.8 ops/sec)
Did 3612 TrustToken-Exp1-Batch1 begin_issuance operations in 2052618us (1759.7 ops/sec)
Did 264 TrustToken-Exp1-Batch1 issue operations in 2084327us (126.7 ops/sec)
Did 220 TrustToken-Exp1-Batch1 finish_issuance operations in 2024603us (108.7 ops/sec)
Did 12691000 TrustToken-Exp1-Batch1 begin_redemption operations in 2000111us (6345147.8 ops/sec)
Did 777 TrustToken-Exp1-Batch1 redeem operations in 2070867us (375.2 ops/sec)
Did 35000 TrustToken-Exp1-Batch1 finish_redemption operations in 2019118us (17334.3 ops/sec)
Did 798 TrustToken-Exp1-Batch10 generate_key operations in 2090816us (381.7 ops/sec)
Did 357 TrustToken-Exp1-Batch10 begin_issuance operations in 2032751us (175.6 ops/sec)
Did 25 TrustToken-Exp1-Batch10 issue operations in 2046353us (12.2 ops/sec)
Did 21 TrustToken-Exp1-Batch10 finish_issuance operations in 2015579us (10.4 ops/sec)
Did 12695000 TrustToken-Exp1-Batch10 begin_redemption operations in 2000126us (6347100.1 ops/sec)
Did 740 TrustToken-Exp1-Batch10 redeem operations in 2032413us (364.1 ops/sec)
Did 35000 TrustToken-Exp1-Batch10 finish_redemption operations in 2011564us (17399.4 ops/sec)

After:
Did 483 TrustToken-Exp0-Batch1 generate_key operations in 2003131us (241.1 ops/sec) [+57.7%]
Did 1449 TrustToken-Exp0-Batch1 begin_issuance operations in 2089317us (693.5 ops/sec) [+1.0%]
Did 176 TrustToken-Exp0-Batch1 issue operations in 2094210us (84.0 ops/sec) [+65.7%]
Did 147 TrustToken-Exp0-Batch1 finish_issuance operations in 2006750us (73.3 ops/sec) [+68.5%]
Did 12217000 TrustToken-Exp0-Batch1 begin_redemption operations in 2000094us (6108212.9 ops/sec) [-0.5%]
Did 483 TrustToken-Exp0-Batch1 redeem operations in 2058132us (234.7 ops/sec) [+55.3%]
Did 35000 TrustToken-Exp0-Batch1 finish_redemption operations in 2026970us (17267.2 ops/sec) [-0.1%]
Did 504 TrustToken-Exp0-Batch10 generate_key operations in 2086204us (241.6 ops/sec) [+56.9%]
Did 144 TrustToken-Exp0-Batch10 begin_issuance operations in 2084670us (69.1 ops/sec) [+1.2%]
Did 16 TrustToken-Exp0-Batch10 issue operations in 2008793us (8.0 ops/sec) [+71.1%]
Did 14 TrustToken-Exp0-Batch10 finish_issuance operations in 2033577us (6.9 ops/sec) [+76.2%]
Did 12026000 TrustToken-Exp0-Batch10 begin_redemption operations in 2000018us (6012945.9 ops/sec) [-1.2%]
Did 483 TrustToken-Exp0-Batch10 redeem operations in 2056418us (234.9 ops/sec) [+55.5%]
Did 35000 TrustToken-Exp0-Batch10 finish_redemption operations in 2046766us (17100.1 ops/sec) [-1.6%]

Did 1239 TrustToken-Exp1-Batch1 generate_key operations in 2060737us (601.2 ops/sec) [+57.5%]
Did 3675 TrustToken-Exp1-Batch1 begin_issuance operations in 2085293us (1762.3 ops/sec) [+0.1%]
Did 420 TrustToken-Exp1-Batch1 issue operations in 2008121us (209.2 ops/sec) [+65.1%]
Did 378 TrustToken-Exp1-Batch1 finish_issuance operations in 2077226us (182.0 ops/sec) [+67.5%]
Did 12783000 TrustToken-Exp1-Batch1 begin_redemption operations in 2000134us (6391071.8 ops/sec) [+0.7%]
Did 1197 TrustToken-Exp1-Batch1 redeem operations in 2056802us (582.0 ops/sec) [+55.1%]
Did 35000 TrustToken-Exp1-Batch1 finish_redemption operations in 2030955us (17233.3 ops/sec) [-0.6%]
Did 1260 TrustToken-Exp1-Batch10 generate_key operations in 2095507us (601.3 ops/sec) [+57.5%]
Did 357 TrustToken-Exp1-Batch10 begin_issuance operations in 2029693us (175.9 ops/sec) [+0.2%]
Did 42 TrustToken-Exp1-Batch10 issue operations in 2050856us (20.5 ops/sec) [+67.6%]
Did 36 TrustToken-Exp1-Batch10 finish_issuance operations in 2027488us (17.8 ops/sec) [+70.4%]
Did 12140000 TrustToken-Exp1-Batch10 begin_redemption operations in 2000070us (6069787.6 ops/sec) [-4.4%]
Did 1210 TrustToken-Exp1-Batch10 redeem operations in 2079615us (581.8 ops/sec) [+59.8%]
Did 34000 TrustToken-Exp1-Batch10 finish_redemption operations in 2052918us (16561.8 ops/sec) [-4.8%]

Change-Id: Idd51d7e1d18f3b94edc4105e68fd50b5f44d87cd
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/41104
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
diff --git a/crypto/fipsmodule/ec/ec.c b/crypto/fipsmodule/ec/ec.c
index 79dc416..5497aac 100644
--- a/crypto/fipsmodule/ec/ec.c
+++ b/crypto/fipsmodule/ec/ec.c
@@ -1048,6 +1048,28 @@
   return 1;
 }
 
+int ec_point_mul_scalar_batch(const EC_GROUP *group, EC_RAW_POINT *r,
+                              const EC_RAW_POINT *p0, const EC_SCALAR *scalar0,
+                              const EC_RAW_POINT *p1, const EC_SCALAR *scalar1,
+                              const EC_RAW_POINT *p2,
+                              const EC_SCALAR *scalar2) {
+  if (group->meth->mul_batch == NULL) {
+    OPENSSL_PUT_ERROR(EC, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED);
+    return 0;
+  }
+
+  group->meth->mul_batch(group, r, p0, scalar0, p1, scalar1, p2, scalar2);
+
+  // Check the result is on the curve to defend against fault attacks or bugs.
+  // This has negligible cost compared to the multiplication.
+  if (!ec_GFp_simple_is_on_curve(group, r)) {
+    OPENSSL_PUT_ERROR(EC, ERR_R_INTERNAL_ERROR);
+    return 0;
+  }
+
+  return 1;
+}
+
 void ec_point_select(const EC_GROUP *group, EC_RAW_POINT *out, BN_ULONG mask,
                       const EC_RAW_POINT *a, const EC_RAW_POINT *b) {
   ec_felem_select(group, &out->X, mask, &a->X, &b->X);
diff --git a/crypto/fipsmodule/ec/ec_montgomery.c b/crypto/fipsmodule/ec/ec_montgomery.c
index de515ab..8e94f30 100644
--- a/crypto/fipsmodule/ec/ec_montgomery.c
+++ b/crypto/fipsmodule/ec/ec_montgomery.c
@@ -507,6 +507,7 @@
   out->dbl = ec_GFp_mont_dbl;
   out->mul = ec_GFp_mont_mul;
   out->mul_base = ec_GFp_mont_mul_base;
+  out->mul_batch = ec_GFp_mont_mul_batch;
   out->mul_public = ec_GFp_mont_mul_public;
   out->felem_mul = ec_GFp_mont_felem_mul;
   out->felem_sqr = ec_GFp_mont_felem_sqr;
diff --git a/crypto/fipsmodule/ec/internal.h b/crypto/fipsmodule/ec/internal.h
index fed81a5..a30af1c 100644
--- a/crypto/fipsmodule/ec/internal.h
+++ b/crypto/fipsmodule/ec/internal.h
@@ -150,6 +150,9 @@
 void ec_scalar_sub(const EC_GROUP *group, EC_SCALAR *r, const EC_SCALAR *a,
                    const EC_SCALAR *b);
 
+// ec_scalar_neg sets |r| to -|a|.
+void ec_scalar_neg(const EC_GROUP *group, EC_SCALAR *r, const EC_SCALAR *a);
+
 // ec_scalar_to_montgomery sets |r| to |a| in Montgomery form.
 void ec_scalar_to_montgomery(const EC_GROUP *group, EC_SCALAR *r,
                              const EC_SCALAR *a);
@@ -308,6 +311,29 @@
 int ec_point_mul_scalar_base(const EC_GROUP *group, EC_RAW_POINT *r,
                              const EC_SCALAR *scalar);
 
+// ec_point_mul_scalar_batch sets |r| to |p0| * |scalar0| + |p1| * |scalar1| +
+// |p2| * |scalar2|. |p2| may be NULL to skip that term.
+//
+// The inputs are treated as secret, however, this function leaks information
+// about whether intermediate computations add a point to itself. Callers must
+// ensure that discrete logs between |p0|, |p1|, and |p2| are uniformly
+// distributed and independent of the scalars, which should be uniformly
+// selected and not under the attackers control. This ensures the doubling case
+// will occur with negligible probability.
+//
+// This function is not implemented for all curves. Add implementations as
+// needed.
+//
+// TODO(davidben): This function does not use base point tables. For now, it is
+// only used with the generic |EC_GFp_mont_method| implementation which has
+// none. If generalizing to tuned curves, this may be useful. However, we still
+// must double up to the least efficient input, so precomputed tables can only
+// save table setup and allow a wider window size.
+int ec_point_mul_scalar_batch(const EC_GROUP *group, EC_RAW_POINT *r,
+                              const EC_RAW_POINT *p0, const EC_SCALAR *scalar0,
+                              const EC_RAW_POINT *p1, const EC_SCALAR *scalar1,
+                              const EC_RAW_POINT *p2, const EC_SCALAR *scalar2);
+
 // ec_point_mul_scalar_public sets |r| to
 // generator * |g_scalar| + |p| * |p_scalar|. It assumes that the inputs are
 // public so there is no concern about leaking their values through timing.
@@ -399,6 +425,11 @@
   // mul_base sets |r| to |scalar|*generator.
   void (*mul_base)(const EC_GROUP *group, EC_RAW_POINT *r,
                    const EC_SCALAR *scalar);
+  // mul_batch implements |ec_mul_scalar_batch|.
+  void (*mul_batch)(const EC_GROUP *group, EC_RAW_POINT *r,
+                    const EC_RAW_POINT *p0, const EC_SCALAR *scalar0,
+                    const EC_RAW_POINT *p1, const EC_SCALAR *scalar1,
+                    const EC_RAW_POINT *p2, const EC_SCALAR *scalar2);
   // mul_public sets |r| to |g_scalar|*generator + |p_scalar|*|p|. It assumes
   // that the inputs are public so there is no concern about leaking their
   // values through timing.
@@ -520,6 +551,10 @@
                      const EC_RAW_POINT *p, const EC_SCALAR *scalar);
 void ec_GFp_mont_mul_base(const EC_GROUP *group, EC_RAW_POINT *r,
                           const EC_SCALAR *scalar);
+void ec_GFp_mont_mul_batch(const EC_GROUP *group, EC_RAW_POINT *r,
+                           const EC_RAW_POINT *p0, const EC_SCALAR *scalar0,
+                           const EC_RAW_POINT *p1, const EC_SCALAR *scalar1,
+                           const EC_RAW_POINT *p2, const EC_SCALAR *scalar2);
 
 // ec_compute_wNAF writes the modified width-(w+1) Non-Adjacent Form (wNAF) of
 // |scalar| to |out|. |out| must have room for |bits| + 1 elements, each of
diff --git a/crypto/fipsmodule/ec/scalar.c b/crypto/fipsmodule/ec/scalar.c
index 3b4a7d8..e4ae9d7 100644
--- a/crypto/fipsmodule/ec/scalar.c
+++ b/crypto/fipsmodule/ec/scalar.c
@@ -106,6 +106,12 @@
   OPENSSL_cleanse(tmp, sizeof(tmp));
 }
 
+void ec_scalar_neg(const EC_GROUP *group, EC_SCALAR *r, const EC_SCALAR *a) {
+  EC_SCALAR zero;
+  OPENSSL_memset(&zero, 0, sizeof(EC_SCALAR));
+  ec_scalar_sub(group, r, &zero, a);
+}
+
 void ec_scalar_select(const EC_GROUP *group, EC_SCALAR *out, BN_ULONG mask,
                       const EC_SCALAR *a, const EC_SCALAR *b) {
   const BIGNUM *order = &group->order;
diff --git a/crypto/fipsmodule/ec/simple_mul.c b/crypto/fipsmodule/ec/simple_mul.c
index 9a43120..02063df 100644
--- a/crypto/fipsmodule/ec/simple_mul.c
+++ b/crypto/fipsmodule/ec/simple_mul.c
@@ -80,3 +80,90 @@
                           const EC_SCALAR *scalar) {
   ec_GFp_mont_mul(group, r, &group->generator->raw, scalar);
 }
+
+static void ec_GFp_mont_batch_precomp(const EC_GROUP *group, EC_RAW_POINT *out,
+                                      size_t num, const EC_RAW_POINT *p) {
+  assert(num > 1);
+  ec_GFp_simple_point_set_to_infinity(group, &out[0]);
+  ec_GFp_simple_point_copy(&out[1], p);
+  for (size_t j = 2; j < num; j++) {
+    if (j & 1) {
+      ec_GFp_mont_add(group, &out[j], &out[1], &out[j - 1]);
+    } else {
+      ec_GFp_mont_dbl(group, &out[j], &out[j / 2]);
+    }
+  }
+}
+
+static void ec_GFp_mont_batch_get_window(const EC_GROUP *group,
+                                         EC_RAW_POINT *out,
+                                         const EC_RAW_POINT precomp[17],
+                                         const EC_SCALAR *scalar, unsigned i) {
+  const size_t width = group->order.width;
+  uint8_t window = bn_is_bit_set_words(scalar->words, width, i + 4) << 5;
+  window |= bn_is_bit_set_words(scalar->words, width, i + 3) << 4;
+  window |= bn_is_bit_set_words(scalar->words, width, i + 2) << 3;
+  window |= bn_is_bit_set_words(scalar->words, width, i + 1) << 2;
+  window |= bn_is_bit_set_words(scalar->words, width, i) << 1;
+  if (i > 0) {
+    window |= bn_is_bit_set_words(scalar->words, width, i - 1);
+  }
+  uint8_t sign, digit;
+  ec_GFp_nistp_recode_scalar_bits(&sign, &digit, window);
+
+  // Select the entry in constant-time.
+  OPENSSL_memset(out, 0, sizeof(EC_RAW_POINT));
+  for (size_t j = 0; j < 17; j++) {
+    BN_ULONG mask = constant_time_eq_w(j, digit);
+    ec_point_select(group, out, mask, &precomp[j], out);
+  }
+
+  // Negate if necessary.
+  EC_FELEM neg_Y;
+  ec_felem_neg(group, &neg_Y, &out->Y);
+  BN_ULONG sign_mask = sign;
+  sign_mask = 0u - sign_mask;
+  ec_felem_select(group, &out->Y, sign_mask, &neg_Y, &out->Y);
+}
+
+void ec_GFp_mont_mul_batch(const EC_GROUP *group, EC_RAW_POINT *r,
+                           const EC_RAW_POINT *p0, const EC_SCALAR *scalar0,
+                           const EC_RAW_POINT *p1, const EC_SCALAR *scalar1,
+                           const EC_RAW_POINT *p2, const EC_SCALAR *scalar2) {
+  EC_RAW_POINT precomp[3][17];
+  ec_GFp_mont_batch_precomp(group, precomp[0], 17, p0);
+  ec_GFp_mont_batch_precomp(group, precomp[1], 17, p1);
+  if (p2 != NULL) {
+    ec_GFp_mont_batch_precomp(group, precomp[2], 17, p2);
+  }
+
+  // Divide bits in |scalar| into windows.
+  unsigned bits = BN_num_bits(&group->order);
+  int r_is_at_infinity = 1;
+  for (unsigned i = bits; i <= bits; i--) {
+    if (!r_is_at_infinity) {
+      ec_GFp_mont_dbl(group, r, r);
+    }
+    if (i % 5 == 0) {
+      EC_RAW_POINT tmp;
+      ec_GFp_mont_batch_get_window(group, &tmp, precomp[0], scalar0, i);
+      if (r_is_at_infinity) {
+        ec_GFp_simple_point_copy(r, &tmp);
+        r_is_at_infinity = 0;
+      } else {
+        ec_GFp_mont_add(group, r, r, &tmp);
+      }
+
+      ec_GFp_mont_batch_get_window(group, &tmp, precomp[1], scalar1, i);
+      ec_GFp_mont_add(group, r, r, &tmp);
+
+      if (p2 != NULL) {
+        ec_GFp_mont_batch_get_window(group, &tmp, precomp[2], scalar2, i);
+        ec_GFp_mont_add(group, r, r, &tmp);
+      }
+    }
+  }
+  if (r_is_at_infinity) {
+    ec_GFp_simple_point_set_to_infinity(group, r);
+  }
+}
diff --git a/crypto/trust_token/pmbtoken.c b/crypto/trust_token/pmbtoken.c
index f1dc39b..0407537 100644
--- a/crypto/trust_token/pmbtoken.c
+++ b/crypto/trust_token/pmbtoken.c
@@ -49,62 +49,17 @@
 
 static const uint8_t kDefaultAdditionalData[32] = {0};
 
-static int mul_twice(const EC_GROUP *group, EC_RAW_POINT *out,
-                     const EC_RAW_POINT *g, const EC_SCALAR *g_scalar,
-                     const EC_RAW_POINT *p, const EC_SCALAR *p_scalar) {
-  EC_RAW_POINT tmp1, tmp2;
-  if (!ec_point_mul_scalar(group, &tmp1, g, g_scalar) ||
-      !ec_point_mul_scalar(group, &tmp2, p, p_scalar)) {
-    return 0;
-  }
-
-  group->meth->add(group, out, &tmp1, &tmp2);
-  return 1;
-}
-
-static int mul_twice_base(const EC_GROUP *group, EC_RAW_POINT *out,
-                          const EC_SCALAR *base_scalar, const EC_RAW_POINT *p,
-                          const EC_SCALAR *p_scalar) {
-  EC_RAW_POINT tmp1, tmp2;
-  if (!ec_point_mul_scalar_base(group, &tmp1, base_scalar) ||
-      !ec_point_mul_scalar(group, &tmp2, p, p_scalar)) {
-    return 0;
-  }
-
-  group->meth->add(group, out, &tmp1, &tmp2);
-  return 1;
-}
-
-// (v0;v1) = p_scalar*(G;p1) + q_scalar*(q0;q1) - r_scalar*(r0;r1)
-static int mul_add_and_sub(const EC_GROUP *group, EC_RAW_POINT *out_v0,
-                           EC_RAW_POINT *out_v1, const EC_RAW_POINT *p1,
-                           const EC_SCALAR *p_scalar, const EC_RAW_POINT *q0,
-                           const EC_RAW_POINT *q1, const EC_SCALAR *q_scalar,
-                           const EC_RAW_POINT *r0, const EC_RAW_POINT *r1,
-                           const EC_SCALAR *r_scalar) {
-  EC_RAW_POINT tmp0, tmp1, v0, v1;
-  if (!mul_twice_base(group, &v0, p_scalar, q0, q_scalar) ||
-      !mul_twice(group, &v1, p1, p_scalar, q1, q_scalar) ||
-      !ec_point_mul_scalar(group, &tmp0, r0, r_scalar) ||
-      !ec_point_mul_scalar(group, &tmp1, r1, r_scalar)) {
-    return 0;
-  }
-  ec_GFp_simple_invert(group, &tmp0);
-  ec_GFp_simple_invert(group, &tmp1);
-  group->meth->add(group, out_v0, &v0, &tmp0);
-  group->meth->add(group, out_v1, &v1, &tmp1);
-  return 1;
-}
-
 // generate_keypair generates a keypair for the PMBTokens construction.
 // |out_x| and |out_y| are set to the secret half of the keypair, while
 // |*out_pub| is set to the public half of the keypair. It returns one on
 // success and zero on failure.
 static int generate_keypair(const PMBTOKEN_METHOD *method, EC_SCALAR *out_x,
                             EC_SCALAR *out_y, EC_RAW_POINT *out_pub) {
+  const EC_RAW_POINT *g = &method->group->generator->raw;
   if (!ec_random_nonzero_scalar(method->group, out_x, kDefaultAdditionalData) ||
       !ec_random_nonzero_scalar(method->group, out_y, kDefaultAdditionalData) ||
-      !mul_twice_base(method->group, out_pub, out_x, &method->h, out_y)) {
+      !ec_point_mul_scalar_batch(method->group, out_pub, g, out_x, &method->h,
+                                 out_y, NULL, NULL)) {
     OPENSSL_PUT_ERROR(TRUST_TOKEN, ERR_R_MALLOC_FAILURE);
     return 0;
   }
@@ -223,9 +178,13 @@
   // Recompute the public key.
   EC_RAW_POINT pub[3];
   EC_AFFINE pub_affine[3];
-  if (!mul_twice_base(group, &pub[0], &key->x0, &method->h, &key->y0) ||
-      !mul_twice_base(group, &pub[1], &key->x1, &method->h, &key->y1) ||
-      !mul_twice_base(group, &pub[2], &key->xs, &method->h, &key->ys) ||
+  const EC_RAW_POINT *g = &group->generator->raw;
+  if (!ec_point_mul_scalar_batch(group, &pub[0], g, &key->x0, &method->h,
+                                 &key->y0, NULL, NULL) ||
+      !ec_point_mul_scalar_batch(group, &pub[1], g, &key->x1, &method->h,
+                                 &key->y1, NULL, NULL) ||
+      !ec_point_mul_scalar_batch(group, &pub[2], g, &key->xs, &method->h,
+                                 &key->ys, NULL, NULL) ||
       !ec_jacobian_to_affine_batch(group, pub_affine, pub, 3)) {
     return 0;
   }
@@ -398,6 +357,7 @@
                          const EC_RAW_POINT *S, const EC_RAW_POINT *W,
                          const EC_RAW_POINT *Ws, uint8_t private_metadata) {
   const EC_GROUP *group = method->group;
+  const EC_RAW_POINT *g = &group->generator->raw;
 
   // We generate a DLEQ proof for the validity token and a DLEQOR2 proof for the
   // private metadata token. To allow amortizing Jacobian-to-affine conversions,
@@ -423,8 +383,10 @@
       !ec_random_nonzero_scalar(group, &ks0, kDefaultAdditionalData) ||
       !ec_random_nonzero_scalar(group, &ks1, kDefaultAdditionalData) ||
       // Ks = ks0*(G;T) + ks1*(H;S)
-      !mul_twice_base(group, &jacobians[idx_Ks0], &ks0, &method->h, &ks1) ||
-      !mul_twice(group, &jacobians[idx_Ks1], T, &ks0, S, &ks1)) {
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_Ks0], g, &ks0,
+                                 &method->h, &ks1, NULL, NULL) ||
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_Ks1], T, &ks0, S, &ks1,
+                                 NULL, NULL)) {
     return 0;
   }
 
@@ -440,20 +402,24 @@
   ec_affine_select(group, &pubo_affine, mask, &priv->pub0, &priv->pub1);
   ec_affine_to_jacobian(group, &pubo, &pubo_affine);
 
-  EC_SCALAR k0, k1, co, uo, vo;
+  EC_SCALAR k0, k1, minus_co, uo, vo;
   if (// k0, k1 <- Zp
       !ec_random_nonzero_scalar(group, &k0, kDefaultAdditionalData) ||
       !ec_random_nonzero_scalar(group, &k1, kDefaultAdditionalData) ||
       // Kb = k0*(G;T) + k1*(H;S)
-      !mul_twice_base(group, &jacobians[idx_Kb0], &k0, &method->h, &k1) ||
-      !mul_twice(group, &jacobians[idx_Kb1], T, &k0, S, &k1) ||
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_Kb0], g, &k0, &method->h,
+                                 &k1, NULL, NULL) ||
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_Kb1], T, &k0, S, &k1,
+                                 NULL, NULL) ||
       // co, uo, vo <- Zp
-      !ec_random_nonzero_scalar(group, &co, kDefaultAdditionalData) ||
+      !ec_random_nonzero_scalar(group, &minus_co, kDefaultAdditionalData) ||
       !ec_random_nonzero_scalar(group, &uo, kDefaultAdditionalData) ||
       !ec_random_nonzero_scalar(group, &vo, kDefaultAdditionalData) ||
       // Ko = uo*(G;T) + vo*(H;S) - co*(pubo;W)
-      !mul_add_and_sub(group, &jacobians[idx_Ko0], &jacobians[idx_Ko1], T, &uo,
-                       &method->h, S, &vo, &pubo, W, &co)) {
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_Ko0], g, &uo, &method->h,
+                                 &vo, &pubo, &minus_co) ||
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_Ko1], T, &uo, S, &vo, W,
+                                 &minus_co)) {
     return 0;
   }
 
@@ -509,7 +475,7 @@
 
   // cb = c - co
   EC_SCALAR cb, ub, vb;
-  ec_scalar_sub(group, &cb, &c, &co);
+  ec_scalar_add(group, &cb, &c, &minus_co);
 
   EC_SCALAR cb_mont;
   ec_scalar_to_montgomery(group, &cb_mont, &cb);
@@ -523,7 +489,8 @@
   ec_scalar_add(group, &vb, &k1, &vb);
 
   // Select c, u, v in constant-time.
-  EC_SCALAR c0, c1, u0, u1, v0, v1;
+  EC_SCALAR co, c0, c1, u0, u1, v0, v1;
+  ec_scalar_neg(group, &co, &minus_co);
   ec_scalar_select(group, &c0, mask, &co, &cb);
   ec_scalar_select(group, &u0, mask, &uo, &ub);
   ec_scalar_select(group, &v0, mask, &vo, &vb);
@@ -550,6 +517,7 @@
                        const EC_RAW_POINT *S, const EC_RAW_POINT *W,
                        const EC_RAW_POINT *Ws) {
   const EC_GROUP *group = method->group;
+  const EC_RAW_POINT *g = &group->generator->raw;
 
   // We verify a DLEQ proof for the validity token and a DLEQOR2 proof for the
   // private metadata token. To allow amortizing Jacobian-to-affine conversions,
@@ -579,10 +547,17 @@
   }
 
   // Ks = us*(G;T) + vs*(H;S) - cs*(pubs;Ws)
+  //
+  // TODO(davidben): The multiplications in this function are public and can be
+  // switched to a public batch multiplication function if we add one.
   EC_RAW_POINT pubs;
   ec_affine_to_jacobian(group, &pubs, &pub->pubs);
-  if (!mul_add_and_sub(group, &jacobians[idx_Ks0], &jacobians[idx_Ks1], T, &us,
-                       &method->h, S, &vs, &pubs, Ws, &cs)) {
+  EC_SCALAR minus_cs;
+  ec_scalar_neg(group, &minus_cs, &cs);
+  if (!ec_point_mul_scalar_batch(group, &jacobians[idx_Ks0], g, &us, &method->h,
+                                 &vs, &pubs, &minus_cs) ||
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_Ks1], T, &us, S, &vs, Ws,
+                                 &minus_cs)) {
     return 0;
   }
 
@@ -601,12 +576,19 @@
   EC_RAW_POINT pub0, pub1;
   ec_affine_to_jacobian(group, &pub0, &pub->pub0);
   ec_affine_to_jacobian(group, &pub1, &pub->pub1);
+  EC_SCALAR minus_c0, minus_c1;
+  ec_scalar_neg(group, &minus_c0, &c0);
+  ec_scalar_neg(group, &minus_c1, &c1);
   if (// K0 = u0*(G;T) + v0*(H;S) - c0*(pub0;W)
-      !mul_add_and_sub(group, &jacobians[idx_K00], &jacobians[idx_K01], T, &u0,
-                       &method->h, S, &v0, &pub0, W, &c0) ||
-      // K1 = u1*(G;T) + v1*(H;S) - c1*(pub1;Ws)
-      !mul_add_and_sub(group, &jacobians[idx_K10], &jacobians[idx_K11], T, &u1,
-                       &method->h, S, &v1, &pub1, W, &c1)) {
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_K00], g, &u0, &method->h,
+                                 &v0, &pub0, &minus_c0) ||
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_K01], T, &u0, S, &v0, W,
+                                 &minus_c0) ||
+      // K1 = u1*(G;T) + v1*(H;S) - c1*(pub1;W)
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_K10], g, &u1, &method->h,
+                                 &v1, &pub1, &minus_c1) ||
+      !ec_point_mul_scalar_batch(group, &jacobians[idx_K11], T, &u1, S, &v1, W,
+                                 &minus_c1)) {
     return 0;
   }
 
@@ -682,8 +664,10 @@
     EC_AFFINE W_affine[2];
     CBB child;
     if (!method->hash_s(group, &Sp, &Tp_affine, s) ||
-        !mul_twice(group, &W[0], &Tp, &xb, &Sp, &yb) ||
-        !mul_twice(group, &W[1], &Tp, &key->xs, &Sp, &key->ys) ||
+        !ec_point_mul_scalar_batch(group, &W[0], &Tp, &xb, &Sp, &yb, NULL,
+                                   NULL) ||
+        !ec_point_mul_scalar_batch(group, &W[1], &Tp, &key->xs, &Sp, &key->ys,
+                                   NULL, NULL) ||
         // This call to |ec_jacobian_to_affine_batch| could be merged with the
         // one in |dleq_generate|, but we expect to implement the batched DLEQOR
         // proofs (see figure 15 of the PMBTokens paper), which would require a
@@ -842,15 +826,18 @@
   EC_RAW_POINT S_jacobian, calculated;
   // Check the validity of the token.
   ec_affine_to_jacobian(group, &S_jacobian, &S);
-  if (!mul_twice(group, &calculated, &T, &key->xs, &S_jacobian, &key->ys) ||
+  if (!ec_point_mul_scalar_batch(group, &calculated, &T, &key->xs, &S_jacobian,
+                                 &key->ys, NULL, NULL) ||
       !ec_affine_jacobian_equal(group, &Ws, &calculated)) {
     OPENSSL_PUT_ERROR(TRUST_TOKEN, TRUST_TOKEN_R_BAD_VALIDITY_CHECK);
     return 0;
   }
 
   EC_RAW_POINT W0, W1;
-  if (!mul_twice(group, &W0, &T, &key->x0, &S_jacobian, &key->y0) ||
-      !mul_twice(group, &W1, &T, &key->x1, &S_jacobian, &key->y1)) {
+  if (!ec_point_mul_scalar_batch(group, &W0, &T, &key->x0, &S_jacobian,
+                                 &key->y0, NULL, NULL) ||
+      !ec_point_mul_scalar_batch(group, &W1, &T, &key->x1, &S_jacobian,
+                                 &key->y1, NULL, NULL)) {
     return 0;
   }