- djm@cvs.openbsd.org 2013/11/21 00:45:44
     [Makefile.in PROTOCOL PROTOCOL.chacha20poly1305 authfile.c chacha.c]
     [chacha.h cipher-chachapoly.c cipher-chachapoly.h cipher.c cipher.h]
     [dh.c myproposal.h packet.c poly1305.c poly1305.h servconf.c ssh.1]
     [ssh.c ssh_config.5 sshd_config.5] Add a new protocol 2 transport
     cipher "chacha20-poly1305@openssh.com" that combines Daniel
     Bernstein's ChaCha20 stream cipher and Poly1305 MAC to build an
     authenticated encryption mode.

     Inspired by and similar to Adam Langley's proposal for TLS:
     http://tools.ietf.org/html/draft-agl-tls-chacha20poly1305-03
     but differs in layout used for the MAC calculation and the use of a
     second ChaCha20 instance to separately encrypt packet lengths.
     Details are in the PROTOCOL.chacha20poly1305 file.

     Feedback markus@, naddy@; manpage bits Loganden Velvindron @ AfriNIC
     ok markus@ naddy@
diff --git a/ChangeLog b/ChangeLog
index cb4dae3..28186e8 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -19,6 +19,23 @@
      [canohost.c clientloop.c match.c readconf.c sftp.c]
      unsigned casts for ctype macros where neccessary
      ok guenther millert markus
+   - djm@cvs.openbsd.org 2013/11/21 00:45:44
+     [Makefile.in PROTOCOL PROTOCOL.chacha20poly1305 authfile.c chacha.c]
+     [chacha.h cipher-chachapoly.c cipher-chachapoly.h cipher.c cipher.h]
+     [dh.c myproposal.h packet.c poly1305.c poly1305.h servconf.c ssh.1]
+     [ssh.c ssh_config.5 sshd_config.5] Add a new protocol 2 transport
+     cipher "chacha20-poly1305@openssh.com" that combines Daniel
+     Bernstein's ChaCha20 stream cipher and Poly1305 MAC to build an
+     authenticated encryption mode.
+     
+     Inspired by and similar to Adam Langley's proposal for TLS:
+     http://tools.ietf.org/html/draft-agl-tls-chacha20poly1305-03
+     but differs in layout used for the MAC calculation and the use of a
+     second ChaCha20 instance to separately encrypt packet lengths.
+     Details are in the PROTOCOL.chacha20poly1305 file.
+     
+     Feedback markus@, naddy@; manpage bits Loganden Velvindron @ AfriNIC
+     ok markus@ naddy@
 
 20131110
  - (dtucker) [regress/keytype.sh] Populate ECDSA key types to be tested by
diff --git a/Makefile.in b/Makefile.in
index e1c68c0..91f39d4 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -1,4 +1,4 @@
-# $Id: Makefile.in,v 1.344 2013/11/08 13:17:41 dtucker Exp $
+# $Id: Makefile.in,v 1.345 2013/11/21 03:12:23 djm Exp $
 
 # uncomment if you run a non bourne compatable shell. Ie. csh
 #SHELL = @SH@
@@ -74,7 +74,7 @@
 	kexdh.o kexgex.o kexdhc.o kexgexc.o bufec.o kexecdh.o kexecdhc.o \
 	msg.o progressmeter.o dns.o entropy.o gss-genr.o umac.o umac128.o \
 	jpake.o schnorr.o ssh-pkcs11.o krl.o smult_curve25519_ref.o \
-	kexc25519.o kexc25519c.o
+	kexc25519.o kexc25519c.o poly1305.o chacha.o cipher-chachapoly.o
 
 SSHOBJS= ssh.o readconf.o clientloop.o sshtty.o \
 	sshconnect.o sshconnect1.o sshconnect2.o mux.o \
diff --git a/PROTOCOL b/PROTOCOL
index 0363314..cace97f 100644
--- a/PROTOCOL
+++ b/PROTOCOL
@@ -91,6 +91,11 @@
 the exchanged MAC algorithms are ignored and there doesn't have to be
 a matching MAC.
 
+1.7 transport: chacha20-poly1305@openssh.com authenticated encryption
+
+OpenSSH supports authenticated encryption using ChaCha20 and Poly1305
+as described in PROTOCOL.chacha20poly1305.
+
 2. Connection protocol changes
 
 2.1. connection: Channel write close extension "eow@openssh.com"
@@ -345,4 +350,4 @@
 This extension is advertised in the SSH_FXP_VERSION hello with version
 "1".
 
-$OpenBSD: PROTOCOL,v 1.21 2013/10/17 00:30:13 djm Exp $
+$OpenBSD: PROTOCOL,v 1.22 2013/11/21 00:45:43 djm Exp $
diff --git a/PROTOCOL.chacha20poly1305 b/PROTOCOL.chacha20poly1305
new file mode 100644
index 0000000..c4b723a
--- /dev/null
+++ b/PROTOCOL.chacha20poly1305
@@ -0,0 +1,105 @@
+This document describes the chacha20-poly1305@openssh.com authenticated
+encryption cipher supported by OpenSSH.
+
+Background
+----------
+
+ChaCha20 is a stream cipher designed by Daniel Bernstein and described
+in [1]. It operates by permuting 128 fixed bits, 128 or 256 bits of key,
+a 64 bit nonce and a 64 bit counter into 64 bytes of output. This output
+is used as a keystream, with any unused bytes simply discarded.
+
+Poly1305[2], also by Daniel Bernstein, is a one-time Carter-Wegman MAC
+that computes a 128 bit integrity tag given a message and a single-use
+256 bit secret key.
+
+The chacha20-poly1305@openssh.com combines these two primitives into an
+authenticated encryption mode. The construction used is based on that
+proposed for TLS by Adam Langley in [3], but differs in the layout of
+data passed to the MAC and in the addition of encyption of the packet
+lengths.
+
+Negotiation
+-----------
+
+The chacha20-poly1305@openssh.com offers both encryption and
+authentication. As such, no separate MAC is required. If the
+chacha20-poly1305@openssh.com cipher is selected in key exchange,
+the offered MAC algorithms are ignored and no MAC is required to be
+negotiated.
+
+Detailed Construction
+---------------------
+
+The chacha20-poly1305@openssh.com cipher requires 512 bits of key
+material as output from the SSH key exchange. This forms two 256 bit
+keys (K_1 and K_2), used by two separate instances of chacha20.
+
+The instance keyed by K_1 is a stream cipher that is used only
+to encrypt the 4 byte packet length field. The second instance,
+keyed by K_2, is used in conjunction with poly1305 to build an AEAD
+(Authenticated Encryption with Associated Data) that is used to encrypt
+and authenticate the entire packet.
+
+Two separate cipher instances are used here so as to keep the packet
+lengths confidential but not create an oracle for the packet payload
+cipher by decrypting and using the packet length prior to checking
+the MAC. By using an independently-keyed cipher instance to encrypt the
+length, an active attacker seeking to exploit the packet input handling
+as a decryption oracle can learn nothing about the payload contents or
+its MAC (assuming key derivation, ChaCha20 and Poly1306 are secure).
+
+The AEAD is constructed as follows: for each packet, generate a Poly1305
+key by taking the first 256 bits of ChaCha20 stream output generated
+using K_2, an IV consisting of the packet sequence number encoded as an
+uint64 under the SSH wire encoding rules and a ChaCha20 block counter of
+zero. The K_2 ChaCha20 block counter is then set to the little-endian
+encoding of 1 (i.e. {1, 0, 0, 0, 0, 0, 0, 0}) and this instance is used
+for encryption of the packet payload.
+
+Packet Handling
+---------------
+
+When receiving a packet, the length must be decrypted first. When 4
+bytes of ciphertext length have been received, they may be decrypted
+using the K_1 key, a nonce consisting of the packet sequence number
+encoded as a uint64 under the usual SSH wire encoding and a zero block
+counter to obtain the plaintext length.
+
+Once the entire packet has been received, the MAC MUST be checked
+before decryption. A per-packet Poly1305 key is generated as described
+above and the MAC tag calculated using Poly1305 with this key over the
+ciphertext of the packet length and the payload together. The calculated
+MAC is then compared in constant time with the one appended to the
+packet and the packet decrypted using ChaCha20 as described above (with
+K_2, the packet sequence number as nonce and a starting block counter of
+1).
+
+To send a packet, first encode the 4 byte length and encrypt it using
+K_1. Encrypt the packet payload (using K_2) and append it to the
+encrypted length. Finally, calculate a MAC tag and append it.
+
+Rekeying
+--------
+
+ChaCha20 must never reuse a {key, nonce} for encryption nor may it be
+used to encrypt more than 2^70 bytes under the same {key, nonce}. The
+SSH Transport protocol (RFC4253) recommends a far more conservative
+rekeying every 1GB of data sent or received. If this recommendation
+is followed, then chacha20-poly1305@openssh.com requires no special
+handling in this area.
+
+References
+----------
+
+[1] "ChaCha, a variant of Salsa20", Daniel Bernstein
+    http://cr.yp.to/chacha/chacha-20080128.pdf
+
+[2] "The Poly1305-AES message-authentication code", Daniel Bernstein
+    http://cr.yp.to/mac/poly1305-20050329.pdf
+
+[3] "ChaCha20 and Poly1305 based Cipher Suites for TLS", Adam Langley
+    http://tools.ietf.org/html/draft-agl-tls-chacha20poly1305-03
+
+$OpenBSD: PROTOCOL.chacha20poly1305,v 1.1 2013/11/21 00:45:43 djm Exp $
+
diff --git a/authfile.c b/authfile.c
index 63ae16b..d0c1089 100644
--- a/authfile.c
+++ b/authfile.c
@@ -1,4 +1,4 @@
-/* $OpenBSD: authfile.c,v 1.97 2013/05/17 00:13:13 djm Exp $ */
+/* $OpenBSD: authfile.c,v 1.98 2013/11/21 00:45:43 djm Exp $ */
 /*
  * Author: Tatu Ylonen <ylo@cs.hut.fi>
  * Copyright (c) 1995 Tatu Ylonen <ylo@cs.hut.fi>, Espoo, Finland
@@ -149,7 +149,7 @@
 
 	cipher_set_key_string(&ciphercontext, cipher, passphrase,
 	    CIPHER_ENCRYPT);
-	cipher_crypt(&ciphercontext, cp,
+	cipher_crypt(&ciphercontext, 0, cp,
 	    buffer_ptr(&buffer), buffer_len(&buffer), 0, 0);
 	cipher_cleanup(&ciphercontext);
 	memset(&ciphercontext, 0, sizeof(ciphercontext));
@@ -473,7 +473,7 @@
 	/* Rest of the buffer is encrypted.  Decrypt it using the passphrase. */
 	cipher_set_key_string(&ciphercontext, cipher, passphrase,
 	    CIPHER_DECRYPT);
-	cipher_crypt(&ciphercontext, cp,
+	cipher_crypt(&ciphercontext, 0, cp,
 	    buffer_ptr(&copy), buffer_len(&copy), 0, 0);
 	cipher_cleanup(&ciphercontext);
 	memset(&ciphercontext, 0, sizeof(ciphercontext));
diff --git a/chacha.c b/chacha.c
new file mode 100644
index 0000000..a84c25e
--- /dev/null
+++ b/chacha.c
@@ -0,0 +1,219 @@
+/*
+chacha-merged.c version 20080118
+D. J. Bernstein
+Public domain.
+*/
+
+#include "includes.h"
+
+#include "chacha.h"
+
+/* $OpenBSD: chacha.c,v 1.1 2013/11/21 00:45:44 djm Exp $ */
+
+typedef unsigned char u8;
+typedef unsigned int u32;
+
+typedef struct chacha_ctx chacha_ctx;
+
+#define U8C(v) (v##U)
+#define U32C(v) (v##U)
+
+#define U8V(v) ((u8)(v) & U8C(0xFF))
+#define U32V(v) ((u32)(v) & U32C(0xFFFFFFFF))
+
+#define ROTL32(v, n) \
+  (U32V((v) << (n)) | ((v) >> (32 - (n))))
+
+#define U8TO32_LITTLE(p) \
+  (((u32)((p)[0])      ) | \
+   ((u32)((p)[1]) <<  8) | \
+   ((u32)((p)[2]) << 16) | \
+   ((u32)((p)[3]) << 24))
+
+#define U32TO8_LITTLE(p, v) \
+  do { \
+    (p)[0] = U8V((v)      ); \
+    (p)[1] = U8V((v) >>  8); \
+    (p)[2] = U8V((v) >> 16); \
+    (p)[3] = U8V((v) >> 24); \
+  } while (0)
+
+#define ROTATE(v,c) (ROTL32(v,c))
+#define XOR(v,w) ((v) ^ (w))
+#define PLUS(v,w) (U32V((v) + (w)))
+#define PLUSONE(v) (PLUS((v),1))
+
+#define QUARTERROUND(a,b,c,d) \
+  a = PLUS(a,b); d = ROTATE(XOR(d,a),16); \
+  c = PLUS(c,d); b = ROTATE(XOR(b,c),12); \
+  a = PLUS(a,b); d = ROTATE(XOR(d,a), 8); \
+  c = PLUS(c,d); b = ROTATE(XOR(b,c), 7);
+
+static const char sigma[16] = "expand 32-byte k";
+static const char tau[16] = "expand 16-byte k";
+
+void
+chacha_keysetup(chacha_ctx *x,const u8 *k,u32 kbits)
+{
+  const char *constants;
+
+  x->input[4] = U8TO32_LITTLE(k + 0);
+  x->input[5] = U8TO32_LITTLE(k + 4);
+  x->input[6] = U8TO32_LITTLE(k + 8);
+  x->input[7] = U8TO32_LITTLE(k + 12);
+  if (kbits == 256) { /* recommended */
+    k += 16;
+    constants = sigma;
+  } else { /* kbits == 128 */
+    constants = tau;
+  }
+  x->input[8] = U8TO32_LITTLE(k + 0);
+  x->input[9] = U8TO32_LITTLE(k + 4);
+  x->input[10] = U8TO32_LITTLE(k + 8);
+  x->input[11] = U8TO32_LITTLE(k + 12);
+  x->input[0] = U8TO32_LITTLE(constants + 0);
+  x->input[1] = U8TO32_LITTLE(constants + 4);
+  x->input[2] = U8TO32_LITTLE(constants + 8);
+  x->input[3] = U8TO32_LITTLE(constants + 12);
+}
+
+void
+chacha_ivsetup(chacha_ctx *x, const u8 *iv, const u8 *counter)
+{
+  x->input[12] = counter == NULL ? 0 : U8TO32_LITTLE(counter + 0);
+  x->input[13] = counter == NULL ? 0 : U8TO32_LITTLE(counter + 4);
+  x->input[14] = U8TO32_LITTLE(iv + 0);
+  x->input[15] = U8TO32_LITTLE(iv + 4);
+}
+
+void
+chacha_encrypt_bytes(chacha_ctx *x,const u8 *m,u8 *c,u32 bytes)
+{
+  u32 x0, x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15;
+  u32 j0, j1, j2, j3, j4, j5, j6, j7, j8, j9, j10, j11, j12, j13, j14, j15;
+  u8 *ctarget = NULL;
+  u8 tmp[64];
+  u_int i;
+
+  if (!bytes) return;
+
+  j0 = x->input[0];
+  j1 = x->input[1];
+  j2 = x->input[2];
+  j3 = x->input[3];
+  j4 = x->input[4];
+  j5 = x->input[5];
+  j6 = x->input[6];
+  j7 = x->input[7];
+  j8 = x->input[8];
+  j9 = x->input[9];
+  j10 = x->input[10];
+  j11 = x->input[11];
+  j12 = x->input[12];
+  j13 = x->input[13];
+  j14 = x->input[14];
+  j15 = x->input[15];
+
+  for (;;) {
+    if (bytes < 64) {
+      for (i = 0;i < bytes;++i) tmp[i] = m[i];
+      m = tmp;
+      ctarget = c;
+      c = tmp;
+    }
+    x0 = j0;
+    x1 = j1;
+    x2 = j2;
+    x3 = j3;
+    x4 = j4;
+    x5 = j5;
+    x6 = j6;
+    x7 = j7;
+    x8 = j8;
+    x9 = j9;
+    x10 = j10;
+    x11 = j11;
+    x12 = j12;
+    x13 = j13;
+    x14 = j14;
+    x15 = j15;
+    for (i = 20;i > 0;i -= 2) {
+      QUARTERROUND( x0, x4, x8,x12)
+      QUARTERROUND( x1, x5, x9,x13)
+      QUARTERROUND( x2, x6,x10,x14)
+      QUARTERROUND( x3, x7,x11,x15)
+      QUARTERROUND( x0, x5,x10,x15)
+      QUARTERROUND( x1, x6,x11,x12)
+      QUARTERROUND( x2, x7, x8,x13)
+      QUARTERROUND( x3, x4, x9,x14)
+    }
+    x0 = PLUS(x0,j0);
+    x1 = PLUS(x1,j1);
+    x2 = PLUS(x2,j2);
+    x3 = PLUS(x3,j3);
+    x4 = PLUS(x4,j4);
+    x5 = PLUS(x5,j5);
+    x6 = PLUS(x6,j6);
+    x7 = PLUS(x7,j7);
+    x8 = PLUS(x8,j8);
+    x9 = PLUS(x9,j9);
+    x10 = PLUS(x10,j10);
+    x11 = PLUS(x11,j11);
+    x12 = PLUS(x12,j12);
+    x13 = PLUS(x13,j13);
+    x14 = PLUS(x14,j14);
+    x15 = PLUS(x15,j15);
+
+    x0 = XOR(x0,U8TO32_LITTLE(m + 0));
+    x1 = XOR(x1,U8TO32_LITTLE(m + 4));
+    x2 = XOR(x2,U8TO32_LITTLE(m + 8));
+    x3 = XOR(x3,U8TO32_LITTLE(m + 12));
+    x4 = XOR(x4,U8TO32_LITTLE(m + 16));
+    x5 = XOR(x5,U8TO32_LITTLE(m + 20));
+    x6 = XOR(x6,U8TO32_LITTLE(m + 24));
+    x7 = XOR(x7,U8TO32_LITTLE(m + 28));
+    x8 = XOR(x8,U8TO32_LITTLE(m + 32));
+    x9 = XOR(x9,U8TO32_LITTLE(m + 36));
+    x10 = XOR(x10,U8TO32_LITTLE(m + 40));
+    x11 = XOR(x11,U8TO32_LITTLE(m + 44));
+    x12 = XOR(x12,U8TO32_LITTLE(m + 48));
+    x13 = XOR(x13,U8TO32_LITTLE(m + 52));
+    x14 = XOR(x14,U8TO32_LITTLE(m + 56));
+    x15 = XOR(x15,U8TO32_LITTLE(m + 60));
+
+    j12 = PLUSONE(j12);
+    if (!j12) {
+      j13 = PLUSONE(j13);
+      /* stopping at 2^70 bytes per nonce is user's responsibility */
+    }
+
+    U32TO8_LITTLE(c + 0,x0);
+    U32TO8_LITTLE(c + 4,x1);
+    U32TO8_LITTLE(c + 8,x2);
+    U32TO8_LITTLE(c + 12,x3);
+    U32TO8_LITTLE(c + 16,x4);
+    U32TO8_LITTLE(c + 20,x5);
+    U32TO8_LITTLE(c + 24,x6);
+    U32TO8_LITTLE(c + 28,x7);
+    U32TO8_LITTLE(c + 32,x8);
+    U32TO8_LITTLE(c + 36,x9);
+    U32TO8_LITTLE(c + 40,x10);
+    U32TO8_LITTLE(c + 44,x11);
+    U32TO8_LITTLE(c + 48,x12);
+    U32TO8_LITTLE(c + 52,x13);
+    U32TO8_LITTLE(c + 56,x14);
+    U32TO8_LITTLE(c + 60,x15);
+
+    if (bytes <= 64) {
+      if (bytes < 64) {
+        for (i = 0;i < bytes;++i) ctarget[i] = c[i];
+      }
+      x->input[12] = j12;
+      x->input[13] = j13;
+      return;
+    }
+    bytes -= 64;
+    c += 64;
+    m += 64;
+  }
+}
diff --git a/chacha.h b/chacha.h
new file mode 100644
index 0000000..4ef42cc
--- /dev/null
+++ b/chacha.h
@@ -0,0 +1,35 @@
+/* $OpenBSD: chacha.h,v 1.1 2013/11/21 00:45:44 djm Exp $ */
+
+/*
+chacha-merged.c version 20080118
+D. J. Bernstein
+Public domain.
+*/
+
+#ifndef CHACHA_H
+#define CHACHA_H
+
+#include <sys/types.h>
+
+struct chacha_ctx {
+	u_int input[16];
+};
+
+#define CHACHA_MINKEYLEN 	16
+#define CHACHA_NONCELEN		8
+#define CHACHA_CTRLEN		8
+#define CHACHA_STATELEN		(CHACHA_NONCELEN+CHACHA_CTRLEN)
+#define CHACHA_BLOCKLEN		64
+
+void chacha_keysetup(struct chacha_ctx *x, const u_char *k, u_int kbits)
+    __attribute__((__bounded__(__minbytes__, 2, CHACHA_MINKEYLEN)));
+void chacha_ivsetup(struct chacha_ctx *x, const u_char *iv, const u_char *ctr)
+    __attribute__((__bounded__(__minbytes__, 2, CHACHA_NONCELEN)))
+    __attribute__((__bounded__(__minbytes__, 3, CHACHA_CTRLEN)));
+void chacha_encrypt_bytes(struct chacha_ctx *x, const u_char *m,
+    u_char *c, u_int bytes)
+    __attribute__((__bounded__(__buffer__, 2, 4)))
+    __attribute__((__bounded__(__buffer__, 3, 4)));
+
+#endif	/* CHACHA_H */
+
diff --git a/cipher-chachapoly.c b/cipher-chachapoly.c
new file mode 100644
index 0000000..20628ab
--- /dev/null
+++ b/cipher-chachapoly.c
@@ -0,0 +1,114 @@
+/*
+ * Copyright (c) 2013 Damien Miller <djm@mindrot.org>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+/* $OpenBSD: cipher-chachapoly.c,v 1.2 2013/11/21 02:50:00 djm Exp $ */
+
+#include "includes.h"
+
+#include <sys/types.h>
+#include <stdarg.h> /* needed for log.h */
+#include <string.h>
+#include <stdio.h>  /* needed for misc.h */
+
+#include "log.h"
+#include "misc.h"
+#include "cipher-chachapoly.h"
+
+void chachapoly_init(struct chachapoly_ctx *ctx,
+    const u_char *key, u_int keylen)
+{
+	if (keylen != (32 + 32)) /* 2 x 256 bit keys */
+		fatal("%s: invalid keylen %u", __func__, keylen);
+	chacha_keysetup(&ctx->main_ctx, key, 256);
+	chacha_keysetup(&ctx->header_ctx, key + 32, 256);
+}
+
+/*
+ * chachapoly_crypt() operates as following:
+ * Copy 'aadlen' bytes (without en/decryption) from 'src' to 'dest'.
+ * Theses bytes are treated as additional authenticated data.
+ * En/Decrypt 'len' bytes at offset 'aadlen' from 'src' to 'dest'.
+ * Use POLY1305_TAGLEN bytes at offset 'len'+'aadlen' as the
+ * authentication tag.
+ * This tag is written on encryption and verified on decryption.
+ * Both 'aadlen' and 'authlen' can be set to 0.
+ */
+int
+chachapoly_crypt(struct chachapoly_ctx *ctx, u_int seqnr, u_char *dest,
+    const u_char *src, u_int len, u_int aadlen, u_int authlen, int do_encrypt)
+{
+	u_char seqbuf[8];
+	u_char one[8] = { 1, 0, 0, 0, 0, 0, 0, 0 }; /* NB. little-endian */
+	u_char expected_tag[POLY1305_TAGLEN], poly_key[POLY1305_KEYLEN];
+	int r = -1;
+
+	/*
+	 * Run ChaCha20 once to generate the Poly1305 key. The IV is the
+	 * packet sequence number.
+	 */
+	bzero(poly_key, sizeof(poly_key));
+	put_u64(seqbuf, seqnr);
+	chacha_ivsetup(&ctx->main_ctx, seqbuf, NULL);
+	chacha_encrypt_bytes(&ctx->main_ctx,
+	    poly_key, poly_key, sizeof(poly_key));
+	/* Set Chacha's block counter to 1 */
+	chacha_ivsetup(&ctx->main_ctx, seqbuf, one);
+
+	/* If decrypting, check tag before anything else */
+	if (!do_encrypt) {
+		const u_char *tag = src + aadlen + len;
+
+		poly1305_auth(expected_tag, src, aadlen + len, poly_key);
+		if (timingsafe_bcmp(expected_tag, tag, POLY1305_TAGLEN) != 0)
+			goto out;
+	}
+	/* Crypt additional data */
+        if (aadlen) {
+		chacha_ivsetup(&ctx->header_ctx, seqbuf, NULL);
+		chacha_encrypt_bytes(&ctx->header_ctx, src, dest, aadlen);
+	}
+	chacha_encrypt_bytes(&ctx->main_ctx, src + aadlen,
+	    dest + aadlen, len);
+
+	/* If encrypting, calculate and append tag */
+	if (do_encrypt) {
+		poly1305_auth(dest + aadlen + len, dest, aadlen + len,
+		    poly_key);
+	}
+	r = 0;
+
+ out:
+	bzero(expected_tag, sizeof(expected_tag));
+	bzero(seqbuf, sizeof(seqbuf));
+	bzero(poly_key, sizeof(poly_key));
+	return r;
+}
+
+int
+chachapoly_get_length(struct chachapoly_ctx *ctx,
+    u_int *plenp, u_int seqnr, const u_char *cp, u_int len)
+{
+	u_char buf[4], seqbuf[8];
+
+	if (len < 4)
+		return -1; /* Insufficient length */
+	put_u64(seqbuf, seqnr);
+	chacha_ivsetup(&ctx->header_ctx, seqbuf, NULL);
+	chacha_encrypt_bytes(&ctx->header_ctx, cp, buf, 4);
+	*plenp = get_u32(buf);
+	return 0;
+}
+
diff --git a/cipher-chachapoly.h b/cipher-chachapoly.h
new file mode 100644
index 0000000..1628693
--- /dev/null
+++ b/cipher-chachapoly.h
@@ -0,0 +1,41 @@
+/* $OpenBSD: cipher-chachapoly.h,v 1.1 2013/11/21 00:45:44 djm Exp $ */
+
+/*
+ * Copyright (c) Damien Miller 2013 <djm@mindrot.org>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+#ifndef CHACHA_POLY_AEAD_H
+#define CHACHA_POLY_AEAD_H
+
+#include <sys/types.h>
+#include "chacha.h"
+#include "poly1305.h"
+
+#define CHACHA_KEYLEN	32 /* Only 256 bit keys used here */
+
+struct chachapoly_ctx {
+	struct chacha_ctx main_ctx, header_ctx;
+};
+
+void	chachapoly_init(struct chachapoly_ctx *cpctx,
+    const u_char *key, u_int keylen)
+    __attribute__((__bounded__(__buffer__, 2, 3)));
+int	chachapoly_crypt(struct chachapoly_ctx *cpctx, u_int seqnr,
+    u_char *dest, const u_char *src, u_int len, u_int aadlen, u_int authlen,
+    int do_encrypt);
+int	chachapoly_get_length(struct chachapoly_ctx *cpctx,
+    u_int *plenp, u_int seqnr, const u_char *cp, u_int len)
+    __attribute__((__bounded__(__buffer__, 4, 5)));
+
+#endif /* CHACHA_POLY_AEAD_H */
diff --git a/cipher.c b/cipher.c
index 54315f4..c4aec39 100644
--- a/cipher.c
+++ b/cipher.c
@@ -1,4 +1,4 @@
-/* $OpenBSD: cipher.c,v 1.90 2013/11/07 11:58:27 dtucker Exp $ */
+/* $OpenBSD: cipher.c,v 1.91 2013/11/21 00:45:44 djm Exp $ */
 /*
  * Author: Tatu Ylonen <ylo@cs.hut.fi>
  * Copyright (c) 1995 Tatu Ylonen <ylo@cs.hut.fi>, Espoo, Finland
@@ -43,9 +43,11 @@
 
 #include <string.h>
 #include <stdarg.h>
+#include <stdio.h>
 
 #include "xmalloc.h"
 #include "log.h"
+#include "misc.h"
 #include "cipher.h"
 
 /* compatibility with old or broken OpenSSL versions */
@@ -63,7 +65,9 @@
 	u_int	iv_len;		/* defaults to block_size */
 	u_int	auth_len;
 	u_int	discard_len;
-	u_int	cbc_mode;
+	u_int	flags;
+#define CFLAG_CBC		(1<<0)
+#define CFLAG_CHACHAPOLY	(1<<1)
 	const EVP_CIPHER	*(*evptype)(void);
 };
 
@@ -95,6 +99,8 @@
 	{ "aes256-gcm@openssh.com",
 			SSH_CIPHER_SSH2, 16, 32, 12, 16, 0, 0, EVP_aes_256_gcm },
 #endif
+	{ "chacha20-poly1305@openssh.com",
+			SSH_CIPHER_SSH2, 8, 64, 0, 16, 0, CFLAG_CHACHAPOLY, NULL },
 	{ NULL,		SSH_CIPHER_INVALID, 0, 0, 0, 0, 0, 0, NULL }
 };
 
@@ -102,7 +108,7 @@
 
 /* Returns a list of supported ciphers separated by the specified char. */
 char *
-cipher_alg_list(char sep)
+cipher_alg_list(char sep, int auth_only)
 {
 	char *ret = NULL;
 	size_t nlen, rlen = 0;
@@ -111,6 +117,8 @@
 	for (c = ciphers; c->name != NULL; c++) {
 		if (c->number != SSH_CIPHER_SSH2)
 			continue;
+		if (auth_only && c->auth_len == 0)
+			continue;
 		if (ret != NULL)
 			ret[rlen++] = sep;
 		nlen = strlen(c->name);
@@ -142,7 +150,12 @@
 u_int
 cipher_ivlen(const Cipher *c)
 {
-	return (c->iv_len ? c->iv_len : c->block_size);
+	/*
+	 * Default is cipher block size, except for chacha20+poly1305 that
+	 * needs no IV. XXX make iv_len == -1 default?
+	 */
+	return (c->iv_len != 0 || (c->flags & CFLAG_CHACHAPOLY) != 0) ?
+	    c->iv_len : c->block_size;
 }
 
 u_int
@@ -154,7 +167,7 @@
 u_int
 cipher_is_cbc(const Cipher *c)
 {
-	return (c->cbc_mode);
+	return (c->flags & CFLAG_CBC) != 0;
 }
 
 u_int
@@ -274,8 +287,11 @@
 		    ivlen, cipher->name);
 	cc->cipher = cipher;
 
+	if ((cc->cipher->flags & CFLAG_CHACHAPOLY) != 0) {
+		chachapoly_init(&cc->cp_ctx, key, keylen);
+		return;
+	}
 	type = (*cipher->evptype)();
-
 	EVP_CIPHER_CTX_init(&cc->evp);
 #ifdef SSH_OLD_EVP
 	if (type->key_len > 0 && type->key_len != keylen) {
@@ -330,9 +346,15 @@
  * Both 'aadlen' and 'authlen' can be set to 0.
  */
 void
-cipher_crypt(CipherContext *cc, u_char *dest, const u_char *src,
+cipher_crypt(CipherContext *cc, u_int seqnr, u_char *dest, const u_char *src,
     u_int len, u_int aadlen, u_int authlen)
 {
+	if ((cc->cipher->flags & CFLAG_CHACHAPOLY) != 0) {
+		if (chachapoly_crypt(&cc->cp_ctx, seqnr, dest, src, len, aadlen,
+		    authlen, cc->encrypt) != 0)
+			fatal("Decryption integrity check failed");
+		return;
+	}
 	if (authlen) {
 		u_char lastiv[1];
 
@@ -374,10 +396,26 @@
 	}
 }
 
+/* Extract the packet length, including any decryption necessary beforehand */
+int
+cipher_get_length(CipherContext *cc, u_int *plenp, u_int seqnr,
+    const u_char *cp, u_int len)
+{
+	if ((cc->cipher->flags & CFLAG_CHACHAPOLY) != 0)
+		return chachapoly_get_length(&cc->cp_ctx, plenp, seqnr,
+		    cp, len);
+	if (len < 4)
+		return -1;
+	*plenp = get_u32(cp);
+	return 0;
+}
+
 void
 cipher_cleanup(CipherContext *cc)
 {
-	if (EVP_CIPHER_CTX_cleanup(&cc->evp) == 0)
+	if ((cc->cipher->flags & CFLAG_CHACHAPOLY) != 0)
+		bzero(&cc->cp_ctx, sizeof(&cc->cp_ctx));
+	else if (EVP_CIPHER_CTX_cleanup(&cc->evp) == 0)
 		error("cipher_cleanup: EVP_CIPHER_CTX_cleanup failed");
 }
 
@@ -417,6 +455,8 @@
 
 	if (c->number == SSH_CIPHER_3DES)
 		ivlen = 24;
+	else if ((cc->cipher->flags & CFLAG_CHACHAPOLY) != 0)
+		ivlen = 0;
 	else
 		ivlen = EVP_CIPHER_CTX_iv_length(&cc->evp);
 	return (ivlen);
@@ -428,6 +468,12 @@
 	const Cipher *c = cc->cipher;
 	int evplen;
 
+	if ((cc->cipher->flags & CFLAG_CHACHAPOLY) != 0) {
+		if (len != 0)
+			fatal("%s: wrong iv length %d != %d", __func__, len, 0);
+		return;
+	}
+
 	switch (c->number) {
 	case SSH_CIPHER_SSH2:
 	case SSH_CIPHER_DES:
@@ -464,6 +510,9 @@
 	const Cipher *c = cc->cipher;
 	int evplen = 0;
 
+	if ((cc->cipher->flags & CFLAG_CHACHAPOLY) != 0)
+		return;
+
 	switch (c->number) {
 	case SSH_CIPHER_SSH2:
 	case SSH_CIPHER_DES:
diff --git a/cipher.h b/cipher.h
index 4650234..4e837a7 100644
--- a/cipher.h
+++ b/cipher.h
@@ -1,4 +1,4 @@
-/* $OpenBSD: cipher.h,v 1.41 2013/11/07 11:58:27 dtucker Exp $ */
+/* $OpenBSD: cipher.h,v 1.42 2013/11/21 00:45:44 djm Exp $ */
 
 /*
  * Author: Tatu Ylonen <ylo@cs.hut.fi>
@@ -38,6 +38,8 @@
 #define CIPHER_H
 
 #include <openssl/evp.h>
+#include "cipher-chachapoly.h"
+
 /*
  * Cipher types for SSH-1.  New types can be added, but old types should not
  * be removed for compatibility.  The maximum allowed value is 31.
@@ -66,6 +68,7 @@
 	int	plaintext;
 	int	encrypt;
 	EVP_CIPHER_CTX evp;
+	struct chachapoly_ctx cp_ctx; /* XXX union with evp? */
 	const Cipher *cipher;
 };
 
@@ -75,11 +78,13 @@
 int	 cipher_number(const char *);
 char	*cipher_name(int);
 int	 ciphers_valid(const char *);
-char	*cipher_alg_list(char);
+char	*cipher_alg_list(char, int);
 void	 cipher_init(CipherContext *, const Cipher *, const u_char *, u_int,
     const u_char *, u_int, int);
-void	 cipher_crypt(CipherContext *, u_char *, const u_char *,
+void	 cipher_crypt(CipherContext *, u_int, u_char *, const u_char *,
     u_int, u_int, u_int);
+int	 cipher_get_length(CipherContext *, u_int *, u_int,
+    const u_char *, u_int);
 void	 cipher_cleanup(CipherContext *);
 void	 cipher_set_key_string(CipherContext *, const Cipher *, const char *, int);
 u_int	 cipher_blocksize(const Cipher *);
diff --git a/dh.c b/dh.c
index d33af1f..3331cda 100644
--- a/dh.c
+++ b/dh.c
@@ -1,4 +1,4 @@
-/* $OpenBSD: dh.c,v 1.52 2013/10/08 11:42:13 dtucker Exp $ */
+/* $OpenBSD: dh.c,v 1.53 2013/11/21 00:45:44 djm Exp $ */
 /*
  * Copyright (c) 2000 Niels Provos.  All rights reserved.
  *
@@ -254,33 +254,19 @@
 void
 dh_gen_key(DH *dh, int need)
 {
-	int i, bits_set, tries = 0;
+	int pbits;
 
-	if (need < 0)
-		fatal("dh_gen_key: need < 0");
+	if (need <= 0)
+		fatal("%s: need <= 0", __func__);
 	if (dh->p == NULL)
-		fatal("dh_gen_key: dh->p == NULL");
-	if (need > INT_MAX / 2 || 2 * need >= BN_num_bits(dh->p))
-		fatal("dh_gen_key: group too small: %d (2*need %d)",
-		    BN_num_bits(dh->p), 2*need);
-	do {
-		if (dh->priv_key != NULL)
-			BN_clear_free(dh->priv_key);
-		if ((dh->priv_key = BN_new()) == NULL)
-			fatal("dh_gen_key: BN_new failed");
-		/* generate a 2*need bits random private exponent */
-		if (!BN_rand(dh->priv_key, 2*need, 0, 0))
-			fatal("dh_gen_key: BN_rand failed");
-		if (DH_generate_key(dh) == 0)
-			fatal("DH_generate_key");
-		for (i = 0, bits_set = 0; i <= BN_num_bits(dh->priv_key); i++)
-			if (BN_is_bit_set(dh->priv_key, i))
-				bits_set++;
-		debug2("dh_gen_key: priv key bits set: %d/%d",
-		    bits_set, BN_num_bits(dh->priv_key));
-		if (tries++ > 10)
-			fatal("dh_gen_key: too many bad keys: giving up");
-	} while (!dh_pub_is_valid(dh, dh->pub_key));
+		fatal("%s: dh->p == NULL", __func__);
+	if ((pbits = BN_num_bits(dh->p)) <= 0)
+		fatal("%s: bits(p) <= 0", __func__);
+	dh->length = MIN(need * 2, pbits - 1);
+	if (DH_generate_key(dh) == 0)
+		fatal("%s: key generation failed", __func__);
+	if (!dh_pub_is_valid(dh, dh->pub_key))
+		fatal("%s: generated invalid key", __func__);
 }
 
 DH *
diff --git a/myproposal.h b/myproposal.h
index 8da2ac9..71dbc99 100644
--- a/myproposal.h
+++ b/myproposal.h
@@ -1,4 +1,4 @@
-/* $OpenBSD: myproposal.h,v 1.33 2013/11/02 21:59:15 markus Exp $ */
+/* $OpenBSD: myproposal.h,v 1.34 2013/11/21 00:45:44 djm Exp $ */
 
 /*
  * Copyright (c) 2000 Markus Friedl.  All rights reserved.
@@ -104,6 +104,7 @@
 	"aes128-ctr,aes192-ctr,aes256-ctr," \
 	"arcfour256,arcfour128," \
 	AESGCM_CIPHER_MODES \
+	"chacha20-poly1305@openssh.com," \
 	"aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc," \
 	"aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se"
 
diff --git a/packet.c b/packet.c
index 90db33b..029bb4c 100644
--- a/packet.c
+++ b/packet.c
@@ -1,4 +1,4 @@
-/* $OpenBSD: packet.c,v 1.189 2013/11/08 00:39:15 djm Exp $ */
+/* $OpenBSD: packet.c,v 1.190 2013/11/21 00:45:44 djm Exp $ */
 /*
  * Author: Tatu Ylonen <ylo@cs.hut.fi>
  * Copyright (c) 1995 Tatu Ylonen <ylo@cs.hut.fi>, Espoo, Finland
@@ -713,7 +713,7 @@
 	buffer_append(&active_state->output, buf, 4);
 	cp = buffer_append_space(&active_state->output,
 	    buffer_len(&active_state->outgoing_packet));
-	cipher_crypt(&active_state->send_context, cp,
+	cipher_crypt(&active_state->send_context, 0, cp,
 	    buffer_ptr(&active_state->outgoing_packet),
 	    buffer_len(&active_state->outgoing_packet), 0, 0);
 
@@ -946,8 +946,8 @@
 	}
 	/* encrypt packet and append to output buffer. */
 	cp = buffer_append_space(&active_state->output, len + authlen);
-	cipher_crypt(&active_state->send_context, cp,
-	    buffer_ptr(&active_state->outgoing_packet),
+	cipher_crypt(&active_state->send_context, active_state->p_send.seqnr,
+	    cp, buffer_ptr(&active_state->outgoing_packet),
 	    len - aadlen, aadlen, authlen);
 	/* append unencrypted MAC */
 	if (mac && mac->enabled) {
@@ -1208,7 +1208,7 @@
 	/* Decrypt data to incoming_packet. */
 	buffer_clear(&active_state->incoming_packet);
 	cp = buffer_append_space(&active_state->incoming_packet, padded_len);
-	cipher_crypt(&active_state->receive_context, cp,
+	cipher_crypt(&active_state->receive_context, 0, cp,
 	    buffer_ptr(&active_state->input), padded_len, 0, 0);
 
 	buffer_consume(&active_state->input, padded_len);
@@ -1279,10 +1279,12 @@
 	aadlen = (mac && mac->enabled && mac->etm) || authlen ? 4 : 0;
 
 	if (aadlen && active_state->packlen == 0) {
-		if (buffer_len(&active_state->input) < 4)
+		if (cipher_get_length(&active_state->receive_context,
+		    &active_state->packlen,
+		    active_state->p_read.seqnr,
+		    buffer_ptr(&active_state->input),
+		    buffer_len(&active_state->input)) != 0)
 			return SSH_MSG_NONE;
-		cp = buffer_ptr(&active_state->input);
-		active_state->packlen = get_u32(cp);
 		if (active_state->packlen < 1 + 4 ||
 		    active_state->packlen > PACKET_MAX_SIZE) {
 #ifdef PACKET_DEBUG
@@ -1302,7 +1304,8 @@
 		buffer_clear(&active_state->incoming_packet);
 		cp = buffer_append_space(&active_state->incoming_packet,
 		    block_size);
-		cipher_crypt(&active_state->receive_context, cp,
+		cipher_crypt(&active_state->receive_context,
+		    active_state->p_read.seqnr, cp,
 		    buffer_ptr(&active_state->input), block_size, 0, 0);
 		cp = buffer_ptr(&active_state->incoming_packet);
 		active_state->packlen = get_u32(cp);
@@ -1357,7 +1360,8 @@
 		macbuf = mac_compute(mac, active_state->p_read.seqnr,
 		    buffer_ptr(&active_state->input), aadlen + need);
 	cp = buffer_append_space(&active_state->incoming_packet, aadlen + need);
-	cipher_crypt(&active_state->receive_context, cp,
+	cipher_crypt(&active_state->receive_context,
+	    active_state->p_read.seqnr, cp,
 	    buffer_ptr(&active_state->input), need, aadlen, authlen);
 	buffer_consume(&active_state->input, aadlen + need + authlen);
 	/*
diff --git a/poly1305.c b/poly1305.c
new file mode 100644
index 0000000..059cc60
--- /dev/null
+++ b/poly1305.c
@@ -0,0 +1,158 @@
+/* 
+ * Public Domain poly1305 from Andrew M.
+ * poly1305-donna-unrolled.c from https://github.com/floodyberry/poly1305-donna
+ */
+
+/* $OpenBSD: poly1305.c,v 1.2 2013/11/21 02:50:00 djm Exp $ */
+
+#include "includes.h"
+
+#include <sys/types.h>
+#include <stdint.h>
+
+#include "poly1305.h"
+
+#define mul32x32_64(a,b) ((uint64_t)(a) * (b))
+
+#define U8TO32_LE(p) \
+	(((uint32_t)((p)[0])) | \
+	 ((uint32_t)((p)[1]) <<  8) | \
+	 ((uint32_t)((p)[2]) << 16) | \
+	 ((uint32_t)((p)[3]) << 24))
+
+#define U32TO8_LE(p, v) \
+	do { \
+		(p)[0] = (uint8_t)((v)); \
+		(p)[1] = (uint8_t)((v) >>  8); \
+		(p)[2] = (uint8_t)((v) >> 16); \
+		(p)[3] = (uint8_t)((v) >> 24); \
+	} while (0)
+
+void
+poly1305_auth(unsigned char out[POLY1305_TAGLEN], const unsigned char *m, size_t inlen, const unsigned char key[POLY1305_KEYLEN]) {
+	uint32_t t0,t1,t2,t3;
+	uint32_t h0,h1,h2,h3,h4;
+	uint32_t r0,r1,r2,r3,r4;
+	uint32_t s1,s2,s3,s4;
+	uint32_t b, nb;
+	size_t j;
+	uint64_t t[5];
+	uint64_t f0,f1,f2,f3;
+	uint32_t g0,g1,g2,g3,g4;
+	uint64_t c;
+	unsigned char mp[16];
+
+	/* clamp key */
+	t0 = U8TO32_LE(key+0);
+	t1 = U8TO32_LE(key+4);
+	t2 = U8TO32_LE(key+8);
+	t3 = U8TO32_LE(key+12);
+
+	/* precompute multipliers */
+	r0 = t0 & 0x3ffffff; t0 >>= 26; t0 |= t1 << 6;
+	r1 = t0 & 0x3ffff03; t1 >>= 20; t1 |= t2 << 12;
+	r2 = t1 & 0x3ffc0ff; t2 >>= 14; t2 |= t3 << 18;
+	r3 = t2 & 0x3f03fff; t3 >>= 8;
+	r4 = t3 & 0x00fffff;
+
+	s1 = r1 * 5;
+	s2 = r2 * 5;
+	s3 = r3 * 5;
+	s4 = r4 * 5;
+
+	/* init state */
+	h0 = 0;
+	h1 = 0;
+	h2 = 0;
+	h3 = 0;
+	h4 = 0;
+
+	/* full blocks */
+	if (inlen < 16) goto poly1305_donna_atmost15bytes;
+poly1305_donna_16bytes:
+	m += 16;
+	inlen -= 16;
+
+	t0 = U8TO32_LE(m-16);
+	t1 = U8TO32_LE(m-12);
+	t2 = U8TO32_LE(m-8);
+	t3 = U8TO32_LE(m-4);
+
+	h0 += t0 & 0x3ffffff;
+	h1 += ((((uint64_t)t1 << 32) | t0) >> 26) & 0x3ffffff;
+	h2 += ((((uint64_t)t2 << 32) | t1) >> 20) & 0x3ffffff;
+	h3 += ((((uint64_t)t3 << 32) | t2) >> 14) & 0x3ffffff;
+	h4 += (t3 >> 8) | (1 << 24);
+
+
+poly1305_donna_mul:
+	t[0]  = mul32x32_64(h0,r0) + mul32x32_64(h1,s4) + mul32x32_64(h2,s3) + mul32x32_64(h3,s2) + mul32x32_64(h4,s1);
+	t[1]  = mul32x32_64(h0,r1) + mul32x32_64(h1,r0) + mul32x32_64(h2,s4) + mul32x32_64(h3,s3) + mul32x32_64(h4,s2);
+	t[2]  = mul32x32_64(h0,r2) + mul32x32_64(h1,r1) + mul32x32_64(h2,r0) + mul32x32_64(h3,s4) + mul32x32_64(h4,s3);
+	t[3]  = mul32x32_64(h0,r3) + mul32x32_64(h1,r2) + mul32x32_64(h2,r1) + mul32x32_64(h3,r0) + mul32x32_64(h4,s4);
+	t[4]  = mul32x32_64(h0,r4) + mul32x32_64(h1,r3) + mul32x32_64(h2,r2) + mul32x32_64(h3,r1) + mul32x32_64(h4,r0);
+
+	                h0 = (uint32_t)t[0] & 0x3ffffff; c =           (t[0] >> 26);
+	t[1] += c;      h1 = (uint32_t)t[1] & 0x3ffffff; b = (uint32_t)(t[1] >> 26);
+	t[2] += b;      h2 = (uint32_t)t[2] & 0x3ffffff; b = (uint32_t)(t[2] >> 26);
+	t[3] += b;      h3 = (uint32_t)t[3] & 0x3ffffff; b = (uint32_t)(t[3] >> 26);
+	t[4] += b;      h4 = (uint32_t)t[4] & 0x3ffffff; b = (uint32_t)(t[4] >> 26);
+	h0 += b * 5;
+
+	if (inlen >= 16) goto poly1305_donna_16bytes;
+
+	/* final bytes */
+poly1305_donna_atmost15bytes:
+	if (!inlen) goto poly1305_donna_finish;
+
+	for (j = 0; j < inlen; j++) mp[j] = m[j];
+	mp[j++] = 1;
+	for (; j < 16; j++)	mp[j] = 0;
+	inlen = 0;
+
+	t0 = U8TO32_LE(mp+0);
+	t1 = U8TO32_LE(mp+4);
+	t2 = U8TO32_LE(mp+8);
+	t3 = U8TO32_LE(mp+12);
+
+	h0 += t0 & 0x3ffffff;
+	h1 += ((((uint64_t)t1 << 32) | t0) >> 26) & 0x3ffffff;
+	h2 += ((((uint64_t)t2 << 32) | t1) >> 20) & 0x3ffffff;
+	h3 += ((((uint64_t)t3 << 32) | t2) >> 14) & 0x3ffffff;
+	h4 += (t3 >> 8);
+
+	goto poly1305_donna_mul;
+
+poly1305_donna_finish:
+	             b = h0 >> 26; h0 = h0 & 0x3ffffff;
+	h1 +=     b; b = h1 >> 26; h1 = h1 & 0x3ffffff;
+	h2 +=     b; b = h2 >> 26; h2 = h2 & 0x3ffffff;
+	h3 +=     b; b = h3 >> 26; h3 = h3 & 0x3ffffff;
+	h4 +=     b; b = h4 >> 26; h4 = h4 & 0x3ffffff;
+	h0 += b * 5; b = h0 >> 26; h0 = h0 & 0x3ffffff;
+	h1 +=     b;
+
+	g0 = h0 + 5; b = g0 >> 26; g0 &= 0x3ffffff;
+	g1 = h1 + b; b = g1 >> 26; g1 &= 0x3ffffff;
+	g2 = h2 + b; b = g2 >> 26; g2 &= 0x3ffffff;
+	g3 = h3 + b; b = g3 >> 26; g3 &= 0x3ffffff;
+	g4 = h4 + b - (1 << 26);
+
+	b = (g4 >> 31) - 1;
+	nb = ~b;
+	h0 = (h0 & nb) | (g0 & b);
+	h1 = (h1 & nb) | (g1 & b);
+	h2 = (h2 & nb) | (g2 & b);
+	h3 = (h3 & nb) | (g3 & b);
+	h4 = (h4 & nb) | (g4 & b);
+
+	f0 = ((h0      ) | (h1 << 26)) + (uint64_t)U8TO32_LE(&key[16]);
+	f1 = ((h1 >>  6) | (h2 << 20)) + (uint64_t)U8TO32_LE(&key[20]);
+	f2 = ((h2 >> 12) | (h3 << 14)) + (uint64_t)U8TO32_LE(&key[24]);
+	f3 = ((h3 >> 18) | (h4 <<  8)) + (uint64_t)U8TO32_LE(&key[28]);
+
+	U32TO8_LE(&out[ 0], f0); f1 += (f0 >> 32);
+	U32TO8_LE(&out[ 4], f1); f2 += (f1 >> 32);
+	U32TO8_LE(&out[ 8], f2); f3 += (f2 >> 32);
+	U32TO8_LE(&out[12], f3);
+}
diff --git a/poly1305.h b/poly1305.h
new file mode 100644
index 0000000..a31fb74
--- /dev/null
+++ b/poly1305.h
@@ -0,0 +1,22 @@
+/* $OpenBSD: poly1305.h,v 1.1 2013/11/21 00:45:44 djm Exp $ */
+
+/* 
+ * Public Domain poly1305 from Andrew M.
+ * poly1305-donna-unrolled.c from https://github.com/floodyberry/poly1305-donna
+ */
+
+#ifndef POLY1305_H
+#define POLY1305_H
+
+#include <sys/types.h>
+
+#define POLY1305_KEYLEN		32
+#define POLY1305_TAGLEN		16
+
+void poly1305_auth(u_char out[POLY1305_TAGLEN], const u_char *m, size_t inlen,
+    const u_char key[POLY1305_KEYLEN])
+    __attribute__((__bounded__(__minbytes__, 1, POLY1305_TAGLEN)))
+    __attribute__((__bounded__(__buffer__, 2, 3)))
+    __attribute__((__bounded__(__minbytes__, 4, POLY1305_KEYLEN)));
+
+#endif	/* POLY1305_H */
diff --git a/servconf.c b/servconf.c
index 3593223..cb21bd2 100644
--- a/servconf.c
+++ b/servconf.c
@@ -1,5 +1,5 @@
 
-/* $OpenBSD: servconf.c,v 1.245 2013/11/07 11:58:27 dtucker Exp $ */
+/* $OpenBSD: servconf.c,v 1.246 2013/11/21 00:45:44 djm Exp $ */
 /*
  * Copyright (c) 1995 Tatu Ylonen <ylo@cs.hut.fi>, Espoo, Finland
  *                    All rights reserved
@@ -2038,7 +2038,7 @@
 	dump_cfg_string(sPidFile, o->pid_file);
 	dump_cfg_string(sXAuthLocation, o->xauth_location);
 	dump_cfg_string(sCiphers, o->ciphers ? o->ciphers :
-	    cipher_alg_list(','));
+	    cipher_alg_list(',', 0));
 	dump_cfg_string(sMacs, o->macs ? o->macs : mac_alg_list(','));
 	dump_cfg_string(sBanner, o->banner);
 	dump_cfg_string(sForceCommand, o->adm_forced_command);
diff --git a/ssh.1 b/ssh.1
index 6369fc2..73e2086 100644
--- a/ssh.1
+++ b/ssh.1
@@ -33,8 +33,8 @@
 .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
 .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 .\"
-.\" $OpenBSD: ssh.1,v 1.339 2013/10/16 22:49:38 djm Exp $
-.Dd $Mdocdate: October 16 2013 $
+.\" $OpenBSD: ssh.1,v 1.340 2013/11/21 00:45:44 djm Exp $
+.Dd $Mdocdate: November 21 2013 $
 .Dt SSH 1
 .Os
 .Sh NAME
@@ -504,6 +504,8 @@
 The queriable features are:
 .Dq cipher
 (supported symmetric ciphers),
+.Dq cipher-auth
+(supported symmetric ciphers that support authenticated encryption),
 .Dq MAC
 (supported message integrity codes),
 .Dq KEX
diff --git a/ssh.c b/ssh.c
index e2c4363..58becd7 100644
--- a/ssh.c
+++ b/ssh.c
@@ -1,4 +1,4 @@
-/* $OpenBSD: ssh.c,v 1.392 2013/11/07 11:58:27 dtucker Exp $ */
+/* $OpenBSD: ssh.c,v 1.393 2013/11/21 00:45:44 djm Exp $ */
 /*
  * Author: Tatu Ylonen <ylo@cs.hut.fi>
  * Copyright (c) 1995 Tatu Ylonen <ylo@cs.hut.fi>, Espoo, Finland
@@ -520,7 +520,9 @@
 		case 'Q':	/* deprecated */
 			cp = NULL;
 			if (strcasecmp(optarg, "cipher") == 0)
-				cp = cipher_alg_list('\n');
+				cp = cipher_alg_list('\n', 0);
+			else if (strcasecmp(optarg, "cipher-auth") == 0)
+				cp = cipher_alg_list('\n', 1);
 			else if (strcasecmp(optarg, "mac") == 0)
 				cp = mac_alg_list('\n');
 			else if (strcasecmp(optarg, "kex") == 0)
diff --git a/ssh_config.5 b/ssh_config.5
index 8809568..9dbc76c 100644
--- a/ssh_config.5
+++ b/ssh_config.5
@@ -33,8 +33,8 @@
 .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
 .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 .\"
-.\" $OpenBSD: ssh_config.5,v 1.179 2013/11/02 22:39:19 markus Exp $
-.Dd $Mdocdate: November 2 2013 $
+.\" $OpenBSD: ssh_config.5,v 1.180 2013/11/21 00:45:44 djm Exp $
+.Dd $Mdocdate: November 21 2013 $
 .Dt SSH_CONFIG 5
 .Os
 .Sh NAME
@@ -334,7 +334,8 @@
 Specifies the ciphers allowed for protocol version 2
 in order of preference.
 Multiple ciphers must be comma-separated.
-The supported ciphers are
+The supported ciphers are:
+.Pp
 .Dq 3des-cbc ,
 .Dq aes128-cbc ,
 .Dq aes192-cbc ,
@@ -348,15 +349,24 @@
 .Dq arcfour256 ,
 .Dq arcfour ,
 .Dq blowfish-cbc ,
+.Dq cast128-cbc ,
 and
-.Dq cast128-cbc .
+.Dq chacha20-poly1305@openssh.com .
+.Pp
 The default is:
+.Pp
 .Bd -literal -offset 3n
 aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,
 aes128-gcm@openssh.com,aes256-gcm@openssh.com,
+chacha20-poly1305@openssh.com,
 aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,
 aes256-cbc,arcfour
 .Ed
+.Pp
+The list of available ciphers may also be obtained using the
+.Fl Q
+option of
+.Xr ssh 1 .
 .It Cm ClearAllForwardings
 Specifies that all local, remote, and dynamic port forwardings
 specified in the configuration files or on the command line be
diff --git a/sshd_config.5 b/sshd_config.5
index 02c45a7..b9864ff 100644
--- a/sshd_config.5
+++ b/sshd_config.5
@@ -33,8 +33,8 @@
 .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
 .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 .\"
-.\" $OpenBSD: sshd_config.5,v 1.166 2013/11/02 22:39:19 markus Exp $
-.Dd $Mdocdate: November 2 2013 $
+.\" $OpenBSD: sshd_config.5,v 1.167 2013/11/21 00:45:44 djm Exp $
+.Dd $Mdocdate: November 21 2013 $
 .Dt SSHD_CONFIG 5
 .Os
 .Sh NAME
@@ -335,7 +335,8 @@
 .It Cm Ciphers
 Specifies the ciphers allowed for protocol version 2.
 Multiple ciphers must be comma-separated.
-The supported ciphers are
+The supported ciphers are:
+.Pp
 .Dq 3des-cbc ,
 .Dq aes128-cbc ,
 .Dq aes192-cbc ,
@@ -349,15 +350,24 @@
 .Dq arcfour256 ,
 .Dq arcfour ,
 .Dq blowfish-cbc ,
+.Dq cast128-cbc ,
 and
-.Dq cast128-cbc .
+.Dq chacha20-poly1305@openssh.com .
+.Pp
 The default is:
+.Pp
 .Bd -literal -offset 3n
 aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,
 aes128-gcm@openssh.com,aes256-gcm@openssh.com,
+chacha20-poly1305@openssh.com,
 aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,
 aes256-cbc,arcfour
 .Ed
+.Pp
+The list of available ciphers may also be obtained using the
+.Fl Q
+option of
+.Xr ssh 1 .
 .It Cm ClientAliveCountMax
 Sets the number of client alive messages (see below) which may be
 sent without