Merge pull request #76 from pyca/wantwritetest-37+67

Only write one byte at a time and try to write many more bytes overall.

On some platforms this does a better job of completely filling the send buffer - which is the goal.