console: Fix pre-console flushing via cfb_console being very slow

On my A10 OlinuxIno Lime I noticed a huge (5+ seconds) delay coming from
console_init_r. This turns out to be caused by the preconsole buffer flushing
to the cfb_console. The Lime only has a 16 bit memory bus and that is already
heavy used to scan out the 1920x1080 framebuffer.

The problem is that print_pre_console_buffer() was printing the buffer once
character at a time and the cfb_console code then ends up doing a cache-flush
for touched display lines for each character.

This commit fixes this by first building a 0 terminated buffer and then
printing it in one puts() call, avoiding unnecessary cache flushes.

This changes the time for the flush from 5+ seconds to not noticable.

The downside of this approach is that the pre-console buffer needs to fit
on the stack, this is not that much to ask since we are talking about plain
text here. This commit also adjusts the sunxi CONFIG_PRE_CON_BUF_SZ to
actually fit on the stack. Sunxi currently is the only user of the pre-console
code so no other boards need to be adjusted.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Tom Rini <trini@konsulko.com>
diff --git a/README b/README
index 1ea397a..2e81ccc 100644
--- a/README
+++ b/README
@@ -948,6 +948,9 @@
 		bytes are output before the console is initialised, the
 		earlier bytes are discarded.
 
+		Note that when printing the buffer a copy is made on the
+		stack so CONFIG_PRE_CON_BUF_SZ must fit on the stack.
+
 		'Sane' compilers will generate smaller code if
 		CONFIG_PRE_CON_BUF_SZ is a power of 2