sparc32: Kill off software 32-bit multiply/divide routines.

For the explicit calls to .udiv/.umul in assembler, I made a
mechanical (read as: safe) transformation.  I didn't attempt
to make any simplifications.

In particular, __ndelay and __udelay can be simplified significantly.
Some of the %y reads are unnecessary and these routines have no need
any longer for allocating a register window, they can be leaf
functions.

Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/arch/sparc/lib/muldi3.S b/arch/sparc/lib/muldi3.S
index 7f17872..9794939 100644
--- a/arch/sparc/lib/muldi3.S
+++ b/arch/sparc/lib/muldi3.S
@@ -63,12 +63,12 @@
 	rd  %y, %o1
 	mov  %o1, %l3
 	mov  %i1, %o0
-	call  .umul
 	mov  %i2, %o1
+	umul %o0, %o1, %o0
 	mov  %o0, %l0
 	mov  %i0, %o0
-	call  .umul
 	mov  %i3, %o1
+	umul %o0, %o1, %o0
 	add  %l0, %o0, %l0
 	mov  %l2, %i0
 	add  %l2, %l0, %i0