dmaengine: edma: Optimize memcpy operation

If the transfer is shorted then 64K we can complete it with one ACNT burst
by configuring ACNT to the length of the copy, this require one paRAM slot.
Otherwise we use two paRAM slots for the copy:
slot1: will copy (length / 32767) number of 32767 byte long blocks
slot2: will be configured to copy the remaining data.

According to tests this patch increases the throughput of memcpy from
~3MB/s to 15MB/s

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
1 file changed