AHCI: Optimize single IRQ interrupt processing

Split interrupt service routine into hardware context handler
and threaded context handler. That allows to protect ports with
individual locks rather than with a single host-wide lock and
move port interrupts handling out of the hardware interrupt
context.

Testing was done by transferring 8GB on two hard drives in
parallel using command 'dd if=/dev/sd{a,b} of=/dev/null'. With
lock_stat statistics I measured access times to ata_host::lock
spinlock (since interrupt handler code is fully embraced with
this lock). The average lock's holdtime decreased eight times
while average waittime decreased two times.

Both before and after the change the transfer time is the same,
while 'perf record -e cycles:k ...' shows 1%-4% CPU time spent
in ahci_single_irq_intr() routine before the update and not even
sampled/shown ahci_single_irq_intr() after the update.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: linux-ide@vger.kernel.org
1 file changed