ART: Implement FP packed reduce for x86

This patch implements correct FP vector reduction by index.
Previous implementation corresponded to packed add reduction.

Change-Id: I02a9bcb8e8945937ba7a511b723f23ec30667d34
1 file changed