[media] vivi: Optimize precalculate_line()
precalculate_line() is not very high on profile, but it calls expensive
gen_twopix(), so let's polish it too:
call gen_twopix() only once for every color bar and then distribute
the result.
before:
# cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
#
# Samples: 46K of event 'cycles'
# Event count (approx.): 15574200568
#
# Overhead Command Shared Object
# ........ ............... ....................
#
27.99% rawv libc-2.13.so [.] __memcpy_ssse3
23.29% vivi-* [kernel.kallsyms] [k] memcpy
10.30% Xorg [unknown] [.] 0xa75c98f8
5.34% vivi-* [vivi] [k] gen_text.constprop.6
4.61% rawv [vivi] [k] gen_twopix
2.64% rawv [vivi] [k] precalculate_line
1.37% swapper [kernel.kallsyms] [k] read_hpet
after:
# cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
#
# Samples: 45K of event 'cycles'
# Event count (approx.): 15561769214
#
# Overhead Command Shared Object
# ........ ............... ....................
#
30.73% rawv libc-2.13.so [.] __memcpy_ssse3
26.78% vivi-* [kernel.kallsyms] [k] memcpy
10.68% Xorg [unknown] [.] 0xa73015e9
5.55% vivi-* [vivi] [k] gen_text.constprop.6
1.36% swapper [kernel.kallsyms] [k] read_hpet
0.96% Xorg [kernel.kallsyms] [k] read_hpet
...
0.16% rawv [vivi] [k] precalculate_line
...
0.14% rawv [vivi] [k] gen_twopix
(i.e. gen_twopix and precalculate_line overheads are almost gone)
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
diff --git a/drivers/media/platform/vivi.c b/drivers/media/platform/vivi.c
index ec832e9..7abb046 100644
--- a/drivers/media/platform/vivi.c
+++ b/drivers/media/platform/vivi.c
@@ -512,12 +512,22 @@
static void precalculate_line(struct vivi_dev *dev)
{
- int w;
+ unsigned pixsize = dev->pixelsize;
+ unsigned pixsize2 = 2*pixsize;
+ int colorpos;
+ u8 *pos;
- for (w = 0; w < dev->width * 2; w++) {
- int colorpos = w / (dev->width / 8) % 8;
+ for (colorpos = 0; colorpos < 16; ++colorpos) {
+ u8 pix[8];
+ int wstart = colorpos * dev->width / 8;
+ int wend = (colorpos+1) * dev->width / 8;
+ int w;
- gen_twopix(dev, dev->line + w * dev->pixelsize, colorpos, w & 1);
+ gen_twopix(dev, &pix[0], colorpos % 8, 0);
+ gen_twopix(dev, &pix[pixsize], colorpos % 8, 1);
+
+ for (w = wstart/2*2, pos = dev->line + w*pixsize; w < wend; w += 2, pos += pixsize2)
+ memcpy(pos, pix, pixsize2);
}
}