Issues #26289 and #26315: Optimize floor/modulo div for single-digit longs

Microbenchmarks show 2-2.5x improvement.  Built-in 'divmod' function
is now also ~10% faster.

-m timeit -s "x=22331" "x//2;x//-3;x//4;x//5;x//-6;x//7;x//8;x//-99;x//100;"
with patch: 0.321          without patch: 0.633

-m timeit -s "x=22331" "x%2;x%3;x%-4;x%5;x%6;x%-7;x%8;x%99;x%-100;"
with patch: 0.224          without patch: 0.66

Big thanks to Serhiy Storchaka, Mark Dickinson and Victor Stinner for
thorow code reviews and algorithms improvements.
3 files changed