1. 1921442 Snap for 7216111 from 033350c4252004f65e85e8d547b473ae28ebd158 to sc-v2-release by android-build-team Robot · 3 years, 2 months ago int/12/fp4
  2. 033350c [LSC] Add LOCAL_LICENSE_KINDS to external/ruy am: 0c00310b85 am: 6332d3e777 am: 99eba2baf6 by Bob Badour · 3 years, 2 months ago
  3. 99eba2b [LSC] Add LOCAL_LICENSE_KINDS to external/ruy am: 0c00310b85 am: 6332d3e777 by Bob Badour · 3 years, 2 months ago
  4. 6332d3e [LSC] Add LOCAL_LICENSE_KINDS to external/ruy am: 0c00310b85 by Bob Badour · 3 years, 2 months ago
  5. 0c00310 [LSC] Add LOCAL_LICENSE_KINDS to external/ruy by Bob Badour · 3 years, 2 months ago
  6. 6bd0f3c Snap for 7205268 from bf8748dc2397c58b711ef661bc5607966aa1bad5 to sc-v2-release by android-build-team Robot · 3 years, 2 months ago
  7. bf8748d Add Android.bp to ruy project am: 35891dbbfa am: 8653006f72 am: 72e32be3a8 by Lev Proleev · 3 years, 2 months ago
  8. 998a6df Merge remote-tracking branch 'aosp/upstream-master' into tflite-rebase-feb-2021 am: 713d254ecf am: dd1f1778c2 am: 039d972c6e by Lev Proleev · 3 years, 2 months ago
  9. 72e32be Add Android.bp to ruy project am: 35891dbbfa am: 8653006f72 by Lev Proleev · 3 years, 2 months ago
  10. 039d972 Merge remote-tracking branch 'aosp/upstream-master' into tflite-rebase-feb-2021 am: 713d254ecf am: dd1f1778c2 by Lev Proleev · 3 years, 2 months ago
  11. 8653006 Add Android.bp to ruy project am: 35891dbbfa by Lev Proleev · 3 years, 2 months ago
  12. dd1f177 Merge remote-tracking branch 'aosp/upstream-master' into tflite-rebase-feb-2021 am: 713d254ecf by Lev Proleev · 3 years, 2 months ago
  13. 35891db Add Android.bp to ruy project by Lev Proleev · 3 years, 3 months ago
  14. 713d254 Merge remote-tracking branch 'aosp/upstream-master' into tflite-rebase-feb-2021 by Lev Proleev · 3 years, 3 months ago
  15. d23d538 Initial empty repository by Inna Palant · 3 years, 3 months ago
  16. be760b6 Simplify quantized multiplier by Georgios Pinitas · 3 years, 3 months ago
  17. 287015c Update test tolerance ahead of merging PR #227 by bjacob · 3 years, 3 months ago
  18. 2887692 Allow late definitions of cpuinfo but only when ruy is a subdir. (#250) by bjacob · 3 years, 4 months ago
  19. 09827c8 Disable tests by default when ruy is a subproject. by bjacob · 3 years, 4 months ago
  20. 58e3051 Change the default MulParams multiplier values to multiply by 1, not 0. by bjacob · 3 years, 4 months ago
  21. fad5a10 Add basic gitignore (#246) by Geoffrey Martin-Noble · 3 years, 4 months ago
  22. 4bdd13c Simplify cpuinfo build overlay (#247) by Geoffrey Martin-Noble · 3 years, 4 months ago
  23. d65bcd7 Fixes for builds in open source projects with cpuinfo and googletest deps. by Benoit Jacob · 3 years, 4 months ago
  24. c200f59 Update depgraph by bjacob · 3 years, 4 months ago
  25. 45a876f Revert "Revert "Add CMake support with a converter from Bazel"" by bjacob · 3 years, 4 months ago
  26. 25df4d3 Corrected macro for detecting ppc platform (#83) by Nishidha · 3 years, 4 months ago
  27. 20b5eb0 Add a tracing framework (really just logging). by Benoit Jacob · 3 years, 4 months ago
  28. 4ed6216 Revert "Add CMake support with a converter from Bazel (#233)" (#243) by bjacob · 3 years, 4 months ago
  29. b87d6d2 Add CMake support with a converter from Bazel (#233) by bjacob · 3 years, 4 months ago
  30. fb9174e Corrected macro for detecting ppc platform. (#83) by Nishidha · 3 years, 4 months ago
  31. cb106ed Move submodules to where they belong. (#240) by bjacob · 3 years, 4 months ago
  32. f6f4475 Add git submodules: googletest and cpuinfo (#235) by bjacob · 3 years, 4 months ago
  33. f4af2f7 Bazel submodules (#236) by bjacob · 3 years, 4 months ago
  34. 9c9fdbc Fix doc paths in README by Benoit Jacob · 3 years, 4 months ago
  35. d7fb861 Add a trimmed dependency graph and its generator, for doc purposes. by Benoit Jacob · 3 years, 4 months ago
  36. 2cbb179 Drop unneeded dependency from :context. by Benoit Jacob · 3 years, 4 months ago
  37. 3f655fa Cosmetics: class-ify TrMulTask, in particular put the trailing _ where they belong. by Benoit Jacob · 3 years, 4 months ago
  38. 8782836 Fix the new raw accumulators example - being raw accumulators, it's not 'per channel', as there is no multiplier here. by Benoit Jacob · 3 years, 5 months ago
  39. c162e5d Relax test tolerance against Eigen, adapting to a recent Eigen change between Eigen commits by Benoit Jacob · 3 years, 5 months ago
  40. 3fc7ae2 fix gcc warnings by Benoit Jacob · 3 years, 5 months ago
  41. 177062d Move the example out of the ruy/ruy directory, and add an example returning raw by Benoit Jacob · 3 years, 5 months ago
  42. 4790797 Fixing warnings on MSVC (comparing a bool with >). by Ben Vanik · 3 years, 6 months ago
  43. 7a6a38e Enforce x86 bit exactness by T.J. Alumbaugh · 3 years, 7 months ago
  44. d79362c MSVC fixes: by Benoit Jacob · 3 years, 7 months ago
  45. 7e1d379 Zero point checking disabled for uint8 x uint8 GEMMs by T.J. Alumbaugh · 3 years, 7 months ago
  46. dd1102a Update AVX, AVX2, AVX512 Rescale operations with Rounding Right Shift by T.J. Alumbaugh · 3 years, 7 months ago
  47. a28320a move example.cc into one directory by Leslie-Fang · 3 years, 7 months ago
  48. 034c0e2 Use movi NEON instruction to zero out registers by Lukas Geiger · 3 years, 7 months ago
  49. e59c55d It's _MSC_VER not __MSC_VER. by Benoit Jacob · 3 years, 7 months ago
  50. 503dd78 Enable x86 SIMD code paths on MSVC 2019 and similarly-versioned Clang-CL. by Benoit Jacob · 3 years, 7 months ago
  51. 3c363dc Add a few PMU counters. by Benoit Jacob · 3 years, 7 months ago
  52. 14569d2 Additional optimizations for AVX 8bit quantized kernel. by T.J. Alumbaugh · 3 years, 8 months ago
  53. fad2140 Optimize AVX/AVX2 quantized path by T.J. Alumbaugh · 3 years, 8 months ago
  54. d13c696 Fix buffer overrun on asan for AVX512 float. by T.J. Alumbaugh · 3 years, 8 months ago
  55. 8f08903 Optimize AVX512 float path by T.J. Alumbaugh · 3 years, 8 months ago
  56. be065e4 Optimize AVX/AVX2+FMA float path by T.J. Alumbaugh · 3 years, 8 months ago
  57. d7b739e AVX 8bit row major/col major packing code by T.J. Alumbaugh · 3 years, 9 months ago
  58. 74bfa70 AVX Pack inherits from StandardCpp by T.J. Alumbaugh · 3 years, 9 months ago
  59. 9e63749 AVX 8bit kernel. Forked from AVX2+FMA version by T.J. Alumbaugh · 3 years, 9 months ago
  60. 29a155b Update README.md by Benoit Jacob · 3 years, 9 months ago
  61. ce0e559 Changes are excluded via Copybara by Ruy Contributors · 3 years, 9 months ago
  62. 4b1972b Changes are excluded via Copybara by Ruy Contributors · 3 years, 9 months ago
  63. 59c2de8 Rename kOutOfOrder -> kGeneric, kInOrder -> kA55ish, by Benoit Jacob · 3 years, 9 months ago
  64. 4f6a37b Reimplement :tune on top of :cpuinfo. by Benoit Jacob · 3 years, 9 months ago
  65. f99b42b Add bzl_library rules for .bzl files without one. by Ruy Contributors · 3 years, 9 months ago
  66. 2b24016 Adds AVX float packing code. by T.J. Alumbaugh · 3 years, 10 months ago
  67. 70d32d6 Adds AVX path and AVX float kernel. by T.J. Alumbaugh · 3 years, 10 months ago
  68. d4822f4 Adds AVX path and AVX float kernel. by T.J. Alumbaugh · 3 years, 10 months ago
  69. 18e34fa Adds AVX path and AVX float kernel. by T.J. Alumbaugh · 3 years, 10 months ago
  70. d7bd2a1 Print extra information in case of disagreeing TestResults. by T.J. Alumbaugh · 3 years, 10 months ago
  71. 5bb02fb check_macros improvements: promote operands before comparisons (avoids -Wsign-compare errors with GCC in cases like RUY_CHECK_NE(unsigned_bitmask_expression, 0)) and move all of the implementation to an inline function instead of having half of it in the macro. by Benoit Jacob · 3 years, 10 months ago
  72. f876353 Add missing #include of <cstring>. by Benoit Jacob · 3 years, 10 months ago
  73. bfe6e0d Simplify bias-loading code now that bias buffers are always rounded up to multiple of kernel size. by Benoit Jacob · 3 years, 10 months ago
  74. b53312b Use lambdas to shorten source code like we did in the avx512 kernel. by Benoit Jacob · 3 years, 10 months ago
  75. f611892 Handle per-column multipliers in the avx512 kernel without transposing the 16x16 accumulator block. by Benoit Jacob · 3 years, 10 months ago
  76. 1efd970 Optimized packing code path for row-major 8bit inputs for the x86 paths. by Benoit Jacob · 3 years, 10 months ago
  77. 257a0fc Optimized packing code path for row-major 8bit inputs for the kNeon path. Written in intrinsics to handle 3 cases at once: by Benoit Jacob · 3 years, 10 months ago
  78. 550655f Use lambdas to shorten Kernel8bitAvx512's source code, and to split the resulting non-opt binary code into smaller functions. This makes no difference in opt builds, but for non-opt builds this reduces the stack frame of this function from 60k down to 24k. This avoids stack overflows in some toolchains. by Benoit Jacob · 3 years, 10 months ago
  79. ec99c70 Optimized packing code path for row-major float inputs. by Benoit Jacob · 3 years, 10 months ago
  80. bebf022 Optimized packing code path for row-major 8bit inputs for the kNeonDotprod path. by Benoit Jacob · 3 years, 10 months ago
  81. d492ac8 Fix the build on some toolchains - a missing #include<cstring> and some avx512 intrinsic synonyms. by Benoit Jacob · 3 years, 10 months ago
  82. 90f7274 Rename packing code implementation functions now that they are explicitly about one specific source matrix storage order. by Benoit Jacob · 3 years, 10 months ago
  83. cd375d3 Templatize packing code paths on the source order, so that we support any combination source order, with the worst case being a fall back to the standard c++ packing code, which readily supports any storage order. by Benoit Jacob · 3 years, 10 months ago
  84. 5210e3e Simplification of FallBackToStandardCpp now that we are past the incremental steps toward supporting any channel_dimension. by Benoit Jacob · 3 years, 10 months ago
  85. 6d218c3 Efficient support for any channel_dimension for quantized kernels on AVX-512, part 2: handling of per-channel multipliers. by Benoit Jacob · 3 years, 10 months ago
  86. c1d5b4f Efficient support for any channel_dimension for quantized kernels on AVX-512, part 1: non-per-channel-multiplier case, so we only have to deal with bias vectors for now. by Benoit Jacob · 3 years, 10 months ago
  87. bb9349c Efficient support for any channel_dimension for quantized kernels on AVX2. by Benoit Jacob · 3 years, 10 months ago
  88. bd21e0c Simplify x86 kernels by using the fact that there always is a per-channel buffer to read from, even in the non-perchannel case (in that case, its size is just the kernel's width and one must use 0 as offset). by Benoit Jacob · 3 years, 10 months ago
  89. 98c5213 Simplify x86 kernels thanks to the new fact that perchannel buffers are rounded to next multiple of kernel width. by Benoit Jacob · 3 years, 10 months ago
  90. a776b5d Fix runtime detection of support for our AVX2+FMA code path: we were only checking for AVX2, which happens to imply FMA on Intel CPUs. by Benoit Jacob · 3 years, 10 months ago
  91. 7784e18 FMA is technically a separate ISA extension from AVX2. by Benoit Jacob · 3 years, 10 months ago
  92. 27d16d0 Efficient support for any channel_dimension for float kernels on AVX-512. by Benoit Jacob · 3 years, 10 months ago
  93. 592d30c Efficient support for any channel_dimension for float kernels on AVX2. by Benoit Jacob · 3 years, 10 months ago
  94. f88e08e Allow the user to specify that they have allocated a slightly larger capacity for the per-channel buffers, so that ruy can then avoid reallocating and copying these buffers. by Benoit Jacob · 3 years, 10 months ago
  95. 388ffd2 Fix ARM32 packing code reading past the end of the source matrix, and finishing enabling the use of SeparateMappingVector in StorageMatrix in the test code to guard against that (It had discovered this issue). by Benoit Jacob · 3 years, 10 months ago
  96. 856f0fd Add comments and some minor simplications to packing code. by Benoit Jacob · 3 years, 10 months ago
  97. e600a4d Avoid overrunning per-channel buffers, whose size is that of the corresponding user-facing matrix dimension, but which assembly kernels tend to address as if they had the same size as the corresponding packed matrix dimension. AddressSanitizer can't see what asm kernels do. by Benoit Jacob · 3 years, 10 months ago
  98. f5b52f9 Minor optimization of in-order arm64 kernels, interleave the dup's used in the channels-are-columns case with other instructions. by Benoit Jacob · 3 years, 10 months ago
  99. 62aa923 Minor simplification of arm32 assembly: the add instruction itself can be conditional. by Benoit Jacob · 3 years, 10 months ago
  100. ec970ca Efficient support for any channel_dimension for quantized kernels on ARM32. by Benoit Jacob · 3 years, 10 months ago