1. fd803fb Add a channel_dimension member to MulParams, bringing the last piece to make Ruy's API fully LHS<->RHS symmetric, allowing the implementation to transpose the whole Mul to reduce to column major destination matrices. by Benoit Jacob · 4 years, 3 months ago
  2. d2509b7 Make FixedKernelLayout internal by Benoit Jacob · 4 years, 3 months ago
  3. 66961ae Fix up templates specialization for change by Ruy Contributors · 4 years, 3 months ago
  4. 19b09a4 Make FixedKernelLayout internal by Robert David · 4 years, 3 months ago
  5. c9f5f9c Clean up #includes and deps among kernel* and pack*. by Benoit Jacob · 4 years, 3 months ago
  6. ae6e0ed trim down common.h, keeping only the macros. by Benoit Jacob · 4 years, 3 months ago
  7. 5efd3eb Make FixedKernelLayout internal by Benoit Jacob · 4 years, 3 months ago
  8. 43680a7 Detemplatize on MulParmsType, part 2. by Benoit Jacob · 4 years, 3 months ago
  9. 1acc6f5 Avoid templatizing on MulParamsType, instead templatize on AccumScalar/DstScalar, as the only MulParamsType is MulParams<AccumScalar,DstScalar> (part 1). by Benoit Jacob · 4 years, 3 months ago
  10. 412e17e Finish cleaning up mul_params.h: remove ZeroPointSupport and LayoutSupport enums, and other now-unused things. Mark MulParams as final. by Benoit Jacob · 4 years, 3 months ago
  11. 5111a55 Remove the LoopStructure enum. by Benoit Jacob · 4 years, 3 months ago
  12. 5c28dfe Delete test_special_mul_params and de-templatize the test code on a MulParamsType, restricting it to non-subclassed MulParams. This is a temporary regression in testing coverage but in the next commit in these series we will recover the testing of special StandardCpp kernel layouts thanks to the new Path's, while removing the other features that subclassing MulParams offered. by Benoit Jacob · 4 years, 3 months ago
  13. bf0c1c4 Introduce new internal-only Paths that are variants of kStandardCpp exercising internal corners of ruy. by Benoit Jacob · 4 years, 3 months ago
  14. f6363d0 Delete stale file, forgot to remove it in cl/317146687. by Benoit Jacob · 4 years, 3 months ago
  15. fb8fa3b the example code was still teaching people to use <ruy::kAllPaths>, which most users now don't need or want to. by Benoit Jacob · 4 years, 3 months ago
  16. 1014033 Shuffle Path values a bit. kStandardCpp=1, other values < 0x10 will be used for kStandardCpp variants for internal testing purposes, SIMD paths start at 0x10. by Benoit Jacob · 4 years, 3 months ago
  17. 8dd9136 Remove SSE4.2 and VNNI placeholder code for now. by Benoit Jacob · 4 years, 3 months ago
  18. 4d8ad9f The word 'packed' is being used for too many things, so rename to make it more specific in each case. by Benoit Jacob · 4 years, 3 months ago
  19. e6603bf Rename Other to OtherSide for readability at call sites, and use it in one more place. by Benoit Jacob · 4 years, 3 months ago
  20. b7649fa Refactoring of the front-end code. by Benoit Jacob · 4 years, 3 months ago
  21. 0b64129 Check that the actually used kernel code path matches the path we think we're taking, at least when it should match, i.e. in standard cases that fast code path are supposed to handle. by Benoit Jacob · 4 years, 3 months ago
  22. c45f194 Fix a recent regression (from cl/316525635): when the LHS/RHS scalr type was uint8 (not int8), we had disabled all NEON paths on ARM 32bit (not on ARM 64bit)! by Benoit Jacob · 4 years, 4 months ago
  23. 072976c Restructure pack*.h headers so that just pack_common.h does not provide any code path, only common helpers, so that one can't accidentally #include pack_common.h instead of pack.h and silently fall back to slow code. by Benoit Jacob · 4 years, 4 months ago
  24. e7b27d6 Restructure kernel*.h headers so that just kernel_common.h does not provide any code path, only common helpers, so that one can't accidentally #include kernel_common.h instead of kernel.h and silently fall back to slow code. by Benoit Jacob · 4 years, 4 months ago
  25. b896b0c Support --cpu=armeabi, used in TensorFlow Raspberry Pi builds like here: by Benoit Jacob · 4 years, 4 months ago
  26. 9ad26c7 Complete the rollback by deleting files that were added by that CL and not deleted by the rollback. by Benoit Jacob · 4 years, 4 months ago
  27. 34ea9f4 Rollback refactoring. by Ruy Contributors · 4 years, 4 months ago
  28. 3281c7c Rename Other to OtherSide for readability at call sites, and use it in one more place. by Ruy Contributors · 4 years, 4 months ago
  29. b786fbd Rollback refactoring. by Ruy Contributors · 4 years, 4 months ago
  30. 93fdb9e The word 'packed' is being used for too many things, so rename to make it more specific in each case. by Benoit Jacob · 4 years, 4 months ago
  31. db28e82 Rename Other to OtherSide for readability at call sites, and use it in one more place. by Benoit Jacob · 4 years, 4 months ago
  32. 40394f7 Update our arm32 detection logic to support the case of cpu=='armv7a' as opposed to cpu=='armeabi-v7a' as we have on Android. Use naming that's more explicit as to our intent to just assume NEON support. by Benoit Jacob · 4 years, 4 months ago
  33. c03298c Import the fix from XNNPACK's cpuinfo.BUILD to support the case where cpu=="armv7a". by Benoit Jacob · 4 years, 4 months ago
  34. 55cb53a Refactoring of the front-end code. by Benoit Jacob · 4 years, 4 months ago
  35. 921b9fe Better comments in trmul.cc. by Benoit Jacob · 4 years, 4 months ago
  36. fb94f05 Trim the dependencies and #includes in common.h, and fix trmul_params.h that was relying on common.h to #include path.h. by Benoit Jacob · 4 years, 4 months ago
  37. 9df1b07 Ruy takes runtime enabled paths from env var. by T.J. Alumbaugh · 4 years, 4 months ago
  38. c347b02 Fix the opensource build, need `defines` to be [] not True. by Benoit Jacob · 4 years, 4 months ago
  39. d42b66b Avoid linkstatic on macOS, see https://github.com/bazelbuild/bazel/issues/11552. by Benoit Jacob · 4 years, 4 months ago
  40. a37cc4d Disable cpuinfo in other build systems than Bazel unless they explcitly opt in by defining this RUY_HAVE_CPUINFO token. That requires first porting the cpuinfo BUILD. by Benoit Jacob · 4 years, 4 months ago
  41. d4ddc05 Consistenly avoid include path stripping to remove some of the dimensions in debugging build errors with TensorFlow Makefiles. by Benoit Jacob · 4 years, 4 months ago
  42. 9a8d8f9 Do not link to cpuinfo on macOS to avoid link errors when building with Bazel. by Benoit Jacob · 4 years, 4 months ago
  43. 8047dba Avoid categorizing Apple watchOS as macOS. by Benoit Jacob · 4 years, 4 months ago
  44. 2ccc5d5 Do not link to cpuinfo on macOS to avoid link errors when building with Bazel. by Benoit Jacob · 4 years, 4 months ago
  45. 20bd869 Use cpuinfo also for cpu cache size detection. by Benoit Jacob · 4 years, 4 months ago
  46. 53b778a x86_64 config setting is --cpu=k8 or --cpu=haswell by T.J. Alumbaugh · 4 years, 4 months ago
  47. b68dcd8 Use individual -mavx512* feature flags instead of -march=skylake-avx512 in the hope that it will be better supported on some older toolchains. by Benoit Jacob · 4 years, 4 months ago
  48. 1a8b7ea Fix the opensource build. Remove the :windows config_setting. It did not work because the bazel implementation using @bazel_tools//src/conditions was wrongly assuming that that was a flag_value that could be used in a config_setting, which it's not. by Benoit Jacob · 4 years, 4 months ago
  49. f745edc Add a :windows config_setting and some comments. by Benoit Jacob · 4 years, 4 months ago
  50. 8de40aa Remove incompatible warnings with Windows by Marin Baron · 4 years, 4 months ago
  51. f8c0144 Enable the NEON dotprod path outside of Linux. by Benoit Jacob · 4 years, 4 months ago
  52. 51b518e Fix #68: Missing clear of q7 leading to wrong computations (#69) by lissyx · 4 years, 4 months ago
  53. 6f14203 Use the cpuinfo library instead of our own code for CPU feature detection. by Benoit Jacob · 4 years, 4 months ago
  54. 20ed917 Use the cpuinfo library instead of our own code for CPU feature detection. by Ruy Contributors · 4 years, 4 months ago
  55. 74b7491 Use the cpuinfo library instead of our own code for CPU feature detection. by Benoit Jacob · 4 years, 4 months ago
  56. 7b75a8b Change the RUY_OPT* syntax to look shorter at call sites: by Benoit Jacob · 4 years, 4 months ago
  57. a0ca5e6 Add -Wundef to ruy_copts, and remove RUY_PLATFORM(X). Call site simplification: by Benoit Jacob · 4 years, 4 months ago
  58. 736429b Refactoring of {Get,Set}RuntimeSupportedPaths: by Benoit Jacob · 4 years, 4 months ago
  59. a216c87 Renaming last_selected_path to last_used_path, i.e. make the name reflect the user's perspective not the implementation's. by Benoit Jacob · 4 years, 4 months ago
  60. f05ec59 Restrict DetectDotprod to Linux again. We are going to abandon it soon in favor of the `cpuinfo` library. This change removes some platform.h code that we had recently added for it. by Benoit Jacob · 4 years, 4 months ago
  61. baa8601 Rollback due to an internal regression test failure. by Shashi Shekhar · 4 years, 4 months ago
  62. 4adf261 Temporarily disable dotprod detection on apple. by Benoit Jacob · 4 years, 4 months ago
  63. 85909ed Allow to control the spin-wait timeout. by Benoit Jacob · 4 years, 4 months ago
  64. 1f56782 Context/Path improvements: by Benoit Jacob · 4 years, 4 months ago
  65. 9fcfdc0 Enable dotprod on all Unix (including Apple) on ARM64. by Benoit Jacob · 4 years, 4 months ago
  66. 4f11e59 size_util and detect_arm do not need public visibility anymore. by Benoit Jacob · 4 years, 5 months ago
  67. bec99d6 Enable AVX512 in the open-source build, on the same compilers as other by Benoit Jacob · 4 years, 5 months ago
  68. b991b7a Fix, and enable, the AVX2 path on GCC >= 9: by Benoit Jacob · 4 years, 5 months ago
  69. d9d2a2f Build with -fno-lax-vector-conversions. by Benoit Jacob · 4 years, 5 months ago
  70. 808ff74 Set is_prepacked flag for matrices returned from cache. by T.J. Alumbaugh · 4 years, 5 months ago
  71. 5825d58 Remove a bad 'static' keyword. by Benoit Jacob · 4 years, 5 months ago
  72. 33b0425 build_defs changes: by Benoit Jacob · 4 years, 5 months ago
  73. ee83d0d Remove tracing. by Benoit Jacob · 4 years, 5 months ago
  74. 36ff18c Remove the legacy kReference which was just an alias for kStandardCpp while by Benoit Jacob · 4 years, 5 months ago
  75. ad04c73 Remove unnecessary #includes from ctx.h and prevent them from coming back by Benoit Jacob · 4 years, 5 months ago
  76. 210c986 Move SystemAligned{Alloc,Free} functions to their own library as they are used independently of Allocator. by Benoit Jacob · 4 years, 5 months ago
  77. 1b31368 Remove Path::kReference. Instead, ReferenceMul becomes a separate library. by Benoit Jacob · 4 years, 5 months ago
  78. eda1b5b Comment on the hash function. by Benoit Jacob · 4 years, 5 months ago
  79. 57e64b4 Remove the advanced API. by Benoit Jacob · 4 years, 5 months ago
  80. 3210565 Remove the internal allocator in PrepackedCache. Just use the system aligned allocator functions, and free buffers in the ~PrepackedCache() destructor. And limit the allocation of the `sums` buffer to the integer quantized case. by Benoit Jacob · 4 years, 5 months ago
  81. 534717b PrepackedCache improvements: by Benoit Jacob · 4 years, 5 months ago
  82. a5fa886 Fix build on certain Windows toolchains by Ruy Contributors · 4 years, 5 months ago
  83. 20e2af8 Use PEMat instead of PrepackedMatrix in PrepackedCache. by Benoit Jacob · 4 years, 5 months ago
  84. 58ee522 Erase duplicate comments from when these files were copied from pack.h. by Benoit Jacob · 4 years, 5 months ago
  85. 2ce8c48 Rename DataSize->DataBytes, SumsSize->SumsBytes, to stress which functions return a number of bytes, instead of a number of potentially multi-byte elemens as FlatSize does. by Benoit Jacob · 4 years, 5 months ago
  86. a4f3160 Remove the old API for the constant packed matrix cache. by Benoit Jacob · 4 years, 5 months ago
  87. 4bdb31a Move cache_policy from the Context class to the Matrix class and change the set of available enum values in CachePolicy. by Benoit Jacob · 4 years, 5 months ago
  88. 0ad580f ruy_advanced API touchups: MulWithPrepacked does not need prepacked operands to be mutable, and PrepackedMatrix does not need accessor methods. by Benoit Jacob · 4 years, 5 months ago
  89. 6b1171e Fix the build by Benoit Jacob · 4 years, 5 months ago
  90. 6039ccc Make context.h minimal, not #including other ruy headers. by Benoit Jacob · 4 years, 5 months ago
  91. 970304d finish c++ifying Context by Benoit Jacob · 4 years, 5 months ago
  92. e866a68 finish c++ifying MulParams by Benoit Jacob · 4 years, 5 months ago
  93. 2bfeb07 finish c++ifying Matrix by Benoit Jacob · 4 years, 5 months ago
  94. de0b1b6 finish c++ifying Layout by Benoit Jacob · 4 years, 5 months ago
  95. 145aecd Rename: by Benoit Jacob · 4 years, 5 months ago
  96. f3c69a7 1. Introduce InternalLayout, a private counterpart of Layout, to be used by internal_matrix.h classes. by Benoit Jacob · 4 years, 5 months ago
  97. 5b0e99d Emulate _BitScanReverse64 on 32-bit MSVC targets by Marat Dukhan · 4 years, 6 months ago
  98. 7a9da95 Increase visibility of size_util by T.J. Alumbaugh · 4 years, 6 months ago
  99. 439c1ac Refactor ruy's predefined Path set constants, introduce a new kDefaultPaths that compiles fewer paths than kAllPaths, and have ruy::Mul(...) use it (overload not taking an explicit Path parameter). by Benoit Jacob · 4 years, 6 months ago
  100. 9f53ba4 Rename :spec to :mul_params. by Benoit Jacob · 4 years, 6 months ago