commit	d50509637394306f9d075ed03556671c5e7df138	[log] [tgz]
author	Artem Serov <artem.serov@linaro.org>	Fri Aug 21 13:34:10 2020 +0100
committer	Artem Serov <artem.serov@linaro.org>	Fri Aug 21 13:34:36 2020 +0100
tree	7e4de53f4c2bde2c5ea2e7e98fe726d8ca705a72
parent	54f58fb13dade10ffabdc72415cee5d4a746c017 [diff]
parent	e66b0909876e61b8a83d2f43c2961e1358ca17e2 [diff]

Merge remote-tracking branch 'aosp/upstream-master' into master

Enables support of ARM Scalabe Vector Extension.

e66b0909 Optimise single handler tables in the decoder.
8ed83527 [sve] Disallow dup with shift on byte-sized lanes
32a7cd97 Fix infinite loops in some native tests.
b8da04db Remove undefined behaviour in add/sub immediate
f84b4643 Revert optimisation for add/sub immediates
aaf02c54 Fix initialisation order for ID register fields.
f3f5d246 Fix add/sub immediate for min-int case
f48172ba Add missing aliases for SVE 0.0 moves.
b9616b36 Fix and enable CanTakeSVEMovprfx.
8c4ceb6a Support more than 64 CPU features.
caa40eec Fix CPUFeature iterator behaviour.
28ff5975 Add an example that dumps CPU feature information.
31d432b2 Add support for AT_HWCAP2.
3d8d3942 Add CPUFeatures up to Armv8.6.
960606b6 Emit pairs of add/sub for larger immediates
4635261c Use segments in SVE indexed fmul simulation
3eb24e95 Fix numerous issues related to CAS* instructions.
102e7a5e Make assembler more strict about SVE prefetch arguments
ebc3b8f5 Use PgLow8 rather than Pg<12, 10>.
7b5819c3 Always assert that 'pg' does not have a lane size.
ecca4b1c Disallow x31/xzr for SVE prefetch scalar offset register
4606adc3 Fix simulation of FCMNE.
5a5e71f3 Require an immediate (0.0) for compare-with-zero instructions.
a8461cf9 Prefer to use 'rd' as a scratch.
32f8fe13 Fix CPURegister::GetArchitecturalName().
dfb93b5e Fix simulation of FTSMUL.
8caa873b Fix the `sve_fmla_fmls` test.
a3d6110c Fix simulation of BRKNS.
3980b742 Make the 'sve_punpk' test VL-agnostic.
7c8c1f0f Update FPCR test.
df01bce3 Merge branch 'sve'
75892bd1 [sve] Restore LaneSize to predicate logical operations.
9927c4f6 [sve] Improve disasm substitution for sign-extending loads
f67b1af0 [sve] Remove generated comments from the disassembler
5e2df59a [sve] Remove extra spaces from load/store register lists.
29936957 [sve] Remove redundant 'USE' macros.
89820250 [sve] Ternary substitution for disassembler
1f1ab9b9 Merge branch 'master' into sve
15d78439 [sve] Make modifiers lower case in disassembly
5f3928c0 [sve] Implement 32-bit scatter store (scalar plus vector mode).
fa098bcf [sve] Implement 64-bit scatter store (scalar plus vector mode).
a5112344 [sve] Complete remaining gather loads.
cd3f6c5e [sve] Fix the index specifier decoding error in the gather load helper.
cb0cfc31 Remove some unnecessary casts in `LoadStoreMemOperand`.
50ef1718 [sve] Relax the lane size restriction of register in MacroAssembler.
7b9a5f12 Use Register for macro assembler ldpsw
4fc4bec4 [sve] Add a strlen example using ldff1b.
6ebcc8a2 Ensure stable build directory name under Python 3
5f9b3800 [sve] Implement ContiguousNonFaultLoad
d154a44f Don't template 'Rx' on the register type.
0d754e91 [sve] Trace writes to FFR.
7d3a329d [sve] Remove a bad assertion.
1c45cfeb [sve] Rename IsScalar to IsPlainScalar.
7db8210d [sve] Implement fmov aliases.
ae3902af [sve] Implement logical immediate aliases.
83ebf7cb Remove redundant tests.
e2de6070 [sve] Implement aliases for mov immediate
9ccc4d23 [sve] Implement aliases for mov from register
a24d95ca [sve] Implement predicate logical instruction aliases
b56cf22f [sve] Implement scatter str, vector plus immediate form
a3c11462 Remove stray assembler method declarations
fd0fc206 Merge branch 'master' into sve
2b66cd62 Fix the 'sh' field for ADD/SUB immediate.
fe7cb100 Compatibility fixes for scons using Python 3
1af34f12 [sve] Implement gather load first-fault data to 64-bit vector (vector index).
6537a9a7 [sve] Implement gather load first-fault data to 32-bit vector (vector index).
823509b1 Merge branch 'master' into sve
504d5e9e Fix clang-format errors.
3db2c498 [sve] Implement prefetch instructions.
8667956f Remove undefined casts to PrefetchOperation.
113d9199 [sve] Implement gather load data to 32-bit vector (vector index).
1a5dcd21 Merge "Merge branch 'master' into sve" into sve
d9859c0e Merge branch 'master' into sve
85e1510e [sve] Implement load and broadcast data to vector.
b944bff2 [sve] Assert destination register is X for count-like instructions
991ee198 [simulator] Remove instruction instrumentation support.
2fe55ec6 Update clang tools to 4.0.
fa3f6bf6 [sve] Implement indexed sdot and udot.
3e2fb505 [sve] Implement ContiguousNonTemporalStore
72765d1b [sve] Implement ContiguousNonTemporalLoad
452ad8b9 [sve] Implement LoadAndBroadcastQuadword
b2a19fae Remove bitfields from the CPURegister API.
8d4bbd20 Merge branch 'master' into sve
d1463742 Document that the DecoderVisitor interface is unstable.
5b0e0044 Merge branch 'master' into sve
c750151e [sve] Fix indexed floating point multiply simulation
48522f53 [sve] Implement AddressGeneration
e4886e50 [sve] Implement FPComplexMulAddIndex
75f1c436 [sve] Implement FPComplexMulAdd
818379d9 Document security considerations.
b4a25f6e [sve] Implement fsqrt.
f60f6dc8 [sve] Implement frecpx.
2cb1b611 [sve] Implement fcvt.
0b1afa89 [sve] Implement FPComplexAddition
f804b606 [sve] Implement PermuteVectorPredicated
fd660b58 [sve] Add token to disassembler for predicate register
83e86612 [sve] Implement predicated shifts by immediate
76c094af [sve] Implement BitwiseShiftByVector and WideElements.
147b0ba7 [sve] Fix immediate shifts for LSL
3bf2d166 [sve] Fix premature truncation of shift amount
acd32aad Consistently use snake_case for variable names.
70d62899 Show output of --list when it fails.
e377513a [sve] Implement SVEFPCompareWithZero instructions.
47c2684c [sve] Implement SVEFPCompareVectors instructions.
f07b8cee [sve] Implement SVEFPUnaryOpPredicated instructions (FRINT).
a2c1bb70 [sve] Implement SVEFPMulAddIndex instructions.
f8d29f13 [sve] Implement SVEFPMulAdd instructions.
364a068f Test fewer targets.
5f9905bf [sve] Support unary instructions in MovprfxHelperScope.
efd9dc76 [sve] Implement remaining FPArithmeticUnpredicated instructions
13050cae [sve] Implement FPUnaryOpUnpredicated
894962f4 [sve] Implement FPFastReduction
afd9335e [sve] Fix insr disassembly
a2fadc26 [sve] Implement predicated FP arithmetic immediate instructions
37f28184 [sve] Implement remaining predicated FP arithmetic instructions
ac07af1f [sve] Implement SVEPermuteVectorExtract
dcdbd757 [sve] Implement vector-plus-immediate loads.
85a9c104 [sve] Add support for ldff1*.
36e6c56f Remove the dependency on 'sed'.
4a9829fb [sve] Implement FADDA
191e757e Move to a C++14 baseline.
5fb2ad66 [sve] Implement FTMAD
2e954295 [sve] Fix unpack instructions when src aliases dst
afe21a8b [sve] Add SVE tests to cpufeatures tests.
31cd6a0f [sve] Implement part of SVEFPUnaryOpPredicated instructions.
db7437cc [sve] Implement part of SVEFPUnaryOpPredicated instructions.
50e9f552 [sve] Implement floating point multiply by index
15f8901d [sve] Implement SVEPermuteVectorInterleaving
662db65f [sve] Fix cpplint complaint about non-const reference
5d87229d [sve] Implement SVEPartitionBreak instructions.
7fd6fd53 [sve] Implement SVEPermutePredicate instructions
03bfc30d [sve] Improve predicate register substitution.
e0eb40b5 Merge branch 'master' into sve
843495bd Merge branch 'master' into sve
ee1108d4 Define AA64PRF0::kRAS.
67c969d4 Fix access to AA64MMFR2 on Arm8.0 targets.
0442b3db [sve] Restructure form selection for structured accesses.
7eb3e219 [sve] Update simulator tracing for SVE.
423e5428 [sve] Make pre-SVE register tracing consistent.
4378263d [sve] Implement SVEIntMiscUnpredicated
77b6d986 [sve] Implement SVEReverseWithinElements
38303d9a [sve] Implement SVEPropagateBreak instructions.
728e5653 Merge branch 'master' into sve
a3e8b176 [sve] implement the predicated rdffr and rdffrs.
4023d7aa [sve] Implement setffr, wrffr and rdffr(unpredicated).
d47d6c41 [sve] Fix decoding bitwise shift (wide) instructions in disassembler.
d0dbe586 Fix DEFINE_ASM_FUNC for namespacing.
ac79f620 Remove non-zero register asserts from Csinc/Csinv/Csneg
8188ddfd [sve] Implement saturating inc/dec vector by element count instructions
91d5ba34 [sve] Implement saturating inc/dec register by element count
e5ab0fe5 [sve] Implement ld2, ld3 and ld4 (immediate).
e483ce56 [sve] Implement ld2, ld3 and ld4 (scalar).
d4dd9c2a [sve] Implement st2, st3 and st4 (immediate).
bc4a54f0 [sve] Implement st2, st3 and st4 (scalar).
29a0c43e [sve] Implement SVEBitwiseShiftUnpredicated instructions.
7a0d3678 [sve] Implement predicated fmax and fmin.
7e8d7838 [sve] Fix decoding tsz in disassembler
17b2e540 Add missing ID register feature detection.
579c92d0 [sve] Implement inc/dec register by element count instructions
74f84f6c [sve] Implement count elements instructions
aae2cf01 [sve] Remove some duplicate cases in the disassembler
f3fae203 [sve] Tidy up register disassembly
378fc895 [sve] Clean up SVESdotUdotHelper.
6ebbba62 [sve] Fix the behaviour of SVE_MUL_VL.
889984cb [sve] Add missing DISASM tests for 'adr'.
0f62eab5 Implement CPY and FCPY (immediate).
9cc3f148 [sve] Predicated select for z registers
d316c5e2 [sve] Implement predicated fdiv and fdivr.
fe536047 [sve] Implement unpredicated fadd, fsub and fmul.
d255bdb3 New decoder with smaller instruction classes.
4d2a4e97 [sve] Implement SVEIntMulAddUnpredicated instructions.
b2d8d1fd [sve] Implement the SVEIntReduction group.
06feb1d6 Merge branch 'master' into sve
33c99f91 Give EqualMemory a zero_offset.
e4983d43 [sve] Fix the insr test.
b40aa695 [sve] Fix and extend the cterm test.
bcc97931 [sve] Fix the encoding of sdiv/udiv.
0093bb9f [sve] Implement CPY_z_r and CPY_z_v.
6b245ba8 [sve] Fix disassembly of DUP_z_r.
b28f6178 [sve] Fix handling of overflow in {sq,uq}{add,sub}.
d9f929c5 [sve] Convert 'Add(..., -1)' to 'Sub(..., 1)'.
6f111bc8 [sve] Implement andv, eorv and orv.
61a271ea Merge branch 'master' into sve
c0066278 [sve] Add an unpredicated 'Neg' macro.
6995bfd2 [sve] Implement the SVEIntWideImmUnpredicated group.
6205eb45 [sve] Implement contiguous ld1 loads.
bc21a0d1 [sve] Implement the IntUnaryArithmetic group.
e668b200 [sve] Implement contiguous st1 stores.
199339db [sve] Implement simple Z and P loads and stores.
4f28df7f [sve] Implement permute vector unpredicated instructions group.
1314c46c [sve] Implement 'Adr(Register, SVEMemOperand)'.
9e5da2af [sve] Implement the SVEStackAllocation group.
66e6671d Add SVEMemOperand.
845246bf [sve] Add support for the SVEIntArithmeticUnpredicated group.
13634762 [sve] Add support for the SVEIntBinaryArithmeticPredicated group.
d4e0b1be Merge branch 'master' into sve
f5659ff6 Enable P register logging.
e8289200 Test various vector lengths.
d961a0c3 [sve] Implement `cntp`.
0ce75844 [sve] Implement most 'SVEPredicateMisc' instructions.
4d6c680d Use macro lists for repetive visitors.
d9002964 Merge branch 'master' into sve
302729ce [sve] Add support for the SVEIntCompare{Signed|Unsigned}Imm group.
c844bb27 [sve] Add support for the SVEIntCompareScalars group.
935b15be Fix ClearForWrite with Z registers.
cd8148c2 [sve] Add support for `index`.
d1686cbb [sve] Add support for INCP and DECP.
6069fd45 Introduce IntegerOperand.
96713fe6 [sve] Add support for the SVEIntCompareVectors group.
fad4dffa Enable Z register tracing.
4b6167bd Merge branch 'master' into sve
9d0f264b Make CPURegister::X() return a Register.
a1885a51 Implement unpredicated bitwise operations with immediate.
e87e11e1 Merge branch 'master' into sve
afd17aa5 [sve] Add SVE condition aliases.
f4fa8226 [sve] Implement SVEPredicateLogical instruction group.
ae2fc3bf [sve] Add support for movprfx.
9ec0cbee [sve] Add 128-bit vector element type.
cfb94218 [sve] Implement bitwise logical unpredicated instructions.
72d2e56a [sve] Basic SVE Z register tracing.
61a75f97 [sve] Make PRegister* types derive from PRegister.
fbdd3b77 [sve] Merge ZRegisterNoLaneSize into ZRegister.
22023dfa [sve] Add support for the IntMulAddPredicated group.
f9658283 Merge branch 'master' into sve
2a249921 [sve] Fix a comment in the test infrastructure.
9d06c4d6 [sve] Enable simulation of the test framework.
e546c4a5 [sve] Enable simulation of RegisterDump::Dump().
81878560 [sve] Implement AcquireGoverningP().
2eaecf15 [sve] P register test utilities.
ee9123c8 [sve] Add P register support to UseScratchRegisterList.
9a570dd7 [sve] Add tests for some UseScratchRegisterScope tools.
0e90ead1 [sve] Rework CPURegister and related classes.
22430a4d Merge branch 'master' into sve
03c0b515 [sve] Z register tools for assembler tests.
dc47bded [sve] Support ZRegister in UseScratchRegisterScope.
119bd21a Merge branch 'master' into sve
99aa1859 Merge branch 'master' into sve
f20f247d [sve] Add predicate condition testing functions in the simulator.
cf730b6c [sve] Get rid of DEBUG-only NORETURN visitors.
e0590ccf [sve] Add P register infrastructure for simulation.
bbb13c6b Merge branch 'master' into sve
e3d059b7 [sve] Add Z register infrastructure for simulation.
d77a8e42 [sve] Add Z and P register support to RegisterDump.
44777d49 [sve] Remove MacroAssembler::Dupm.
bdd38cb1 [sve] Add macro assembler skeleton
b4f387ab Add disassembler test skeleton
63db3c15 Fix types of asm governing predicates
d7b90956 Update .gitreview for the 'sve' branch.
25197201 Split assembler tests
00aa898a [sve] Implement basic register support.
e91d1ec0 Add SVE simulator skeleton
4fbccad0 Merge branch 'master' into sve
c7406fab Split disassembler tests
a9abe595 Add SVE disassembler skeleton.
6847c2fc Add SVE assembler skeleton and dummy [ZP]Register classes
aaba1a48 SVE instruction constants
b545d6ca Implement SVE decoder and skeleton components

Test: mma test-art-host-vixl
Test: test.py --host --optimizing --jit --gtest
Test: test.py --target --optimizing --jit
Test: run-gtests.sh

Change-Id: Ic10af84a026fe83d788f587b6d1fc2240be915fb

tree: 7e4de53f4c2bde2c5ea2e7e98fe726d8ca705a72

README.md

VIXL: ARMv8 Runtime Code Generation Library, Development Version

Contents:

Overview
Licence
Requirements
Known limitations
Usage

Overview

VIXL contains three components.

Programmatic assemblers to generate A64, A32 or T32 code at runtime. The assemblers abstract some of the constraints of each ISA; for example, most instructions support any immediate.
Disassemblers that can print any instruction emitted by the assemblers.
A simulator that can simulate any instruction emitted by the A64 assembler. The simulator allows generated code to be run on another architecture without the need for a full ISA model.

The VIXL git repository can be found on 'https://git.linaro.org'.

Changes from previous versions of VIXL can be found in the Changelog.

Licence

This software is covered by the licence described in the LICENCE file.

Requirements

To build VIXL the following software is required:

Python 2.7
SCons 2.0
GCC 4.8+ or Clang 4.0+

A 64-bit host machine is required, implementing an LP64 data model. VIXL has been tested using GCC on AArch64 Debian, GCC and Clang on amd64 Ubuntu systems.

To run the linter and code formatting stages of the tests, the following software is also required:

Git
Google's cpplint.py
clang-format-4.0
clang-tidy-4.0

Refer to the 'Usage' section for details.

Note that in Ubuntu 18.04, clang-tidy-4.0 will only work if the clang-4.0 package is also installed.

Known Limitations for AArch64 code generation

VIXL was developed for JavaScript engines so a number of features from A64 were deemed unnecessary:

Limited rounding mode support for floating point.
Limited support for synchronisation instructions.
Limited support for system instructions.
A few miscellaneous integer and floating point instructions are missing.

The VIXL simulator supports only those instructions that the VIXL assembler can generate. The doc directory contains a list of supported A64 instructions.

The VIXL simulator was developed to run on 64-bit amd64 platforms. Whilst it builds and mostly works for 32-bit x86 platforms, there are a number of floating-point operations which do not work correctly, and a number of tests fail as a result.

VIXL may not build using Clang 3.7, due to a compiler warning. A workaround is to disable conversion of warnings to errors, or to delete the offending return statement reported and rebuild. This problem will be fixed in the next release.

Debug Builds

Your project's build system must define VIXL_DEBUG (eg. -DVIXL_DEBUG) when using a VIXL library that has been built with debug enabled.

Some classes defined in VIXL header files contain fields that are only present in debug builds, so if VIXL_DEBUG is defined when the library is built, but not defined for the header files included in your project, you will see runtime failures.

Exclusive-Access Instructions

All exclusive-access instructions are supported, but the simulator cannot accurately simulate their behaviour as described in the ARMv8 Architecture Reference Manual.

A local monitor is simulated, so simulated exclusive loads and stores execute as expected in a single-threaded environment.
The global monitor is simulated by occasionally causing exclusive-access instructions to fail regardless of the local monitor state.
Load-acquire, store-release semantics are approximated by issuing a host memory barrier after loads or before stores. The built-in __sync_synchronize() is used for this purpose.

The simulator tries to be strict, and implements the following restrictions that the ARMv8 ARM allows:

A pair of load-/store-exclusive instructions will only succeed if they have the same address and access size.
Most of the time, cache-maintenance operations or explicit memory accesses will clear the exclusive monitor.
- To ensure that simulated code does not depend on this behaviour, the exclusive monitor will sometimes be left intact after these instructions.

Instructions affected by these limitations: stxrb, stxrh, stxr, ldxrb, ldxrh, ldxr, stxp, ldxp, stlxrb, stlxrh, stlxr, ldaxrb, ldaxrh, ldaxr, stlxp, ldaxp, stlrb, stlrh, stlr, ldarb, ldarh, ldar, clrex.

Security Considerations

VIXL allows callers to generate any code they want. The generated code is arbitrary, and can therefore call back into any other component in the process. As with any self-modifying code, vulnerabilities in the client or in VIXL itself could lead to arbitrary code generation.

For performance reasons, VIXL's Assembler only performs debug-mode checking of instruction operands (such as immediate field encodability). This can minimise code-generation overheads for advanced compilers that already model instructions accurately, and might consider the Assembler's checks to be redundant. The Assembler should only be used directly where encodability is independently checked, and where fine control over all generated code is required.

The MacroAssembler synthesises multiple-instruction sequences to support some unencodable operand combinations. The MacroAssembler can provide a useful safety check in cases where the Assembler's precision is not required; an unexpected unencodable operand should result in a macro with the correct behaviour, rather than an invalid instruction.

In general, the MacroAssembler handles operands which are likely to vary with user-supplied data, but does not usually handle inputs which are likely to be easily covered by tests. For example, move-immediate arguments are likely to be data-dependent, but register types (e.g. x vs w) are not.

We recommend that all users use the MacroAssembler, using ExactAssemblyScope to invoke the Assembler when specific instruction sequences are required. This approach is recommended even in cases where a compiler can model the instructions precisely, because, subject to the limitations described above, it offers an additional layer of protection against logic bugs in instruction selection.

Usage

Running all Tests

The helper script tools/test.py will build and run every test that is provided with VIXL, in both release and debug mode. It is a useful script for verifying that all of VIXL's dependencies are in place and that VIXL is working as it should.

By default, the tools/test.py script runs a linter to check that the source code conforms with the code style guide, and to detect several common errors that the compiler may not warn about. This is most useful for VIXL developers. The linter has the following dependencies:

Git must be installed, and the VIXL project must be in a valid Git repository, such as one produced using git clone.
cpplint.py, as provided by Google, must be available (and executable) on the PATH.

It is possible to tell tools/test.py to skip the linter stage by passing --nolint. This removes the dependency on cpplint.py and Git. The --nolint option is implied if the VIXL project is a snapshot (with no .git directory).

Additionally, tools/test.py tests code formatting using clang-format-4.0, and performs static analysis using clang-tidy-4.0. If you don't have these tools, disable the test using --noclang-format or --noclang-tidy, respectively.

Also note that the tests for the tracing features depend upon external diff and sed tools. If these tools are not available in PATH, these tests will fail.

Getting Started

We have separate guides for introducing VIXL, depending on what architecture you are targeting. A guide for working with AArch32 can be found here, while the AArch64 guide is here. Example source code is provided in the examples directory. You can build examples with either scons aarch32_examples or scons aarch64_examples from the root directory, or use scons --help to get a detailed list of available build targets.