76 Commits

Author SHA1 Message Date
Wunkolo
2c0be5e18c emit_x64_vector: AVX512 Implementation of EmitVectorNarrow{32,64}
Includes a new test case with the XTN instruction to verify
the implementation
2021-05-16 10:02:49 +01:00
Wunkolo
105b464bc1 backend/x64: Implement HostFeature 2021-05-14 21:20:21 +01:00
MerryMage
5ebe11c329 reg_alloc: Inform RegAlloc about rsp changes 2021-05-07 12:47:55 +01:00
MerryMage
e19f898aa2 ir: Reorganize to new top level folder 2021-04-21 22:22:07 +01:00
Merry
c28f13af97 emit_x64_vector: Bugfix for EmitVectorReverseBits on AVX-512: Do not reverse bytes without vector 2021-03-27 21:32:43 +00:00
Merry
4d33feb1fa emit_x64_vector: Bugfix for EmitVectorLogicalShiftRight8: shift_amount can be >= 8 2021-03-27 21:32:07 +00:00
Merry
91337788ee emit_x64_vector: Bugfix for EmitVectorLogicalShiftLeft8: shift_amount can be >= 8 2021-03-27 21:31:51 +00:00
Merry
dc37fe6e28 emit_x64_vector: Bugfix for ArithmeticShiftRightByte: shift_amount can be >= 8 2021-03-27 21:31:22 +00:00
Wunk
3e932ca55d
emit_x64_vector: Fix ArithmeticShiftRightByte zero_extend constant
Should be shifting in _bytes_ of `0x80`. Not bits.
2020-11-09 09:47:51 -08:00
Wunkolo
ec52922dae emit_x64_vector: Use explicit 64-bit mask constant
Exchange `~0ull` with `0xFFFFFFFFFFFFFFFF` when generating
the `zero_extend` constant.
2020-11-07 15:29:12 +00:00
Wunkolo
490160ef43 emit_x64_vector: GNFI implementation of ArithmeticShiftRightByte
The bit-matrix is generated up-front and added to the constant-pool.
I'm using an embedded 64-bit broadcast here(m64bcst) which is the particular
EVEX encoded version of the instruction with AVX512VL+GNFI.

If it ever really matters, then we would ideally detect specific host
features like bare-GFNI and specific subsets of AVX512 and emit
the assembly based on that rather than by the entire Icelake uarch.
2020-11-07 15:29:12 +00:00
Wunkolo
7df235aefb emit_x64_vector: GNFI implementation of EmitVectorLogicalShiftLeft8
Same principle as EmitVectorLogicalShiftRight8. An 8x8 galois identity
matrix is bit-shfited to allow for arbitrary 8-bit-lane shifts.
2020-11-07 15:29:12 +00:00
Wunkolo
5cc646ffed emit_x64_vector: GNFI implementation of EmitVectorLogicalShiftRight8
Bitshifts of the GFNI identity matrix generates a new matrix that
applies lane-wise bitshifts as well. This allows for a fast
single-instruction implementation of a byte-lane bitshift.
2020-11-07 15:29:12 +00:00
Wunkolo
6bb49726f4 emit_x64_vector: GNFI+SSSE3 implementation of EmitVectorReverseBits
Performs a full 128-bit bit-reversal using only two instructions.

First by reversing all the bits of each byte using a galois matrix
multiplication(vgf2p8affineqb, Icelake), and then by reversing the bytes
themselves(pshufb, ssse3).
2020-10-02 05:56:59 +01:00
MerryMage
f35aaa017c IR: Add VectorDeinterleave{Even,Odd}Lower 2020-07-04 11:04:10 +01:00
Merry
b1ff971a92 backend/x64: Temporarily avoid use of DefineValue(Argument&)
Issues with inappropriate values in upper bits of values
2020-06-27 10:52:59 +01:00
MerryMage
7d1e103ff5 IR: Implement VectorTranspose 2020-06-21 12:14:13 +01:00
MerryMage
8bbc9fdbb6 A32: Implement ASIMD VTBX 2020-06-20 22:35:31 +01:00
MerryMage
87f6e412d0 emit_x64_vector: SSE4.1 implementation of EmitVectorPolynomialMultiply{Long}8 2020-06-18 18:44:00 +01:00
MerryMage
f5b41aabc6 emit_x64_vector: Implement EmitVectorPolynomialMultiplyLong64 in terms of pclmulqdq 2020-06-18 18:04:23 +01:00
MerryMage
f495018f53 block_of_code: Encapsulate CPU feature detection code 2020-06-09 21:25:57 +01:00
MerryMage
b47adaee1d emit_x64_vector: SSSE3 implementation of EmitVectorExtract 2020-06-01 15:41:36 +01:00
MerryMage
5c0bb5cc63 Remove unreachable code (MSVC warnings) 2020-04-23 16:36:34 +01:00
MerryMage
a8a712c801 Relicense to 0BSD 2020-04-23 15:45:57 +01:00
MerryMage
325808949f backend/x64: Rename namespace BackendX64 -> Backend::X64 2020-04-22 21:06:17 +01:00
MerryMage
bd88286b21 cast_util: Add FptrCast
Reduce unnecessary type duplication when casting a lambda to a function pointer.
2020-04-22 21:06:17 +01:00
MerryMage
81fcb4e537 mp: Migrate to shared version of mp library 2020-04-22 21:06:17 +01:00
Merry
f252a62c1b Merge pull request #502 from lioncash/header
General: Remove unnecessary includes
2020-04-22 21:04:22 +01:00
Lioncash
349d4b577a General: Remove unnecessary includes
Removes unnecessary header dependencies that have accumulated over time
as changes have been made. Lessens the amount of files that need to be
rebuilt when the headers change.
2020-04-22 21:04:22 +01:00
Lioncash
cba9351b82 backend/x64/emit_*: Apply const where applicable 2020-04-22 21:04:22 +01:00
Lioncash
87083af733 general: Remove trailing spaces
General code-related cleanup. Gets rid of trailing spaces in the
codebase.
2020-04-22 21:04:21 +01:00
Lioncash
675f67e41d emit_x64_vector: Use const on locals where applicable
Normalizes the use of const in the source file.
2020-04-22 21:02:47 +01:00
Lioncash
a4cadf1cd9 frontend/ir_emitter: Add opcodes for signed saturated left shifts with unsigned saturation 2020-04-22 21:01:44 +01:00
Lioncash
b37279f65c backend/x64/emit_x64_vector: Prevent undefined behavior within VectorSignedSaturatedShiftLeft
Avoids undefined behavior by potentially left-shifting a signed negative
value.
2020-04-22 21:00:47 +01:00
MerryMage
f0920c0ded Fix VShift terminology
An arithmetic shift is by definition a signed shift, and a logical shift is by definition an unsigned shift.

- Rename VectorLogicalVShiftS* -> VectorArithmeticVShift*
- Rename VectorLogicalVShiftU* -> VectorLogicalVShift*
2020-04-22 20:55:50 +01:00
MerryMage
b51dae790d emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS16 2020-04-22 20:55:50 +01:00
MerryMage
bd47f2ca8f emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS64 2020-04-22 20:55:50 +01:00
MerryMage
3bf183d7e8 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftS32 2020-04-22 20:55:50 +01:00
MerryMage
94f9d402eb emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftU16() 2020-04-22 20:55:50 +01:00
MerryMage
6d9639e3b0 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU64() 2020-04-22 20:55:50 +01:00
MerryMage
bbc066a266 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU32() 2020-04-22 20:55:50 +01:00
Lioncash
da2e7fad87 emit_x64_vector: SSSE3 variant of EmitVectorCountLeadingZeros8()
pshufb lyfe
2020-04-22 20:55:50 +01:00
MerryMage
b8fde48732 emit_x64_vector: AVX implementation for EmitVectorCountLeadingZeros8 2020-04-22 20:55:50 +01:00
MerryMage
fd37b637aa emit_x64_vector: SSE implementation of EmitVectorCountLeadingZeros16 2020-04-22 20:55:50 +01:00
Lioncash
d426dfe942 ir: Add opcodes for unsigned saturating left shifts 2020-04-22 20:55:06 +01:00
Lioncash
b14eaaec46 ir: Add opcodes for left signed saturated shifts 2020-04-22 20:55:06 +01:00
Lioncash
a2cd643525 emit_x64_vector: Make EmitVectorUnsignedSaturatedAccumulateSigned() internally linked
Given this is just an internal helper function, it can be marked static.
2020-04-22 20:55:06 +01:00
MerryMage
12243692f5 A64: Implement SQRDMULH (vector), vector variant 2020-04-22 20:55:06 +01:00
MerryMage
3e447614c6 IR: Add VectorSignedSaturatedDoublingMultiplyLong 2020-04-22 20:55:06 +01:00
MerryMage
06b31448aa emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2020-04-22 20:55:06 +01:00