199 Commits

Author SHA1 Message Date
SachinVin
048da372e9 block_of_code.cpp: remove redundant align() 2021-07-17 22:12:31 +01:00
Wunkolo
5971361160 IR: Add AndNot{32,64} IR instruction
Also includes BMI1-acceleration for x64, when available
2021-07-02 22:27:29 +01:00
Wunkolo
49d00634f9 IR: Add VectorAndNot IR instruction
And(a, Not(b)) is a common enough operation that this can
be fused into a single `AndNot` operation. On x64 this is also
a single `pandn` instruction rather than two.
2021-07-02 22:27:29 +01:00
Wunkolo
253713baf1 opcodes.inc: Disable clang format 2021-07-02 22:27:29 +01:00
Wunkolo
1fc96fd0c2 emit_x64{_vector}_floating_point: Unsafe AVX512 implementation of Emit{RSqrt,Recip}Estimate
This implementation exists within the unsafe optimization paths and
utilize the 14-bit-precision `vrsqrt14*` and `vrcp14p*`
instructions provided by AVX512F+VL. These are _more_ accurate than
the fallback path and the current `rsqrt`-based unsafe code-path
but still falls in line with what is expected of the
`Unsafe_ReducedErrorFP` optimization flag.

Having AVX512 available will mean this function has 14 bits of precision.
Not having AVX512 available will mean these functions have 11 bits of precision.
2021-06-27 11:18:58 +01:00
MerryMage
ea02a7d05d conditional_state: Break from translation when invalid NV instruction is hit 2021-06-25 22:09:39 +01:00
Lioncash
9bb464a203 externals: Update fmt to 8.0.0 2021-06-23 05:04:53 -04:00
Wunkolo
c6125082ea emit_x64_floating_point: AVX512 implementation of EmitFPMinMaxNumeric 2021-06-20 10:12:27 +01:00
SachinVin
a626a2ec63 ir_emitter: Remove 32-bit-only SubWithCarry 2021-06-11 17:27:34 +01:00
Wunkolo
776208742b emit_x64_{vector_}floating_point: Centralize implementation of FP{Vector}{Abs,Neg}
Removes dependency on the constants at the top of some files
such as `f16_negative_zero` and `f32_non_sign_mask` in favor
of the `FPInfo` trait-type.

Also removes bypass delays by selecting between instructions
such as `pand`, `andps`, or `andpd` depending on the type
and keeps them in their respective uop domain.

See https://www.agner.org/optimize/instruction_tables.pdf for
more info on bypass delays.
2021-06-10 00:04:57 +01:00
Wunkolo
58ffde23f9 bit_util: Make Replicate constexpr 2021-06-10 00:04:57 +01:00
SachinVin
ccf27f9c8c ir_emitter: Remove 32-bit-only AddWithCarry 2021-06-09 01:54:03 +01:00
Wunkolo
5385edcc66 emit_x64_vector_floating_point: AVX512 implementation of EmitFPVector{Min,Max}{32,64} 2021-06-08 17:50:28 +01:00
Wunkolo
0c67b913fe backend/x64: Add vcmp constants 2021-06-08 17:50:28 +01:00
Wunkolo
8fde505943 backend/x64: Add vfpclass constants
Bit-wise constants for use with the `vfpclass` instruction.
2021-06-08 17:50:28 +01:00
Wunkolo
c82e29ed82 backend/x64: Add vrange constants
Adds compile-time `FpRangeLUT` for generating the 8-bit
immediate LUT value for the `vrange*` instruction
2021-06-08 17:50:28 +01:00
MerryMage
c1d5a7977e Add Unsafe_IgnoreStandardFPCRValue optimization 2021-06-08 17:26:45 +01:00
Wunkolo
c157dfcc4c emit_x64_vector: Reduce gf2p8affineqb requirement to GFNI
Currently, every usage of `gf2p8affineqb` is guarded by the
`AVX512F + AVX512VL + GFNI` requirement, when really
we only need `GFNI` on its own.

This will allow `GFNI`-only chips to get emit GFNI features without
needing to have AVX512 as well.
There _are_ chips in existance currently that strictly ship with GFNI and
have no implementation of AVX1/AVX2/AVX512(and thus no VEX/EVEX
encoding) such as Tremont(Lakefield) chips.
2021-06-08 14:00:00 +01:00
Wunkolo
e47d0d11c3 emit_x64_vector: AVX512 implementation of EmitVectorNot
Single in-place ternary logic instruction.
2021-06-08 03:11:38 +01:00
Markus Wick
0c12614d1a A64/config.h: Split fastmem and page_table options.
We might want to allocate different sizes for each of them.
e.g. for the unsafe fastmem approach without bounds checking.
Or for using the full 48bit adress range (with mirrors) by allocating our real arena as close to 1<<47 as possible.
2021-06-06 17:25:51 +01:00
MerryMage
828959caed IR: Implement FPVector{To,From}Half32
Implement ASIMD VCVT (half) in terms of this instruction.
Correct handling of ASIMDStandardValue.
2021-06-05 03:39:48 +01:00
Wunkolo
9a23c09c3b emit_x64_floating_point: AVX implementation of ZeroIfNaN 2021-05-31 13:41:05 +01:00
Wunkolo
e9c5c01eda emit_x64{_vector}_floating_point: AVX512 implementation of ZeroIfNaN
Using a single `vfixupimm` to turn `QNaN`/`SNan` to `+0`
2021-05-31 13:39:56 +01:00
Wunkolo
fe5abdb3e1 backend/x64: Add vfixup constants
Adds compile-time `FixupLUT` function for generating the 32-bit
LUT of src->dst mappings
2021-05-31 13:39:56 +01:00
MerryMage
8235de9829 {a32,a64}_emit_x64: Fix fast_dispatch_table_lookup call in Unpatch on W^X systems
fast_dispatch_table_lookup is in JITted code, and thus execution must be enabled before it can be called.
2021-05-30 22:30:51 +01:00
MerryMage
0a98e5d3d7 exception_handler_*: Simplify message for case when exception is not our fault 2021-05-30 22:22:02 +01:00
MerryMage
9815502fee emit_x64_data_processing: operand in EmitExtractRegister is not modified 2021-05-30 22:18:21 +01:00
Markus Wick
36c3b289a0 fixup! a64/fastmem: Implement fastmem on 128 bit memory access. 2021-05-28 22:14:09 +01:00
Markus Wick
e82685223a a64/fastmem: Implement fastmem on 128 bit memory access. 2021-05-28 18:49:31 +01:00
Markus Wick
ff01b1c6f9 a64/fastmem: Only generate abort handler if needed.
If fastmem fails, we call the callback from the signal handler. So this callback proxy in slowmem won't be used ever.
2021-05-28 18:49:31 +01:00
MerryMage
709773dcf1 a64_emit_x64: Implement fastmem for A64 frontend for 8-64 bit reads/writes 2021-05-28 18:49:31 +01:00
Merry
bbffae2f96 emit_x64_vector_saturation: AVX implementation of EmitVectorSignedSaturated 2021-05-28 15:34:49 +01:00
Merry
56e3bf57d2 emit_x64_vector_saturated: Consolidate unsigned operations into EmitVectorUnsignedSaturated 2021-05-28 15:34:49 +01:00
Merry
a76e8c8827 emit_x64_vector_saturation: Reduce esize noise in EmitVectorSignedSaturated 2021-05-28 15:34:49 +01:00
Merry
de31caca49 emit_x64_vector_saturation: AVX implementation of EmitVectorUnsignedSaturatedSub32 2021-05-28 15:34:49 +01:00
Merry
b46e6a24dc emit_x64_vector_saturation: AVX implementation of EmitVectorUnsignedSaturatedAdd32 2021-05-28 15:34:49 +01:00
Merry
d087ef42b9 emit_x64_vector_saturation: AVX implementation of EmitVectorUnsignedSaturatedSub32 2021-05-28 15:34:49 +01:00
Merry
0a232a6fbf emit_x64_vector_saturation: AVX2 implementation of EmitVectorUnsignedSaturatedAdd64 2021-05-28 15:34:49 +01:00
Wunkolo
57601f064b emit_x64_vector_saturation: AVX512 implementation of EmitVectorSignedSaturated 2021-05-28 15:34:49 +01:00
Wunkolo
332c26d432 emit_x64_vector_saturation: AVX512 implementation of VectorUnsignedSaturated{Add,Sub}{32,64} 2021-05-28 15:34:49 +01:00
Wunkolo
fa8cc1ac36 backend/x64: Add constants
Used to redefine x86 assembly-constants without
including platform-dependent headers such as `immintrin.h`.

Currently includes vpcmp constants as well as ternary logic
utility-terms.

Removes `immintrin.h` requirement from emit_x64_vector_saturation
and updates our usage of `vpcmp` and `vpternlog` with the new constants
2021-05-28 14:13:11 +01:00
MerryMage
f6f8024fb5 a32_emit_x64: Dump x64 disassembly upon fastmem patch failure 2021-05-25 21:57:29 +01:00
MerryMage
4256d21481 common: Add x64_disassemble 2021-05-25 21:56:59 +01:00
MerryMage
17ae7f9ce1 IR: Implement IR instruction CallHostFunction 2021-05-23 15:44:57 +01:00
Wunkolo
3c693f2576 emit_x64_vector: AVX512VBMI implementation of EmitVectorTableLookup128
Also adds AVX512VBMI detection to host_feature
2021-05-22 22:48:31 +01:00
Wunkolo
37b24ee29e emit_x64_vector: AVX512{VL+BW} implementation of EmitVectorTableLookup128
Based off of the SSE41 implementation but utilizing
embedded broadcasting, mask registers, and
the special zero-mask to default-initialize out-of-bound
indices to zero in the `is_defaults_zero` case.
2021-05-22 22:47:21 +01:00
MerryMage
53493b2024 Add .clang-format file
Using clang-format version 12.0.0
2021-05-22 15:07:02 +01:00
MerryMage
51b155df92 A32: Introduce PreCodeTranslationHook 2021-05-22 14:16:10 +01:00
Merry
714216fd0e Consolidate all source files into src/ directory 2021-05-19 17:41:59 +01:00