dynarmic

mirror of https://github.com/azahar-emu/dynarmic synced 2025-11-13 10:30:07 +01:00

Author	SHA1	Message	Date
Merry	08b123feb5	IR: Modify VectorSignedSaturatedShiftLeftUnsigned to only accept immediate shift amounts	2022-10-18 15:04:30 +01:00
Merry	9313f5ea88	IR: Remove VectorShuffleHighHalfwords and VectorShuffleLowHalfwords	2022-10-18 15:04:30 +01:00
Merry	a97105c296	IR: Split VectorSignedSaturatedDoublingMultiply into VectorSignedSaturatedDoublingMultiply{High,HighRounding}	2022-10-18 15:04:30 +01:00
Merry	61d509dda2	IR: Add VectorMultiply{Signed,Unsigned}Widen instructions Polyfill for x86-64 backend	2022-10-18 15:04:30 +01:00
Merry	babfb7d7b8	IR/saturation: Revamp saturated add/sub IR instructions	2022-10-18 15:04:30 +01:00
Merry	cd537dc711	IR: Rename PackedAbsDiffSumS8 to PackedAbsDiffSumU8	2022-10-18 15:04:30 +01:00
Merry	8b41755db0	ir_emitter: Remove unused ResultAndCarryAndOverflow structure	2022-10-18 15:04:30 +01:00
Merry	a2b3199adf	Convert NZCV to C flag where able	2022-07-23 11:46:07 +01:00
Merry	72c87d11e4	a32_get_set_elimination_pass: Correct insertion point	2022-07-20 16:53:48 +01:00
Merry	78b4ba10c9	Migrate to mcl	2022-04-19 18:05:04 +01:00
merry	879f211686	ir/value: Add AccType to Value	2022-03-26 15:38:10 +00:00
merry	98cff8dd0d	IR: Implement SHA256MessageSchedule{0,1}	2022-03-20 13:59:18 +00:00
merry	f0a4bf1f6a	IR: Implement SHA256Hash	2022-03-20 13:59:18 +00:00
Wunkolo	5e7d2afe0f	IR: Introduce `VectorReduceAdd{8,16,32,64}` opcode Adds all elements of vector and puts the result into the lowest element. Accelerates the `addv` instruction into a vectorized implementation rather than a serial one.	2021-09-27 19:54:11 +01:00
Wunkolo	1e94acff66	ir: Add VectorBroadcastElement{Lower} IR instruction The lane-splatting variant of `FMUL` and `FMLA` is very common in instruction streams when implementing things like matrix multiplication. When used, they are used very densely. https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/coding-for-neon---part-3-matrix-multiplication The way this is currently implemented is by grabbing the particular lane into a general purpose register and then broadcasting it into a simd register through `VectorGetElement` and `VectorBroadcast`. ```cpp const IR::U128 operand2 = v.ir.VectorBroadcast(esize, v.ir.VectorGetElement(esize, v.V(idxdsize, Vm), index)); ``` What could be done instead is to keep it within the vector-register and use a permute/shuffle to "splat" the particular lane across all other lanes, removing the GPR-round-trip. This is implemented as the new IR instruction `VectorBroadcastElement`: ```cpp const IR::U128 operand2 = v.ir.VectorBroadcastElement(esize, v.V(idxdsize, Vm), index); ```	2021-08-07 23:03:57 +01:00
Wunkolo	5971361160	IR: Add AndNot{32,64} IR instruction Also includes BMI1-acceleration for x64, when available	2021-07-02 22:27:29 +01:00
Wunkolo	49d00634f9	IR: Add VectorAndNot IR instruction And(a, Not(b)) is a common enough operation that this can be fused into a single `AndNot` operation. On x64 this is also a single `pandn` instruction rather than two.	2021-07-02 22:27:29 +01:00
SachinVin	a626a2ec63	ir_emitter: Remove 32-bit-only `SubWithCarry`	2021-06-11 17:27:34 +01:00
SachinVin	ccf27f9c8c	ir_emitter: Remove 32-bit-only `AddWithCarry`	2021-06-09 01:54:03 +01:00
MerryMage	828959caed	IR: Implement FPVector{To,From}Half32 Implement ASIMD VCVT (half) in terms of this instruction. Correct handling of ASIMDStandardValue.	2021-06-05 03:39:48 +01:00
MerryMage	17ae7f9ce1	IR: Implement IR instruction CallHostFunction	2021-05-23 15:44:57 +01:00
MerryMage	53493b2024	Add .clang-format file Using clang-format version 12.0.0	2021-05-22 15:07:02 +01:00
Merry	714216fd0e	Consolidate all source files into src/ directory	2021-05-19 17:41:59 +01:00

23 Commits