Sunday, February 5, 2023

Curious Undisclosed Skylake Bug

Curious undisclosed Skylake bug

2023-02-05

CPUs running Intel’s Skylake-X microarchitecture have a curious bug that I haven’t seen mentioned anywhere: the AVX-512 compression instructions have a false dependency on their destination. In other words, the following two instructions have identical performance characteristics:

vcompressps  X{k}, Y
vcompressps     X{k}{z}, Y

Whereas we would expect the latter to depend only on k and Y, it also depends on X. The problem seems to have been fixed in Icelake. Surprisingly, while it affects all compression operations, it does not affect any of the expansion operations. Presumably, this is related to the odd behaviour of compression with a memory destination; expansion can’t target memory.

One thing I have never understood is why compression and expansion operations pun on the operation mask, instead of using it for its ordinary purpose and using a separate mask to specify the lanes to be compressed from or expanded to. There is certainly space in the instruction encoding for it. And these operations are slow enough that it seems unlikely adding a mask would have slowed them down further. The semantics have to be figured out, but they can be figured out (I would just and the two masks together), and the result would be strictly more expressive.

My benchmarking code is here. On a system which has the bug, it should print two large numbers and two small numbers; on a system which does not have the bug, it should print one large number and three small numbers.



from Hacker News https://ift.tt/mMwT8lZ

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.