Files
cpu-miner/sha256
Davide Grilli b6af554b0c perf(sha256): eliminate double bswap between SHA256d pass1 and pass2
Add sha256_transform_armv8_2way_pass2 which reads the pass1 output state
words directly into MSG0/MSG1 without byte serialization. Previously:
  sha256_state_to_digest() → native uint32 → BE bytes (8x write_u32_be)
  sha256_transform load   → BE bytes → vrev32q_u8 → native uint32 (4x)
These two conversions cancel out. The new path skips both, saving ~52
shift/store/load/vrev ops per 4-nonce group. Also eliminates the two
128-byte block2 stack buffers from sha256d80_hash_4way_armv8_2way.
2026-03-30 11:13:59 +02:00
..