My 'coding sprint' of two days is over as I have a complex systems conference next two days and this weekend will be busy. My sprint was to further evolve the design of my async hash engine which needs to spot when hash op rounds can be coalesced into SIMD 4-Hash rounds, or whether to throw (hyper)threads at the work queue instead. It must do this very quickly and on a data structure simultaneously in use by other threads - the atomic ordering and locking semantics are proving quite tricky - while also handling arbitrary digest lengths and hash termination. An earlier design simply ran a wall clock and synchronised all workers after each hash round which hugely eases the scheduling, but it couldn't be expected to scale well once your memory prefetchers (one per SIMD stream) start tripping over one another as you max out main memory. This, much looser, design ought to scale much better, if I can get it race condition free. Hopefully I'll grab a few more hours before Thanksgiving.
|Go to previous entry||Go to next entry||Go back to the archive index||Go back to the latest entries|