this post was submitted on 08 Jun 2023
20 points (100.0% liked)
Programming
13636 readers
2 users here now
All things programming and coding related. Subcommunity of Technology.
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'll bet it just ends up being the limitations of memory bandwidth to stuff things into registers for the optimized algorithm. Or, something like Mojo's autotuning finds the best way to partition the work for the hardware.