Rendered at 11:31:42 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
egl2020 7 hours ago [-]
Article title should be "Efficient C++ Programming for Modern 64-bit CPUs...".
adrian_b 29 minutes ago [-]
Many cost relationships from TFA have already been more or less true for the 32-bit CPUs launched after 1990 and they all became true for the 32-bit high-end CPUs launched after 2000 (like Intel Pentium 4 and AMD Athlon XP), when the difference between the CPU clock frequency and the DRAM latency became almost as high as today.
Only for the 32-bit CPUs used in microcontrollers, which may have clock frequencies under 100 MHz and which may lack a cache hierarchy, the cost differences between many kinds of operations may collapse.
For instance even for not too old 32-bit CPUs it is right to classify the instructions in the following groups, based on their cost in clock cycles:
1. Simple integer operations with operands in registers
2. Loads from the L1 cache memory and simple floating-point operations, like addition and multiplication
3. Loads from the L2 cache memory, division (integer or floating-point), square root and mispredicted branches
4. Loads from the L3 cache memory and atomic read-modify-write operations (like atomic exchange, atomic fetch-and-add, atomic compare-and-swap)
5. Loads from the main memory
This classification matches the chart from TFA.
Nevermark 3 hours ago [-]
A CPU implementing C++ as a microarchitecture…? Finally, uncontrovertible proof of the prophesy. We really are living in a Cthulhu nightmare.
Simulation theory is dead.
sukuva 2 hours ago [-]
do we have "modern" 32-bit CPUs?
reactordev 1 hours ago [-]
Yes. Yes we do. A lot of them.
avadodin 3 hours ago [-]
That title got me:
Modern C++ CPUs as in LISP CPUs or as in Verilog CPUs?
zombot 7 hours ago [-]
Came here to say exactly that.
notorandit 36 minutes ago [-]
C++ CPUs?
reinitctxoffset 5 hours ago [-]
If people are interested in this stuff, this is the house style guide that I've ended up with in mid 2026, its great-great-great grandparents were at Google, which informed Greg Badros and Mark Rabkin and Andrei Alexandrescu when they did the one at FB, which informed a bunch of trading work, which informed a bunch of GPU work.
It links itself to some things that really seem to have existed, like a straylight project linked to the ESA, and an old domain b7r6.net linked to another HN account. There are a lot of buzzwords there, but in aggregate it is nonsense. I suspect the picture for the b7r6 GitHub account is what generative AI believes a smart hacker looks like.
Is this the internet now?
tom_ 2 hours ago [-]
Slightly struck by the concept of hand-writing the config parsing but not, apparently, the documentation...
rramadass 4 hours ago [-]
This is Excellent! Thanks for sharing.
First off, i highly suggest that you expand this into a full-blown book. This could become a successor to a combination of {Adrian & Piotr's "Software Architecture with C++" + Fedor Pikus' "The Art of Writing Efficient Programs"} for the Agentic era.
I really like that you are using Lean4 for parts of code generation, tips for Agentic coding etc. which are all needed today. I myself have been thinking on these lines i.e. using formal methods for specification and verification so that agent-generated code can be "correct-by-construction" and efficient. Your write-up is the first i have seen which tries to provide the overall picture.
Tony_Delco 2 hours ago [-]
[dead]
zombot 7 hours ago [-]
This looks like something that every serious C++ programmer should be reading.
Only for the 32-bit CPUs used in microcontrollers, which may have clock frequencies under 100 MHz and which may lack a cache hierarchy, the cost differences between many kinds of operations may collapse.
For instance even for not too old 32-bit CPUs it is right to classify the instructions in the following groups, based on their cost in clock cycles:
1. Simple integer operations with operands in registers
2. Loads from the L1 cache memory and simple floating-point operations, like addition and multiplication
3. Loads from the L2 cache memory, division (integer or floating-point), square root and mispredicted branches
4. Loads from the L3 cache memory and atomic read-modify-write operations (like atomic exchange, atomic fetch-and-add, atomic compare-and-swap)
5. Loads from the main memory
This classification matches the chart from TFA.
Simulation theory is dead.
Modern C++ CPUs as in LISP CPUs or as in Verilog CPUs?
It's opinionated but it has served me well.
https://gist.github.com/b7r6/5dde648f5dc1dea1e9039f2211f5d40...
It links itself to some things that really seem to have existed, like a straylight project linked to the ESA, and an old domain b7r6.net linked to another HN account. There are a lot of buzzwords there, but in aggregate it is nonsense. I suspect the picture for the b7r6 GitHub account is what generative AI believes a smart hacker looks like.
Is this the internet now?
First off, i highly suggest that you expand this into a full-blown book. This could become a successor to a combination of {Adrian & Piotr's "Software Architecture with C++" + Fedor Pikus' "The Art of Writing Efficient Programs"} for the Agentic era.
I really like that you are using Lean4 for parts of code generation, tips for Agentic coding etc. which are all needed today. I myself have been thinking on these lines i.e. using formal methods for specification and verification so that agent-generated code can be "correct-by-construction" and efficient. Your write-up is the first i have seen which tries to provide the overall picture.