I'm not the guy you responded to, nor am I a kernel expert, but I have a few suggestions:
-
Sites like phoronix and lwn will go into pretty low-level kernel details like this from time to time. You could consider subscribing to their RSS feeds or something like that
-
Review a few open university courses on either Operating Systems or Computer Architecture. Short of that, you can also just browse wikipedia for articles on these kinds of topics. I find it enjoyable to read them from time to time
-
Subscribe to the LKML (which is probably a lot more information than any single person can process, but sites like lwn and phoronix highlight/summarize from time to time)
I would also say that there are a lot of people out there who have made contributions to the Linux kernel, including this specific portion of the Linux kernel. The person you're responding to may even do it as a part of his/her day job (and it certainly reads like he does). It's not that uncommon.
And the last thing to keep in mind is that learning knowledge like this doesn't happen overnight. You learn a lot more by learning small things over several years, compared to learning a lot in a short time. Don't make it a goal to learn things like this - instead, try to make it something you enjoy doing, so you keep doing it over the years and learning more and more small bits of knowledge over time. Eventually, all the different pieces start fitting together and you too could mash out an excellent post like GP's!
Exactly! Imagine you have two services in a data center. If they have to communicate a lot with each other, then you would prefer them as close to each other as possible. Why? Well it's because of the difference between sending a request over a network vs. just sending it to another process on the same host. It's much more efficient in terms of latency and bandwidth. There are, of course, downsides and other other costs (like the fact that the cores that are handling the requests themselves are much less powerful), so you have to tailor your hardware allocation to your workloads. In general, if you're CPU-bound, you would want more powerful CPUs (necessitating fewer cores per host for power reasons), and if you're I/O bound, you want to reduce network latency as much as possible.
Now imagine you have thousands of services. The network I/O can get pretty extreme. Plus, occasionally, you have requirements like the fact that any data traveling from one host to another must be encrypted. So if you can keep as many services as possible on a single host, you reduce a lot of that overhead as well.
tl;dr: everything comes down to trade-offs and understanding the needs of your workloads, but in general, running 300 low power cores is probably indicative of an I/O-bound application and could hypothetically be much more efficient and cost-effective.