this post was submitted on 05 Aug 2023

100 points (97.2% liked)

No Stupid Questions

35393 readers

2 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)

Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.

Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.

Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.

Rule 4- No self promotion or upvote-farming of any kind.

That's it.

Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.

Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.

Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.

Rule 8- All comments should try to stay relevant to their parent content.

Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.

Rule 10- Majority of bots aren't allowed to participate here.

Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 1 year ago

MODERATORS

[email protected]

100

What would 128 bits computing look like? (lemmy.ml)

submitted 1 year ago by [email protected] to c/[email protected]

58 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 80 points 1 year ago (3 children)

The PS3 had a 128-bit CPU. Sort of. "Altivec" vector processing could split each 128-bit word into several values and operate on them simultaneously. So for example if you wanted to do 3D transformations using 32-bit numbers, you could do four of them at once, as easily as one. It doesn't make doing one any faster.

Vector processing is present in nearly every modern CPU, though. Intel's had it since the late 90s with MMX and SSE. Those just had to load registers 32 bits at a time before performing each same-instrunction-multiple-data operation.

The benefit of increasing bit depth is that you can move that data in parallel.

The downside of increasing bit depth is that you have to move that data in parallel.

To move a 32-bit number between places in a single clock cycle, you need 32 wires between two places. And you need them between any two places that will directly move a number. Routing all those wires takes up precious space inside a microchip. Indirect movement can simplify that diagram, but then each step requires a separate clock cycle. Which is fine - this is a tradeoff every CPU has made for thirty-plus years, as "pipelining." Instead of doing a whole operation all-at-once, or holding back the program while each instruction is being cranked out over several cycles, instructions get broken down into stages according to which internal components they need. The processor becomes a chain of steps: decode instruction, fetch data, do math, write result. CPUs can often "retire" one instruction per cycle, even if instructions take many cycles from beginning to end.

To move a 128-bit number between places in a single clock cycle, you need an obscene amount of space. Each lane is four times as wide and still has to go between all the same places. This is why 1990s consoles and graphics cards might advertise 256-bit interconnects between specific components, even for mundane 32-bit machines. They were speeding up one particular spot where a whole bunch of data went a very short distance between a few specific places.

Modern video cards no doubt have similar shortcuts, but that's no longer the primary way the perform ridiculous quantities of work. Mostly they wait.

CPUs are linear. CPU design has sunk eleventeen hojillion dollars into getting instructions into and out of the processor, as soon as possible. They'll pre-emptively read from slow memory into layers of progressively faster memory deeper inside the microchip. Having to fetch some random address means delaying things for agonizing microseconds with nothing to do. That focus on straight-line speed was synonymous with performance, long after clock rates hit the gigahertz barrier. There's this Computer Science 101 concept called Amdahl's Law that was taught wrong as a result of this - people insisted 'more processors won't work faster,' when what it said was, 'more processors do more work.'

Video cards wait better. They have wide lanes where they can afford to, especially in one fat pipe to the processor, but to my knowledge they're fairly conservative on the inside. They don't have hideously-complex processors with layers of exotic cache memory. If they need something that'll take an entire millionth of a second to go fetch, they'll start that, and then do something else. When another task stalls, they'll get back to the other one, and hey look the fetch completed. 3D rendering is fast because it barely matters what order things happen in. Each pixel tends to be independent, at least within groups of a couple hundred to a couple million, for any part of a scene. So instead of one ultra-wide high-speed data-shredder, ready to handle one continuous thread of whatever the hell a program needs next, there's a bunch of mundane grinders being fed by hoppers full of largely-similar tasks. It'll all get done eventually. Adding more hardware won't do any single thing faster, but it'll distribute the workload.

Video cards have recently been pushing the ability to go back to 16-bit operations. It lets them do more things per second. Parallelism has finally won, and increased bit depth is mostly an obstacle to that.

So what 128-bit computing would look like is probably one core on a many-core chip. Like how Intel does mobile designs, with one fat full-featured dual-thread linear shredder, and a whole bunch of dinky little power-efficient task-grinders. Or... like a Sony console with a boring PowerPC chip glued to some wild multi-phase vector processor. A CPU that they advertised as a private supercomputer. A machine I wrote code for during a college course on machine vision. And it also plays Uncharted.

The PS3 was originally intended to ship without a GPU. That's part of its infamous launch price. They wanted a software-rendering beast, built on the Altivec unit's impressive-sounding parallelism. This would have been a great idea back when TVs were all 480p and games came out on one platform. As HDTVs and middleware engines took off... it probably would have killed the PlayStation brand. But in context, it was a goofy path toward exactly what we're doing now - with video cards you can program to work however you like. They're just parallel devices pretending to act linear, rather than they other way around.

[–] [email protected] 21 points 1 year ago (1 children)

There’s this Computer Science 101 concept called Amdahl’s Law that was taught wrong as a result of this - people insisted ‘more processors won’t work faster,’ when what it said was, ‘more processors do more work.’

You massacred my boy there. It doesn't say that at all. Amdahl's law is actually a formula how much speedup you can get by using more cores. Which boils down to: How many parts of your program can't be run in parallel? You can throw a billion cores at something, if you have a step in your algorithm that can't run in parallel.. that's going to be the part everything waits on.

Or copied:

Amdahl's law is a principle that states that the maximum potential improvement to the performance of a system is limited by the portion of the system that cannot be improved. In other words, the performance improvement of a system as a whole is limited by its bottlenecks.

[–] [email protected] 7 points 1 year ago (2 children)

Gene Amdahl himself was arguing hardware. It was never about writing better software - that's the lesson we've clawed out of it, after generations of reinforcing harmful biases against parallelism.

Telling people a billion cores won't solve their problem is bad, actually.

Human beings by default think going faster means making each step faster. How you explain that's wrong is so much more important than explaining that it's wrong. This approach inevitably leads to saying 'see, parallelism is a bottleneck.' If all they hear is that another ten slow cores won't help but one faster core would - they're lost.

That's how we got needless decades of doggedly linear hardware and software. Operating systems that struggled to count to two whole cores. Games that monopolized one core, did audio on another, and left your other six untouched. We still lionize cycle-juggling maniacs like John Carmack and every Atari programmer. The trap people fall into is seeing a modern GPU and wondering how they can sort their flat-shaded triangles sooner.

What you need to teach them, what they need to learn, is that the purpose of having a billion cores isn't to do one thing faster, it's to do everything at once. Talking about the linear speed of the whole program is the whole problem.

[–] [email protected] 6 points 1 year ago (1 children)

You still don't get it. This is about algorithmic complexity.

Say you have an algorithm that has 90% that can be done in parallel, but you have 10% that can't. No matter how many cores you throw at it, be it 4, 10, or a billion, the 10% will be the slowest part that you can't optimize with more cores. So even with an unlimited amount of cores, your algorithm is still having to wait on the last 10% that runs on a single core.

Amdahl's law is simply about those 10% you can't speed up, no matter how many cores you have. It's a bottleneck.

There are algorithms you can't run in parallel, simply because the results depend on each other. For example in a cipher where you first calculate block A, then to calculate block B you rely on block A. You can't do block A and B at the same time, it's not possible. Yes, you can use multi-threading to calculate A, then do it again to calculate B, but overall you still have waiting times while you wait for each result, which means no matter how fast you get, you always have a minimum time that you'll need.

Throwing more hardware at this won't help, that's the entire point. It helps to a certain degree, but at some point the parts you can't run in parallel will hold you back. This obviously doesn't count for workloads that can be done 100% in parallel (like rendering where you can split the workload up without issues), Amdahl's law doesn't apply there as the amount of single-core work would be zero in the equation.

The whole thing is used in software development (I heard of Amdahl's law in my university class) to decide if it makes sense to multi-thread part of the application. If the work you do is too sequential then multi-threading won't give you much of a benefit (or makes it run worse, as you have to spin up threads and synchronize results).

[–] [email protected] 1 points 1 year ago (1 children)

I am a computer engineer. I get the math.

This is not about the math.

Speeding up a linear program means you've already failed. That's not what parallelism is for. That's the opposite of how it works.

Parallel design has to be there from the start. But if you tell people adding more cores doesn't help, unless!, they're not hearing "unless." They're hearing "doesn't." So they build shitty programs and bemoan poor performance and turn to parallelism to hurry things up - and wow look at that, it doesn't help.

I am describing a bias.

I am describing how a bias is reinforced.

That's not even a corruption of Amdahl's law, because again, the actual dude named Amdahl was talking to people who wanted to build parallel machines to speed up their shitty linear code. He wasn't telling them to code better. He was telling them to build different machines.

Building different machines is what we did for thirty or forty years after that. Did we also teach people to make parallelism-friendly programs? Did we fuck. We're still telling students about "linear portions" as if programs still get entered on a teletype and eventually halt. What should be a 300-level class about optimization is instead thrown at people barely past Hello World.

We tell them a billion processors might get them a 10% speedup. I know what it means. You know what it means. They fucking don't.

Every student's introduction to parallelism should be a case where parallelism works. Something graphical, why not. An edge-detect filter that crawls on a monster CPU and flies on a toy GPU. Not some archaic exercise in frustration. Not some how-to for turning two whole cores into a processor and a half. People should be thinking in workloads before they learn what a goddamn pointer is. We betray them, by using a framing of technology that's older than disco. Amdahl's law as she is taught is a relic of the mainframe era.

Telling kids about the limits of parallelism before they've started relying on it has been an excellent way to ensure they won't.

[–] [email protected] 4 points 1 year ago (1 children)

At this point you're just arguing to argue. Of course this is about the math.

This is Amdahl's law, it's always about the math:

https://upload.wikimedia.org/wikipedia/commons/thumb/e/ea/AmdahlsLaw.svg/1024px-AmdahlsLaw.svg.png

No one is telling students to use or not use parallelism, it depends on the workload. If your workload is highly sequential, multi-threading won't help you much, no matter how many cores you have. So you might be able to switch out the algorithm and go with a different one that accomplishes the same job. Or you re-order tasks and rethink how you're using the data you have available.

Practical example: The game Factorio. It has thousands of conveyor belts that have to move items in a deterministic way. As to not mess things up this part of the game ran on a single thread to calculate where everything landed (as belts can intersect, items can block each other and so on). With some clever tricks they rebuilt how it works, which allowed them to safely spread the workload over several cores (at least for groups of belts). Bit of a write-up here (under "Multithreaded belts").

Teaching software development involves teaching the theory. Without that you would have a difficult time to decide what can and what can't benefit from multi-threading. Absolutely no one says "never multi-thread!" or "always multi-thread!", if you had a teacher like that then they sucked.

Learning about Amdahl's law was a tiny part of my university course. A much bigger part was actually multi-threading programs, working around deadlocks, doing performance testing and so on. You're acting as if the teacher shows you Amdahl's law and then says "Obviously this means multi-threading isn't worth it, let's move on to the next topic".

[–] [email protected] 2 points 1 year ago (1 children)

"The way we teach this relationship causes harm."

"Well you don't understand this relationship."

"I do, and I'm saying: people plainly aren't getting it, because of how we teach it."

"Well lemme explain the relationship again--"

Nobody has to tell people not to use parallelism. They just... won't. In part because of how people tend to think, by default, and in part because of how we teach them to think.

We would have to tell students to use parallelism, if we expect graduates to choose it freely. It's hard and it's weird and you can't just slap it on at the end. It should become what they do first.

I am telling you in some detail how focusing on linear performance, using the language of the nineteen goddamn seventies, doesn't need to say multi-threading isn't worth it, to leave people thinking multi-threading isn't worth it.

Jesus, even calling it "multi-threading" is an obstacle. It makes parallelism sound like some fancy added feature. It's the version of parallelism that shows up in late-version changelogs, when for some reason performance has become an obstacle.

[–] [email protected] 2 points 1 year ago (1 children)

Multi-threading is difficult, you can't just slap it on everything and call it a day.

There are languages where it's easier (Go, Rust, ..) but parallelism is an advanced feature. Do it wrong and you get race conditions or dead locks. There is a reason you learn about this later in programming, but you do learn about it (and get to use it).

When we're being honest most programmers work on CRUD applications, which are highly sequential, usually waiting on IO and not CPU cycles and so on. Saving 2ms on some operations doesn't matter if you wait 50ms on the database (and sometimes using more threads is actually slower due to orchestration). If you're working with highly efficient algorithms or with GPUs then parallelism has a much higher priority. But it always depends on what you're working with.

Depending on your tech stack you might not even have the option to properly use parallelism, for example with JavaScript (if you don't jump through hoops).

[–] [email protected] 1 points 1 year ago

"Here's all the ways we tell people not to use parallelism."

I'm sorry, that's not fair. It's only a fraction of the ways we tell people not to use parallelism.

Multi-threading is difficult, which is why I said it's a fucking obstacle. It's the wrong model. The fact you'd try to "slap it on" is WHAT I AM TALKING ABOUT. You CANNOT just apply more cores to existing linear code. You MUST actively train people to write parallel-friendly code, even if it won't necessarily run in parallel.

Javascript is a terrible language I work with regularly, and most of the things that should be parallel aren't - and yet - it has abundant features that should be parallel. It has absorbed elements of functional programming that are excellent practice, even if for some goddamn reason they're actually executed in-order.

Fetches are single-threaded, in Javascript. I don't even know how they did that. Grabbing a webpage and then responding to an event using an inline function is somehow more rigidly linear than pre-emptive multitasking in Windows 95. But you should still write the damn things as though they're going to happen in parallel. You have no control over the order they happen in. That and some caching get you halfway around most locks.

Javascript, loathesome relic, also has vector processing. The kind insisted upon by that pedant in the other subthread, who thinks the 512-bit vector units in a modern Intel chip don't qualify, but the DSP on a Super Nintendo does. Array.forEach and Array.map really fucking ought to be parallelisable. Google could use its digital imperialism to force millions of devs to adopt better standards, just by following the spec and not processing keys in a rigid order. Bad code treating it like a simplified for-loop would break. Good code... wouldn't.

We want people to write that kind of code.

Not necessarily code that will run in parallel. Just code that could.

Workload-centric thinking is the only thing that's going to stop "let's add a little parallelism, as a treat" from producing months of needless agony. Anything else has to be dissected, warped beyond recognition, and stitched back together, with each step taking more effort than starting over from scratch, and the end result still being slow and unreadable and fragile.

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago)

Amdahl's isn't the only scaling law in the books.

Gustafson's scaling law looks at how the hypothetical maximum work a computer could perform scales with parallelism—idea being for certain tasks like simulations (or, to your point, even consumer devices to some extent) which can scale to fully utilize, this is a real improvement.

Amdahl's takes a fixed program, considers what portion is parallelizable, and tells you the speed up from additional parallelism in your hardware.

One tells you how much a processor might do, the only tells you how fast a program might run. Neither is wrong, but both are incomplete picture of the colloquial "performance" of a modern device.

Amdahl's is the one you find emphasized by a Comp Arch 101 course, because it corrects the intuitive error of assuming you can double the cores and get half the runtime. I only encountered Gustafson's law in a high performance architecture course, and it really only holds for certain types of workloads.

[–] [email protected] 6 points 1 year ago (1 children)

slight correction. vector processing is available on almost no common architectures. What most architectures have is SIMD instructions. Which means that code that was written for sse2 cannot and will not ever make use of the wider AVX-512 registers.

The risc-v isa is going towards the vector processing route. The same code works on machines with wide vector registers, or ones with no real parallel ability, but will simply loop in hardware.

Simd code running on a newer cpu with better simd capabilities will not run any faster. Unmodified vector code on a better vector processor, will run faster

[–] [email protected] 3 points 1 year ago (6 children)

Fancier tech co-opting an existing term doesn't make the original use wrong.

Any parallel array operation in hardware is vector processing.

load more comments (6 replies)

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (1 children)

I am unsure about the historical reasons for moving from 32-bit to 64-bit, but wouldnt the address space be a significantly larger factor? Like you said, CPUs have had vectoring instructions for a long time, and we wouldn't move to 128-bit architectures just to be able to compute with numbers of those size. Memory bandwidth is, also as you say, limited by the bus widths and not the processor architecture. IMO, the most important reason that we transitioned to 64-bit is primarily for the larger address address space without having to use stupidly complex memory mapping schemes. There are also some types of numbers like timestamps and counters that profit from 64-bit, but even here I am not sure if the conplex architecture would yield a net slowdown or speedup.

To answer the original question: 128 bits would have no helpful benefit for the address space (already massive) and probably just slow everyday calculations down.

[–] [email protected] 2 points 1 year ago (1 children)

8-bit machines didn't stop dead at 256 bytes of memory. Address length and bus width are completely independent. 1970s machines were often built with bit-slice memory, with however many bits of addressing, and one-bit output. If you wanted 8-bit memory then you'd wire eight chips in parallel - with the same address lines. Each chip would deliver a different part of the same logical byte.

64-bit math doesn't need 64-bit hardware, either. Turing completeness says any computer can run the same code - memory and time allowing. As an object example, Javascript exclusively used 64-bit double floats, even when it was defined in the late 1990s, and ran exclusively on 32-bit machines.

[–] [email protected] 1 points 1 year ago (2 children)

Clearly you can address more bytes than your data bus width. But then why all the "hacks" on 32-bit architectures? Like the 36-bit address bus via memory mapping on SPARCv8 instead of using paired index registers ( or ARMv7 width LPAE). From a perfomance perspective using an address width that is not the native register width/ internal data bus width is an issue. For a significant subset of operations multiple instructions are required instead of one.

Also is your comment about turing completeness to be taken seriously? We are talking about performance and practicality. Go ahead and crunch on some 64-bit floats using purely 8-bit arithmetic operations (or even using vector registers). Of course you can, but the point is that a suitable word size is more effective for certain computational tasks. For operations that are done frequently, they should ideally be done at native data-bus width. Vectored operations will also cost performance.

load more comments (2 replies)

[–] [email protected] 50 points 1 year ago (2 children)

They would look the same really. The word size being 128 instead of 64 doesn't really change anything about the architecture. It just means the proc's registers are 128 bits in size, the system bus is 128, each RAM address and data is 128, etc. The only difference would be significantly more expensive to crunch ridiculously large numbers. So really not much benefit. I expect 64 to be the standard for quite a long time, maybe forever, because we have much bigger bottlenecks to worry about.

[–] [email protected] 16 points 1 year ago

There are already special instruction sets that deal with 128 and up bits. Many SIMD. AVX-512 for example deals with 512 bits at a time.

At this point the advantage is parallelization and specialization of operations. AVX can be used for video encoding/decoding for example, or crypto, ...

[–] [email protected] 6 points 1 year ago (2 children)

maybe forever, because we have much bigger bottlenecks to worry about.

Well now I'm wondering what bottlenecks you have in mind. What do you believe to be the biggest bottlenecks for PCs in the near future?

[–] [email protected] 19 points 1 year ago

We're getting to the point where we can't really make transistors much smaller, for one

[–] [email protected] 12 points 1 year ago

Mostly heat. Every gate destroys information, which is kinda the definition of entropy, so it necessarily generates heat. There's goofy plans for "reversible computing" that swap bits - so true is 10 and false is 01 - and those should only produce heat through the resistance in the wires. (I personally suspect you'd have to shuttle data elsewhere and destroy it anyway. That'd be off-chip, so it could be arbitrarily large, instead of concentrating hundreds of watts in a thumbnail of silicon. But you'd still have a motherboard with a north bridge, a south bridge, and a woodshed.)

The other change that'd make wider lanes less egregious is 3D chip design. We're pretty far from 2D, already. There's dozens of layers of stuff going on in any complex microchip. AMD's even stacking a couple naked dies on top of one another for higher memory bandwidth. But what'd be transformative is the ability to fold any square layout into a cube, with as much fine detail vertically as it has horizontally. 256-bit data paths could be 16 traces wide and tall. Some could have no presence at all, because the destination is simply atop the source, and connected by a bunch of 10nm diagonals.

But aside from the design and manufacturing complexity of that added dimension, current technology would briefly turn that cube into an incandescent lightbulb. The magic smoke would escape with unprecedented efficiency.

[–] [email protected] 41 points 1 year ago (1 children)

exactly the same as 64 bit computing, except pointers now take up twice as much ram, and therefore you need mire baseline momory throuput/ more cache, for pretty much no practical benefit. Because we aren't close to fully using up a 64-bit address space .

[–] [email protected] 8 points 1 year ago (1 children)

Our modern 64 bit processors do use 128 bits for certain vector operations though, don't they? So there is another aspect apart from address space.

[–] [email protected] 5 points 1 year ago

Yes, up to 512 bits since Skylake. But there are very few real-world tasks that can make use of such wide data paths. One example is media processing, where a 512-bit register could be used to pack 8 64-bit operands and act on all of them simultaneously, because there is usually a steady stream of data to be process using similar operations. In other tasks, where processing patters can't make use of such batched approaches, the extra bits would essentially be wasted.

[–] [email protected] 17 points 1 year ago (1 children)

It wouldn't be much different. Was it noticeably different when you went from a 32 bit to 64 bit computer?

[–] [email protected] 3 points 1 year ago (1 children)

For me it was, actually. Maybe because I was late to the party so people stopped developing shit for 32 bits, and when I did the transition was like "Finally, I can install shit" also my computer was newer and the OS worked better.

[–] [email protected] 13 points 1 year ago

So your PC was old (thus the new one faster) and its HW no longer supported by some software developers (because it was outdated and not enough users were on it anymore). The same can hold true if you have a 5 year old PC now. You didn't notice this due to going 64bit, you noticed it due to going away from a heavily outdated system.

[–] [email protected] 15 points 1 year ago (1 children)

The big shortcoming of 32 bit hardware was that it limits the amount of RAM in the computer to 4 GB. 64 bit is not inherently faster (for most things) but it enables up to 16 exabytes of RAM, an incomprehensible amount. Going to 128 bit would only be helpful if 16 exabytes wasn't enough.

[–] [email protected] 7 points 1 year ago

Slightly off topic, but the number of bits doesn’t necessarily describe the size of memory. For example most eight bit processors had 16bit data busses and address registers.

Some processors that were 32 bits internally have 24bit memory addressing.

[–] [email protected] 13 points 1 year ago (5 children)

We have 128 bit stuff in some places where it's advantageous, but in most cases there's not really a need. 64 bits already provides a maximum integer value of (+/-)9,223,372,036,854,775,807. Double it if you don't need negatives and drop the sign. There's little need in most cases for a bigger number, and cases that do either get 128 bit hardware, or can be handled by big number libraries.

load more comments (5 replies)

[–] [email protected] 10 points 1 year ago

Similar to a modern 64 bit computer, my computer actually has a 512 bit wide ALU for SIMD, basically it lets you do the same operation on multiple numbers simultaneously.

[–] [email protected] 7 points 1 year ago

Riscv has a 128bit instruction set under proposal.

Nobody will ever use it, they know they'll never use it, it's stupid and impractical.

But maybe one day we want to connect every computer on the world together, we could probably use 128bit addresses so my computer could work on data in your computers memory.

There are ways to do this now, but maybe they make it simpler later.

[–] [email protected] 6 points 1 year ago

You'd be able to access more memory than we'd be able to physically build in the world.

[–] [email protected] 5 points 1 year ago (2 children)

[–] [email protected] 12 points 1 year ago

Contrary to some misconceptions, these SIMD capabilities did not amount to the processor being "128-bit", as neither the memory addresses nor the integers themselves were 128-bit, only the shared SIMD/integer registers. For comparison, 128-bit wide registers and SIMD instructions had been present in the 32-bit x86 architecture since 1999, with the introduction of SSE. However the internal data paths were 128bit wide, and its processors were capable of operating on 4x32bit quantities in parallel in single registers.

Source

[–] [email protected] 5 points 1 year ago (1 children)

What?

[–] [email protected] 3 points 1 year ago (1 children)

I would guess they think a PS2 is an example of 128 bit computing.

[–] [email protected] 3 points 1 year ago (1 children)

The PS2 had full 128 bits DMA bus, and full 128 bits registers. IIRC Dreamcast too.

[–] [email protected] 3 points 1 year ago (1 children)

Contrary to some misconceptions, these SIMD capabilities did not amount to the processor being "128-bit", as neither the memory addresses nor the integers themselves were 128-bit, only the shared SIMD/integer registers

[–] [email protected] 5 points 1 year ago (1 children)

OP’s question is very vague. I would argue that the PS2 was indeed capable of “128 bits computing”, even if it isn’t technically a 128 bits computer.

I’m also pretty sure the comment was tongue in cheek.

[–] [email protected] 2 points 1 year ago

I also believe the initial reply was a bit cheeky.

[–] [email protected] 4 points 1 year ago* (last edited 1 year ago) (1 children)

It’s hard to picture “128bit computing” in a general sense as ever being a thing. It’s just so far beyond anything we can realistically use now, plus would be inefficient/wasteful on most ordinary tasks.

Put this together with the physical limits to Moore’s law and current approaches to at least mobile computing ……

I picture more use of multi-core, specialty core, system on a chip. Some loads, like video, benefit from wide lanes, huge bandwidth, addresses many things at once, and we have video cores with architectures more suited for that. Most loads can be done with a standard compute core, and it is unnecessary, maybe counterproductive to move up to 128bit. If we want efficiency cores, like some mobile already have, 128bit is wrong/bad/inefficient. We’ll certainly have more AI cores, but I have no idea what they need

If you can forgive the Apple boosterism and take this as a general trend, see the focus on fast interconnections to many specialty cores. Each core has a different architecture and different needs

— https://www.apple.com/newsroom/2023/06/apple-introduces-m2-ultra/

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (1 children)

not even an apple thing isn't this just how SOCs work in general? definitely something intel and amd should be doing though (if they aren't already i dont honestly know) especially with hardware decoders and ML cores and whatnot

[–] [email protected] 2 points 1 year ago

Yes, this is how SoC can work. I think it is a great description about one specific company emphasizing a balance of different cores to do different jobs, rather than trying to make many general cores attempting to do everything. However, don’t get distracted by all the marketing language or that this is a company that people love to hate

[–] [email protected] 4 points 1 year ago

In fact, your computer is already capable of processing more than 64 bits at once using SIMD instructions. Many applications or things you don't suspect may or are already using them, including games.

load more comments