this post was submitted on 19 Nov 2024
22 points (95.8% liked)

Godot

6048 readers
56 users here now

Welcome to the programming.dev Godot community!

This is a place where you can discuss about anything relating to the Godot game engine. Feel free to ask questions, post tutorials, show off your godot game, etc.

Make sure to follow the Godot CoC while chatting

We have a matrix room that can be used for chatting with other members of the community here

Links

Other Communities

Rules

We have a four strike system in this community where you get warned the first time you break a rule, then given a week ban, then given a year ban, then a permanent ban. Certain actions may bypass this and go straight to permanent ban if severe enough and done with malicious intent

Wormhole

[email protected]

Credits

founded 2 years ago
MODERATORS
 

I have heard many times that if statements in shaders slow down the gpu massively. But I also heard that texture samples are very expensive.

Which one is more endurable? Which one is less impactful?

I am asking, because I need to decide on if I should multiply a value by 0, or put an if statement.

top 9 comments
sorted by: hot top controversial new old
[–] Feyter 12 points 1 month ago* (last edited 1 month ago)

I think there is no obvious way of telling this, because it depends on how you if statement will be constructed and in the end what machine code will be generated from your code.

So best thing would probably be implement both and measure the results. I would argue that's how performance optimisations work. Don't trust on what a forum post tells you.

However chances are high that both will have similar performance in a range that doesn't matter for your use case... Without knowing your use case :)

[–] [email protected] 5 points 1 month ago* (last edited 1 month ago)

Impact of if statements depends on how you use them. GPUs are massively parallel and sacrifice complexity to fit more parallel compute. Threads aren't fully independent, so regardless of which branch is taken, the thread usually has to wait for both branches.

Pixels that take the then-branch idle while other ones take the else-branch and vice versa. That's precious GPU time wasted doing nothing. Adding more cases make this exponentially worse because the program has to wait for every case.

Can't say if it's slower than your other expensive job, though. Try it out and measure.

[–] [email protected] 4 points 1 month ago (2 children)

@Smorty Link doesn't load for me and I don't know the answer in general, but one thing I can say is that _sometimes_ if statements aren't an issue at all, which is when the condition evaluates to the same thing for all pixels/fragments. E.g. an "if sin(TIME) < 0.0" costs you almost nothing, whereas "if COLOR.r > 0.5" causes execution to branch and slows you down. But I can't say how that case compares to a texture lookup, I assume it depends on many thing

[–] PoolloverNathan 2 points 1 month ago (1 children)

I've heard that using mix() instead (or whatever GDShader calls that GLSL function) can be more performant, since it doesn't branch. Is that true?

[–] [email protected] 2 points 1 month ago

@PoolloverNathan Afaik that is true, yes! mix is the same instruction for all fragments, so if you can replace a branching if with a mix that should be an improvement

[–] Kissaki 2 points 1 month ago (1 children)

Link doesn’t load for me

This post has no link. It's a text post.

[–] [email protected] 1 points 1 month ago

@Kissaki I saw this post on Mastodon where it only contains the title and a link to the Lemmy post (which didn't work for me), I didn't realize I was actually commenting on a Lemmy post by replying on Mastodon haha

[–] [email protected] 3 points 1 month ago

@Smorty don't know for sure but experience tells me to go with mul zero.

[–] [email protected] 1 points 1 month ago

@Smorty because gpus can't feasibly do speculative execution, forking is more expensive than a lookup, which can be done in parallel and cached, but of course, it depends on what you're testing and what you're sampling
it's not the same to test for one equality than a complex function call, and it's not the same thing sampling a small or big texture, with or without mipmap levels, aggregation, etc