this post was submitted on 09 Sep 2023

70 points (92.7% liked)

Programming

18313 readers

170 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]

founded 2 years ago

MODERATORS

snowe

Ategon

[email protected]

Good performance is not just big O (jmmv.dev)

submitted 1 year ago by [email protected] to c/programming

12 comments fedilink hide all child comments

Having a fast and responsive app is orthogonal to “knowing your big Os”. Unfortunately, most tech companies over-emphasize algorithms in interviews and downplay systems knowledge, and I believe that’s one reason behind sluggish apps and bloated systems.

I’ve seen this play out repeatedly. Interviewers ask a LeetCode-style coding question, which is then followed by the ritual of discussing time and memory complexity. Candidates ace the answers. But then… their “real” code suffers from subtle yet impactful performance problems.

all 13 comments

sorted by: hot top controversial new old

[–] atheken 13 points 1 year ago (2 children)

Sure, you can make it 10x faster by optimizing the complexity, but I can improve throughput by 1000x by putting it on 1000 machines.

In practice, it’s a bit of both. Paying attention to Big-O and understanding the gross relative cost of various operations goes a long way, but being able to run an arbitrarily large number of them can get you out of a lot of jams.

[–] [email protected] 16 points 1 year ago

I can improve throughput by 1000x by putting it on 1000 machines

My app is installed on 10000 phones! See how much better it runs!

[–] [email protected] 3 points 1 year ago

And sometimes a co pletely different approach is better than focusing on the runtime of a single piece. We had a system that was sql code that ran many functions that scaled horribly as data was added to the system. Each function was as fast as could be measured, but the process of running them as separate repetitive functions ended up being over 24 hours in some cases.

Redoing them as a single stored procedure and optimizing that reduced it to a few minutes. This approach was much more complex than the individual functions and took about a year to replace what was oroginally written in 3 months.

Many times the effort to make something more efficient takes far longer than getting it to work in the first place, and many processes are slow because they did not allocate time to properly optimize or include staff that are able to optimize.

[–] [email protected] 11 points 1 year ago

The vast majority of wall time for most uses is io. You need someone on your team to care about big o, but for most teams, its not the problem

[–] [email protected] 5 points 1 year ago (1 children)

Article ends with "I’m not a performance expert at all, but I do like systems and enjoy thinking about their holistic behavior."

That's the only take away. Author isn't a subject matter expert but is going flap on about it anyway.

[–] [email protected] 29 points 1 year ago (1 children)

I think this is the author being humble. jmmv is a long time NetBSD and FreeBSD contributor (tmpfs, ATF, pkg_comp), has worked as a SRE at Google, and has been a developer on projects such as Bazel (build infrastructure). They probably know a thing or two about performance.

Regarding the overall point of the blog, I agree with jmmv. Big O is a measure of efficiency at scale, not a measure of performance.

As someone who teaches Data Structures and Systems Programming courses, I demonstrate this to students early on by showing them multiple solutions to a problem such as how to detect duplicates in a stream of input. After analyzing the time and space complexities of the different solutions, we run it the programs and measure the time. It turns out that the O(nlogn) version using sorting can beat out the O(n) version due to cache locality and how memory actually works.

Big O is a useful tool, but it doesn't directly translate to performance. Understanding how systems work is a lot more useful and important if you really care about optimization and performance.

[–] [email protected] 11 points 1 year ago

Big O is a useful tool, but it doesn't directly translate to performance. Understanding how systems work is a lot more useful and important if you really care about optimization and performance.

This is something I learned pretty early on as a professional developer. I got a computer science degree and was taught data structures, algorithms, and big O. In my first job, I came across a small piece of Java code that was being run a lot that had a small list being searched each time. I figured converting to a HashSet would be faster, but in testing it was actually slower. I forget the exact details, but I learned to test my assumptions about performance and not to just blindly change things.

My degree, for the most part, did not prepare me for working with large, interconnected systems. The only course that came close was an elective called "concurrent programming" which was really just "algorithms, but parallel".

[–] [email protected] -1 points 1 year ago (2 children)

Besides an amazing anime, the heck is a big O?

[–] [email protected] 12 points 1 year ago* (last edited 1 year ago) (1 children)

Itcs a generalized method/notation of measuring how long a piece of code takes to run for a given input.

O(n^2^) means that as the input n grows, it takes exponential time to process. Like if you were trying to find items that matched in an array by looping over every item in the array and then using a nested loop in the first one to compare against every other item in the array. You'd be doing (# of items) * (# of items) comparisons, or (# of items)^2^ comparisons. n^2^ comparisons.

There's some rules about simplifying the result down to the most significant portion, so O^n^+n would be simplified as O^n^, but that's the basics.

It's a pretty important concept as you get into making larger projects.

[–] [email protected] 17 points 1 year ago

O(n^2^) means that as the input n grows, it takes exponential time to process.

this is really pedantic, but O(n^2^) is quadratic, not exponential. the exponential runtimes are things like O(2^n^). when n gets even modestly big (say n=100), youre looking at a difference of 2^100^ ≈ 1.26×10^30^ vs 100^2^ = 10,000. this is just to say that exponential runtime is really in a class of its own.

but otherwise i think this was a pretty good explanation of the concept

[–] [email protected] 2 points 1 year ago

Its when you find the clitoris for the first time.

Joking aside, it's a description of the runtime of a thing for a size of a data set. Its expressed as a function. So for example an exponential function would get longer and longer as your data set size grows, linear time has a basically proportional operating time compared to the size of the data set, and log(n) would see runtime increase very little as data set size increases.