this post was submitted on 05 Jun 2024
96 points (98.0% liked)

Programming

17536 readers
179 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]



founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 16 points 5 months ago* (last edited 5 months ago) (3 children)

So, the word here is parallelism. It's not something specific to python, asyncio in python is just the implementation of asynchronous execution allowing for parallelism.

Imagine a pizza restaurant that has one cook. This is your typical non-async, non-threading python script - single-threaded.
The cook checks for new orders, pickups the first one and starts making the pizza one instruction at the time - fetching the dough, waiting for the ham slicer to finish slicing, ... eventually putting the unbaked pizza into oven and sitting there waiting for the pizza to bake.
The cook is rather inefficient here, instead of waiting for the ham slicer and oven to finish it's job he could be picking up new orders, starting new pizzas and fetching/making other different ingredients.

This is where asynchronicity comes in as a solution, the cook is your single-thread and the machines would be mechanisms that have to be started but don't have to be waited on - these are usually various sockets, file buffers (notice these are what your OS can handle for you on the side, asyncIO ).
So, the cook configures the ham slicer (puts a block of ham in) and starts it - but does not wait for each ham slice to fall out and put it on the pizza. Instead he picks up a new order and goes through the motions until the ham slicer is done (or until he requires the slicer to cut different ingredient, in this case he would have to wait for the ham task to finish first, put ...cheese there and switch to finishing the first order with ham).

With proper asynchronicity your cook can now handle a lot more pizza orders, simply because his time is not spent so much on waiting.
Making a single pizza is not faster but in-total the cook can handle making more of them in the same time, this is the important bit.


Coming back to why a async REPL is useful comes simply to how python implements async - with special ("colored") functions:

async def prepare_and_bake(pizza):
  await oven.is_empty()  # await - a context switch can occur and python will check if other asynchronous tasks can be continued/finalized
  # so instead of blocking here, waiting for the oven to be empty the cook looks for other tasks to be done
  await oven.bake(pizza)  
  ...

The function prepare_and_bake() is asynchronous function (async def) which makes it special, I would have to dive into Event Loops here to fully explain why async REPL is useful but in short, you can't call async functions directly to execute them - you have to schedule the func.
Async REPL is here to help with that, allowing you to do await prepare_and_bake() directly, in the REPL.


And to give you an example where async does not help, you can't speed up cutting up onions with a knife, or grating cheese.
Now, if every ordered pizza required a lot of cheese you might want to employ a secondary cook to preemptively do these tasks (and "buffer" the processed ingredients in a bowl so that your primary cook does not have to always wait for the other cook to start and finish).

This is called concurrency, multiple tasks that require direct work and can't be relegated to a machine (OS, or to be precise can't be just started and awaited upon) are done at the same time.
In a real example if something requires a lot of computation (calculating something - like getting nth fibonnaci number, applying a function to list with a lot of entries, ...) you would want to employ secondary threads or processes so that your main thread does not get blocked.

To summarize, async/parallelism helps in cases where you can delegate (IO) processing to the OS (usually reading/writing into/out of a buffer) but does not make anything go faster in itself, just more efficient as you don't have to wait so much which is often a problem in single-threaded applications.

Hopefully this was somewhat understandable explanation haha. Here is some recommended reading https://realpython.com/async-io-python/

Final EDIT: Reading it myself few times, a pizza bakery example is not optimal, a better example would have been something where one has to talk with other people but these other people don't have immediate responses - to better drive home that this is mainly used on Input/Output tasks.

[–] [email protected] 6 points 5 months ago

Final EDIT: Reading it myself few times, a pizza bakery example is not optimal

Yeah, I kept getting hung up on the notion of toppings being sliced or grated to-order, rather than bought from Sysco pre-prepped in big plastic bags like any normal pizzaria would do. Your analogy be fancy!

[–] [email protected] 6 points 5 months ago

Thank you very much!! That was very informative <3

[–] [email protected] 1 points 5 months ago (1 children)

Have you got concurrency and parallelism swapped around?

[–] [email protected] 1 points 5 months ago* (last edited 5 months ago)

In what part exactly?
The example is not perfect I can see that myself. If I read into it too much there could be an overlap with concurrency, e.g. the (IO) tasks awaited & delegated to the OS could be considered a form of concurrency but other then that I do think it's close to describing how async usually works.