this post was submitted on 23 Sep 2023
22 points (95.8% liked)

Python

6413 readers
3 users here now

Welcome to the Python community on the programming.dev Lemmy instance!

πŸ“… Events

PastNovember 2023

October 2023

July 2023

August 2023

September 2023

🐍 Python project:
πŸ’“ Python Community:
✨ Python Ecosystem:
🌌 Fediverse
Communities
Projects
Feeds

founded 1 year ago
MODERATORS
 

I know what I am asking is rather niche, but it has been bugging me for quite a while. Suppose I have the following function:

def foo(return_more: bool):
   ....
    if return_more:
        return data, more_data
   return data

You can imagine it is a function that may return more data if given a flag.

How should I typehint this function? When I use the function in both ways

data = foo(False)

data, more_data = foo(True)

either the first or the 2nd statement would say that the function cannot be assigned due to wrong size of return tuple.

Is having variable signature an anti-pattern? Is Python's typehinting mechanism not powerful enough and thus I am forced to ignore this error?

Edit: Thanks for all the suggestions. I was enlightened by this suggestion about the existence of overload and this solution fit my requirements perfectly

from typing import overload, Literal

@overload
def foo(return_more: Literal[False]) -> Data: ...

@overload
def foo(return_more: Literal[True]) -> tuple[Data, OtherData]: ...

def foo(return_more: bool) -> Data | tuple[Data, OtherData]:
   ....
    if return_more:
        return data, more_data
   return data

a = foo(False)
a,b = foo(True)
a,b = foo(False) # correctly identified as illegal
all 20 comments
sorted by: hot top controversial new old
[–] UndercoverUlrikHD 13 points 1 year ago (1 children)

from typing import Union is probably what you're looking for, but yes, I'd argue you should try to avoid that kind of pattern, even if it's convenient.

Sorry for the triple(?) notifications. Trying out the beta version of the boost app and it's still a bit buggy.

[–] [email protected] 1 points 1 year ago (1 children)

I thought about it, but it isn't as expressive as I wished.

Meaning if I do

a = foo(return_more=True)
or
a, b = foo(return_more=False)

it doesn't catch these errors for me.

In comparison, the other suggested solution does catch these.

[–] UndercoverUlrikHD 1 points 1 year ago

Yeah, good point, the linked answer seems better suited (even if I would still recommended not having a variable return). I appreciate the feedback!

[–] [email protected] 10 points 1 year ago (1 children)

always return more, than you can

data, _ = foo(false)

data, more_data = foo(true)

and write a good documentation in the function, why it has different return amounts.

A boolean toggle should influence the process, but not change the sigmature. Maybe two functions are better?

getfoo() and getmorefoo()?

[–] [email protected] 4 points 1 year ago (1 children)

Agreed. I avoid having the shape of the return type be determined by arguments. Having the return type be generic is one thing. But this is different as here you are taking about returning 1 object or 2 objects.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (2 children)

but from a practical perspective, let's say you retrieve an object and can either return a subset of its fields as your API. Doesn't it make sense to re-use the same function, but change what fields are returned?

I'm specifically talking about the narrow use-case where your API returns either A or B of the fields, and won't extend it in the future

The alternative is to either duplicate the function, or extract it out which seems a bit overkill if it is only 2 different type of functions.

[–] [email protected] 5 points 1 year ago

In my opinion, it doesn't. I'd rather have foo() and detailed_foo() over foo(detailed: bool = False).

Designing APIs can be hard at times. You have to shift your view to the person that will being using the code instead of the person implementing the code. There is also potential down side of returning a tuple or just a single thing if the single thing shares some of the same API as a tuple. Say the return type is Union[str, tuple[str, str]. Now result[0] can either be the first string or the first character of the returned string depending on how the function was called. This could lead to the failure happening farther away from where the bug is, which makes debugging harder. That being said, if you do want to proceed this way, overload with Literal[True] is the correct way to type this as mentioned in other comments.

I also don't think it's overkill to extract functionality just for 2 functions. I often do that even when it is only used in one function. Maybe the number of lines to implement the block starts to make the primary function too long. Or the logic is a bit complicated, so it easier to give it a clearer name.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (1 children)

In general, I'd say what you're trying to do is poor form; primarily because it's "just weird."

When you're writing code that will be interacted with later as a sort of API ... the #1 thing is how that API feels to use. Is it consistent? Does it follow normal rules? Are you likely to be surprised by how it behaves? Does it compose well (i.e. how well can it be used in other code)?

You're shoving two functions together and using a boolean flag to determine where to go. That's really weird. Data shouldn't drive the program in this way.

You've basically spelled:

def do_x():
def do_y():

do_x()

As:

def do_(char):

do_('x')

The program:

def bar(k):
  x = do_(k)

Is never going to be valid. I'd never accept a code review with this code in it without an extremely strong justification of why it has to be this way.

Remember, extra lines in your program are cheap. Bugs from being clever to reduce the number of lines aren't.

[–] [email protected] 2 points 1 year ago (1 children)

I think there's a spectrum here, and I'll clarify the stances.

The spectrum ranges from "Data shouldn't cause the function to do (something wildly) different" to "It should be allowed, even to the point of variable returns"

I think you stand on the former while I stand on the latter. Correct me if I'm wrong though, but that's the vibe I'm getting from the tone in your example.

Data shouldn’t drive the program in this way.

Suppose we have a function that calculates a price of an object. I feel it is agreeable for us to have compute_price(with_discount: bool), over compute_price_with_discount() + compute_price_without_discount()

You’ve basically spelled:

I feel your point your making in the example is a bit exaggerated. Again, coming back to my above example, I don't think we would construe it as compute_price('with_discount').

Maybe this is bandwagoning, but one of the reason for my stance is that there are quite a few examples of variable returns.

eg:

  • getattr may return a different type base on the key given
  • quite a few functions in numpy returns different things based on flags. SVD will return S if compute_uv=False and S,U,V otherwise
[–] [email protected] 1 points 1 year ago

I think you stand on the former

Absolutely.

Suppose we have a function that calculates a price of an object. I feel it is agreeable for us to have compute_price(with_discount: bool), over compute_price_with_discount() + compute_price_without_discount()

Well, presumably you'd also actually have some other inputs to a price compute function. In which case, I'd suggest bundling all that information into an Invoice type or something that includes whether or not discounts are applied...

Maybe this is bandwagoning, but one of the reason for my stance is that there are quite a few examples of variable returns.

getattr is really special, it's basically a reflection operator, it shouldn't be a model for how a normal function should behave.

I'm not familiar with numpy. The linked function though looks like a true case of generic behavior where an input changes an output in a specified way for any number of values that meet its requirements. A boolean flag is never generic.

[–] __init__ 8 points 1 year ago (3 children)

You may be able to achieve this using typing.Overload with typing.Literal for your argument. Check out this post about overload: https://adamj.eu/tech/2021/05/29/python-type-hints-how-to-use-overload/

[–] [email protected] 4 points 1 year ago

Nice! It looks like the best solution out there.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

yea, this is pretty close to what I'm looking for.

The only missing piece is the ability to define the overload methods on the bool

something like

@overload
def foo(return_more: True) -> (Data, Data)

@overload
def foo(return_more: False) -> Data

But I don't think such constructs are possible? I know it is possible in Typescript to define the types using constants, but I don't suppose Python allows for this?

EDIT: At first, when I tried the above, the typechecker said Literal[True] was not expected and I thought it was not possible. But after experimenting some, I figured out that it is actually possible. Added my solution to the OP

Thanks for the tip!

[–] eternacht 1 points 1 year ago

This is the real answer, overloads are meant for exactly this purpose.

It’ll be something like this:

from typing import Literal, overload

@overload
def foo() -> Data: …
@overload
def foo(return_more: Literal[True]) -> tuple[Data, Data]: …
def foo(return_more: bool = False) -> Data | tuple[Data, Data]
    ...
    if return_more:
        return data, more_data
   return data
[–] [email protected] 3 points 1 year ago* (last edited 1 year ago) (2 children)

def foo(return_more: bool) -> Union[Type1, tuple[Type2,Type3]]:

[–] [email protected] 4 points 1 year ago* (last edited 1 year ago)

Python >= 3.10 version:

def foo(return_more: bool) -> DataType | tuple[DataType, MoreDataType]: ...

But i would definitely avoid to do that if possible. I would maybe do something like this instead:

def foo(return_more: bool) -> tuple[DataType, MoreDataType | None]:
    ...
    if return_more:
        return data, more_data
   return data, None

Or if data is a dict, just update it with more_data:

def foo(return_more: bool) -> dict[str, Any]:
    ...
    if return_more:
        return data.update(more_data)
   return data
[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

You can also consider the new union that was introduced with Python 3.10, check PEP604 for details:

def foo(return_more: bool) -> Type1 | tuple[Type2,Type3]:

[–] [email protected] 1 points 1 year ago
def foo_return_more():
    ...
    return data, more_data

def foo():
    return foo_return_more()[0]