clawlor

joined 2 years ago
 

Pydantic, Inc. is seeking feedback on their development roadmap. Features currently proposed:

  • Python Analytics/Observability — a logging and metrics platform with tight Python and Pydantic integration, designed to make the data flowing through your application more readily usable for both engineering and business analytics.
  • Data Gateway for object stores — Add validation, transformation and cataloguing in front of object stores like S3, with a schema defined in Pydantic models then validated by our Rust service.
  • Data Gateway for data warehouses — the same service as above, but integrated with your existing data warehouse.
  • Schema Catalog — for many, Pydantic already holds the highest fidelity representation of their data schemas. Our Schema Catalog will take this to the next level, serving as an organization-wide single source of truth for those schemas, tracking their changes, and integrating with our other tools and your wider platform.
  • Dashboards and UI powered by Pydantic models — a managed platform to deploy and control dashboards, auxiliary apps and internal tools where everything from UI components (like forms and tables) to database schema would be defined in Python using Pydantic models.
[–] clawlor 4 points 2 years ago

create_form(locals()) is so elegant! Each API endpoint is almost entirely described by the method signature alone, about as DRY as can be. Very clever (the good kind!)

[–] clawlor 2 points 2 years ago

Conversely: "Our incompetence, intransigence, sunk cost fallacy, decision making, and hiring idiocy have prevented you from ever possibly working with us. Sorry for the convenience."

RIP Mitch.

 

Hey all! I Just submitted c/Python to sub.rehab, which is currently on the front page of HN. It seems there is some manual submission review in place, but we seem to meet the minimum requirements for submission so hopefully we'll show up in the search results soon.

[–] clawlor 2 points 2 years ago

Great post! I suspect that PYTHONPATH hack might be useful in reorganizing a particular repo at $JOB that contains a few different deployable packages and a common library.

The thing I like most about using Makefiles in this way is that it can provide a consistent dev experience across many repos in a team setting, if each repo defines a consistent set of make targets: make setup, make test, make build etc. I don't have to care too much if the project is using pip vs. poetry, or pytest vs. unittest.

[–] clawlor 3 points 2 years ago

I've been using Copilot. I don't really use the prompt feature at all, but the autocomplete is very good. I find that it tends to follow the style of the code that's already in the module I'm working on, so it's mostly a speed thing than a "figure out how to do this for me" thing. It's also great at finishing comments and test cases.

[–] clawlor 2 points 2 years ago (7 children)

+1, exactly this.

As an aside, "stop the world" GC pauses can affect web server performance in interesting ways. Some web application servers have a perf profile where throughput drops off a cliff as the server approaches max memory load. This is fine, so long as you know what's happening, and can tune your auto scaling to spin up new servers before you start to hit that threshold. This likely wouldn't be a reason to not use a particular lang / server, except at the most massive scales.

[–] clawlor 5 points 2 years ago

You've got the right idea with your SQL example, that's pretty much exactly what N+1 would look like in your query logs.

This can happen when using an ORM, if you're not careful to avoid it. Many ORMs will query the database on attribute access, in a way that is not particularly obvious:


class User:
  id: int
  username: str

class Post:
  id: int

class Comment:
  id: int
  post_id: int  # FK to Post.id
  author_id: int  # FK to User
 

Given this simple python-ish example, many ORMs will let you do something like this:


post = Post.objects.get(id=11)

for comment in post.comments:  # SELECT * FROM comment WHERE post_id=11
    author = comment.author  # uh oh! # SELECT * FROM user WHERE id=comment.author_id

Although comment.author looks like a simple attribute access, the ORM has to issue a DB query behind the scenes. As a dev, especially one learning a new tool, it's not particularly obvious that this is happening, unless you've got some query logging that you're likely to notice during development.

A couple of fixes are possible here. Some ORMs will provide some method for fetching the comments via JOIN in the initial query. e.g. post = Post.objects.get(id=11).select_related("comments") instead of just post = Post.objects.get(id=11). Alternately, you could fetch the Post, then do another query to grab all the comments. In this toy example, the former would almost certainly be faster, but in a more complex example where you're JOINing across multiple tables, you might try breaking the query up in different ways if you're really trying to squeeze out the last drop of performance.

In general, DB query planners are very good at retrieving data efficiently, given a reasonable query + the presence of appropriate indexes.