this post was submitted on 06 Mar 2025
25 points (90.3% liked)
Programming
18679 readers
116 users here now
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities [email protected]
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It all depends on how it's represented on disk though and how the query is executed. Sqlite only supports numbers and strings, and if you keep using a
VARCHAR
, a read of those rows are going to have materialize a string into memory inside the sqlite library. DuckDB has more types, but if you're using varchars everywhere, something has to read that string into memory unless you can push down logic into a query that doesn't actually have to read the actual value, such as one that can use indices.The best way is to change the representation on disk, such as converting low-cardinality columns like the
station
into a numeric id. A standardint
being four bytes is a lot more efficient than an n-byte string + a header and it can be compared by value.This is where file formats, like Parquet, shine. They're oriented more towards parsing by systems. JSON is geared towards human parsing.