this post was submitted on 08 Aug 2023
8 points (100.0% liked)

Data Engineering

373 readers
1 users here now

A community for discussion about data engineering

Icon base by Delapouite under CC BY 3.0 with modifications to add a gradient

founded 1 year ago
MODERATORS
 

This article helped defined the “data engineer” role so I’d say it belongs here!

Although some time has passed, I find it very relevant: SQL is used more than ever, graphical ETL tools that don’t output code are rare and vendors are still trying to convince executives to trust all their data to proprietary data warehouses.

The author Maxime Beauchemin also wrote Airflow and Superset so they have some experience worth listening to.

you are viewing a single comment's thread
view the rest of the comments
[–] ndotb 2 points 1 year ago (2 children)

Man, SSIS really stunk. You'd end up having to write your own components anyways and had the extra layer of making them look like pricey RAD toolkit bits to satisfy empty suits. And then you'd have to write SSIS packages that wrote SSIS packages to deal with fluid schemas from multiple teams deploying all of the time.

[–] jim 2 points 1 year ago (1 children)

I've said this before to other people, but over time, those tools eventually became what Airflow and other orchestration tools are: defining DAGs and running scripts.

When I was using SSIS, eventually, every task was a C# or PowerShell executor instead of using the built-in functionality. So glad for Airflow and other modern tools today.

[–] Reader9 1 points 1 year ago

those tools eventually became what Airflow and other orchestration tools are: defining DAGs and running scripts

Definitely. It is much more pleasant to work with better tools for the same functionality.

Airflow got a lot of things right. For example in Luigi a runnable “task” is a python class that gets implicitly executed, whereas in Airflow tasks are made from functions that get called in a more straightforward/imperative manner. This makes DAGs much easier to read and write in Airflow.