Data Engineering

259 readers
4 users here now

News and discussion on Data Engineering topics

founded 1 year ago
MODERATORS
1
 
 
2
3
4
5
6
4
Unified Star Schema (towardsdatascience.com)
submitted 1 year ago by [email protected] to c/[email protected]
 
 

Hi all,

I was recently reading about the Unified Star Schema and the Puppini Bridge. I’m curious whether anyone here has experience with it and what their thoughts are.

TIA

7
 
 
8
 
 

I plan to run a few tests to determine if Kafka is suitable for a certain usecase I have in mind.

My idea is to run a local cluster of Kafka servers (either VMs or containers), produce/consume a series of messages, observe a bunch of metrics (Prometheus & Grafana) and custom business logic outcomes.

What are some good tools to record and visualise the internals of Kafka cluster?

I'm looking for things like consumer lag, topic replication, possibly tracing messages, ...

Originally posted on https://mastodon.social/@bahmanm/110662538718523380

9
 
 

Hi fellow data engineers,

Currently I’m restructuring a pipeline written with pyspark on Databricks. Since it’s a lot of transformations, results in an extensive DAG, but it’s cool to spend some extra processing resources to make a standard dimensional model (apart from the necessary transformations).

Was wondering what real benefits you have seen a star schema design has from the “one big table” approach, I could preach to my team? (My goal mainly would be to have a resulting smaller PowerBI model.)

And as a side question, what tools do you use to create a dimensional model such a star schema with code?

Thanks a lot!

10
 
 

Hi fellow data engineers,

Currently I’m restructuring a pipeline written with pyspark on Databricks. Since it’s a lot of transformations, results in an extensive DAG, but it’s cool to spend some extra processing resources to make a standard dimensional model (apart from the necessary transformations).

Was wondering what real benefits you have seen a star schema design has from the “one big table” approach, I could preach to my team? (My goal mainly would be to have a resulting smaller PowerBI model.)

And as a side question, what tools do you use to create a dimensional model such a star schema with code?

Thanks a lot!

11
 
 

Thought I’d share this link. I’m not affiliated in any way.

12
 
 

Thought I’d share this link. I’m not affiliated in any way.

13
 
 

Hey there community!

Does anyone have any resources they could share relating to Data Vault 2.0, specifically the joining of SAL and PIT tables? The two main books on the architecture are very sparse on this area, which I would have thought would be a fairly key component for any mid-to-large organisation.

14
 
 

15
 
 

16
 
 

What needs to be added for 2023

17
 
 

Fellow data engineers, looking forward to your contribution/participation to the communiy. If you want to help in managing the community, get in touch to join the team