I have a fact table: FactVoyage at a "Voyage" level of granularity. Each "Voyage" involves visits to multiple "Voyage Ports" so the FactVoyagePort table will be at Port level of granularity. Voyage to VoyagePort = 1 to many.
I'm trying to use the Dimensional Model for datawarehouse, so my understanding is that each FactTable should connect to many Dimension tables but not to other Fact tables as this would effectively become a relational model. However many of the dimensions we want to include in FactVoyagePort table are the same as those in FactVoyage, so I'm tempted to join FactVoyagePort to FactVoyage using VoyageID column in both tables, and depend on the dimensions in FactVoyage instead of repeating them in FactVoyagePort . This means in order to create the cube for VoyagePorts I will need to include both Fact tables in the DSV.
If my understanding of Dimensional model is correct I might regret this later on, but if I fully denormalize here I'm concerned about errors arising from generating the same data twice at the ETL level.
Any comments ?
Thanks
Richard
Not sure where the concern about generating data twice comes in. I would recommend against joining the facts. However, in the ETL, you might use one fact to retrieve the dimensions for the other fact, since you will have already performed any necessary lookups.|||Thanks! I'll try using one fact to populate the other fact at ETL stage as you suggested.
Richard
No comments:
Post a Comment