Category Archives: Kimball

Adding many-to-many relationships to a data warehouse

Many to many relationships frequently occur in OLTP databases.  For instance, consider a retail system which stores the details of products in one table. Each product  could belong to  one or more product types; equally, each product type could apply to more than one product.  This is often represented in an OLTP database as:


However, in a data warehouse this structure is likely to cause performance problems.

Assuming Products has it’s own dimension in the DW.  Here are some alternatives for storing the associated Product Type data:

Proposed structure Issues to condsider
Add a bitmap column to the Product dimension
  • Not easily readable (for humans)
  • Good for binary indicators
  • Good for values which are quite static
  • Very good for performance
  • Needs to be documented well so that all using it clearly understand what each bit indicates
Add an xml column to the Product dimension
  • Quite readable – good for free text values like category names
  • Need to pay attention to the performance implications
  • May need to add an XML index to resolve performance issues
  • Extends easily – it is XML after all!
Add a bridging table to a dimension
  • Does have performance implications due to the joins that will need to be made to connect the tables together
  • How many connections is the dimension likely to have to the bridging table? If most don’t have connections then the performance hit may be acceptable
  • Consider the dimension type – the fewer rows of a type I dimension will have much less of a performance hit than a type II dimension
  • Requires more ETL work than previous options
Add a bridging table to a fact table
  • Consider which type of fact table it is – has the potential to add a lot of redundant information to the DW if not relevant to reporting requirements
  • Good for capturing frequently changing values that need to be reported on
  • Requires more ETL work than previous options