Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Data Warehouse design

  1. #1

    Default Data Warehouse design


    I am trying to design a DWH for a reporting system and would like to ask the profis about their opinion concerning the best design.

    It is more complex but to keep it simple, suppose the following scenario:

    • Each customer has “group of cars”
    • There are “set of To-dos” to be checked with each customer, and
    • There might be “calls” related to each customer

    In our DB, the data for each subject (Customer – Cars – To-dos – Calls) are stored in a Table. The Customer table has a one-to-many relationship to (cars – To-dos - Calls) tables, as shown in the pic

    Name:  Db.PNG
Views: 78
Size:  11.8 KB

    I was thinking for the reporting system to set the following dimensions and fact tables:

    • DIM_Customer
    • DIM_Car
    • DIM_To-do
    • DIM_Call
    • Fact_ALL

    As I mentioned it is more complex. Anyway, I am stuck now since it is not clear how to aggregate my fact table --> since there are a group of Cars AND a group of To-dos AND multiple Calls

    DIM_Customer_ID DIM_Car_ID DIM_To-do_ID DIM_Call_ID Cars_Count To-dos_Count Calls_Count

    • Q1: Using a bridge table between each (car, to-do, call) dimensions and the fact table --> its grain is „per customer per group-of-cars per group-of-To-dos per group of calls, right? Is there any other easier or better solution? How to solve such a cardinality?

    • Q2: in the Db each Customer has a unique ID --> is it possible to use this customer_ID to connect the dimension tables and the fact table, as shown in the pic? Or this is not recommended since the dim and fact tables are now not (PK-FK) connected (customer_ID is only as a PK in the DIM_customer but as FK in the other DIMs) … I am just trying to make it as simple as possible

    • Q3: Is there any better or more positional DWH design

    This is my first DWH design, so excuse me if there are any Silly ideas

    Thank you
    Attached Images Attached Images  
    Last edited by PenBI; 02-27-2017 at 11:39 AM.

  2. #2
    Join Date
    May 2014


    Can you give us an example of a report? What would you want to see as one row on a report, what is the granularity, how should the data be grouped? This would influence the data mart design.

  3. #3


    yes I can, which force me to throw my next question

    an example can:
    - all users-name along with a list of all-cars Num that belongs to them and the list of the to-dos that has been check with him --> this is a general one
    - number and name of the cars that has been used by a user that has a certain type (can be also with the calls text)

    - here comes my second question:

    PLZ consider that the user is valid only for on time --> user x has a contact_date (01.01.2016) --> user x can NOT has any future contact dates or even info and if so it is a new user that has nothing to do with the user x

    The table Customer has a LOT of time/date stamps (tel_date, contract_date, endtel_date, meeting_date, ...etc).
    Moreover, the table car has also the date stamps of each car used by the customer has time/date stamps also (not_date, go_date, come_date, ...etc)

    a report might be: for a user x --> number and name of all used cars along with each time stamp & some time stamps differences or,
    between hour 13 and 14 a list of all used cars

    here i might/must have a Date dimension, right? but i was wandering if that is really necessary since all the needed time-stamps are already aggregated in either the Customer or the Car table

    hope there are no silly ideas here

  4. #4
    Join Date
    May 2014


    In your first diagram it looks like a "To-do" is for a customer, not a specific car, is this correct? To me, from that model, it looks like the Customer is a dimension and the rest are three fact tables.

  5. #5
    Join Date
    Jan 2015


    I've answered your question on StackOverflow, though it would have been helpful to post all the additional info you gave here over there as well.

    From your description here, it seems you have a pool of cars that can be assigned (rented) to customers, but only once to each customer (is your business a funeral parlor?)

    It seems a bad fit for a dimensional model, to be honest, with very little repeated data except possibly the cars. I would go with a close approximation of the source data model, but put user-friendly labels on the fields and values.
    Last edited by Isha Lamboo; 03-01-2017 at 08:16 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.