Star Modeler Vs Schema Workbench? One or the other or both?
I'm a newb to the forums/community and slowly making my way through the Pentaho Tech Stack to build my own EDW for BI after reading Kimball's Data Warehouse Tool Kit book.
I have everything installed and working on Ubuntu/Postgres (ie: PDI, BI, Schema Workbench, etc.).
I have written lots of custom data migration code and now have all my data out of our OLTP system and into Dimension Staging tables in Postgres. Now I am at the point I need to build my Star Schema and/or cubes and have been playing with both the PDI Star Modeler along with the Schema Workbench. Unfortunately due to minimal documentation on both of these tools I am not sure how to proceed and whether I should use both of these tools or just one or the other?
So I have 3 primary questions right now, one very high level and the other two much lower level.
1) Can anyone tell me which of these tools I really should be using for they seem to overlap some? Although Star Modeler appears to have more functionality it seems like "Schema Workbench" may be the correct one I need to build the XML Schema that BI needs as opposed to Star Modeler for it appears not all functionality appears to work in Star Modeler. If I should use Schema Workbench then what exactly is the objective of the "Star Modeler"? Is it supposed to do the same (and more) as Schema Workbench yet Star Modeler is just buggy and unusable right now?
2) So what I did is I started with 'Star Modeler' and built the Dim and Fact tables for the schema I am trying to build for BI. I then ran the SQL that created the objects in my target DB. I expected that I could then click on the "Mondrian Schema" button in the Star Modeler expecting it to generate the "XML Schema File" I need for BI yet nothing happens when I click on that button. Nothing happens when I click on the "Physical Model" button either in Star Modeler. Is this a bug that these buttons do not do anythin or is the Star Modeler actually writing the XML Schema file someplace yet just not popping up a window or anything so it appears to be doing nothing?
(btw I run PDI/Star Modeler local on my Mac yet I have Postgres and BI installed running on Ubuntu)
3) So after nothing happened when pressing the "Mondrian Schema" button on Star Modeler I finally opted to scrap the PDI Star Modeler and check out the "Schema Workbench" and that is where my final question arises from.
When in Schema Workbench the source tables I used to build the dimensions were the Dimension tables that Star Modeler built that have both a "Technical Key" and a "Natural Key". So my question is: "If I am building Dimensions in Schema Workbench based on Dimension created by Star Modeler, which key would I need to use for the Foreign Key for the dimension in question in the Schema Workbench? Would I use the Technical (surrogate) key that Star Modeler created, or would I use the Natural Key?
My other thought is that maybe using Dimension tables created by Star Modeler as source tables is the wrong approach? Should I just use my Staging tables with their "Natural keys" and not use the tables the Star Modeler created with their Technical Keys at all?
Any help/suggestions would be greatly appreciated! I apologize for such a long first post . . .
Last edited by brig; 08-15-2012 at 06:29 PM.
I heard from the folks on the PDI team that yes the "Star Modeler" is not finished yet so that answers my primary question to this forum so it looks like at point in time the Schema Workbench is the only option until they finish the crucial "Mondrian Schema" functionality in the Star Modeler.
So after thinking over night on my other question I am guessing I have come to some conclusions on that too yet wanted to try to get some confirmation on my conclusion from someone savvy with the Schema Workbench.
I asked whether you should use the "Technical/Surrogate" key in Schema Workbench as opposed to the "Natural Key" yet after thinking it over it seems the only thing you can do with Schema Workbench would be to use a Technical/Surrogate Key for it does not appear to support a "Compound Key" which is what my underlying data sources have.
In my OLTP system each of our "Product Offerings" essentially uses a compound Primary Key that is assembled from 4 fields in a table so with this in mind it does not appear that Schema Workbench could handle this and thus I will really need to have the underlying tables in a Star Schema already with Technical/Surrogate keys.
Can anyone comment on this? Is my conclusion correct that Schema Workbench cannot handle a foreign Key that is a compound key so the only way to use Schema Workbench is with data that has already been put into a Star Schema where the Surrogate keys are available?
I just wanted to follow up on this old post now that I have spent the last month using Pentaho. When I posted this question I was a newbie to the DW world and OLAP Schemas and could not find any real basic tutorial that said "Start Here . . ." which would walk you through the process of defining an OLAP/Mondrian schema for usage in Pentaho BI.
After researching it for many many hours I came to the conclusion that I probably needed to use either "Schema Workbench" or the PDI "Star Modeler" so I played with both until I realized "Star Modeler" was not yet a finished product so I then concentrated on learning to use Schema Builder.
Well after a lot of trial & error, and then actual use of the Pentaho User Console (PUC), as far as I can tell you really don't need either Schema Workbench nor Star Modeler for when you create a Data Source in the PUC you have the option to configure it for analysis too which then automatically creates your Mondrian Schema file and makes it available for Pentaho or Saiku Analytics.
Anyway maybe I am wrong and missing something yet if not it would have saved me a LOT of time getting Pentaho off the ground had I known I really did not even need to mess with Schema Workbench nor Star Modeler.
Anyway so advice for any Pentaho/DW newbies who are traveling down the same road as I was, as far as I can tell just to get going you really do not need to mess with any of these standalone Schema Builder tools and all you really need to do to setup your Mondrian OLAP schema (cube) is to login to the PUC and create a new Data Source and then define the keys that link the tables together and then you are done!
I probably could have saved myself 2-3 days of work had I known all of this before hand . . .
Tags for this Thread