PDA

View Full Version : ETL required or not for dashboard?



aronen
11-17-2006, 05:45 PM
I imagine it's not but I just want to confirm.

Is it a requirement that the application data be extracted/transformed/loaded from the application's database to a separate "data mart" database in order to be able to utilize a dashboard? Because of a Pentaho specific database schema requirements for utilizing a dashboard for example. Or is the ETL component and "data mart" offered simply as an option to offload the application database from load generated by the dashboard? Where can I find more information about this ETL and "data mart" mechanism?

I see in the examples that action sequences defined in XML files can contain arbitrary SQL leading me to believe I can dip into any database schema.

What if I have more complex logic involving multiple queries and possibly some logic. Is there a mechanism to link the action sequence in the XML file to a Java module that produces the data?

mbatchelor
11-17-2006, 08:41 PM
Hi there, and welcome to Pentaho!



Is it a requirement that the application data be extracted/transformed/loaded from the application's database to a separate "data mart" database in order to be able to utilize a dashboard?


Short answer: No. We don't require a data warehouse or a data mart for reporting. But, we strongly advice it.

Long answer: Absolutely not required, but definitely recommended. The primary purpose of a transactional system is just that - for high-performance transactions (fast updates). This is directly in opposition to reporting where the primary goal is to answer questions and provide guidance on improving process rapidly. Databases optimized for quick updates are completely inappropriate for reporting. If you want further detail on this point, just ask. Suffice it to say that reporting and analysis of data should be done against a database optimized for reporting and analysis, not one optimized for transactions.





What if I have more complex logic involving multiple queries and possibly some logic. Is there a mechanism to link the action sequence in the XML file to a Java module that produces the data?


Answer: Yes. The Pentaho platform has a very easy way to inject your own handling for data (or anything else). You can create your own Component that you reference in the action sequence document that does a bunch of stuff, or you could define a Kettle transformation, and use that kettle transformation as a live datasource - we define that as in-line ETL (extract, transform, load).

I hope that helps,

Marc