View Full Version : ETL vs Reporting : more chatting

05-21-2007, 12:10 PM
More chatting with a self-acclaimed “Noob”. (not a real nickname) The soup was provided by yours truly.
I’m posting this as it might be interesting for others as well.

(16:56:19) Noob: Hey man
(16:56:26) Noob: i wanted to ask about DWH
(16:56:31) soup-nazi: k
(16:56:53) Noob: after alot of reading and such am i right in assuming that most BI solutions are static as such
(16:57:09) Noob: consdering we have to decide on DWH dimensions and fact tables before we start
(16:57:46) soup-nazi: No, you are not right, in fact…
(16:58:19) soup-nazi: You do need to decide up-front, but the fact that people change and requirements change is absolute
(16:59:11) soup-nazi: So what you need is a water-fall project model: Gather requirements - Design model - make technical design - implement ETL - make reports / testing - handover / docs
(16:59:31) soup-nazi: Each time you go through the cycle your model is updated and the ETL changes.
(16:59:50) soup-nazi: A star schema is designed to allow this to happen. It copes very well with changes to the requirements.
(17:00:27) soup-nazi: Your DWH can take a punch when you design it correctly and use technical (or surrogate) keys everywhere.
(17:01:09) soup-nazi: k?
(17:01:34) Noob: Thats interesting. so would you consider a customer as an ongoing project and if the customer required changes such as new reports you would have to add to your original development
(17:02:07) soup-nazi: Yes, absolutely.
(17:02:51) soup-nazi: Here is the unconvenient truth: users change their mind. And they should be able to change their minds.
(17:03:09) soup-nazi: Business requirements change, everything changes.
(17:03:26) soup-nazi: In some organisations things change faster than in others, but changes are happening everywhere.
(17:03:59) Noob: thats very true. have you ever come up against a customer that wanted more control to devlop thier own reports as things change?
(17:04:22) Noob: as learning the ins and outs of the Pentaho BI platform isn’t an easy job
(17:05:16) soup-nazi: Yes, that’s why we (I actually) developed Pentaho metadata and the Web Based Ad-hoc Querying and Reporting (WAQR) solution
(17:05:39) soup-nazi: That allows plain simple users to build their own reports.
(17:05:47) soup-nazi: online
(17:05:49) soup-nazi: on the web
(17:05:54) soup-nazi: available now
(17:06:00) soup-nazi: in version 1.5M3 or later
(17:06:08) Noob: actually build thier own reports? not just edit ETL transformations?
(17:06:41) soup-nazi: yes, one has nothing to do with the other.
(17:06:48) Noob: you will have to forgive me if i ask stupid questions about the pentaho products. ive been thrown in the deep end here http://www.ibridge.be/wp-includes/images/smilies/icon_smile.gif
(17:06:54) soup-nazi: ETL (Kettle) is not meant for reporting.
(17:07:46) Noob: Okay, well i have this little dedicated linux box sitting next to me and the plan is to use it to build a BI solution
(17:07:55) Noob: ive started with what i know best, Kettle
(17:08:14) Noob: and transformed my data into a DWH for some basic reporting
(17:08:39) Noob: from then on i would use the other Pentaho tools such as report designer and the .xaction editor to create the rest?
(17:09:00) soup-nazi: Yes.
(17:09:21) soup-nazi: Most of these tools start out with a SQL query that defines your data.
(17:09:45) soup-nazi: The SQL query is usually simple because it reads from a star schema DWH.
(17:10:08) soup-nazi: With Pentaho metadata you can even eleminate that SQL writing by designing your data model;
(17:10:21) soup-nazi: WAQR then writes the SQL for you based on a user-selection.
(17:10:35) Noob: ahhhhh, very nice =)
(17:10:54) Noob: WAQR you say. is there a release? maybe ive already seen it
(17:11:47) soup-nazi: Download Pentaho 1.5 milestone 3, it’s in there. (Create report from the first page after login)
(17:12:01) soup-nazi: It’s some AJAX like frontend
(17:12:12) Noob: AJAX, i like! =)
(17:12:25) soup-nazi: 1.5M3 is available from the Pentaho frontpage at the moment
(17:14:11) Noob: Okay, im downloading it just now to have a look.
(17:14:32) soup-nazi: Pentaho metadata 1.5M3 is also available for download BTW.
(17:14:40) Noob: but in all, we start with Kettle and place all our OLTP data into a DWH
(17:14:59) Noob: from there we do everything else. reporting, dashboards, charts ect ect
(17:15:17) soup-nazi: yes.
(17:15:23) Noob: all the data for these reports is taken from the DWH
(17:15:28) soup-nazi: yes
(17:15:37) Noob: okay. starting to get a better understanding
(17:15:57) soup-nazi: In fact, once you put the lowest level of data, the transactions, in the DWH, you never go back to the source system.
(17:16:19) soup-nazi: You can build what we call aggregates or analytical fact tables that are derived from the lowest level fact tables.
(17:16:50) soup-nazi: For example, if you want to know how long ago it was that each individual customer bought something, you need to do lookups.
(17:17:05) soup-nazi: Those lookups are easier in ETL than they are in reporting tools. (virtually impossible).
(17:17:12) soup-nazi: So you build an aggregate, etc…
(17:17:38) soup-nazi: You refine the data, you lift it until at the very end you get a highly top-level fact table to drive your dash-boards.
(17:18:36) Noob: wow, its alot to take in. just aswel im gonna keep a copy of this conversation. Also i appreciate your helping this *Noob* =)
(17:18:55) soup-nazi: np

More... (http://www.ibridge.be/?p=50)