View Full Version : B.View with columns from multiple B.Tables - how is it joined?

12-22-2008, 10:34 AM

Could somebody please point me to the documentation that explains how a Business View consisting of columns from several Business Tables is handled by Pentaho and specifically by WAQR? The docs reads that you can put columns from different tables into a view but it doesn't explain how the tables are joined at runtime to produce the view.

To be specific, I've 3 tables similar to those below:

CarPart(partSupplierId, partSupplierFactoryId, ...).
PartSupplier(id, name).
PartSupplierFactory(id, partSupplierId, name; Composed primary key(id, partSupplierId)) .

I want to create a View for users to create reports about CarParts and I want it to have the attributes partSupplierName and partSupplierFactoryName instead of those IDs. The question is how to achieve that?

I've defined the relationships CarPart(partSupplierId) -> PartSupplier(id), CarPart(partSupplierId, partSupplierFactoryId) -> PartSupplierFactory(partSupplierId, id) and PartSupplierFactory(partSupplierId) -> PartSupplier(id). Notice that PartSupplierFactory.id isn't unique - only (partSupplierId, id) is unique. I've also created a CarPartView with columns from those 3 tables (PartSupplier.name as partSupplierName, PartSupplierFactory.name as partSupplierFactoryName, ...). Is this enough? When creating a report, will WAQR/PMD generate the proper SQL, something like this:

SELECT supplier.name as partSupplierName, factory.name as partSupplierFactoryName, ...
FROM CarPart part JOIN PartSupplier supplier ON part.partSupplierId=supplier.id
JOIN PartSupplierFactory factory ON (part.partSupplierId = factory.partSupplierId AND part.partSupplierFactoryId = factory.id)

Thank you!

12-22-2008, 05:55 PM
Yes, PMD knows which tables are involved in your selection and will include all required tables in the generated SQL.
We know on the business column level to which business table (physical table+alias) it belongs to and as such we can do this, even for recursive expressions. (business columns used in calculations etc)

12-23-2008, 06:35 AM
Thanks Matt for your reply. I know what I need but still I'm curious and would like to learn more about the way PMD derives the final MQL/SQL from a view, is it described somewhere (aside of the code :) )? E.g. what would happen if I didn't specify the relationships or only some of them, i.e. what relationships must be specified for PMD to derive the correct SQL?

Thank you and wish you Merry Christmas.

12-23-2008, 03:08 PM
It will be easier if you look at it from the other way around. You are exposing the users to a number of business columns with names, descriptions, etc.
It's your task as a metadata designer to make sure that whatever any user drags onto a report generates valid SQL. You do this by linking business tables to each other in a business view in the Pentaho Metadata Editor.
You are free to do with that join condition what you want to obtain that goal. INNER, LEFT-,RIGHT,FULL-OUTER JOIN, all is OK.

My only advice to all you designers out there is indeed to keep it as simple as possible to obtain the goal stated above: make sure that the end user can't make a mistake.
That means: stay away from loops in a single business view. Rather, create multiple simplistic business views in the same business model.


12-24-2008, 05:02 AM
If you dont specify the relationships, but allow the end user to select values from 2 unrelated tables, then you'll get an error in the log saying that metadata can't find a path between those tables. Although this is presented as a blank page to the end user.

You should ensure only one relationship is defined, and you dont have any circular routes in your metadata model as Matt says. This can cause craziness :)

If you do that then it all works pretty well. ( as well as the usual stuff like hiding implementation details from the end user, and making sure you use business not technical names for the fields.