View Full Version : Thin Model for Metadata discussion

04-22-2009, 06:02 PM
Handling comments on this model in email is impractical.. moving discussion to forums. The following is referring the the 1st stab diagram to represent the impending thin model for the metadata system, located here: http://wiki.pentaho.com/download/attachments/12387542/thinModel.png

Here I am responding to feedback from James...

> View:
> Do we need to link to the LogicalSchema? Or can we just derive it from the LogicalColumns?
Agreed. I like the idea of derived relationships to eliminate redundancy.

> LogicalRelationship:
> This needs to allow multiple column-column mappings.
The way it is depicted, a LogicalSchema can have many LogicalRelationships (each LR maps one column to one column). Are you saying a single LR must handle more than just one fromColumn and one toColumn?

> Query Model:
> We need to include axes so that the generic model support both relational and OLAP, or have two query models.
Acknowledged, we can make that decision later. I'll note it in the drawing
> We need to add parameters to the query model
You mean so we can persist the query and insert values at runtime? How do you suppose this would manifest in the query model?

> Can we add LogicalHierarchies, LogicalCubes? Should the hierachies be in the Logical layer, and the cubes in theView layer, or both in the Logical layer?
Good questions, you have suggestions? Do you think we should attempt to model OLAP in this first pass?

A version2 is posted with the modifications that I can make right away: http://wiki.pentaho.com/download/attachments/12387542/thinModelv2.png


04-22-2009, 06:15 PM
(responding to comments from Will)

> I think the Domain object needs direct references to the Phsical and
> Logical schemas, the Domain object replaces the SchemaMeta in the
> original model.
So you are not in favor of deriving the relationships? I'm sure we could have a way of easily determining the Logical and Physical schemas by traversing the View. In any case this is more or less a trivial change whichever way we decide to go.

> Also, I don't think the View Column is necessary at this point. The
> current model doesn't have that, we can just link Categories to Logical
> columns. If it does exist, it shouldn't have a name or data type
> property.
Ok, we did discuss the ViewColumn the other day as being a wanted feature. James, r u ok with leaving it off for now? I can't imagine it would required for this sprint at least.

> Also, View should be renamed to Category. I'm not sure if we want to
> support nested categories at this time.
What's the story on the term Category. To me "view" is much more intuitive and has a corollary in db-speak. Is there a function of the view/category that I am missing that makes "category" a more appropriate term?

> In the original metadata model, Business Models had a list of
> categories, and our GUI's are designed to show the business model and
> then categories to end users (waqr and mqleditor). I'll chat with
> Brett / Jake about the end user experience a bit, that will drive how we
> model the view.
Ok, yep lets talk with the UI guys. What does "business model" really mean? There is a "domain" which is a set of views/categories. Not sure where business model fits in?

04-23-2009, 11:49 AM
We (pentahoanalysistool project team) work on creating a lightweight Query Model for OLAP cubes, since we need to connect our GWT application to the java (olap4j) backend. All of the following data/methods are serializable and used in RPC calls in GWT.

The data we get from the backend is being structured in:
- Rowheaders [][]
- Columnheaders [][]
- CellData [][]

We also have a discovery service that allows to:
- get List of available Cubes
- get List of available Dimensions in a Cube on a specific Axis
- get TreeList of available Members of a specific Dimension

In order to build the query we support following methods right now (in chronological order of usage):
- Create new Query
- Move a dimension to a different axis (Parameter: Axis, Dimension )
- Create Selection (Parameter: Dimension, List<String> of Members)
- Execute Query (returns data in the structure described above)
- Clear Selection (clear query, ready for new one)

At the moment we only support RPC as interface, but we're planning on supporting JSON/WebService at some point.

I have no idea if my explanation about our (Olap4j) Query Model makes actually sense in this context, but I do think that there could be synergies between your project and PAT.


p.s: PAT = Pentaho Analysis Tool (http://code.google.com/p/pentahoanalysistool/)

04-23-2009, 01:48 PM
Hi Paul,

The OLAP Query model and OLAP support for Data Access is on our backlog, after we deliver SQL Queries and CSV / flat file metadata support, so we're definitely thinking about those scenarios. The query model you guys are working with on PAT is what we hope to use.