# Thread: Dimensional modeling problem - multivalued dimensions and avg aggregation function

1. Junior Member
Join Date
Dec 2016
Posts
19
Hello underscores,

thank you for your reply! That for sure does overcome my knowledge too!

I've already thought about the virtual cubes. It indeed solves the problem of looking only for the grades per student, or look for the grades when considering the hobbies. But I guess the main problem is, when viewing the grades per student per hobby, if a select only the grades associated with two or more hobbies, it will inevitably count the same grade more than once, if the student has more than one of the hobbies required as the filter.

I'll keep searching for a solution, of course, but I suspect, as you said, that maybe it can't be done using Mondrian queries and the Saiku Plugin.

Best regards,
Bruno

2. Junior Member
Join Date
Feb 2017
Posts
18
Originally Posted by amaralbf
Hello underscores,

thank you for your reply! That for sure does overcome my knowledge too!

I've already thought about the virtual cubes. It indeed solves the problem of looking only for the grades per student, or look for the grades when considering the hobbies. But I guess the main problem is, when viewing the grades per student per hobby, if a select only the grades associated with two or more hobbies, it will inevitably count the same grade more than once, if the student has more than one of the hobbies required as the filter.

I'll keep searching for a solution, of course, but I suspect, as you said, that maybe it can't be done using Mondrian queries and the Saiku Plugin.

Best regards,
Bruno
Hello Bruno,

I'm really sorry that I could not properly help you. A Virtual Cube is something very simple. You would create two measurements one for the Students and Hobbies and one for the Students alone. You could than use these measures in your virtual cube. Right now I do an internship to create a simple OLAP Solution based on Mondrian for a small company as part of a mandatory university internship and suffer from similar problems as you do. However in my opinion to receive a proper result with Mondrian you really have to ensure the 1:n relationship or use a Closure Table. However the last solution requires you to create a transitive, reflexive closure. Another way would be to denormalize the student-hobby information and create a hobby group for each Student. However that would not satisfy the requirement to see what hobbies have the best average grades (a somewhat funny thought however to relate hobbies to grades. but again here a calculation over student and hobbies in two cubes might work). In my current work I had 3 processes hidden in a single source database. However the source database design used a "fake-key" to create an entry for every process. So while non of these 3 processes had any actual relationship, different number of rows their "keys" were all collected in a single "center-table". During my work I found that these processes create different value-IDs and therefor need different dimensions and have therefor a different granularity. I created 3 cubes with their respective measures and combined them in a virtual cube over their shared dimension (customer, machine, date). This however created large problems when trying to drill-down to the source data. It is simply impossible. As I found out by now this is due to the indexing that got handled incorrectly. You see I struggle a bit too (however I'm also a bad student and probably mentally very plain). You might be able to achieve your goal with one of these tricks however. You should remember: If the result you see is correct, no one cares what you did to achieve that. And in most companies you will be one of the very few to understand OLAP, Big Data to the extend you already have, believe me.
I really hope someone else will comment here and give you better help and that you achieve your goal.

With kind regards,

underscores

3. Junior Member
Join Date
Feb 2017
Posts
18
I wanted to add something to the virtual cube.
Cube StudentxHobby:
dimension:
Top -> Student -> Hobby
Cube: Hobby:
Dimension:
Top -> Hobby

Virtual Cube: Student And Hobby
Virtual Measure: Average Grade By Hobby
Virtual Measure: Average Grade By Student
Dimensions: Shared Dimensions (Time/Date, etc.)
Dimensions: StudentxHobby
Dimensions: Hobby
Than you can select the Dimensions you need with the Measure you need in Saiku and get both results properly. That would solve all your problems through a backdoor easily and wont reduce effiency too much. If this is an assignment you might just justify your approach like this:
Mondrian does not allow muilti-valued dimensions that are not reflexive. So you decided to calculate two measures for each to get legal results and combine them through a Virtual Cube which is a common way to combine different granularity, detail grade, dimensionality of values. Since both are different busines measures with different goals (a hobby does not study but someone with a hobby might study).

4. Junior Member
Join Date
Dec 2016
Posts
19
Hi underscores,

This approach is valid for getting the correct results when selecting by student as well as when selecting by hobby (and combining them with other dimensions). This is the result I already have, using the approach described in the posts #4 and #5 of this thread.

Said that, I don't see how this approach can solve the problem of calculating the following query: average of grades per course, filtering by 2 of 3 hobbies. It is the same query I mention on the post #5.

Thanks and best regards,

Bruno

5. Junior Member
Join Date
Dec 2016
Posts
19
Just an update, I've discussed the problem with Diethard Steiner, and this discussion led to the elaboration of a very good and informative post on his blog. The URL to the post is: https://diethardsteiner.github.io/mo...Attribute.html

I recommend checking his blog out, as I've been following his blog for a long time and have found lots of useful information regarding Business Intelligence and the use of Mondrian and Pentaho Community.

Best regards,
Bruno