Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Parent-child hierarchy with multivalued dimension: trouble with measures

  1. #1
    Join Date
    Oct 2007
    Posts
    255

    Default Parent-child hierarchy with multivalued dimension: trouble with measures

    I'm having a few problems here.

    First off, I've created a dimension that takes on multiple values in the context of individual facts (multivalued dimension). Additionally, it's a parent-child hierarchy. There are two main problems I'm up against:

    :- The measures for the 'All' level all have [seemingly] correct values, but their children are null
    :- If I add the group label as the ordinalColumn of the parent-child hierarchy, it no longer recognizes one of the child groups as a child of one of the top level groups, and consequently that group is never accessible.

    The multivalued dimension part is constructed as follows:

    :- Fact table has a 'groupsetid' field that's been pre-populated
    :- Groupset table has multiple entries for each groupsetid, one for each group
    :- Groupset table also has a weight attribute (1/n, n=# of groups included in that groupset)
    :- Fact table used in the cube is a view created in mysql as:
    Code:
    create or replace view eventfacts_bridged as select ef.*,gs.groupkey_fk,gs.weight from eventfacts ef,groupset gs
    The parent-child hierarchy part is constructed as follows:

    :- groupdimension contains 4 values: groupid, grp_parentid, groupkey_tk, groupkey_parent_fk.
    :- there is a 1:1 correlation between groupid, groupkey_tk, and same for the parent stuff
    :- the closure table has 3 columns: the distance, groupkey_fk, groupkey_parent_fk

    The schema itself uses eventfacts_bridged for the fact table, and groupdimension as the dimension table, with a <Closure> element in the only existing level within the only hierarchy of the groupdimension, modeled as:
    Code:
    <Level name="Groups" column="groupkey_tk" nameColumn="grp_label" parentColumn="groupkey_parent_fk" nullParentValue="-1" tyep="Numeric" uniqueMembers="true" levelType="Regular" hideMemberIf="Never">
      <Closure parentColumn="groupkey_parent_fk" childColumn="groupkey_fk">
        <Table name="groupdimension_closure" />
      </Closure>
    </Level>
    Lastly, measures are the sum of the weights that are gained from the join to the groupset table.

    Originally, the closure table was using groupid, and grp_parentid, and the level table was using groupkey_tk, but that didn't work. I didn't pick up on it at the time, but the problem was that it was trying to compare groupid from the closure table to groupkey_tk in the groupdimension table. To make sure things were working I changed the level table to use groupid and grp_parentid instead and things were working, but I ran into some problems later on that made me realize I needed to fix what the level/closure tables were using as keys now instead of putting it off.

    So, I did that, and now it's broken. To clarify, when it was working before using the groupid/grp_parentid instead of the real table keys, I could explore my cube in jrubik or jpivot just fine. Counts showed up, all was well. Unfortunately, I didn't save that schema, because I figured the only changes would be ETL stuff, and changing names of keys in the schema. Now when I browse the cube I get correct totals for the "All" level of the group dimension hierarchy, but when I expand groups I get null values for each of the individual measures. Another oddity: when i leave out the ordinalColumn attribute in the level definition, the parent-child hierarchy stuff works fine, but when I add the group label as the ordinal column the only group in my dataset that is contained in a top level group doesn't show up in the resulting query as a child of its parent (in fact, it's not in the resultset at all)

    I decided to investigate by looking at the generated sql. I copied it all out of the mysql log, did necessary search/replace to make the queries functional (what with the backticks & all), then ran them. The first gets column names and IDs, the 2nd gets counts and IDs, the rest give empty resultsets, and are all of the form:
    Code:
    select groupdimension.groupkey_tk as c0, groupdimension.grp_label as c1
      from groupdimension as groupdimension
      where groupkey_parent_fk = 9
      group by groupdimension.groupkey_tk,
            groupdimension.grp_label
      order by ISNULL(groupdimension.grp_label),
            groupdimension.grp_label ASC;
    If you've seen anything like this before, let me know. If you know how to fix it, that would be even better, but barring that, at the very least I'd like to know if I'm alone on this one or not.

    -Brian

  2. #2
    Join Date
    Oct 2007
    Posts
    255

    Default df

    As a followup, i just checked all the queries giving empty resultsets, there isn't one for "groupkey_parent_fk = 5", where 5 is the group "Test", which should have a child of 4.

    My suspicion is I did something wrong during ETL, but I'm not sure what.

    -Brian

  3. #3
    Join Date
    Oct 2007
    Posts
    255

    Default asdf

    Alright, have another update to this saga. I sort of solved the problem with things not being hierarchized properly. There's two groups named Test (I didn't realize that earlier). I changed one to Test1, the other to Test2, and now the 'Test' hierarchy does show up properly, so it looks like there may be a bug with parent-child hierarchies if the label and/or ordinal columns contain the same values, it eliminates one (I switched things back, and sure enough, only one 'Test' group shows up, but if I rename them they both show up).

    -Brian

  4. #4
    Join Date
    Oct 2007
    Posts
    255

    Default asdf

    Alright, I finally have time to work on this again, this time I have a cleaned copy of my schema to attach to this issue: cleaned.xml. I've reduced it down to some of the basic dimensions that exist in the schema.

    To sum up:

    :- Group dimension is both multivalued and a parent-child hierarchy
    :- The multivalued portion is being handled behind the scenes as a view in mysql because mondrian + mysql = no subselects (eg, no <SQL>, <Join>, etc, in mondrian for the fact table)
    :- When doing the following query I get null results:
    Code:
    select {[measures].[event count]} on columns,
    {[Group].[All Groups].children} on rows
    from [events]
    The view/fact table structure is just what you'd expect from the schema:

    serverdatekey_fk, servertimekey_fk, userkey_fk,
    groupkey_fk,groupkey_parent_fk, score, weight

    ... among other things.

    -Brian
    Last edited by Phantal; 04-04-2008 at 01:55 PM. Reason: attachment missing

  5. #5
    wgorman Guest

    Default

    I don't see anything wrong with the schema, could you provide a debug log file when running this? What version of Mondrian are you using? (pre 3.0 use turn on tracing, 3.0 or later you'll need to configure log4j appropriately) Make sure your parent child table and closure table have -1 for the root parent ID, the problem might be in the data itself vs. the schema.

    Will

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.