PDA

View Full Version : Performance problems with large dimensions



phgrenet
10-02-2005, 10:14 AM
Hello again,
I am running into performance problems with Mondrian 1.1.5 with a cube that has a couple of very large dimensions. One dimension for instance has currently 1 million rows and I expect it to be 10-20 times bigger when all the data is loaded in the database.
My query looks like this:
select { [Product_Category].[all_categories].children } on rows
from sales
where ([Customer].[GOOGLE INC])
I have set the following system properties:
mondrian.trace.level=2
mondrian.rolap.LargeDimensionThreshold=20000
mondrian.result.limit=0
Mondrian logs this:
------------------------------------------------------
SqlMemberSource.getMemberChildren: executing sql [select "CUSTOMER"."CUSTOMER_NAME" as "c0" from "CUSTOMER" "CUSTOMER" group by "CUSTOMER"."CUSTOMER_NAME" order by "CUSTOMER"."CUSTOMER_NAME"], 34 ms
SqlMemberSource.getMemberChildren: executing sql [select "PRODUCT_CATEGORY"."CATEGORY" as "c0" from "PRODUCT_CATEGORY" "PRODUCT_CATEGORY" group by "PRODUCT_CATEGORY"."CATEGORY" order by "PRODUCT_CATEGORY"."CATEGORY"], 2 ms
Segment.load: executing sql [select "CUSTOMER"."CUSTOMER_NAME" as "c0", "PRODUCT_CATEGORY"."CATEGORY" as "c1", sum("SALES"."AMOUNT") as "m0" from "CUSTOMER" "CUSTOMER", "SALES" "SALES", "PRODUCT_CATEGORY" "PRODUCT_CATEGORY" where "SALES"."CUSTOMER_ID" = "CUSTOMER"."CUSTOMER_ID" and "CUSTOMER"."CUSTOMER_NAME" = 'GOOGLE INC' and "SALES"."PRODUCT_CATEGORY_ID" = "PRODUCT_CATEGORY"."CATEGORY_ID" group by "CUSTOMER"."CUSTOMER_NAME", "PRODUCT_CATEGORY"."CATEGORY"], 366 ms
[INFO] Batch.loadAggregation 376
[INFO] loadAggregation 376
------------------------------------------------------
In the first and second SQL queries, Mondrian loads all members in the cache. I verified this by adding a counter in SqlMemberSource.getMemberChildren(RolapMember parentMember, List children, Connection jdbcConnection). This is a problem for the Customer dimension as the result set is 1 million rows. It takes about a minute to do so. In theory there is no need for Mondrian to cache all members...
How can can I prevent Mondrian from caching the whole table? Do I need to define my own member reader? Or am I missing something?
Thanks for your help.
-- Philippe Grenet

phgrenet
10-02-2005, 11:22 AM
Looks like the only method that calls MondrianProperties.getLargeDimensionThreshold() is RolapSchema.createMemberReader(final RolapHierarchy hierarchy, final String memberReaderClass). In this method, the call to SqlMemberSource.getMemberCount() is disabled and the member count is set to Integer.MAX_VALUE, forcing Mondrian to use SmartMemberReader. Could that be related to the problem?