Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Result limit on SQL statements

  1. #1
    Join Date
    Feb 2013

    Default Result limit on SQL statements

    I've been having performance issues with an MDX query and after some analysis I believe this is due to the way SQL statemets are generated, in particular, to the number of rows they return.
    No matter the size of the set I run the query on, the SQL queries always return around 1300 rows.
    How does Mondrian determine this limit? Is there any way to change it?
    Last edited by exeba; 02-08-2013 at 05:28 AM.

  2. #2
    Join Date
    Feb 2013


    I'd like to add one more detail:
    In mondrian 3.3 this issue is not present: The data for the query is gathered through much bigger queries (one or two), with a positive effect on the execution time.

  3. #3


    There is a property mondrian.result.limit that will cause an exception to be thrown if the number of rows returned from a sql query exceeds the limit. I don't think that's what you are looking for though. Can you post an MDX query and the resulting SQL queries?

  4. #4
    Join Date
    Feb 2013


    Thanks for your interest!
    Unfortunately mondrian.result.limit is not the issue.
    I'll try to describe more precisely the problem, this is the simplest example I could came up with:


    NON EMPTY CrossJoin(
    {[Measures].[Temperatura Massima]}) ON COLUMNS,
    NON EMPTY CrossJoin([Filiali].[Filiale].Members, {[Dati].[Temperature].[Mandata].[Murale Latticini 1, Modulo 2]}) ON ROWS
    FROM [Cube]

    The columns axis contains the time dimension, in particular, the first X days of January 2012, hour by hour.
    The rows axis contains a particular variable ([Dati].[Temperature].[Mandata].[Murale Latticini 1, Modulo 2]) measured in several branches ([Filiali].[Filiale].Members)

    When I run the MDX query, a series of SQL queries {QUERY(i,j)} is generated, and they look like this:

    QUERY(i,j) = select `variables`.`id` as `c0`, `branches`.`ID` as `c1`, `datetimes`.`id` as `c2`, max(`measures_temperature`.`value`) as `m0` from `variables` as `variables`, `measures_temperature` as `measures_temperature`, `branches` as `branches`, `datetimes` as `datetimes` where `measures_temperature`.`variable_id` = `variables`.`id` and `variables`.`id` = 1411 and `measures_temperature`.`branch_id` = `branches`.`id` and `branches`.`ID` in ($BRANCHES_SUBSET_i) and `measures_temperature`.`datetime_id` = `datetimes`.`id` and `datetimes`.`id` in ($DATETIMES_SUBSET_j) group by `variables`.`id`, `branches`.`ID`, `datetimes`.`id`

    $BRANCHES_SUBSET_i and $DATETIMES_SUBSET_j are partitioning of [Filiali] and [Periodo] dimensions, respectively.
    As I increase the number of days included, the number of sql queries grows, each one of them roughly returns around 5k rows (checked by manyally executing the sql statements)

    If I had to guess, I'd say that this feature has been included in order to allow Mondrian to process the results without having to wait for the execution of a big query.
    However, the way queries are split up is far from optimal in my case.

  5. #5
    Join Date
    Jul 2012


    try the following query:

    SET xj AS NonEmptyCrossJoin({[Dati].[Temperature].[Mandata].[Murale Latticini 1, Modulo 2]}
    , NonEmptyCrossJoin([Filiali].[Filiale].Members
    , {
        Extract(xj, Periodo) on 0,
        Extract(xj, Filiali, Dati) on 1
    FROM [Cube]
    WHERE [Measures].[Temperatura Massima]
    with the following stuff set in your
    mondrian.rolap.cellBatchSize=ReallyBigNumber (i use 2147483647)
    i've found that using NonEmptyCrossJoin() in the WITH section of the query drastically and profoundly improves the performance of non-empty type stuff.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2017 Pentaho Corporation. All Rights Reserved.