PDA

View Full Version : Group by problem using CASE functions



impulseBE
02-03-2010, 06:06 AM
When specifying the formula of a business column as an addition/subtraction of two other business columns which both use CASE formulas to calculate their value, the SQL generator also adds it to the group by statement, which shouldn't happen?



SELECT
BT_AGGR_SF_B.KLNR41 AS COL0
,BT_KLADIM.CUSTNAME AS COL1
, CASE WHEN SUM( BT_AGGR_SF_B.OMZET_ACY ) <> 0 THEN ( SUM( BT_AGGR_SF_B.BW_ACY ) / SUM( BT_AGGR_SF_B.OMZET_ACY ) * 100) ELSE CASE WHEN SUM( BT_AGGR_SF_B.BW_ACY ) = 0 THEN 0 ELSE 100 END END AS COL2
, CASE WHEN SUM( BT_AGGR_SF_B.OMZET_YTD_ALY ) <> 0 THEN ( SUM( BT_AGGR_SF_B.BW_YTD_ALY ) / SUM( BT_AGGR_SF_B.OMZET_YTD_ALY ) * 100) ELSE CASE WHEN SUM( BT_AGGR_SF_B.BW_YTD_ALY ) = 0 THEN 0 ELSE 100 END END AS COL3
, CASE WHEN SUM( BT_AGGR_SF_B.OMZET_ACY ) <> 0 THEN ( SUM( BT_AGGR_SF_B.BW_ACY ) / SUM( BT_AGGR_SF_B.OMZET_ACY ) * 100) ELSE CASE WHEN SUM( BT_AGGR_SF_B.BW_ACY ) = 0 THEN 0 ELSE 100 END END - CASE WHEN SUM( BT_AGGR_SF_B.OMZET_YTD_ALY ) <> 0 THEN ( SUM( BT_AGGR_SF_B.BW_YTD_ALY ) / SUM( BT_AGGR_SF_B.OMZET_YTD_ALY ) * 100) ELSE CASE WHEN SUM( BT_AGGR_SF_B.BW_YTD_ALY ) = 0 THEN 0 ELSE 100 END END AS COL4
FROM
PENTAHO_DW.AGGR_SF_B BT_AGGR_SF_B
,PENTAHO_DW.KLADIM BT_KLADIM
WHERE
( BT_AGGR_SF_B.KLNR41 = BT_KLADIM.CUSTNR AND
BT_AGGR_SF_B.ADNR41 = BT_KLADIM.ADDRESSNR )
AND
(
(
BT_AGGR_SF_B.KLNR41 = 19716
)
)
GROUP BY
BT_AGGR_SF_B.KLNR41
,BT_KLADIM.CUSTNAME
, CASE WHEN SUM( BT_AGGR_SF_B.OMZET_ACY ) <> 0 THEN ( SUM( BT_AGGR_SF_B.BW_ACY ) / SUM( BT_AGGR_SF_B.OMZET_ACY ) * 100) ELSE CASE WHEN SUM( BT_AGGR_SF_B.BW_ACY ) = 0 THEN 0 ELSE 100 END END - CASE WHEN SUM( BT_AGGR_SF_B.OMZET_YTD_ALY ) <> 0 THEN ( SUM( BT_AGGR_SF_B.BW_YTD_ALY ) / SUM( BT_AGGR_SF_B.OMZET_YTD_ALY ) * 100) ELSE CASE WHEN SUM( BT_AGGR_SF_B.BW_YTD_ALY ) = 0 THEN 0 ELSE 100 END END



When i delete the last line (in red) from the GROUP BY statement, the query executes fine.

Business columns BC_AGGR_SF_B_GM_%_YTD and BC_AGGR_SF_B_GM_%_Y-1_YTD calculate ratios using a formula like:


CASE(
[BT_AGGR_SF_B.BC_AGGR_SF_B_TO_YTD]<>0;
[BT_AGGR_SF_B.BC_AGGR_SF_B_GM_YTD]/[BT_AGGR_SF_B.BC_AGGR_SF_B_TO_YTD]*100;
CASE([BT_AGGR_SF_B.BC_AGGR_SF_B_GM_YTD]=0;0;100)
)
Calculating the difference between the two for business column BC_AGGR_SF_B_DIFF_GM_%_YTD is where it goes wrong

[BT_AGGR_SF_B.BC_AGGR_SF_B_GM_%_YTD]-[BT_AGGR_SF_B.BC_AGGR_SF_B_GM_%_Y-1_YTD]
For the moment, I'm just putting both CASE functions in one formula as a workaround.

EDIT: related to http://jira.pentaho.com/browse/PMD-521 i guess

ak@proventa.de
02-10-2010, 07:17 AM
Hey,

me expierence with this problem:

Choose a aggregation for this field and it will no appear in the 'goup by' section.

Greetings
Chris

impulseBE
02-15-2010, 09:27 AM
I don't want to calculate an aggregation on a single field or calculated field, but the difference between aggregations on 2 different fields (i already pre-calculated those aggregations in 2 other business columns and would like to re-use them).

It's basically a case of wanting SUM(A) - SUM(B) instead of SUM(A-B), but then using CASE functions and not having it show up in the group by statement.

codek
02-15-2010, 10:06 AM
I hit this issue too - at the time there wasnt a workaround, so if you find one then let me know!

Actually looking at it, that jira is mine! ha. Hopefully now others are hitting it we'll get some traction.

impulseBE
02-15-2010, 11:04 AM
Well actually it seems like you can fix the formula in your JIRA by pre-calculating sum(num_responses_gt_1) and sum(num_responses) in 2 newly created logical business columns, using aggregation:sum and exact formula.

Because in my case, i'm using
CASE([CAT_AGGR_SF_B.BC_AGGR_SF_B_TO_YTD]<>0;[CAT_AGGR_SF_B.BC_AGGR_SF_B_GM_YTD]/[CAT_AGGR_SF_B.BC_AGGR_SF_B_TO_YTD]*100;CASE([CAT_AGGR_SF_B.BC_AGGR_SF_B_GM_YTD]=0;0;100))
-
CASE([CAT_AGGR_SF_B.BC_AGGR_SF_B_TO_Y-1_YTD]<>0;[CAT_AGGR_SF_B.BC_AGGR_SF_B_GM_Y-1_YTD]/[CAT_AGGR_SF_B.BC_AGGR_SF_B_TO_Y-1_YTD]*100;CASE([CAT_AGGR_SF_B.BC_AGGR_SF_B_GM_Y-1_YTD]=0;0;100))
as a workaround.

For instance, I defined CAT_AGGR_SF_B.BC_AGGR_SF_B_TO_YTD using aggregation:sum and [CAT_AGGR_SF_B.BC_AGGR_SF_B_OMZET_ACY] as the exact formula.

But it's ugly :p (as i already calculated these 2 CASE functions before, i would like to re-use them => [CAT_AGGR_SF_B.BC_AGGR_SF_B_GM_%_YTD] - [CAT_AGGR_SF_B.BC_AGGR_SF_B_GM_%_Y-1_YTD]).

I think if you only use aggregate business columns in a formula, it does get picked up by the SQL generator and the formula won't get added to the group by statement. But from the moment you start referring to non-aggregate business columns, it all goes wrong, if that makes sense.