PDA

View Full Version : Join rows



ghost
12-21-2005, 05:04 AM
Attachment: TestCase.zip (http://forums.pentaho.org/archived_att/kettle/TestCase.zip) Hi,

i've got something new. :)

The case:
From my contact data, imported via "Excel input", i need to parallel generate rows that respectively references their appropriate contact record (via generated id). This represents a 1-to-Many cardinality between a contact record (one DB table) and its generated records (another DB table).
The creation of all generated records is done parallel. Therefore i need to join them together and renaming their fields conform to the DB table.
And here occurs something wrong. Sometimes a field contains wrong value (from an other field). But this behaviour follows no recognizeable logic. When i run a preview 5 times, i may reach 5 different results. (!?)

It seems that especially the "Java Scrip value" elements (dynData1-4) are responsible for the described problem - The finial "dynValue" field is it, that may have wrong value. Furthermore occurs sometimes errors on these script elements, like:
"dynData_2.0 - ERROR: Unexpected error in 'toObject row values' : java.lang.IndexOutOfBoundsException: Index: 5, Size: 5"

Without these scripts it seems to run stable.

Any ideas?

Ok - i attachted my test case with example source data file.

Cheers,
Gunther.

MattCasters
12-21-2005, 05:26 AM
Hi Gunther,



I can't read data from the test-file you sent me as there are conflicting date-types in it:



<code>
1000001;0;1000000;1000019;1000002;1000000;Test1;;100;2005/12/21 13:06:39.080;100;2005/12/21 13:06:39.080

1000001;0;1000000;1000019;1000002;1000000;Test1;;100;2005/12/21 13:06:39.080;100;2005/12/21 13:06:39.080

1000001;0;1000002;1000019;1000002;1000002;Test1;;100;21.12.2005 13:06;100;21.12.2005 13:06

1000001;0;1000003;1000019;1000002;1000003;Test1;;100;21.12.2005 13:06;100;21.12.2005 13:06

1000001;0;1000004;1000019;1000002;1000004;Test1;;100;21.12.2005 13:06;100;21.12.2005 13:06

</code>



It's hard to test without good test data. ;-)



Furthermore, it seems like the effects you are having come from having rows with varying number of fields.

This is in theory supported, but it can cause so many problems that I'm thinking of giving errors at runtime for it.



For efficiency, Kettle steps often cache field indexes in the rows, therefor, it's important that when you're joining multiple streams, the order in which the fields occur are also the same.



Cheers,



Matt

MattCasters
12-21-2005, 05:52 AM
In stead of using javascript you can also use a calculator:

To concatenate 2 strings (A+B)
To assign default values (Set field to constant value A)

Matt

ghost
12-21-2005, 05:53 AM
Sorry - wrong setting one more time ;)

Ok - the date fields of the test data are never used in this test case. You can delete them if you like. I tested this case without the "Java Script value" elements - works perfectly. Because i principial need this scripts to construct dynamical Strings i wrapped them together in only one script after join. And also this works perfect :)
The script looks like:

var dynValue = "";

if(C_ValidCombination_ID >= 1100000 && C_ValidCombination_ID < 1200000) {
dynValue = Name.getString()+"-1";
}
else if(C_ValidCombination_ID >= 1200000 && C_ValidCombination_ID < 1300000) {
dynValue = Name.getString()+"-2";
}
else if(C_ValidCombination_ID >= 1300000 && C_ValidCombination_ID < 1400000) {
dynValue = Name.getString()+"-3";
}
else if(C_ValidCombination_ID >= 1400000 && C_ValidCombination_ID < 1500000) {
dynValue = Name.getString()+"-4";
}

So i have "my solution" ;) - but i think there is a basic problem especially with the "Java Script value" element, because the test case should principially work, or?

Best wishes,
Gunther

ghost
12-21-2005, 05:57 AM
Ah, ok - i didn&#39;t knew about this step - i&#39;ll try this.

Thanks!

Best wishes,
Gunther

(... with correct settings this time ;) )

MattCasters
12-21-2005, 06:10 AM
Congratulations, you did find a bug in ScriptValues.
However, it will require some puzzling to find this one :-(
Perhaps I better start by upgrading the Rhino library to a more recent version. (js.jar)
The bug just seems too weird.

Thanks,

Matt

MattCasters
12-21-2005, 06:37 AM
Ok - the date fields of the test data are never used in this test case. You can delete them if you like. I tested this case without the "Java Script value" elements - works perfectly. Because i principial need this scripts to construct dynamical Strings i wrapped them together in only one script after join. And also this works perfect :)

The script looks like:

var dynValue = "";

if(C_ValidCombination_ID >= 1100000 && C_ValidCombination_ID < 1200000) {

dynValue = Name.getString()+"-1";
}
else if(C_ValidCombination_ID >= 1200000 && C_ValidCombination_ID < 1300000) {

dynValue = Name.getString()+"-2";

}
else if(C_ValidCombination_ID >= 1300000 && C_ValidCombination_ID < 1400000) {

dynValue = Name.getString()+"-3";
}
else if(C_ValidCombination_ID >= 1400000 && C_ValidCombination_ID < 1500000) {

dynValue = Name.getString()+"-4";
}

So i have "my solution" ;) - but i think there is a basic problem especially with the "Java Script value" element, because the test case should principially work, or?

Best wishes,
Gunther