PDA

View Full Version : String length limit



t_christensen
03-13-2006, 12:34 AM
Attachment: String.GIF (http://forums.pentaho.org/archived_att/kettle/String.GIF) In a transformation I have, I am reading text data from a table ... then transforming it with a javascript (see attached picture) ... and then finally writing it to a MySQL database.

The input field "TOASTX" is defined as a string with the length of 40 ... the output field for the same data, is "IHTTXT" defined as a string with the length of 30 ...

One should think that the data in the output stream, has been cut down to the length of 30 ... but it is not ... it still has the original length of 40 ...

Why is the length not cut down to 30 ..?

This is a problem because the "Output to MySQL" step makes an error when attempting to write data to the database ...

jbleuel
03-13-2006, 12:55 AM
Hi,

there is no automatic cutting to the target size. You have to define it explicit. I think this is good for getting an error if the source changes unexpectedly.

Nevertheless I tried to limit the size with the select values step and thought this should work but it didnÂÂ't (Matt what do you think?).

As a circumvention (you already use javascript): use substr(int from, int to).

Cheers,
Jens

t_christensen
03-13-2006, 01:49 AM
Should the error you are mentioning, not come from Kettle's step-check and not from when MySQL fails to insert data into the database ..?

---

[ var IHTTXT = TOASTX.substr(0,30).getString(); ]

Would the javascript-solution also work if the input field only contains 20 chars in some rows ..?

You could use a condition on the length of the string, and only use "substr" when the length is larger than 30 ... but what if I do not know where the data will be of a wrong length ... to put conditions and "substr" on all fields would make the javascript-code too complex, for such a small task, IMO ...

MattCasters
03-13-2006, 03:59 AM
Hi All,

It would be costly performance-wise to detect this at runtime in every step.
Also, this thing annoys the heck out of me in other ETL tools. Data type conversions should be as automatic as possible.
That being said, there should be a check in the verify function that verifies that the data is too long for the target field.

However, in javascript you can do just about anything so it is impossible to see if it is correct.
The substring function Jens showed worked (even without the getString() methods)

All the best,
Matt

jbleuel
03-13-2006, 04:10 AM
Another circumvention:
- Before the Java script step add a select value step and on the "Meta-Data" tab rename the field TOASTX to IHTTXT and change the length to 30.
- in Java script: IHTTXT = IHTTXT.substr(0,IHTTXT.getLength()); (original value will be overwritten by this)

The substr() works also on strings having less than 30 characters.

One could consider to add a change request for cutting strings to the target size (but only by option). This could happen at the select values step and in Java script. But then we would change the behavior between strings and other data types. On other numerical data types this would be a mess.

Good luck,

Jens