PDA

View Full Version : How to parse a complex file (by using regular expressions ?)



lebuche
05-25-2007, 11:17 AM
Hi.
I have an input file with a complex structure.
For example :

01;key1;name1;givename1;birthday1
02;key1;phonenumber1
03;key1;mobilephonenumber1
01;key2;name2;givename2;birthday2
03;key2;mobilephonenumber2
01;key3;name3;givename3;birthday3
02;key3;phonenumber3
02;key4;phonenumber4

(warning : this file is not a "CSV file", because the number of fields is different depending on the first field (type of record "01", "02"))

I would like to transform this file like this:

key1;name1;givename1;birthday1;phonenumber1;mobile phonenumber1
key2;name2;givename2;birthday2;;mobilephonenumber2
key3;name3;givename3;birthday3;phonenumber3;
key4;;;;phonenumber4;

1/
To do this transformation, I would like to parse my input file by regular expression. Is it possible with Kettle ?
2/
An other idea perhaps ?

Thank you for your help

Jerome

prec
05-25-2007, 02:27 PM
hi Jerome,
i make an exemple in the attach zip.

prec.

lebuche
05-27-2007, 04:00 PM
Prec,

###
Es tu français ? (dans l'un des champs, tu as mis "clé"... )
Merci pour ta réponse rapide et claire.
Cela correspond bien à mon besoin.
Sais tu par contre si il est possible d'introduire des RegEx (comme sur Talend, autre ETL opensource) ?


##

In english :
Thank you for your answer that a good solution for my need
Do you know if Kettle supports RegEx (like Talend, an other opensource ETL) ?

Bye
Jerome

prec
05-28-2007, 08:10 AM
bonjour Lebuche,
oui je suis français ...
pour le REGex (expression régulière ?), je ne sais pas ... il faudrait que tu expliques un peu le concept.
Je suppose que c'est de faire une évaluation mathématique incluant un ou des champs. Si c'est cela il faut utiliser la step calcul qui te génrère une nouvelle colonne, filtrer sur cette colonne, et supprimer cette colonne (selection val.)


hi Lebuche,
yes i'm french ...
for REGex (regular expression ?), i don't know ... you have to explain concept ...
I suppose there is to make a mathematical evaluation with one or a few field. In that case, use calculator step, filter line ans value selection).

prec.

shassan2
05-28-2007, 10:31 AM
Hello,

should be a good idea if you put a request for it :-)

http://javaforge.com/proj/tracker/browseTracker.do?tracker_id=1274

Samatar