PDA

View Full Version : improving performance of script



kettle_anonymous
02-06-2006, 03:06 PM
good luck on that, Jeet. he he.

MattCasters
02-07-2006, 11:32 AM
(original message by anyoldlogin, set format to Plain)

I have a very large Fixed text file. Each line has a position that indicates the line type. Each line type loads into a separate database table. Currently I split this process in two steps 1) create foour new files for each of the four line types using text file in, filter and text file out; 2) Load each of four file types translateing them to the proper columns for the the database table, use select to filter out unwanted information and finally use table load to load the database.

This process works but is obviously slow due to the disk reads and writes.

Is there any way to grab singles lines from fixed files and send them down separate paths where there data can be extracted based upon the line type. Sorry this message is so long.

Thanks.

MattCasters
02-07-2006, 11:36 AM
Usually these kind of files come from Cobol programs with redefine sections.
At the moment, you need to apply a filter in Kettle and read the file multiple times: one time per record layout.

Of-course, this too is a sub-optimal solution, and I've had this question before, so I'm thinking adding a variable number of output-streams to TextFileInput so that you can have N targets for N different record layouts in the text-file(s).

There are only 2 main issues at the moment with this:
- the dialog will become (too) complex
- backward compatibility

Hope this clears some stuff up.

Please do open a feature request for this as I don't have the time at the moment. ;-)

Cheers,
Matt