PDA

View Full Version : Database lookup of null (blank) fields when loading from XLS



MattCasters
02-07-2006, 02:15 PM
(posted by Anonymous, edited for readeablility)
I am loading a database table with the contents of an XLS. In most cases all the fields I need are already specified in the XLS, but some records are missing fields (blank columns). When these fields are missing, they can be looked up (database lookup) using a key field which is always specified. How can I take the results of a database lookup and use the looked up values to populate fields on the record which drives the lookup? The denormalization step did not seem to meet this need unless I am missing something.

Example:

If the XLS has this data:

CUSTID CUSTNAME Transaction Amt
1 Some Customer 1 Sell 10
2 Some Customer 2 Buy 55
1 Some Customer 1 Sell 5
7 Buy 100
2 Some Customer 2 Buy 55

I would want to load the 4th record as follows:

CUSTID CUSTNAME Transaction Amt
7 Some Customer 7 Buy 100

This requires looking up in a customer table, using the CUSTID field, the customer name at load time because it is not specified in the XLS.

Please note that normalizing this information (i.e., removing the CUSTNAME) from this table is not an option -- at least not by me at this point.

Thanks

MattCasters
02-07-2006, 02:27 PM
How does the 4th data line look like?

(1) 7 Buy 100
(2) 7 <empty> Buy 100

If (1) is the answer: change the XLS sheet, Kettle only supports streams of rows with the same layout.

If (2) is the answer I have another question: is there another row with the following line in the XLS file?

7 Some Customer 7 Buy 100

If this is the case, you can read the file twice, send the data to a stream lookup file and enrich the data like that. It does so in memory, but I guess the amounts of rows are not that big if it comes from an XLS file ;-)

Please note that Kettle can also fill in blanks by taking the value from the previous row in the Excel sheet. I&#39;m not sure that this helps in your case though.

Hope this helps,

Matt

ark0n3
07-19-2010, 01:18 PM
A small tips for everyone (thnx Matt!):
in the Excel input you have to set "Repeat" to "Y" in the column you want to be filled in with the previous row content (in case it's empty).