View Full Version : Dimension update/lookup always inserts a null row

02-13-2006, 01:08 PM

I have a transformation importing data from an excel file. The problem I am experiencing is that I always have a null row inserted into the resulting table when making use of the "Dimension update/lookup". I have tried inserting different steps to try and eliminate this from occurring. The steps I have are as follows ...

Excel Input -> Sort Rows -> Unique Rows -> Dimension update/lookup.

I have tried inserting a Filter rows with a criteria of "if not null" do update else Dummy(do nothing) just before the last step, but to no avail. If however, I change the last step to a Table output ( and remove the Filter rows step ) there is no null row inserted into the table. This leads me to believe that either I am clueless ( very likely ) or there is possibly a bug in Dimension update/lookup.

BTW, thanks for a great tool. Your effort is greatly appreciated.


02-14-2006, 02:18 AM

An "unknown" row is inserted by design with technical key 0. (or 1 on certain databases)
I thought of making it optional, but if you don't want this, then you're no longer updating a slowly changing dimension and you should indeed find another way of doing the dimension update.
The point is that if you don't have the "unknown" key in the dimension, you can't enforce the 1:N relationship between a dimension and a fact table.

All the best,


02-14-2006, 04:12 AM
Hi Matt,

ahh, didn't realise it was the "unknown" row. Is it possible to allow the "unknown" value to be configurable instead of null, i.e, integer(0), null or the string value "unknown"? The situation I have is that the key field used for lookup in the dimension is a string ... I'm not getting original key from the source so I cannot create an alternatekey field. Everytime the transformation is run it creates another null / "unknown" row.

I've been experimenting with the MS AdventureWorks DB and here I have the alternatekey as an integer. In this case, the key field is set to null and on each execution of the transformation no additional "unknown" rows are inserted other than the original "unknown" row.

Could there be an error when the dimension field key used for comparison is a string?

I have a ton of other "feature" requests but I'll hold off on them - going to start poking around in the source first.


02-14-2006, 05:30 AM
Hi Greg,

I'm asuming you're talking about the natural key right? You can put whatever value in there.
Kettle only checks for the existance of technical/surrogate key 0. (or 1 on certain databases)
It doesn't do an update of any of the fields in the unknown row.
So my suggestion is to just do a manual update of the unknown row.

If it tries to insert a technical key with value NULL, then we do have a bug on our hands ;-)
In that case, you should create a new bug tracker on this very site and specify the database type, access method, versions etc, you know what I mean...

Also, if you have any suggestions on making Dimension Lookup/Update better, please also file a tracker for an enhancement.

At the moment I'm taking a short break from Kettle because of the birth of my daughter, but in a couple of weeks I'll be back to full speed hacking.

Hope this helps,