One common task when you are building a fact table is the Foreign Key lookup in the dimension tables.



When you have to fill FK for a very big Fact Table the “Database Lookup” object could be very time consuming. It don’t know how exactly this object is programmed but I assume that it have to run something like :
SELECT dimtable.pk
FROM dimtable
WHERE facttable.LogicalKey = dimtable.LogicalKey
on the Dimension Table for each records from the Fact Table.

There are 2 atlternatives for this issue .
- First one and the fastest, is to work with Stream lookup. The principle is the same as Database Lookup but Stream lookup make it possible to let all Dimension Records in memory (takes care to use only the fields needed for the lookup. So you have perhaps to remove some fields). The limitation of this solution is when you have a very big dimension or if you have to lookup many FK in many dimensions. You could enconter memory limitation. Or you system will begin to swap with temp files causing poor performances.
- Second one is to work with flat files and then to merge (merge join) the fact file/stream with dimension file/stream in order to get the Primary/Technical Key from the dimension and write it in the FK field of the fact. Of course, the fact and the dimension need to be sorted on the logical key (included begin/end date if you have a type 2 dimension)

Here is a comparaison of the 3 methods for a Fact with 152.000 Records and a Dimension (type 1) with 18.000 Records.

Stream Lookup 269 sec.
Merge Join 524 sec.
Database Lookup 843 sec. (I have use an Ms Access DB on a network drive for this exercice.)
Kind Regards