Table Output creating duplicate keys in empty table
I'm trying to copy a table from one database to another using Kettle (3.0.4). The source is DB2 on AS/400 and the target is MySQL (5.x). My source table and my target both have the same primary key constraints and yet I'm getting duplicate key errors in my Table Output step. I suspect the problem is that the source table is being actively modified while the ETL is running. I'm guessing Kettle reads rows in batches from the source table and that when it goes back for a second batch, it finds a row it just copied because it was modified in the meantime. I've been trying to force it to read the whole thing in one batch, but I can't tell if I've been successful. I told it to show progress every 5000 rows and it looks like it's reading and writing every batch, not reading the whole thing and then writing the whole thing.
My source query is a very basic select of all the fields from the table. No "WHERE" or "ORDER BY" clause. The target table is always empty because I create it as part of the transformation. (I'm replicating data by copying it into a temp table created like the actual target and then renaming that temp table when I'm done).
It doesn't seem like I should have to use Insert/Update on an empty table, but I may end up trying that. Just now, I tried telling it to ignore insert errors, but I had to hand edit the .ktr to get it to do that because I couldn't figure out how to get Spoon to un-grey the checkbox (or the Use Batch one), even with Commit = 0. I'm letting it run for a while to see if that helps, but we're on a tight deadline, so any other ideas would be appreciated.
It appears that my problem was caused by a source table that would periodically have duplicate keys in it based on a bug in the process that was updating it. You can disregard my question.