PDA

View Full Version : Transaction in Insert/Update step?



jettm
03-23-2007, 01:08 PM
I very well could be missing something, but it appears from the behavior that the Insert/Update step is not wrapped in a transaction. Details:

Using SQL Server 2000 database, 2.4.1-M2 as downloaded today.

I have several Insert/Updates in parallel. I'm basically transforming rows into columns. A record comes in, gets several copies made, and each after filtering is pushed into an Insert/Update on the same table. The first Insert/Update step to get one of the copies should insert the record, the remainder of the steps to get a copy of the record should update it. However, I get primary key violations.

It looks like IU step 1 gets a copy, does the select to find it not there. IU step 2 gets a copy, does the select to find it not there. IU step 1 inserts the record since it wasn't there when it looked. IU step 2 now tries to insert the record....

I've attached the transform XML for reference, but it seems to be that the Insert/Update is not wrapping the SELECT and Update/Insert block into a transaction, which results in dirty transactions.

Any ideas on how to resolve?

thx,

Mike

MattCasters
03-23-2007, 05:13 PM
It's not a bug, it's a feature: every step runs in a different thread and every thread has it's own connection to the database.
You can change that behaviour in the Transformation settings, using the "Unique connections" checkbox (Misc tab).
One rollback in the transformation rolls back everything. It's not yet a perfect transactional implementation. It uses one unique connection per defined database.
However, to be perfect it should wait until the complete transformation is finished before doing a commit or rollback.
Unfortunately, this is not yet the case, but we'll keep working on that aspect as well.

jettm
03-26-2007, 11:55 AM
I like the behavior you're describing, that's how I expect it to behave. My question is centered on the actions of a single Insert/Update step. From what I understand, this is what it does:

1 - SELECT to see if the record exists
2 - IF it does, UPDATE. IF NOT, INSERT.

The problem is when you have several of these in your transform operating on the same table. The SELECT happens, and there's not a record. I prepare and do the INSERT, but in the meantime another IU step somewhere else in the transform INSERTed a record, so my INSERT causes a primary key violation. What I think needs to happen for this step is:

1 - BEGIN TRANSACTION
2 - SELECT to see if the record exists
3 - IF it does, UPDATE. IF NOT, INSERT.
4 - COMMIT

As far as the transform as a whole, I agree that how you are doing it is the right way. But the Insert/Update needs to be atomic, or the faster you run with more IU steps the higher your odds of a primary key violation...

I hope I'm understanding what you said correctly, if not just tell me I'm dense. Everyone else does... :)

And thanks for a great product, btw...

jettm
03-27-2007, 05:20 PM
I don't know how this impacts my first question and reply, but welcome to SQL Server. I replaced the Insert/Update step with a SQL Script step that did explicitly what I needed. Like I suspected, wrapping it in a transaction resolved the issue, but only after I bumped up the transaction isolation level. In retrospect, this makes sense.

But after bumping up the transaction isolation level to SERIALIZABLE and holding the lock from the SELECT statement forward through the transaction, I started running into deadlocks. With SQL Server, this also makes sense.

Long story short, it was easier to work around the problem than deal with it like I started. Where before I had multiple Insert/Update steps in parallel, I now have one I/U step then update only steps in parallel. This works, which is what I was originally after...

If anyone has any questions let me know.

MattCasters
03-28-2007, 03:24 AM
I'm just curious, but did you give the unique connections feature a try?
That one can solve your porblem in a lot of these cases.