Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: Batch processing in Table Output - a little sugestion

  1. #1
    Join Date
    Dec 2008
    Posts
    11

    Default Batch processing in Table Output - a little sugestion

    There is problem with a batch processing in the oracle jdbc driver. The driver is not comprehensive compliance with the JDBC specification.
    Let assume that we have six insert PreaperedStatements in a batch and the fifth insert PreaperedStatement is wrong(ex: the statement violates a constraint). After the executing of the batch, the first four insert PreaparedStatements are inserted and the getUpdateCounts() function from BatchUpdateException returns the array with five elements and with status -3(Statement.EXECUTE_FAILED) for every element. After that Kettle commits and sends the all statements(because they have status -3) from the batch to an error handling steps. Now there appears discrepancy, because the error handling step receives the first four rows, which have been inserted into a table earlier. I think that Oracle shouldn't insert rows and after that say that these statements aren't executed properly, it is not consistent with the JDBC specification.

    So, I would like to propose to add a checkbox to the table output step, by which user can force keetle to make the rollback after the failed executing of a batch. This is the only way where we can work around this bad behaviour of the Oracle JDBC driver.

    Oracle also recommends to rollback a transaction after the failed execution of a batch:

    Oracle Database
    JDBC Developer's Guide and Reference
    10g Release 1 (10.1)
    Part No. B10979-02

    chapter 22
    "Error Handling in the Oracle Implementation of Standard Batching":
    Code:
    For a prepared statement batch, it is not possible to know which operation failed.
    The array has one element for each operation in the batch, and each element has a
    value of -3. According to the JDBC 2.0 specification, a value of -3 indicates that an
    operation did not complete successfully. In this case, it was presumably just one
    operation that actually failed, but because the JDBC driver does not know which
    operation that was, it labels all the batched operations as failures.
    You should always perform a ROLLBACK operation in this situation.


    What do you think?

    Regards,
    Daniel Polaczanski aka ZreZur

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Thanks for the suggestion. It's in the wrong place, but I guess Oracle drivers are even worse :-)

    If it's true what you say then the option is not needed since we're forced to do a rollback anyway to get correct results. I don't doubt reports of stupidity from Oracle but then again I recall testing the batch processing option on Oracle earlier.

    In short : File a JIRA case with a test to reproduce the Oracle bug.

  3. #3
    Join Date
    Dec 2008
    Posts
    11

    Default

    np, I've created the jira case: http://jira.pentaho.com/browse/PDI-2356 .

    I changed the behaviour of this step in my "temporary implementation of Table Output" in 2 functions( dispose() and writeToTable()) to perform rollback instead of commit

    Daniel Polaczanski aka ZreZur

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.