Hitachi Vantara Pentaho Community Forums
Results 1 to 11 of 11

Thread: Transformation Error Handling

  1. #1
    Join Date
    Mar 2011
    Posts
    139

    Default Transformation Error Handling

    I am working on a relatively simple transformation where I have a table containing a number of HL7 messages. The script works by extracting the HL7 message from the table and parsing the message into discrete elements. This works fine except when I encounter a message improperly formatted message.

    Prior to adding the Abort step to the script, the transformation failed with the following error:

    2013/11/19 07:49:21 - HL7 Input.0 - ERROR (version 5.0.0.1,build 1 from 2013-09-11_16-51-19 by buildguy) : Unexpected error
    2013/11/19 07:49:21 - HL7 Input.0 - ERROR (version 5.0.0.1,build 1 from 2013-09-11_16-51-19 by buildguy) rg.pentaho.di.core.exception.KettleException:
    2013/11/19 07:49:21 - HL7 Input.0 - Error parsing message
    2013/11/19 07:49:21 - HL7 Input.0 - at java.lang.Thread.run (null:-1)
    2013/11/19 07:49:21 - HL7 Input.0 - at org.pentaho.di.trans.step.RunThread.run(RunThread.java:60)
    2013/11/19 07:49:21 - HL7 Input.0 - at org.pentaho.di.trans.steps.hl7input.HL7Input.processRow(HL7Input.java:77)
    2013/11/19 07:49:21 - HL7 Input.0 - at ca.uhn.hl7v2.parser.Parser.parse (Parser.java:158)
    2013/11/19 07:49:21 - HL7 Input.0 - at ca.uhn.hl7v2.parser.GenericParser.getEncoding(GenericParser.java:190)
    2013/11/19 07:49:21 - HL7 Input.0 - at ca.uhn.hl7v2.parser.PipeParser.getEncoding(PipeParser.java:102)
    2013/11/19 07:49:21 - HL7 Input.0 -
    2013/11/19 07:49:21 - HL7 Input.0 - atorg.pentaho.di.trans.steps.hl7input.HL7Input.processRow(HL7Input.java:98)
    2013/11/19 07:49:21 - HL7 Input.0 - at org.pentaho.di.trans.step.RunThread.run(RunThread.java:60)
    2013/11/19 07:49:21 - HL7 Input.0 - at java.lang.Thread.run(Unknown Source)
    2013/11/19 07:49:21 - HL7 Input.0 - Caused by:java.lang.NullPointerException
    2013/11/19 07:49:21 - HL7 Input.0 - at ca.uhn.hl7v2.parser.PipeParser.getEncoding(PipeParser.java:102)
    2013/11/19 07:49:21 - HL7 Input.0 - atca.uhn.hl7v2.parser.GenericParser.getEncoding(GenericParser.java:190)
    2013/11/19 07:49:21 - HL7 Input.0 - at ca.uhn.hl7v2.parser.Parser.parse(Parser.java:158)
    2013/11/19 07:49:21 - HL7 Input.0 - atorg.pentaho.di.trans.steps.hl7input.HL7Input.processRow(HL7Input.java:77)
    2013/11/19 07:49:21 - HL7 Input.0 - ... 2 more

    When I added the Abort step, the error is more informative:
    2013/11/19 07:50:31 - HL7 Error.0 - ERROR (version 5.0.0.1,build 1 from 2013-09-11_16-51-19 by buildguy) : Row nr 1 causing abort :[419077], [000075FE7B2818A185257BD9005C71D0], {HL7 Message here – cannot display due to privacy},[1], [Field Separator], [1.1.1.1], [ST], [single value], [|]
    2013/11/19 07:50:31 - HL7 Error.0 - ERROR (version 5.0.0.1,build 1 from 2013-09-11_16-51-19 by buildguy) : Aborting after having seen 1rows.

    Here is my question: the HL7 step does not have error handling. How can I add some level of error handling so bad records are identified however the script can continue?

    I am also trying to understand error handling in transformations how to handle ETL when there is an error and incorporating the ability to continue to the next record or how to stop the entire ETL to throw an error.

    Thank you
    Ray
    Attached Images Attached Images  

  2. #2
    Join Date
    Nov 2008
    Posts
    777

    Default

    I'm not too well-versed on HL7 but I'm willing to try to help in order to learn a little about it...

    First, a question: Is your table data in HL7 Version 2.x (ASCII) format or Version 3 (XML) format?

    Second, a suggestion: You should submit a JIRA that pleads for the addition of error handling to the HL7 Input step.
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  3. #3
    Join Date
    Mar 2011
    Posts
    139

    Default

    Darrell,

    The data is in ASCII format which the HL7 step parses without an issue. The problem is that the step only fails when the format is incorrect based on established standards. My question may be more focused on "how to handle errors in a transformation?" I have came across similar error handling issues in a transformation where I do not know how to properly handle the errors and have better control of the application flow.

    Thanks
    Ray

  4. #4
    Join Date
    Mar 2011
    Posts
    139

    Default

    Darrell,

    Also a good point regarding submitted a JIRA for enhancing the HL7 step!

    Thanks
    Ray

  5. #5
    Join Date
    Nov 2008
    Posts
    777

    Default

    Quote Originally Posted by raymueller View Post
    The data is in ASCII format which the HL7 step parses without an issue. The problem is that the step only fails when the format is incorrect based on established standards. My question may be more focused on "how to handle errors in a transformation?"

    I have came across similar error handling issues in a transformation where I do not know how to properly handle the errors and have better control of the application flow.
    I'm not sure I follow your response so I have lots of questions:
    1. Didn't your original post say that the HL7 Input step causes an unforgiving Java stack dump when the data isn't formatted correctly? Don't you need to catch the errors before you handle them? IMHO a stack dump is not an example of a catch.
    2. What type of data problems are causing the error? Can you give an example of the good and bad rows?
    3. Is there a way to detect an improperly formatted record (i.e., data validation) before the HL7 Input step?


    As for the flow, would the typical way of splitting the flow (after a data validation step or an upgraded HL7 step) so you have one branch with good records and one branch with bad records be sufficient for your application?
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  6. #6
    Join Date
    Mar 2011
    Posts
    139

    Default

    Darrell,

    The step that is generating the error is the HL7 input step however the error is caused by the message is not formatted based on established standards. The error could be a character or delimiter in the incorrect position. I am not aware of any method within PDI to verify the HL7 structure. There are other tools that can but I can't check every one by hand.

    Thinking past that I am using a HL7 step in this scenario, let's say that I am using any step that does not have error handling capability; how can I trap/capture/detect an error caused by a step and gracefully handling the error?

    Thanks
    Ray

  7. #7
    Join Date
    Nov 2008
    Posts
    777

    Default

    Certainly, an upgraded HL7 input step is the most desirable choice. It would most likely catch all the errors so you can handle them gracefully. That's not going to happen very soon though so you need a workaround in the mean time. Here are some possibilities:
    1. Write a regex to filter out your offending rows. Missing delimiters might be easy to catch in this way. This is why I asked for some examples of good and bad rows.
    2. Find an open source Java library that can parse HL7 input and orchestrate it in a JavaScript step. http://hl7api.sourceforge.net/
    3. Write a UJDC that analyzes rows for common HL7 formatting errors.
    Last edited by darrell.nelson; 11-19-2013 at 02:33 PM. Reason: Added link
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  8. #8
    Join Date
    Mar 2011
    Posts
    139

    Default

    Darrell,

    Thank you. HL7 messages are difficult to share unless you de-identify them first (which is a pain) because they contain clinical patient information. Regex would be a good start to look for patterns which should be effective even on the very long messages. I have been looking for an excuse to do a UJDC so this will be a good project to try it on.

    I appreciate the help and insight.

    Thank you
    Ray

  9. #9
    Join Date
    Nov 2008
    Posts
    777

    Default

    Take a look at this page http://hl7api.sourceforge.net/conformance.html under the "Runtime Message Validator" heading. Might be helpful in your UDJC experience...
    Last edited by darrell.nelson; 11-19-2013 at 03:50 PM.
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  10. #10
    Join Date
    Mar 2011
    Posts
    139

    Default

    Darrell,

    Thank you for pointing me to the HAPI site. I have come across this site some time ago so it was a good renewal. I should be able to adapt this.

    Thanks
    Ray

  11. #11
    Join Date
    Nov 2008
    Posts
    777

    Default

    You may also want to follow the stack trace you are getting back into the source code at the following link. I see Kettle is using the same HAPI library I suggested so perhaps you can figure out why it is crashing on you.

    https://github.com/pentaho/pentaho-k.../HL7Input.java
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.