Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Error handling with "Get Data From XML" component

  1. #1
    Join Date
    Apr 2012
    Posts
    253

    Default Error handling with "Get Data From XML" component

    I'm having issues with this component in that it aborts the transformation upon error.

    I have the error handling configured for the component, but apparently it is throwing an error
    that it doesn't understand or does not have a catch configured for.

    2014/05/20 17:31:50 - Get data from XML.0 - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : Unexpected Error : org.pentaho.di.core.exception.KettleException:
    2014/05/20 17:31:50 - Get data from XML.0 - org.dom4j.DocumentException: Error on line 40 of document http://wkbn.com/ : The entity "raquo" was referenced, but not declared. Nested exception: The entity "raquo" was referenced, but not declared.
    2014/05/20 17:31:50 - Get data from XML.0 - Error on line 40 of document http://wkbn.com/ : The entity "raquo" was referenced, but not declared. Nested exception: The entity "raquo" was referenced, but not declared.
    2014/05/20 17:31:50 - Get data from XML.0 - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : org.pentaho.di.core.exception.KettleException:
    2014/05/20 17:31:50 - Get data from XML.0 - org.dom4j.DocumentException: Error on line 40 of document http://wkbn.com/ : The entity "raquo" was referenced, but not declared. Nested exception: The entity "raquo" was referenced, but not declared.
    2014/05/20 17:31:50 - Get data from XML.0 - Error on line 40 of document http://wkbn.com/ : The entity "raquo" was referenced, but not declared. Nested exception: The entity "raquo" was referenced, but not declared.
    2014/05/20 17:31:50 - Get data from XML.0 -
    2014/05/20 17:31:50 - Get data from XML.0 - at org.pentaho.di.trans.steps.getxmldata.GetXMLData.setDocument(GetXMLData.java:184)
    2014/05/20 17:31:50 - Get data from XML.0 - at org.pentaho.di.trans.steps.getxmldata.GetXMLData.ReadNextString(GetXMLData.java:409)
    2014/05/20 17:31:50 - Get data from XML.0 - at org.pentaho.di.trans.steps.getxmldata.GetXMLData.getXMLRowPutRowWithErrorhandling(GetXMLData.java:712)
    2014/05/20 17:31:50 - Get data from XML.0 - at org.pentaho.di.trans.steps.getxmldata.GetXMLData.getXMLRow(GetXMLData.java:698)
    2014/05/20 17:31:50 - Get data from XML.0 - at org.pentaho.di.trans.steps.getxmldata.GetXMLData.processRow(GetXMLData.java:655)
    2014/05/20 17:31:50 - Get data from XML.0 - at org.pentaho.di.trans.step.RunThread.run(RunThread.java:60)
    2014/05/20 17:31:50 - Get data from XML.0 - at java.lang.Thread.run(Thread.java:662)
    2014/05/20 17:31:50 - Get data from XML.0 - Caused by: org.dom4j.DocumentException: Error on line 40 of document http://wkbn.com/ : The entity "raquo" was referenced, but not declared. Nested exception: The entity "raquo" was referenced, but not declared.
    2014/05/20 17:31:50 - Get data from XML.0 - at org.dom4j.io.SAXReader.read(SAXReader.java:482)
    2014/05/20 17:31:50 - Get data from XML.0 - at org.dom4j.io.SAXReader.read(SAXReader.java:291)
    2014/05/20 17:31:50 - Get data from XML.0 - at org.pentaho.di.trans.steps.getxmldata.GetXMLData.setDocument(GetXMLData.java:162)
    2014/05/20 17:31:50 - Get data from XML.0 - ... 6 more


    Aside from embedding this in a job system and passing and identifier for the current row being processed, is there an additional way
    to stop this from aborting the transformation execution? You'd think a parse error would be something it would expect. And yes,
    I'm using this in a strange way. And no, I don't want to check template conformation.

  2. #2
    Join Date
    Apr 2012
    Posts
    253

    Default

    I'm probably just going to implement my own java class to utilize SAX.

  3. #3
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    PDI has a StAX step as well, you might find it easy to work with.

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    The problem here is, that GDFX and STAX both need XML input, so first of all we must convert the HTML page to XML.
    No XML parser knows about HTML entities without declaration.
    So long, and thanks for all the fish.

  5. #5
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    There are some tools out there to do HTML to XML conversion.
    Parsing HTML with an XML parser on the other hand is not a good idea. (since it's not XML).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.