Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: parse xml-date to date field

  1. #1
    Join Date
    Oct 2010
    Posts
    13

    Default parse xml-date to date field

    When I process a date from a xml file I use the "get data from xml" step.
    I create a date field with the format yyyy-MM-dd'T'HH:mm:ss'+'HH:mm to get the following value 2010-04-09T22:55:45+02:00.

    But when I want to get the following value. 2010-04-09T22:48:09.127+02:00 and use the same format I get an error
    Code:
    2011/03/04 14:43:47 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : _AA_LDC String : couldn't convert string [2010-03-12T10:50:01.22+01:00] to a date using format [yyyy-MM-dd'T'HH:mm:ss'+'HH:mm]
    2011/03/04 14:43:47 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : Unparseable date: "2010-03-12T10:50:01.22+01:00"
    2011/03/04 14:43:47 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : Unexpected error : 
    2011/03/04 14:43:47 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : org.pentaho.di.core.exception.KettleException: 
    2011/03/04 14:43:47 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : Unable to read row from XML file
    The number of digits after the dot varies.
    Does anyone know how to deal with this kind of date? Also do you know how to deal with the +01:00 at the end?

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    You have a timezone indicator at the end.
    Date format is:

    Code:
    yyyy-MM-dd'T'HH:mm:ss.SSZ

  3. #3
    Join Date
    Oct 2010
    Posts
    13

    Default

    Does this also deal with 3 digits after the dot? Or should I then add the extra "S"?
    My guess is [yyyy-MM-dd'T'HH:mm:ss.SSSZ] should deal with 1,2 or 3 digits after the dot.

    When I use your format I get the following error.
    Code:
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : org.pentaho.di.core.exception.KettleValueException: 
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : _AA_LDC String : couldn't convert string [2010-03-12T10:50:01.22+01:00] to a date using format [yyyy-MM-dd'T'HH:mm:ss.SSZ]
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : Unparseable date: "2010-03-12T10:50:01.22+01:00"
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : Unexpected error : 
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : org.pentaho.di.core.exception.KettleException: 
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : Unable to read row from XML file
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : 
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : org.pentaho.di.core.exception.KettleValueException: 
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : _AA_LDC String : couldn't convert string [2010-03-12T10:50:01.22+01:00] to a date using format [yyyy-MM-dd'T'HH:mm:ss.SSZ]
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : Unparseable date: "2010-03-12T10:50:01.22+01:00"
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : 
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : 
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : org.pentaho.di.trans.steps.getxmldata.GetXMLData.getXMLRowPutRowWithErrorhandling(GetXMLData.java:720)
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : org.pentaho.di.trans.steps.getxmldata.GetXMLData.getXMLRow(GetXMLData.java:691)
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : org.pentaho.di.trans.steps.getxmldata.GetXMLData.processRow(GetXMLData.java:648)
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : org.pentaho.di.trans.step.RunThread.run(RunThread.java:40)
    2011/03/07 08:21:02 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : java.lang.Thread.run(Thread.java:662)
    2011/03/07 08:21:02 - Get data from XML.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=1)
    The other day I ran into a problem where the parser in the "text file input" step wasn't able to change a typical date format from text to a date attribute. Coudl this be related?

  4. #4
    Join Date
    Oct 2010
    Posts
    13

    Default

    Without the Z the xml file will parse now. But adding the Z generates the error shown as in the previous post.

  5. #5
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    The problem is that Java can parse +0100 or -0100 as timezone offset, not +01:00 or -01:00
    You could always do a string replace (:00$ --> 00) and try again in a "Select Values" step.

    http://download.oracle.com/javase/1....ateFormat.html

  6. #6
    Join Date
    Oct 2010
    Posts
    13

    Default

    The URL you refer to also mentions the lowercase z as an option for this. This would be my first choice.
    Unfortunately....

    HTML Code:
    2011/03/07 11:18:57 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : org.pentaho.di.core.exception.KettleValueException: 
    2011/03/07 11:18:57 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : _AA_LDC String : couldn't convert string [2010-03-12T10:50:01.22+01:00] to a date using format [yyyy-MM-dd'T'HH:mm:ss.SSz]
    2011/03/07 11:18:57 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : Unparseable date: "2010-03-12T10:50:01.22+01:00"
    2011/03/07 11:18:57 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : Unexpected error : 
    2011/03/07 11:18:57 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : org.pentaho.di.core.exception.KettleException: 
    2011/03/07 11:18:57 - Get data from XML.0 - ERROR (version 4.1.2-GA, build 14760 from 2011-01-26 13.31.23 by buildguy) : Unable to read row from XML file
    And you are right, when I remove the ":" and put the Z in the format the date is parsed

  7. #7
    Join Date
    Oct 2010
    Posts
    13

    Default

    I am sorry. I need to look a bit closer at the text. The z options has names or abbreviations of the timezone included.

    Thank you for your help.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.