Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: Dynamically parsing complicated XML timeseries with values as attributes

  1. #1
    Join Date
    Aug 2017
    Posts
    5

    Default Dynamically parsing complicated XML timeseries with values as attributes

    I have a very complicated XML that I am trying to parse.
    It is an hourly timeseries of the form:

    <Base>
    <OtherInfo>info1</OtherInfo>
    <Month>3</Month>
    <Data>
    <Date H1=val1 H2=val2...H23=val23>1</Date>
    <Date H1=val1 H2=val2...H23=val23>2</Date>
    <Date H1=val1 H2=val2...H23=val23>...</Date>
    <Date H1=val1 H2=val2...H23=val23>28</Date>
    </Data>
    </Base>
    <Base>
    <OtherInfo>info2</OtherInfo>
    <Month>4</Month>
    <Data>
    <Date H1=val1 H2=val2 ... H24=val24>1</Date>
    <Date H1=val1 H2=val2 ... H24=val24>2</Date>
    <Date H1=val1 H2=val2 ... H24=val24>...</Date>
    <Date H1=val1 H2=val2 ... H24=val24>30</Date>
    </Data>
    </Base>
    <Base>
    <OtherInfo>info3</OtherInfo>
    <Month>5</Month>
    <Data>
    <Date H1=val1 H2=val2 ... H24=val24>1</Date>
    <Date H1=val1 H2=val2 ... H24=val24>2</Date>
    <Date H1=val1 H2=val2 ... H24=val24>...</Date>
    <Date H1=val1 H2=val2 ... H24=val24>31</Date>
    </Data>
    </Base>

    I would like to get it in this form:

    OtherInfo Month Date Hour Value
    info1 3 1 1 val-3-1-1
    3 1 2 val-3-1-2
    info1 3 ... ... ...
    info1 3 28 23 val-3-28-23
    info2 4 1 1 val-4-1-1
    info2 4 ... ... ...
    info2 4 30 24 val-4-30-24
    info3 5 1 1 val-5-1-1
    info3 5 ... ...
    info3 5 31 24 val-5-31-24

    What would be a dynamic way to read all of the values? The number of months and dates
    are dynamic, as well as the hour attributes (in March and October the number of hours changes due to DST).
    Is there any way to dynamically loop through all of the attributes, or must they all be hardcoded,
    which would then also present problems in DST months?

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    You don't have to care about dynamics in date and month, XPath will handle that for you.
    If you're willing to specify hour attributes explicitely, it's as easy as pulling up a Row-Normalizer behind Get-Data-From-XML.
    Missing values will be null.
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Nov 2013
    Posts
    382

    Default

    You can start with something as simple as annexed.

    I had to add double quotes to the values.
    Attached Files Attached Files

  4. #4
    Join Date
    Aug 2017
    Posts
    5

    Default

    Thank you marabu - yes I guess in the end this is similar to the last question you helped me with!

    And Thank you DepButi, your test solution works perfectly as advertised.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.