Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: Get Data from XML only reads 1 element if multiple elements with same name exist

  1. #1
    Join Date
    Apr 2015
    Posts
    6

    Question Get Data from XML only reads 1 element if multiple elements with same name exist

    I need to read an XML catalog with multiple attribute values with the same element name.
    When using the "Get Data from XML" step only 1 of the elements is read and passed on to the next step.
    I need to consolidate the multi-value elements into a java ArrayList but can't do this if only 1 out of n elements are read.
    The XML validates.
    Is this a gap in functionality?

    Catalog snippet
    Code:
    <?xml version="1.0" encoding="UTF-8"?>
    <catalog>
        <document name="http://www.somesite.com/6730005355">
            <Title>Cross Strap Tinfoil</Title>
            <ORIGINAL_ID>67300053</ORIGINAL_ID>
            <Stock_Quantity>972</Stock_Quantity>
            <Sizes>38|39|40|41</Sizes>
            <dept>All New In Maine</dept>
            <dept>Loved By Hollie</dept>
            <dept>Shoes</dept>
            <dept>xmlfeed</dept>
            <dept>Accessories</dept>
            <dept/>
            <CURRENT_PRICE_US>97.00</CURRENT_PRICE_US>
            <ORIGINAL_PRICE_US>97.00</ORIGINAL_PRICE_US>
        </document>
        <document name="http://www.somesite.com/3710065417">
            <Title>Bramble Bee</Title>
            <ORIGINAL_ID>37123465417</ORIGINAL_ID>
            <Stock_Quantity>46</Stock_Quantity>
            <Sizes>8|12|16</Sizes>
            <dept>All New In</dept>
            <dept>Tops</dept>
            <dept>Clothing</dept>
            <dept>xmlfeed</dept>
            <dept/>
            <dept/>
            <dept/>
            <dept/>
            <dept/>
            <CURRENT_PRICE_US>55.00</CURRENT_PRICE_US>
            <ORIGINAL_PRICE_US>55.00</ORIGINAL_PRICE_US>
        </document>
    </catalog>

  2. #2
    Join Date
    Apr 2008
    Posts
    4,689

    Default

    You can set your repeating element to /catalog/document/dept and reference the other fields by relative reference.


    Code:
    Name    XPath    Element    Result type    Type    Format    Length    Precision    Currency    Decimal    Group    Trim type    Repeat
    DocName    ../@name    Attribute    Value of    String                            none    N
    Title    ../Title    Node    Value of    String                            none    N
    OriginalID    ../ORIGINAL_ID    Node    Value of    String                            none    N
    Stock    ../Stock_Quantity    Node    Value of    String                            none    N
    Sizes    ../Sizes    Node    Value of    String                            none    N
    Dept    .    Node    Value of    String                            none    N
    Current    ../CURRENT_PRICE_US    Node    Value of    String                            none    N
    Orig    ../ORIGINAL_PRICE_US    Node    Value of    String                            none    N

  3. #3
    Join Date
    Apr 2015
    Posts
    6

    Default

    It doesn't seem like this will work for multiple elements with multi-values. Is there another method to read this type of XML schema?

  4. #4
    Join Date
    Apr 2008
    Posts
    4,689

    Default

    For this particular type, the only other option I can think of is to use the StAX step.

    However, if your structure was a little bit different:

    Code:
    <?xml version="1.0" encoding="UTF-8"?>
    <catalog>
        <document name="http://www.somesite.com/6730005355">
            <Title>Cross Strap Tinfoil</Title>
            <ORIGINAL_ID>67300053</ORIGINAL_ID>
            <Stock_Quantity>972</Stock_Quantity>
            <Sizes>38|39|40|41</Sizes>
        <depts>
            <dept>All New In Maine</dept>
            <dept>Loved By Hollie</dept>
            <dept>Shoes</dept>
            <dept>xmlfeed</dept>
            <dept>Accessories</dept>
            </depts>
        <others>
            <other>All New In Maine</other>
            <other>Loved By Hollie</other>
            <other>Shoes</other>
            <other>xmlfeed</other>
            <other>Accessories</other>
            </others>
            <CURRENT_PRICE_US>97.00</CURRENT_PRICE_US>
            <ORIGINAL_PRICE_US>97.00</ORIGINAL_PRICE_US>
        </document>
        <document name="http://www.somesite.com/3710065417">
            <Title>Bramble Bee</Title>
            <ORIGINAL_ID>37123465417</ORIGINAL_ID>
            <Stock_Quantity>46</Stock_Quantity>
            <Sizes>8|12|16</Sizes>
        <depts>
            <dept>All New In</dept>
            <dept>Tops</dept>
            <dept>Clothing</dept>
            <dept>xmlfeed</dept>
            </depts>
        <others>
            <other>All New In</other>
            <other>Tops</other>
            <other>Clothing</other>
            <other>xmlfeed</other>
            </others>
            <CURRENT_PRICE_US>55.00</CURRENT_PRICE_US>
            <ORIGINAL_PRICE_US>55.00</ORIGINAL_PRICE_US>
        </document>
    </catalog>
    Then you can pass through the fragments for <depts> and <others> to subsequent XML Input steps

  5. #5
    Join Date
    Apr 2015
    Posts
    6

    Default

    Unfortunately I don't have the luxury of modifying the incoming XML schema.
    Are there file handlers in the Modified Java Script Value/Rhino module that can be used to manually parse an XML file?

  6. #6
    Join Date
    Apr 2008
    Posts
    4,689

    Default

    Then you are probably going to have to investigate the StAX step.
    There might be ways of doing it with positional filters, if you know that there are always 9 dept values, but I'm not 100% certain how to do that.

  7. #7
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    The XML sample is "Flat XML" - i.e. no element groups below node /catalog/document.
    In this particular case we can convert a document to a key-value list which we easily can subject to further processing.


    PS:
    I'm so embarassed I attached a GDFX step without configuration - silly me.


    On the other side, I wonder what the two guys thought who already downloaded this.
    Attached Files Attached Files
    Last edited by marabu; 04-11-2017 at 02:18 PM. Reason: replacement of attached file
    So long, and thanks for all the fish.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.