Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: xpath select does not appear to exclude elements

  1. #1
    Join Date
    Oct 2016
    Posts
    5

    Default xpath select does not appear to exclude elements

    Hello,

    I am having trouble with getting Xpath to work in Pentaho 4 running on windows 7.
    My goal is to to Select Each F node but exclude the child F nodes. What I am trying
    to get to work in Pentaho works on the below xpath tester website.

    I want to be able to select the first F node and all elements except for child F nodes.
    In my real XML I have child nodes that have the same element name as parent nodes.
    I expect to get back f1 and f1a for the top level F. For the second F I expect to get back f2
    and the final F to get back f3.

    My structure looks like
    Code:
    Top   
         Next
         F
          f1
          f1a
               F
                f2
                 F
                   f3

    I have tried a LOT of different variations of xpath 1.0 syntax but what I want does not work.
    I can only get it to pull the first element below F. In other words it will return the 'f1' element
    but nothing else. However when I use the website below with the sample xml
    below the website works correctly. But when I add the same thing in Pentaho I just get back 'f1'.
    I expect to get back f1, f1a, and gotit for the top level F.

    Pentaho step looks like this with the xml below as input
    Loop Xpath: /Top
    Fields: Name->MyElement, Xpath-> //Next/F/*[not(descendant::F)]
    Element: Node
    Result type: Single Node

    Thank you for any guidance,
    Oring

    -------
    -------
    http://chris.photobooks.com/xml/default.htm (or use the xpath tester of your choice)

    //Top/Next/F/*[not(descendant::F)]


    Code:
    <Top>
        <Next>
           <F>
               <f1>f1</f1>
               <f1a>f1a</f1a>
               <F>
                   <f2>f2</f2>
                   <F>
                      <f3>f3</f3>
                   </F>
               </F>
           <gotit>gotit</gotit>
         </F>
        </Next>
    </Top>

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    I don't know how others do it, but I read a text sequentially.

    What a ride:

    Quote Originally Posted by oring View Post
    My goal is to to Select Each F node but exclude the child F nodes.
    Poor choice of an input sample, then: There's only one F node qualifiying.

    Quote Originally Posted by oring View Post
    I want to be able to select the first F node and all elements except for child F nodes.
    Try not to confuse your readers - first or each, what is it now?

    Quote Originally Posted by oring View Post
    I expect to get back f1 and f1a for the top level F.
    You should expect gotit, too.

    Quote Originally Posted by oring View Post
    For the second F I expect to get back f2 and the final F to get back f3.
    Earlier you said, you want to ignore F child nodes. Strange.

    Quote Originally Posted by oring View Post
    But when I add the same thing in Pentaho I just get back 'f1'.
    You seem to make some mistake because I get the same results for PDI 4.4 compared to the online tester you use.

    Quote Originally Posted by oring View Post
    I expect to get back f1, f1a, and gotit for the top level F.
    Ah, now you expect gotit ...

    Quote Originally Posted by oring View Post
    Pentaho step looks like this
    Just zip and attach, saves everbody's time.

    Quote Originally Posted by oring View Post
    //Top/Next/F/*[not(descendant::F)]
    I did what you should have done: Build and attach a test case.

    You're welcome.
    Attached Files Attached Files
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Oct 2016
    Posts
    5

    Default

    Hi marabu,

    Thank you for your helpful and courteous reply.
    I would love to attach a transform but our startup company does not allow us to upload files to websites.

    The only thing I think I should have made clearer was that I am NOT trying to get the first F only.
    I want a "single" step that extracts each F node (except for child F nodes).


    So the xpath in a "single" step in this case might look like:
    //Next/F/*[not(descendant::F)]
    //Next/F/F/*[not(descendant::F)]
    //Next/F/F/F

    The first xpath should return
    Code:
              <f1>f1</f1>
               <f1a>f1a</f1a>
           <gotit>gotit</gotit>
         </F>
    The second xpath should return
    Code:
          <f2>f2</f2>
    The third xpath should return
    Code:
         <f3>f3</f3>

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    So it's something like //F/*[name()!="F"] you want to use as your Loop XPath ?

  5. #5
    Join Date
    Oct 2016
    Posts
    5

    Default

    Quote Originally Posted by marabu View Post
    So it's something like //F/*[name()!="F"] you want to use as your Loop XPath ?
    This is a follow up. I tried your suggestion and moved it into the Loop xpath section instead of having it in the fields section.
    I tried your version and a few others and still no luck.

    So for now we will just have to parse each level separately.

    Thanks for the suggestions.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.