Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Reading XML from nested folders

  1. #1
    Join Date
    Oct 2007

    Default Reading XML from nested folders


    I have just started using Kettle and I primarily need to import XML into a RDBMS. The problem is that I have a huge number of XML files saved into a very nested file structure.

    For example

    Drive C:
    Folder A1
    Folder B1
    Folder C1
    Folder C2
    Folder C3
    Folder B2
    Folder C1
    Folder C2
    Folder C3
    Folder B3
    Folder C1
    Folder C2
    Folder C3

    Ideally I would like to pass C:\A1 as a file directory and read all the xml files. However, so far I have managed only to read xml files that are directly under A1, which means that I won’t be able to read all my files at once. Specifying the path to each file is impossible because I am working with hundreds of files. Is there a way to read all the xml files in the subdirectories with Kettle? I know that I could eventually write some code to merge them in one file, but it would be much better if I could just do the same with Kettle. I would appreciate if someone could give me some guidelines on this.

    Also, I have the Spoon guidelines, but is there a example that shows how to import XML to RDBMS step by step. It would really help alot.



  2. #2
    Join Date
    May 2006


    Default PDI is not that good with deeply nested XML (especially the optional parts), there would be some resolve in XML input path plugin but in the end you're still restricted by PDI that all output data you extract from an XML file has to be of the same format.

    No guidelines for moving XML to Oracle... it just as any other component.

    For your directory problem you can probably get away by shelling out, making a list of files and processing those.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.