Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: How to load multiple xmls in oracle database using pentaho.

  1. #1
    Join Date
    Nov 2015
    Posts
    3

    Default How to load multiple xmls in oracle database using pentaho.

    We are using xml input stream as source & table output as destination to load single xml to oracle database. But we wanted to load multiple xmls at a time. Can some one help ? How to loop the transformation to load multiple xmls?
    Or is there any alternate way any job etc to do the bulk loading of xmls in database?

  2. #2
    Join Date
    Dec 2009
    Posts
    332

    Default

    Is the Oracle data stored as LOB, XMLType or are you parsing it?

  3. #3
    Join Date
    Nov 2015
    Posts
    3

    Default

    Hi Khelms,

    The oracle data stored is not LOB.The XML stream input source parse the XML & The table out put load it in oracle table.

  4. #4
    Join Date
    Dec 2009
    Posts
    332

    Default

    We perform the same process with tolerable performance on XML files that average 40,000 nodes per file - all of which are part of a single transaction and have a complicated hierarchy. This is accomplished in two phases. The first finds all the xml files in the appropriate location using Get File Names (converts the file name somewhat); looks the file up to see if we have already parsed it or not to avoid duplicates; parses out much of the crucial data using xpath Getdata from XML; then outputs the results to Oracle as shown in this screen shot:
    Name:  XmlInput.jpg
Views: 101
Size:  8.3 KB
    This process will work just fine for any simple XML and it may be all you need if your XML is relatively flat. It is also spectacularly fast and processed 300,000 files a second for us when we tested it by itself.

    Our XML data however is freakishly messy - so we take the next step by moving the pre-processed files to a folder mounted in a defined Oracle Directory and calling a database package to parse the file using XMLType cursors directly from the directory. These cursors look something like this:
    CURSOR XML_CURS (C_FILE_NAME XML_FILES.FILENAME%TYPE) IS
    SELECT FND_NODES.*
    FROM XMLTABLE('/RootXMLNode'
    PASSING XMLTYPE(BFILENAME('ORACLEXMLDIR', C_FILE_NAME), NLS_CHARSET_ID('AL32UTF8'))
    COLUMNS
    HEADER_XML XMLTYPE PATH 'Header',
    BODY_XML XMLTYPE PATH 'Body',
    SPEC_VERSION VARCHAR2(100) PATH '@specVersion'
    ) FND_NODES
    This method parses roughly 600 files a second (with minimum of 40,000 nodes each).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.