Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Insert XML to MySQL (same attributes)

  1. #1
    Join Date
    Oct 2015
    Posts
    3

    Exclamation Insert XML to MySQL (same attributes)

    Hello, good evening!

    I am trying to make a simple recommender system with dblp (dblp.uni-trier.de) database, using articles titles and authors attributes.

    My XML file looks something like:
    Name:  xml.jpg
Views: 100
Size:  35.5 KB

    I want to insert the data from my XML file to my MySQL data,
    Its noticed that the attribute <author> may repeat inside <article> node, then when I insert the data in MySQL database, using pentaho tool.

    Name:  pentaho.jpg
Views: 70
Size:  10.9 KB

    The result is like:

    Name:  dblpp.jpg
Views: 68
Size:  47.2 KB

    You can see that the input is always only one <author> (the first one is selected).
    I want to, if possible, duplicate the row but with the another <author>'s name (some articles can have 1, 2, 3 or more authors), then I want to make 1, 2, 3 or more rows for same article but with different <author> name.

    Another thing that can be think is merge <author> (input) names in same <author> (output) field.. something like (author1, author2, author3).

    Please help me with that,
    Thank you in advance.

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Don't aim at the article element with your chosen XPath Loop Expression.
    If you want to extract rows based on the repeatable author element, use something like //article/author to loop over the document, and adjust the Field XPath Expressions accordingly.
    author XPath becomes . and title XPath becomes ../title
    If you need further help, consider to attach sample data, not images.
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Oct 2015
    Posts
    3

    Default

    Quote Originally Posted by marabu View Post
    Don't aim at the article element with your chosen XPath Loop Expression.
    If you want to extract rows based on the repeatable author element, use something like //article/author to loop over the document, and adjust the Field XPath Expressions accordingly.
    author XPath becomes . and title XPath becomes ../title
    If you need further help, consider to attach sample data, not images.
    Hello Marabu, first thanks for you answer and sorry for late reply,

    I tried to use the //article/author but it doesnt returns anything on getfields button.

    Here is a sample of XML file...

    Code:
    <?xml version="1.0" encoding="utf-8"?>
    <!DOCTYPE dblp SYSTEM "dblp.dtd">
    <dblp>
    <article mdate="2011-01-11" key="journals/acta/Saxena96">
    <author>Sanjeev Saxena</author>
    <title>Parallel Integer Sorting and Simulation Amongst CRCW Models.</title>
    <pages>607-619</pages>
    <year>1996</year>
    <volume>33</volume>
    <journal>Acta Inf.</journal>
    <number>7</number>
    <url>db/journals/acta/acta33.html#Saxena96</url>
    <ee>http://dx.doi.org/10.1007/BF03036466</ee>
    </article>
    <article mdate="2011-01-11" key="journals/acta/Simon83">
    <author>Hans-Ulrich Simon</author>
    <title>Pattern Matching in Trees and Nets.</title>
    <pages>227-248</pages>
    <year>1983</year>
    <volume>20</volume>
    <journal>Acta Inf.</journal>
    <url>db/journals/acta/acta20.html#Simon83</url>
    <ee>http://dx.doi.org/10.1007/BF01257084</ee>
    </article>
    <article mdate="2011-01-11" key="journals/acta/GoodmanS83">
    <author>Nathan Goodman</author>
    <author>Oded Shmueli</author>
    <title>NP-complete Problems Simplified on Tree Schemas.</title>
    <pages>171-178</pages>
    <year>1983</year>
    <volume>20</volume>
    <journal>Acta Inf.</journal>
    <url>db/journals/acta/acta20.html#GoodmanS83</url>
    <ee>http://dx.doi.org/10.1007/BF00289414</ee>
    </article>
    <article mdate="2011-01-11" key="journals/acta/Blum82">
    <author>Norbert Blum</author>
    <title>On the Power of Chain Rules in Context Free Grammars.</title>
    <pages>425-433</pages>
    <year>1982</year>
    <volume>17</volume>
    <journal>Acta Inf.</journal>
    <url>db/journals/acta/acta17.html#Blum82</url>
    <ee>http://dx.doi.org/10.1007/BF00264161</ee>
    </article>
    <article mdate="2013-11-28" key="journals/acta/Schonhage77"> 
    <author>Arnold Sch&ouml;nhage</author>
    <title>Schnelle Multiplikation von Polynomen &uuml;ber K&ouml;rpern der Charakteristik 2.</title>
    <journal>Acta Inf.</journal>
    <volume>7</volume> 
    <year>1977</year> 
    <pages>395-398</pages>
    <url>db/journals/acta/acta7.html#Schonhage77</url>
    <ee>http://dx.doi.org/10.1007/BF00289470</ee>
    </article>
    </dblp>
    DataIntegration.ktr also the kettle file..

    what do you think I can do for fix it?

    Thanks in advance

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Action "Get Fields" isn't appropriate here, since element author doesn't contain elements.
    Attached Files Attached Files
    So long, and thanks for all the fish.

  5. #5
    Join Date
    Oct 2015
    Posts
    3

    Default

    Oh great! I didnt know about doing the xPath manually using ..// etc.. Now it works perfectly, thanks for help! <3

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.