Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: XSLT Transformation

  1. #1
    Join Date
    Sep 2005
    Posts
    1,403

    Default XSLT Transformation

    I need to transform one complex XML file into another using an XSLT Transformation. Could you tell me how I can do this on Kettle.

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default RE: XSLT Transformation

    I can't really tell you as I don't know what the XSLT is doing and I don't know a lot about XSLT Transformations anyway.

    Kettle might or might not be useful for such a thing as "complex XML" can mean just about anything.

    It seems that a lot of people think that by refering to the term XML they somehow think that this term makes everything clear. When in fact, XML just means that information is structured according to a certain standard. It doesn't say anything about the content or the strcuture itself. In essence, XML is just an interface, nothing more, nothing less.

    All the best,

    Matt

    XML Quote: :-)
    ------------------
    There was this fellow in Hell
    Said, things are going to well
    I'll put a stop
    to work in that shop
    They'll worship their new XML

  3. #3
    Join Date
    Jul 2007
    Posts
    2,498

    Default

    Sorry to revive this old thread;

    Just wanted to know if something has been done regarding this issue. I have to do lots of transformations of xml, and an action that would allow a xslt transformation would be nice.

    I have a lot of projects (EDI) where different formats of xml need to be parsed into db tables (and the opposite); Correct me if I'm wrong, but its still a bit difficult to do this kind of mapping (muitilevel xml<->rdbm) in kettle, am I correct? If not, where can I look for further info?

    Thanks
    Pedro Alves
    Meet us on ##pentaho, a FreeNode irc channel

  4. #4

    Default

    Hi,
    I don't know if it will help you, but you can use the XSLT job entry in PDI.
    It output a file from an xml and an xsl.
    I am working in the equivalent step (produce a result stream).

    Rgd

    Samatar

  5. #5
    Join Date
    Jul 2007
    Posts
    2,498

    Default

    Ok, so let me see if I got it (I'm new to kettle)

    I want to do 2 things: A -> read files of the tipe order-1234.xml from a dir and save them to a rdbm - lets say order_header and order_lines table and B2-> read invoices from that same rdbm, invoice_header and invoice_lines and generate invoice-1234.xml

    What would be my approach in kettle? After reading some docs, looking at some examples, I would guess something like:

    A -

    1 -> define a transformation that reads order.xml and stores it into the rdbm
    2 -> define a job that listens to the incoming dir for all orders.xml and delete the source file after successful transformation

    I guess that's all I need...


    for B -

    1 - define a transformation that uses a query that returns all the info I need from the tables invoice_header and invoice_lines based on a invoce id
    2 - write it to xml as is
    3 - define a job that finds all unprocessed invoice ids
    4 - run the transformation for all those ids
    5 - for each output, make a xslt transformation

    I dont know if there is any way for step 2 to output info on the form <header>....</header><lines><line>...</line></lines>, just to make xsl development easier, but I would guess some kind of grouping step. xml handling samples are really few.


    Thanks a lot for the help, every tip appreciated
    Pedro Alves
    Meet us on ##pentaho, a FreeNode irc channel

  6. #6
    Join Date
    May 2006
    Posts
    4,882

    Default

    For the second part you can look at one of the XML examples (under the sample directory) "XML Add - creating multi-level XML files.ktr" where multi-level XML is being created... but it's not for the faint-hearted and I would consider it very high-maintenance ;-)

    ETL tools are better for row based stuff. Think of it this way... rows arrive at the XML step 1 by 1... it needs to decide what to do based on the current row (or possible a history of past rows), building multi-level XML with something like that is hard to do. None of the ETL tools I worked with have a good solution for it, all limit output possibilities to something as Kettle.

    Regards,
    Sven
    Last edited by sboden; 08-13-2007 at 03:18 AM.

  7. #7
    Join Date
    Jul 2007
    Posts
    2,498

    Default

    I opened that example and after a quick look I closed it even faster

    well, I just remembered that for the second part I can use a string concatenation of the 2 outputs and get something of the form:
    <result><table1>....output....<table2>output...</table2><table2>...</table2><result>

    Its very easy to get from this to a multilevel xml like the one from the example with xslt. I think it must be a cleaner approach.

    The thing I don't quite get is how to guarantee that I will only process one invoice only one time... maybe by checking the files I've generated?
    Pedro Alves
    Meet us on ##pentaho, a FreeNode irc channel

  8. #8
    Join Date
    May 2006
    Posts
    4,882

    Default

    The thing I don't quite get is how to guarantee that I will only process one invoice only one time... maybe by checking the files I've generated?
    Don't know what you mean with this.

    What other people also successfully got away with is using e4x in a javascript step to generate xml snippets. But that requires upgrading the js.jar in your installation and adding some other jars.

    Regards,
    Sven

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.