Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: PDI HTML to CSV conversion

  1. #1

    Default PDI HTML to CSV conversion

    Hello everyone,

    I have total 300 jobs which are calling multiple transformations as well as jobs which further contain many transformations.
    I have already created HTML documentation of these jobs which show source/target, database connection etc information. Is there any way to convert this HTML format to Excel format ? This Excel format should be in proper tabular format.

    Note:: I have already gone through some online converters but they are not giving me output in proper format.

    Thanks in Advance

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Why would you convert HTML to CSV?
    Of course, it's easy to convert tabular data between different formats (HTML/TABLE, CSV, XLS, ...), but why don't you create CSV files in the first place?
    So long, and thanks for all the fish.

  3. #3

    Default

    Thanks marabu for ur reply.
    I have used the auto-documentation using this reference http://www.kjube.be/presentations/PC...landBouman.pdf which creates the documentation of all transformations and jobs, in html format.
    So now, i need to get this output in excel tabular format.
    How should i achieve this?

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    At first make sure the generated HTML is well formed XML.
    Then use one of the XML parsers offered by Kettle - Get-Data-From-XML or XML-Input-Stream.
    So long, and thanks for all the fish.

  5. #5
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    You do know that the KTR files are well formed XML, right?
    So instead of going KTR -> HTML -> Excel, it would make a lot more sense to go KTR -> Excel

  6. #6
    Join Date
    May 2016
    Posts
    282

    Default

    Thanks a lot for pointing out the wonderful auto-documentation project by Rolan Bouman, I wasn't aware of it and it's much more elaborated than the autodocumentation operator in Pentaho and the sample provided. When I saw the document generated from my own project I almost cried... Such an easy way to keep your documentation updated, just use the descriptions while your are developing the transformations and jobs, and when you commit your changes to your control version system, just regenerate the documentation of the whole project... And a lot of so called professional (and very expensive) ETL tools lack this...

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.