Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Creating Excel-output from multiple template files

  1. #1
    Join Date
    Nov 2014
    Posts
    4

    Default Creating Excel-output from multiple template files

    Hi,

    I am trying to develop a job that is able to generate 1 Excel-file with multiple tabs. Basically, what I want to achieve is:

    • tab 1 is based on template_1.xlsx
    • tab 2 is based on template_2.xlsx
    • the output file (say: output.xlsx) contains tab 1, 2


    Transformation 1 has 1 Excel Writer step with the following settings:
    - Extension: xlsx 2007
    - If output file exists: replace with new output file
    - Make this the active sheet: Y
    - If sheet exists in output file: write to existing sheet
    - Use template file when creating new files: Y
    - Templates file: /path/to/template/file/1

    Transformation 2 has 1 Excel Writer step with the following settings:
    - Extension: xlsx 2007
    - If output file exists: use existing file for writing
    - Wait for first row before creating file: Y
    - If sheet exists in output file: replace with new sheet
    -Use template when creating new sheets: Y
    - Template sheet: /path/to/template/file/2

    The jobs runs fine until up until Transformation 1. In Transformation 2, I get the following error:
    ERROR 22-11 15:38:00,416 - Microsoft Excel Writer - Error opening new file
    ERROR 22-11 15:38:00,416 - Microsoft Excel Writer - org.pentaho.di.core.exception.KettleException:
    Tempate Sheet: /........./template_2.xlsx not found, abortingI am absolutely sure that the path is correct and I tried different ways of writing the file path, but do not understand why the template file can't be found. I am developing this for Data Integration 4.4-SNAPSHOT. I have also done tests with 6.1, but I'm facing the same issues.

    Any suggestions on how to get this working? Or might it be that what I want isn't possible at all? I would also be fine if I needed to create to Excel output files that could be merged somehow, but as far as I know, PDI does not support this.

    Thanks for your suggestions!

  2. #2
    Join Date
    Aug 2011
    Posts
    360

    Default

    Hi, when using a template sheet, you need to provide the name of the sheet to use in the file your are writing to,
    not a filename.
    So how to do it:
    Create only one template file, with two sheets:
    One sheet coresponding to you template 1, and the other sheet
    corresponding to your template 2.
    Then in transformstion 1, just write to sheet 1, using template file.
    In transformation 2, just write to existing sheet 2, no need for a template anymore.

    If you need to write multiple sheets with template sheet 1 and multiple with template 2,
    Just crrat two sheets in the template file, name them Style1, Style2
    then use Style1 or Style2 as your template sheet when creating new sheets.
    You just have to mask the sheets in the template so then dont show up for the end users

  3. #3
    Join Date
    Nov 2014
    Posts
    4

    Default

    Hi Mathias,

    Unfortunately, this is not a solution that suits my needs.

    What I am developing is a standard job for generating Excel-reports with customized content. The idea is basically to have one 'master job' that pulls data from some reporting catalog. The job then generates an Excel file with a standard cover sheet and a few other sheets with data as configured in the reporting catalog. Finally, the job would include one report-specific transformation that does the custom report stuff and adds the resulting sheets to the output file. Hence, I need one template for the generic stuff such as the cover sheet, and one template for the report-specific stuff.

  4. #4
    Join Date
    Aug 2011
    Posts
    360

    Default

    Quote Originally Posted by fantastisch View Post
    Hi Mathias,

    Unfortunately, this is not a solution that suits my needs.

    What I am developing is a standard job for generating Excel-reports with customized content. The idea is basically to have one 'master job' that pulls data from some reporting catalog. The job then generates an Excel file with a standard cover sheet and a few other sheets with data as configured in the reporting catalog. Finally, the job would include one report-specific transformation that does the custom report stuff and adds the resulting sheets to the output file. Hence, I need one template for the generic stuff such as the cover sheet, and one template for the report-specific stuff.
    Hi, ok that's a hard one.
    Maybe you can try that, dont know if it works:
    Do like you did, but for the template 2 sheet, write
    [Path to template2 excel file]![name of template 2 sheet]
    (Without the brackets)
    This is the syntax in excel to reference external sheets, but i dont know
    if the pdi library will understand that and get the sheet from external file.

    If it dont work, the other solutions I know is:
    - open a JIRA case on jira.pentaho.com aksing evolution of excel writer such that template sheet
    can be specified as a sheet in another file (i will vote for it, this might be really handy)
    - modify by yourself the step and patch your install...
    - use Pentaho Reporting instead to create your report.

  5. #5
    Join Date
    Nov 2014
    Posts
    4

    Default

    Hi Matthias,

    Thanks for your suggestion! Unfortunately, the syntax template2.xlsx!sheet2 gives me the same error.

    One more thing I tried is to use xls (97-2003) instead of xlsx (2007-), but this also doens't make a difference.

    I'm not a Java developer, so modifying the step myself is not a possibility for me, so I will open a JIRA case.

  6. #6
    Join Date
    Apr 2016
    Posts
    156

    Default

    Quote Originally Posted by fantastisch View Post
    Unfortunately, this is not a solution that suits my needs.

    What I am developing is a standard job for generating Excel-reports with customized content. The idea is basically to have one 'master job' that pulls data from some reporting catalog. The job then generates an Excel file with a standard cover sheet and a few other sheets with data as configured in the reporting catalog.
    PDI is meant as a data integration tool, not a reporting tool. I realize it's very easy to to consider PDI to be a reporting tool since functions like the Excel Template output make it so easy.

    If you wish to keep your requirement (standard cover + tabs from reporting catalog)... why not use PDI for what it is good for (single-template files) and other tools/scripts to add them.

    For example: use PDI to generate single-page Excel files using way mentioned above (and grabbing the template from your 'report catalog' for each page), then pass the output filenames to a script in some scripting language that will combine the files into one output file that functions as your report?
    My runtime environment: MacOS, JDK 1.8u121, PDI 7.0

  7. #7
    Join Date
    Nov 2014
    Posts
    4

    Default

    Hi adamnyc,

    I completely agree that PDI is intended as an ETL tool and not for generating 'fancy' Excel-reports. However, the company that I'm currently working has quite some 'legacy' Excel-reports, and it's a huge task to rebuild them with data visualisation tooling. This is a lengthy process, so I'm trying to develop something that could make out lives easier in the meantime :-) I'll have a look at your suggestion to use some tool that could merge Excel files. If a reliable tool exists, calling it with a Execute Process step might as well do the trick for me!

  8. #8
    Join Date
    Apr 2016
    Posts
    156

    Default

    In the past I've had good success in Windows environment with this tool from http://www.oaultimate.com/ (not related to the company; just sharing what I've used).

    URL: http://www.oaultimate.com/office/com...s-and-csv.html Scroll to bottom for download.

    Browsing that page again I see it can take a text file input and process the merge... didn't realize that in past! Looks like that format can relatively easily be put together using standard PDI steps. :-)


    Edit: to clarify, that tool was used for data copying; no formulas/formatting... if your fancy-pants Excel spreadsheets use custom formatting, that tool may not copy them correctly.
    Last edited by adamnyc; 11-23-2016 at 11:56 AM. Reason: Updated experience w/ custom formatting
    My runtime environment: MacOS, JDK 1.8u121, PDI 7.0

  9. #9
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Quote Originally Posted by fantastisch View Post
    I completely agree that PDI is intended as an ETL tool and not for generating 'fancy' Excel-reports. However, the company that I'm currently working has quite some 'legacy' Excel-reports, and it's a huge task to rebuild them with data visualisation tooling.
    You could rebuild one or two of them in PRD, and then run them (especially if you use the same report over and over and over as many reports are...) through PDI.

    Really strange looping:
    Use PDI to run repeated executions of PRD reports
    Each PRD report that is run can execute multiple PDI/KTR to collect and prepare data

    So you can have PDI running PRD running PDI.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.