Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: Workaround for Excel Writer NPE when no rows exist

  1. #1
    Join Date
    Jun 2013
    Posts
    24

    Default Workaround for Excel Writer NPE when no rows exist

    Hi everyone,
    I have a question about a known bug for which I really need some form of a workaround.
    I have a transformation that gets data from two tables and writes it to a single spreadsheet. Each table's data goes to a different sheet in the workbook. We recently started running into issues where the job would fail with a Null Pointer Exception. After some troubleshooting, I found the JIRA above. Given that we still need this, I attempted to use the Detect empty stream step, as pictured below, as a workaround. However, if the second table input has no results, the Excel Writer corrupts the Excel file, and there is no data at all in it.
    Name:  NPETrans.jpg
Views: 144
Size:  12.3 KB

    Any idea how I can still get the Excel file written without corruption issues, even if there are no resulting rows? We are unlikely to get any form of version 6 in our Production environment, to fix this bug, so even when that fix is released, I don't know that it will help me here.

    Thanks.

  2. #2
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    Hi.
    Quite difficult to read that image, but why do you have 2 hops from Network, one to Excel Writer and one to Detect empty stream?
    If you write data to 2 sheets within the same Excel file, you should either:
    1. split stream in 2 and use a Block Step, so that sheet 2 is not done until sheet 1 has been populated
    2. split the transformation in 2 and use a job to sequence the transformations

    I am surprised that it was working when you tried to write data to 2 sheets of the same file at the same time.
    -- Mick --

  3. #3
    Join Date
    Jun 2013
    Posts
    24

    Default

    Thanks for the reply, Mick
    I tried several options, but when I upload the image, it is being resized from 1212x274 -> 600x136. I don't know of a way to address this - ideas?

    In any case, I do actually have your option 1 implemented. Here is an outline of what I have, in text form:
    • Table Input -> Excel Writer -> Blocking step (Wait for the previous Excel Writer)
    • Second Table Input -> Excel Writer
    • I have a second hop from the Second Table Input to a Detect empty row step, which outputs to the Excel Writer, as well


    Some additional info on the configuration for my Excel Writers:
    • The first Excel writer is set to create the file right away (I've not checked the "Wait for first row before creating file" option), so that a file is always created (the first Table Input step will always return data)
    • I have checked the option to "wait for first row before creating file" for the second Excel Writer, so it won't interfere with the first one
    • Both steps have their own sheet names, which they are configured to create or overwrite, and to use the existing file


    The second table input is where I could have no rows returned (Which, as the JIRA I linked to explains, results in an NPE for the Excel Writer). My hope was that using the Detect empty row step would address this (since it makes all output null). But this actually overwrites the whole file with a corrupt file.

    I did have the paths split up before, and even had two transformations, but the NPE error occurs regardless, so none of those seem to matter.

    I've also tried experimenting around with using a Union in my Second Table Input to ensure I always return a row, and also putting an extra Table Input after the Detect Empty Row that will return only empty columns. But the second sheet just never gets written if I only have an empty row. Also note, that I don't get any write errors, so there isn't a blocking situation going on - both Excel Writer steps can access the file just fine.

    I've attached another image, that is in JPG format when I uploaded it - maybe it will work better?
    Attachment 16366
    Attached Images Attached Images  

  4. #4
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    My read of the Jira entry is that if you don't specify the fields, you'll get the NPE.
    So would the work around not be as simple as specify the fields in the Excel Writer step?

  5. #5
    Join Date
    Jun 2013
    Posts
    24

    Default

    That would be awesome, if it were! The bug that I experienced is that, if your input has no data, the Excel Writer doesn't keep/access the Row Metadata properly, so it fails with an NPE.

    I will anonymize some stuff and post the transformation, if that helps.

  6. #6
    Join Date
    Jun 2013
    Posts
    24

    Default

    After a night to think through it, I will actually re-write this so that I get all of the data I need up front, and use some sort of column to flag what data needs to go on sheet 1 versus sheet 2. Making this back into 2 transformations kind of brings other complications into play, because I will need to duplicate metrics, etc. for both.

    Thanks for your time looking into this. Looking through JIRA, it seems that the NPE is a recurring issue for the Excel Writer that has cropped up several times, so I should probably do well to ensure that the step always has something to write (if even a blank row).

    Thanks again!

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.