Hitachi Vantara Pentaho Community Forums
Results 1 to 10 of 10

Thread: Something broken on Excel Writer output step?

  1. #1
    Join Date
    Feb 2011
    Posts
    840

    Default Something broken on Excel Writer output step?

    I think this should be directly to Slawo, but maybe I'm just doing something wrong...

    I have a job. And inside it, there are 2 transformations: one writes data on an excel report using Excel Writer, template and all since it's a report for the boss. The next one gets the data from this report and updates it on SQL.

    Problem is here: when I check SQL, there's a lot of data missing.

    if I open the report on Excel or OpenOffice's Calc, everything shows fine - but when I close it asks for me to save, like if I had changed something. And if I change the filename on the Excel input, including a static file path so I can preview - it shows just like on SQL, with missing data, different from when I open the spreadsheet on Excel or Calc.

    I'm attaching below screenshot of the preview on the file created by the first transformation - along with the screenshot of the transformation. Everything is setup like it should on the Excel Writer steps: wait for first row, force formula recalc and leave styles.
    Name:  2011-03-16 16h03_52.png
Views: 530
Size:  47.3 KB
    Name:  2011-03-16 16h03_06.jpg
Views: 502
Size:  17.6 KB
    See there's only 3 values different from null or 0,0? Same thing shows up on SQL. When I open the file on Excel/Calc, most of them are populated with data.

    -=edit=- oh yeah, if I open the spreadsheet, choose to save, and preview again, it shows the information correctly. I think "force recalc" on Slawo's step only forces recalc on values updated by Kettle, not on the whole file - and the rows/fields giving problem are ones that have formulas involving values added via Kettle, but aren't modified by Kettle.
    Last edited by joao.ciocca; 03-16-2011 at 03:34 PM.

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Since all steps in a transformation run in parallel I'm having a hard time figuring out how you can write to the same file in parallel. Since it's technically impossible I'm just wondering.

  3. #3
    Join Date
    Feb 2011
    Posts
    840

    Default

    The writing itself is doing fine, the "block until" step took care of it...

    As I've said, when I open the resulting file on Excel/Calc, all values show up as they should - and when I try to close the file, the program asks if I want to save. If I save as asked, when I preview the file on Kettle it shows as it should. But if I don't save - or don't even open it on Excel/Calc - and try to preview it in Kettle, it shows as pictured on the first post.

    -=edit=- BTW Matt, it's not only possible, but repeatable, as Ashish could do the same on that other thread...

  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    I'm just saying that the "Block" steps serve little to no purpose since it offers no guarantee for the order of execution vs. the execution of the other output steps.
    However, since all is perfectly OK I'll refrain from further comments :-)

  5. #5
    Join Date
    Feb 2011
    Posts
    840

    Default

    yeah... I've been meaning to ask you more about the block steps...

    both the block and the block until step finish. They do made me think from the first time I've seen them that they'd serve as flow-control steps, as I've asked on another topic before, and you've said that it wasn't like that.

    But I've got it working in several cases like this. Here I can show you another example where it works - as the one above, using the Block until & Excel Writer...
    Name:  2011-03-16 19h08_10.png
Views: 495
Size:  15.0 KB
    Block 2 blocks until "DLO Contas" is finished, Block 3 blocks until "DLO Detalhamento" is finished, in this example.

    that's the reason I've said that I think the problem is in Excel Writer not forcing recalc of formulas already present on the template...

  6. #6
    Join Date
    Sep 2009
    Posts
    810

    Default

    Hi there,

    would you be so kind and attach a minimal sample that shows the issue?

    I'll have a look and maybe I can provide a fix or workaround.

    btw: the blocking works because the excel step can be configured to wait for the first line before doing anything (especially opening a file), and it signals "finished" only after having written and closed the file it is working on. Therefore you can configure them to reliably wait for each other.

  7. #7
    Join Date
    Sep 2009
    Posts
    810

    Default

    May I also suggest to consider a different approach to updating your DB? It sounds like you're relying on the formulas you got in the excel sheet to generate relevant data for you. I'd consider using a more reliable source for calculated data, like an SQL db or Mondrian. If you'd need strong guarantees on the results you're getting from an Office document you'd have to use MS libraries on a windows machine. Some people report success executing scripts through the windows scripting facilities (I.e. executing a VB script that reads/manipulates documents on the command line and letting it generate data in a more accessible format)

    Cheers

    Slawo
    Last edited by slawomir.chodnicki; 03-17-2011 at 04:21 AM. Reason: typo

  8. #8
    Join Date
    Feb 2011
    Posts
    840

    Default

    I'm not so much into scripting things - what I like most about Kettle is being able to do a lot of things without having to script =p

    I'm attaching the "DLO template.xls" that I use. The first sheet, "Cálculo Lim. Compatib. PR-PRE" has some formulas that, from what I've seen, aren't updated by Excel Writer output. Specifically, cels B6, B7, B9, B10, B41, B43, B44 and B46.

    As I told before, when I run the transformation and open the file on Excel/Calc, the formulas are showing the correct value, already calculated. But if before opening/saving on Excel/Calc I preview the sheet on Kettle, it shows those zeros.

    Isn't there anyway to make it so the "force recalc" feature force the recalc of pre-existing formulas? Or even, a new feature on excel writer, to rewrite formulas as their values - like the "Paste as value" on Excel?

    Also... seems like Excel Input won't read formula values correctly. When it tries to read the cel B4 (which contains formula =B5-B6), I'm getting this error:
    Code:
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Error processing row from Excel file [F:\CAPITAL\DLO\2011\2015-06\DLO 2015-06.xls] : org.pentaho.di.trans.steps.excelinput.KettleCellValueException: Cannot convert field "VR_CONTA" from cell on sheet 1, row 3, column 2: 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.pentaho.di.core.exception.KettleValueException: 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Unexpected conversion error while converting value [v String] to a Number
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : v String : couldn't convert String to number 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Unparseable number: "VALOR"
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Unexpected conversion error while converting value [v String] to a Number
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : v String : couldn't convert String to number 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Unparseable number: "VALOR"
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.pentaho.di.trans.steps.excelinput.KettleCellValueException: Cannot convert field "VR_CONTA" from cell on sheet 1, row 3, column 2: 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.pentaho.di.core.exception.KettleValueException: 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Unexpected conversion error while converting value [v String] to a Number
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : v String : couldn't convert String to number 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Unparseable number: "VALOR"
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Unexpected conversion error while converting value [v String] to a Number
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : v String : couldn't convert String to number 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Unparseable number: "VALOR"
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : 
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.pentaho.di.trans.steps.excelinput.ExcelInput.fillRow(ExcelInput.java:231)
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.pentaho.di.trans.steps.excelinput.ExcelInput.getRowFromWorkbooks(ExcelInput.java:644)
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.pentaho.di.trans.steps.excelinput.ExcelInput.processRow(ExcelInput.java:450)
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : org.pentaho.di.trans.step.RunThread.run(RunThread.java:40)
    2011/03/17 15:25:31 - Compatibilização PR/PRE.0 - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : java.lang.Thread.run(Unknown Source)
    Which is just weird, since I told Excel input to start at row 3 - and "VALOR" is in row 2, not row 3 (or am I understanding wrong? from what I get, Kettle's row 0 = Excel's row 1 and so on...)

    I'm attaching the transformation with the problem.

    Also, after I try to run the transformation for the first time, I am able to use the Preview button on the Input step that's giving that error, and it shows an error on the log, but shows up the preview correctly, with the same data that's on the spreadsheet - starting from row 3, and not row 2...
    Code:
    2011/03/17 15:25:32 - 2041_Atualiza_Outras - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Erros detectados!
    2011/03/17 15:25:32 - 2041_Atualiza_Outras - ERROR (version 4.1.0-stable, build 14410 from 2010-11-16 16.43.28 by buildguy) : Erros detectados!
    "Erros detectados!" = "Detected errors!" in portuguese...
    Attached Files Attached Files
    Last edited by joao.ciocca; 03-17-2011 at 02:35 PM. Reason: included full log on the first code box.

  9. #9
    Join Date
    Sep 2009
    Posts
    810

    Default

    Hi there,

    I've taken a look at the code again and for XLS files the Excel writer sets a "dirty" flag on all formulas of all sheets when "force recalc" is turned on. Excel will recalculate the values when the file is opened. This explains your experience of not being able to read the values unless the file was opened by Excel.
    For XLSX Files all formulas are actually recalculated because the XLSX format does not seem to support a "dirty" flag. At least the POI API does not.

    Having "force recalc" and a separate "set dirty flag on formula cells" option could be implemented. I suggest you'd open a jira case for that.

    To "fix" your issue you may try to work with an XLSX output file, it may help.

    As to the other issues, they seem to be related to *reading* XLS files, and they probably deserve a new thread.

    Cheers

    Slawo
    Last edited by slawomir.chodnicki; 03-17-2011 at 03:56 PM.

  10. #10
    Join Date
    Feb 2011
    Posts
    840

    Default

    Jira case for Excel Writer is open: http://jira.pentaho.com/browse/PDI-5488

    I'm adding a new thread on the Excel Input step then.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.