Hitachi Vantara Pentaho Community Forums
Page 1 of 5 123 ... LastLast
Results 1 to 10 of 45

Thread: File specifications not checked

  1. #1
    Join Date
    Sep 2005
    Posts
    1,403

    Default File specifications not checked

    Attachment: kettle_repo.ktr I'm trying to create a very simple transformation in spoon 2.2.2 (from the binary Kettle download) in BlackDown Java 1.4.2 on Ubuntu 5.109: I'm reading from a PostgreSQL table, and outputting to a text file.

    The transformation is running and the file is created, but contains no data.

    When I do RMBM and "Check Selected Steps" on the text file output object there is a remark that says "File specifications not checked".

    Does anyone know what "file specifications" these are? I've filled in the file name, and extension, and have tried most of the others to see if I could get that message to go away.

    Does anyone know if this message is the reason the output file stays at 0 bytes?
    Or if the cause is something completely unreleated?

    Thanx!

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default RE: File specifications not checked

    Well, it probably just means that somehow you're not sending rows to the step.
    How many rows are in the table?

    Matt

  3. #3
    Join Date
    Sep 2005
    Posts
    1,403

    Default RE: File specifications not checked

    According to count() run from psgql, 1965367 rows.

    The command I used, is:
    select count(*) from idisc;
    (it's a PostgreSQL version of the freedb, CDDB alternative, created using the tools at this site: http://asmith.id.au/freedb-better.html )

    But I guess that the next case would be to see what pgsql does with the select statement from the input stage. When running, this stage takes a suspiciously short time; only around half a minute.

    Thanx!

    - Steinar

  4. #4
    Join Date
    Sep 2005
    Posts
    1,403

    Default RE: File specifications not checked

    When I try to preview the input step, I get a dialog saying:
    "Sorry, no rows can be found to preview."

    Nothing interesting can be found in the Log View. At least none that I can see:
    2006/03/18 20:45:47 - Spoon - Logging goes to /tmp/spoon.32515.log
    2006/03/18 20:45:48 - DBCache - Loading database cache from file: [/home/username/.kettle/db.cache]
    2006/03/18 20:45:50 - Spoon - Main window is created.
    2006/03/18 20:45:50 - Spoon - Asking for repository
    2006/03/18 20:45:50 - Kettle - Reading repositories XML file: /home/username/.kettle/repositories.xml
    2006/03/18 20:46:16 - be.ibridge.kettle.trans.Trans - Transformation is in preview mode...
    2006/03/18 20:46:16 - be.ibridge.kettle.trans.Trans - Dispatching started for filename [null]
    2006/03/18 20:46:16 - disc.0 - Starting to run...
    2006/03/18 20:46:16 - dummy.0 - Starting to run...
    2006/03/18 20:46:37 - disc.0 - Finished reading query, closing connection.
    2006/03/18 20:46:37 - disc.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=0
    2006/03/18 20:46:37 - dummy.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=0

    I tried removing the DB.cache file, but that didn't change the behaviour.


    - Steinar

  5. #5
    Join Date
    Nov 1999
    Posts
    9,729

    Default RE: File specifications not checked

    Hi Steinar,

    Try the database explorer, browse to the "idisc" table and try to get a few thousand rows in a preview window.
    Perhaps we're looking at a JDBC driver issue or something.
    Also, it would be nice if you could at least post the transformation. (File : Export to xml)

    Thanks,
    Matt

  6. #6
    Join Date
    Sep 2005
    Posts
    1,403

    Default RE: File specifications not checked

    Attempting to view the first 2000 rows of idisc gives me the following error message:

    java.lang.OutOfMemoryError

    java.lang.reflect.InvocationTargetException
    at org.eclipse.jface.operation.ModalContext.run(ModalContext.java:327)
    at org.eclipse.jface.dialogs.ProgressMonitorDialog.run(ProgressMonitorDialog.java:447)
    at be.ibridge.kettle.core.dialog.GetPreviewTableProgressDialog.open(Unknown Source)
    at be.ibridge.kettle.core.dialog.DatabaseExplorerDialog.previewTable(Unknown Source)
    at be.ibridge.kettle.core.dialog.DatabaseExplorerDialog$8.widgetSelected(Unknown Source)
    at org.eclipse.swt.widgets.TypedListener.handleEvent(TypedListener.java:90)
    at org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:66)
    at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1021)
    at org.eclipse.swt.widgets.Display.runDeferredEvents(Display.java:2867)
    at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:2572)
    at be.ibridge.kettle.core.dialog.DatabaseExplorerDialog.open(Unknown Source)
    at be.ibridge.kettle.spoon.Spoon.exploreDB(Unknown Source)
    at be.ibridge.kettle.spoon.Spoon$57.handleEvent(Unknown Source)
    at org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:66)
    at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1021)
    at org.eclipse.swt.widgets.Display.runDeferredEvents(Display.java:2867)
    at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:2572)
    at be.ibridge.kettle.spoon.Spoon.readAndDispatch(Unknown Source)
    at be.ibridge.kettle.spoon.Spoon.main(Unknown Source)
    Caused by: java.lang.OutOfMemoryError

    So I guess it's not all that mysterious. I need to up the heap size of the spoon process, right?

    The exported transform is attached, but I guess it isn't all that interesting.

  7. #7
    Join Date
    Sep 2005
    Posts
    1,403

    Default RE: File specifications not checked

    Upped the maximum heap size from the default 256m to 1024m (the machine I'm running on has 2GB of physical memory). That made me run into a different problem:
    <pre>Couldn't get row from result set
    Invalid character data was found. This is most likely caused by stored data containing characters that are invalid for the character set the database was created in. The most common example of this is storing 8bit data in a SQL_ASCII database.


    java.lang.reflect.InvocationTargetException: Couldn't find any rows because of an error :be.ibridge.kettle.core.exception.KettleDatabaseException:
    Couldn't get row from result set
    Invalid character data was found. This is most likely caused by stored data containing characters that are invalid for the character set the database was created in. The most common example of this is storing 8bit data in a SQL_ASCII database.

    at be.ibridge.kettle.core.dialog.GetPreviewTableProgressDialog$1.run(Unknown Source)
    at org.eclipse.jface.operation.ModalContext$ModalContextThread.run(ModalContext.java:113)
    Caused by: be.ibridge.kettle.core.exception.KettleDatabaseException:
    Couldn't get row from result set
    Invalid character data was found. This is most likely caused by stored data containing characters that are invalid for the character set the database was created in. The most common example of this is storing 8bit data in a SQL_ASCII database.

    at be.ibridge.kettle.core.database.Database.getRow(Unknown Source)
    at be.ibridge.kettle.core.database.Database.getRows(Unknown Source)
    at be.ibridge.kettle.core.database.Database.getFirstRows(Unknown Source)
    ... 2 more
    Caused by: org.postgresql.util.PSQLException: Invalid character data was found. This is most likely caused by stored data containing characters that are invalid for the character set the database was created in. The most common example of this is storing 8bit data in a SQL_ASCII database.
    at org.postgresql.core.Encoding.decodeUTF8(Encoding.java:287)
    at org.postgresql.core.Encoding.decode(Encoding.java:182)
    at org.postgresql.core.Encoding.decode(Encoding.java:198)
    at org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1ResultSet.java:206)
    ... 5 more
    </pre>

    The raw data is supposed to be in either latin1, ascii and utf8, but I guess I have to convert everything to utf8 as outlined in http://asmith.id.au/freedb-unicode.html and store it in a PostgreSQL database created with -E UNICODE

  8. #8
    Join Date
    Nov 1999
    Posts
    9,729

    Default RE: File specifications not checked

    Hi Steinar,

    Concerning the heap size. I guess you can consider it a PGSQL jdbc bug to consume that much memory.
    Normally, the driver has to only consider (in Kettle&#39;s case) 5000 rows in memory which would never max out the 256M.
    MySQL has a similar "feature" that has still has not been resolved after many years:

    Select * from tab1

    causes an out-of-memory exception with very large tables.
    The worst about it is that in case you have let&#39;s say 10 bilion rows in tab1, there is no solution at the moment...

    Matt

  9. #9
    Join Date
    Sep 2005
    Posts
    1,403

    Default RE: File specifications not checked

    Hm... now I&#39;ve done a cleanup of the charset issues (it&#39;s all normalized into UTF-8 in the PostgreSQL database).

    I&#39;ve also upped the maximum heap size to 1GB.

    But I still get the out of memory exception when exploring the idisc table, even when I try limiting to previewing the first 100 rows of the table.

    The actual cause of the out of memory exception seems to be lost. Looks like it&#39;s caught by a catch-all clause at the top...? Is there a way to determine the actual cause of the exception?

    Thanx!


    - Steinar

  10. #10
    Join Date
    Sep 2005
    Posts
    1,403

    Default RE: File specifications not checked

    I started with a fresh spoon process and tried previewing the first 100 lines of
    the idisc table, while watching the memory usage with top.

    The memory usage for the spoon process skyrocketed straight up to 1g, and then
    I got the caught exception.

    It sure would be interesting to see what caused this. Maybe I should try using
    some sort of memory profiler on spoon while browsing this table...?

    - Steinar

    PS on a side note, I&#39;m not deliberately posting anonymously. I&#39;m waiting for a
    confirmation email

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.