Hitachi Vantara Pentaho Community Forums
Results 1 to 18 of 18

Thread: Apparent Bug in KettleComponent.java

  1. #1
    Join Date
    Oct 2007
    Posts
    18

    Default Apparent Bug in KettleComponent.java

    I think I have stumbled across a bug in KettleComponent.java. I made the
    attached Kettle transformation with Spoon and it runs successfully there.
    However, when I run it under the Pentaho planform from an xaction file,
    then the Table Input step does not run correctly.

    The problem seems to involve the Set Variables step before the Table Input.
    In line 428 of KettleComponent.java, the list of fields expected in
    the result set from the SQL query is obtained by calling method
    RowMeta.getFieldNames. However, the obtained list includes the two
    variables from the Set Variables step. This means that instead of
    expecting the 3 fields from the query, KettleComponent expects 5 fields.
    Later, in rowWrittenEvent line 771, this causes an array bounds exception
    when the code attempts to access the fourth (non-existent) field in
    the result set.

    I made a simple workaround by changing line 428 of KettleComponent to
    call a new method instead of getFieldNames. The new method returns only
    the fields that have are originated in the Table Input step. That fixe
    the array bounds exception.

    Having done that, I ran across one other minor issue. The transformation
    file uses &apos.; for single-quotes in the query, instead of just literal
    single quotes. When I run the transformation from an xaction file,
    this prevents variable substitution within the quotes. However,
    to literal single quotes fixes that problem. Not sure if that's what
    is intended or not, but when running the transformation under Spoon
    the &apos.; doesn't prevent variable substitution.

    I'm running Kettle 3.0.0 GA with Pentaho platform from tag 1.7.0.931.
    The KettleComponent.java I'm using is downloaded from:

    http://wiki.pentaho.org/display/EAI/...ntaho+platform

    Let me know if you need any other info.

    Regards,
    Larry
    Attached Files Attached Files

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Thanks Larry, I created a JIRA case for that: http://jira.pentaho.org/browse/PDI-574

  3. #3
    Join Date
    May 2006
    Posts
    4,882

    Default

    The problem with your set variables step is not a bug, just wrong usage. The set variables step will add the fields to the output stream. You can remove them afterwards with a select values step.

    AND you can not set a variable in the same transformation where you will use it... sometimes it will work, sometimes it wont

    For the single quotes raise a JIRA in the appropriate department.

    Regards,
    Sven

  4. #4
    Join Date
    Oct 2007
    Posts
    18

    Default

    I'm confused ... the reason I have set up the variables is so that
    the query in my Table Input step can use them. Once the
    variables are present in the stream, then it causes the Table
    Input to crash with the array bounds exception. So how can
    I set up variables and then use them in a query?

    I've tried using the ? approach, but these don't seem to get
    evaluated inside single quotes. (I'm trying to pass date
    parameters and I need to convert these using Postgres
    to_date which involves single quotes.)

    Regards,
    Larry

  5. #5
    Join Date
    May 2006
    Posts
    4,882

    Default

    Still have to debug it, but I guess the table output step requires some extra code hardening... you could use select values to remove the extra "variable" values and it would probably work without exceptions.

    For the non-use of a variable in the same transformation as where it's being set.... have a look at the "process all tables" example in the samples directory.

    If you only need to set 1 row of variables (in spoon, not so sure about xactions):
    - Make a job
    - start -> transformation 1 -> transformation 2
    - Transformation 1 does system info and just sets variables
    - Transformation 2 does the same system info and links to the table input. In the table input you can then use the variables set in transformation 1.

    Regards,
    Sven

  6. #6
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    There is no need to be confused. As mentioned all over this forum and in the FAQ, the rule is very simple:

    Thou shall not use and set a variable in the same transformation!
    As Sven mentions, it is because all steps run in parallel. (it's a feature, not a bug, see the FAQ)

    On the subject of your question, an example that very very very much applies to your situation (get input from a "Get System Info" step into a "Table Input" step) is shown in the Spoon documentation.
    I suggest you check it out.

    Matt

  7. #7
    Join Date
    May 2006
    Posts
    4,882

    Default

    + there's a difference between parameters and variables.

    Parameters never use 's around them

    Regards,
    Sven

  8. #8
    Join Date
    Oct 2007
    Posts
    18

    Default

    I originally tried using Get System Info and then feeding that into a
    Table Input which used the parameters. But I ran into problems with single
    quotes.

    Specifically what I'm trying to do is:

    - In the action sequence, I have three (Pentaho) parameters from the user which
    are a year, month and day, e.g. '2007', '12', '01'.

    - I intend to get these into KettleComponent as parameters using Get System
    Info.

    - Then, the Postgres query run by the transformation would be like this:

    select firstname, lastname from customer
    where orderdate > to_date('???', 'YYYYMMDD')

    And hopefully the parameters would be substituted to become:

    select firstname, lastname from customer
    where orderdate > to_date('20071201', 'YYYYMMDD')

    Postgres wants the text to be within single quotes. But the problem I have
    is that the parameters don't seem to get resolved when they are within
    single quotes, I guess because parameters work that way. Hence my interest
    in variables which seem to work even within single quotes.

    So I guess the best approach is a job with multiple transformations, setting
    up the variables in one and using them in the second, as suggested by Sven?

    Regards,
    Larry

  9. #9
    Join Date
    May 2006
    Posts
    4,882

    Default

    parameters don't use 's ... in Oracle you would need to use something as ? || ? || ? without any quotes.

    Regards,
    Sven

  10. #10
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Larry, just look at the example in the documentation. (Table Input step)

  11. #11
    Join Date
    Oct 2007
    Posts
    18

    Default

    Matt, Sven, thanks for your prompt feedback, which has helped a lot. I've
    now managed to do what I want in Spoon using parameters.

    However, when I deploy the transform and run it under KettleComponent, I
    encounter a problem. As an example of the problem, the attached transform
    gets one parameter (transform name) from a Get System Info and uses it
    in a Table Input. Under Spoon this works fine. Under KettleComponent,
    I get the following error:

    Error: Kettle.ERROR_0012 - Could not set up row listener - org.pentaho.di.core.exception.KettleStepException: Unable to get queryfields for SQL: select ? as f1, companyname as f2 from customer Couldn't get field info from [select ? as f1, companyname as f2 from customer] Location: exec query ERROR: syntax error at or near "as" (org.pentaho.plugin.kettle.KettleComponent)

    Debug: Starting execute of visionoss/reports/test.xaction (org.pentaho.core.solution.SolutionEngine)
    Debug: Getting runtime context and data (org.pentaho.core.solution.SolutionEngine)
    Debug: Loading action sequence definition file (org.pentaho.core.solution.SolutionEngine)
    Debug: SolutionRepository.DEBUG_FILE_PATH - getFile path=/usr/local/visionoss_reporting/pentaho-solutions/visionoss/reports/test.xaction (org.pentaho.repository.filebased.solution.SolutionRepository)
    Debug: Adding variable source request to parameter outputType (org.pentaho.repository.filebased.solution.SolutionRepository)
    Debug: Adding variable source request to parameter outputType (org.pentaho.repository.filebased.solution.SolutionRepository)
    Debug: Adding default value of "pdf" to parameter outputType (org.pentaho.repository.filebased.solution.SolutionRepository)
    Debug: Adding variable source runtime to parameter outputTypeList (org.pentaho.repository.filebased.solution.SolutionRepository)
    Debug: Adding default value of "[{report-output-desc=PDF, report-output-type-id=pdf}, {report-output-desc=Excel, report-output-type-id=xls}, {report-output-desc=Word, report-output-type-id=rtf}, {report-output-desc=Web Page, report-output-type-id=html}, {report-output-desc=Comma Separated Value, report-output-type-id=csv}]" to parameter outputTypeList (org.pentaho.repository.filebased.solution.SolutionRepository)
    Debug: Adding variable source response to parameter content (org.pentaho.repository.filebased.solution.SolutionRepository)
    Debug: audit: instanceId=f71dfc08-a974-11dc-8448-192dd0f83f6e, objectId=org.pentaho.core.runtime.RuntimeContext, messageType=action_sequence_start (org.pentaho.core.runtime.RuntimeContext)
    Debug: SolutionRepository.DEBUG_FILE_PATH - getFile path=/usr/local/visionoss_reporting/pentaho-solutions (org.pentaho.repository.filebased.solution.SolutionRepository)
    Debug: SolutionRepository.DEBUG_FILE_PATH - getFile path=/usr/local/visionoss_reporting/pentaho-solutions (org.pentaho.repository.filebased.solution.SolutionRepository)
    Error: Kettle.ERROR_0012 - Could not set up row listener - org.pentaho.di.core.exception.KettleStepException: Unable to get queryfields for SQL: select ? as f1, companyname as f2 from customer Couldn't get field info from [select ? as f1, companyname as f2 from customer] Location: exec query ERROR: syntax error at or near "as" (org.pentaho.plugin.kettle.KettleComponent)
    Error: RuntimeContext.ERROR_0012 - ActionDefinition for KettleComponent did not execute successfully (org.pentaho.core.runtime.RuntimeContext)
    Error: SolutionEngine.ERROR_0007 - Action sequence execution failed (org.pentaho.core.solution.SolutionEngine)
    Also, under Spoon, if I do a Show Output Fields on the Table Input, I get
    this error message:

    Unable to get queryfields for SQL:
    select ? as f1, companyname as f2 from customer

    Couldn't get field info from [select ? as f1, companyname as f2 from customer]
    Location: exec query
    ERROR: syntax error at or near "as"
    Any idea what the problem is?

    Regards,
    Larry
    Attached Files Attached Files

  12. #12
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    This is a syntax error in your SQL.
    Check your SQL reference or the community if you are using an open source database.

  13. #13
    Join Date
    Oct 2007
    Posts
    18

    Default

    Matt, thanks for your response.

    The query runs successfully under Spoon, where the ? is treated as a
    parameter marker. I've also tested the query manually, and there doesn't
    seem to be anything wrong with the syntax.

    The problem happens with KettleComponent, which does not appear to
    treat the ? as a parameter marker. So the query gets passed to Postgres
    with the ?, causing the syntax error.

    Shouldn't KettleComponent treat the ? as a parameter marker, as Spoon
    does?

    Regards,
    Larry

  14. #14
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    There should be no difference, both KettleComponent and Spoon use exactly the same code.
    If it runs in Spoon, it will run on the Pentaho platform.

    The only thing that can be different is the JDBC driver somewhere.
    So verify that the used libs are the same.

    All the best,

    Matt

  15. #15
    Join Date
    Oct 2007
    Posts
    18

    Default

    Matt, thanks for your response. I copied across the JDBC drivers used in Spoon
    to my Pentaho server but the problem remains.

    The issue seems to be getting the fields for the Table Input step. Although the
    transform runs correctly under Spoon, I can get the same error under Spoon
    if I do a Show Output Fields on the Table Input step.

    Looking into the code, TableInputMeta.getFields calls db.getQueryFields in a
    way that causes db.getQueryFields to treat the query as a non-prepared
    statement. (Parameter 'inform' is null.) So the query gets run directly and the
    ? causes the syntax error.

    It seems to me there is a problem in the code that gets the query's fields,
    regarding its treatment of prepared statement parameters.

    Am I missing something?

    Regards,
    Larry

  16. #16
    Join Date
    Oct 2007
    Posts
    18

    Default

    Matt, Sven,

    Any thoughts on this issue?

    I've looked into the code some more, and the problem definitely seems to be
    that TableInputMeta.getFields is not checking whether the query has
    parameters. So it calls Database.getQueryFields with the 'inform' and 'data'
    parameters as null, which means the query is not treated as a prepared
    statement. Then the transform crashes because of the resulting syntax
    error in the query caused by the ?s.

    You should be able to reproduce this problem by doing a Show Output Fields
    in Spoon for any Table Input step that uses parameters in the query.

    I think TableInputMeta.getFields needs to be fixed to handle parameters.

    I've had a little attempt at doing that fix, but it would require calling some
    methods on the TableInput object, and in getFields there is a TableInputMeta
    object instead. Is there an easy way to get the TableInput object from
    the TableInputMeta object there?

    Also, should I open a Jira case for this issue?

    Regards,
    Larry

  17. #17
    Join Date
    May 2006
    Posts
    4,882

    Default

    Raise a JIRA and it won't be forgotten.

    Regards,
    Sven

  18. #18
    Join Date
    Oct 2007
    Posts
    18

    Default Raised Jira case

    OK, I've raised Jira case PDI-585.

    Also, for future reference, attached is a transform that runs with the
    sample database in the Pentaho pre-configured installation, and which
    illustrates the problem. Just do a Show Output Fields on the Table
    Input step.

    Regards,
    Larry
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.