Hitachi Vantara Pentaho Community Forums
Results 1 to 18 of 18

Thread: How can i set variables and iterate between them based on a file with those values?

  1. #1
    Join Date
    Oct 2013
    Posts
    23

    Question How can i set variables and iterate between them based on a file with those values?

    Hello,

    I´ve created one ETL on PDI that has 4 transformations.
    On the first transformation i use the set variables step to set two variables (A and B).
    Those variables are used on transformations 2, 3 and 4. I use them on the select query on table inputs.
    The thing is that i have 80 pairs of (A,B) values that i have to run on this ETL.
    I could run one by one setting the A and B variables value on the default field. So i would need to run the ETL 80 times, setting each time both variables values manually.

    I´ve tried putting on the first transformation an excel input (with 2 columns, A and B, and the 80 pairs of values). And i used the output of this excel file as input to Set Variable step.
    When i ran the ETL i got the follow error:

    2016/11/14 16:19:26 - Set Variables.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : org.pentaho.di.core.exception.KettleStepException:
    2016/11/14 16:19:26 - Set Variables.0 - Only 1 input row was expected to set the variables and at least 2 were received.

    Is there a way to do this? So i could run only one time the ETL and it would iterate through the 80 pairs of variables values.

    Example:

    A | B
    1 test
    2 beta
    3 book

    Based on this file/excel input kettle would ran the transformations 2, 3 and 4 one time for (1,test) ; another time for (2,beta); and one last time for (3, book).

    Maybe there is a way to do this on the job level.

    Thank you very much.

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Terminate the dataflow in your first transformation with Copy-Rows-To-Result.
    Now wrap the remaining transformations in a subjob and let the subjob create variables from the fields in the result stream, row by row.
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Oct 2013
    Posts
    23

    Default

    Quote Originally Posted by marabu View Post
    Terminate the dataflow in your first transformation with Copy-Rows-To-Result.
    Now wrap the remaining transformations in a subjob and let the subjob create variables from the fields in the result stream, row by row.
    Marabu,

    I´ve set the main job with this flow: start -> transformation1 -> subjob -> finish.

    Transformation 1: excel input -> copy rows to result.

    Subjob: Transformation2 -> transformation3 -> Transformation4.

    I don´t understand how can i "let subjob create variables from the fields/values in the result stream, row by row".

    Could you please detail it for me?

    Thank you for your time and attention,

    Guilherme.

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Define your variables in the subjob properties (Ctrl-T in subjob) as Parameters and check the advanced settings (double-click on the job symbol).


    Name:  214978.png
Views: 438
Size:  27.3 KB
    So long, and thanks for all the fish.

  5. #5
    Join Date
    Oct 2013
    Posts
    23

    Default

    Quote Originally Posted by marabu View Post
    Define your variables in the subjob properties (Ctrl-T in subjob) as Parameters and check the advanced settings (double-click on the job symbol).


    Name:  214978.png
Views: 438
Size:  27.3 KB
    Thanks,Marabu.
    I´ve double clicked the subjob, on parameters i´ve checked "pass all parameter values down to the sub-job", and i´ve put 2 entries:
    1)identifier_old, identifier_old, null;
    2)parent_id_novo, parent_id_novo, null.

    identifier_old and parent_id_novo are the two columns name on my excel file (the name of my former variables).

    On advanced option of the sub-job propertie i´ve checked "Copy previous results to parameters" and "execute for every input row".

    For example, on my input table query on transformation 1 i had this sql:

    SELECT
    id
    , name
    , description
    , homepage
    , 0 as is_public
    ,${parent_id_novo} as parent_id
    , NOW() as created_on
    , updated_on
    ,CONCAT(SUBSTR(identifier,1,123),'-po2017') as identifier
    , status
    , lft
    , rgt
    ,1 as inherit_members
    , default_version_id
    FROM projects
    where parent_id =
    (SELECT id FROM projects where identifier = ${identifier_old}) AND status=1
    Where i´ve checked the option "replace variables in script".
    But when i ran it giving me syntax error...
    Do i have to call the two parameters on a diferent way than i was calling the variables?
    I called variables on the query using ${VariableName}.

    Thank you.

  6. #6
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Parameters are variables which can be set from the outside, so you refer to them in the same way when it comes to textual substitution.

    Are you not told the exact position of the syntax error?
    ${identifier_old} must be single quoted if identifier is a string.
    So long, and thanks for all the fish.

  7. #7
    Join Date
    Oct 2013
    Posts
    23

    Default

    Quote Originally Posted by marabu View Post
    Parameters are variables which can be set from the outside, so you refer to them in the same way when it comes to textual substitution.

    Are you not told the exact position of the syntax error?
    ${identifier_old} must be single quoted if identifier is a string.
    I got the follow error:

    2016/11/18 15:34:46 - input projects.0 - An error occurred executing SQL:
    2016/11/18 15:34:46 - input projects.0 - SELECT
    id
    , name
    , description
    , homepage
    , 0 as is_public
    ,${parent_id_novo} as parent_id
    , NOW() as created_on
    , updated_on
    2016/11/18 15:34:46 - input projects.0 - ,CONCAT(SUBSTR(identifier,1,123),'-po2017') as identifier #para garantir que não estoure 130 caracteres
    , status
    , lft
    , rgt
    ,1 as inherit_members
    , default_version_id
    FROM projects
    where parent_id =
    2016/11/18 15:34:46 - input projects.0 - (SELECT id FROM projects where identifier = ${identifier_old}) AND status=1
    2016/11/18 15:34:46 - input projects.0 -


    2016/11/18 15:34:46 - input projects.0 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '{parent_id_novo} as parent_id
    , NOW() as created_on
    , updated_on
    2016/11/18 15:34:46 - input projects.0 - ,CONCAT' at line 7
    The identifier is string value indeed.
    When i was using the set variables step i´ve put the default value with single quotes (example: 'test').
    But now that i´m trying to use parameters base on excel file values i don´t know where should i put the single quotes.
    You mean that i should put it on the excel values of identifier_old ?
    Like (1,'test') and (2,'beta') ?
    Or i have to put on the query when i call the parameter? Like ${'identifier_old'}

    When i was using set variable this query was runing ok, no syntax problem.

  8. #8
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Previously you said, when using Set-Variables it didn't work at all because of multiple rows instead of a single row ...

    With variable references it's a simple textual substitution, so you should place single quotes in the SQL statement like '${identifier_old}'

    Do you know that variable names are case sensitive?
    Try to insert the variable reference by positioning the cursor and pressing Ctrl-Space - see if your variable (parameter) is listed.
    So long, and thanks for all the fish.

  9. #9
    Join Date
    Oct 2013
    Posts
    23

    Default

    Yes, i know that the variable names are case sensitive, i indeed defined the name on lower case.
    When i press ctrl-space on the SQL of table input on transformation 1 i don´t see any of the two parameters on the list..
    Actually i have one main job: start -> transformation0 -> subjob.
    Inside transformation 0 i have excel input (with the two parameters values) -> copy rows to result. Do i have to set the parameters on transformation 0 properties too?
    On subjob properties i´ve set the two parameters.
    Inside the subjob i have: start -> transformation1 -> transformation2 -> transformation3 -> sucess.
    In transformation 1 i´ve quoted when i use the identifier_old parameter..

    I´m still getting the same error output...
    Something is missing on this configuration of using excel values as parameters and passing it to the transformations in the subjob.
    If you know what is possible missing please tell me.

    Thank you.

  10. #10
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Your main problem seems to be parameter usage.
    Look at the attached demo and find out what you did differently.
    No need for Excel or SQL as it's all about parameter usage.
    Attached Files Attached Files
    So long, and thanks for all the fish.

  11. #11
    Join Date
    Oct 2013
    Posts
    23

    Default

    Marabu,

    Thanks for your example.
    My ETL was set like yours, but the error continues on the table input sql where i use both parameters values on the follow query (the error persists when i quote the identifier_old parameter):

    SELECT
    id
    , name
    , description
    , homepage
    , 0 as is_public
    , ${param_parent_id_novo} as parent_id
    , NOW() as created_on
    , updated_on
    , CONCAT(SUBSTR(identifier,1,123),'-po2017') as identifier
    , status
    , lft
    , rgt
    , 1 as inherit_members
    , default_version_id
    FROM projects
    where parent_id =
    (SELECT id FROM projects where identifier = ${param_identifier_old}) AND status=1
    If i replace both variables by it´s values directly (like the following query) it works, showing that the error is due to the variable substitution.

    SELECT
    id
    , name
    , description
    , homepage
    , 0 as is_public
    , 7381 as parent_id
    , NOW() as created_on
    , updated_on
    , CONCAT(SUBSTR(identifier,1,123),'-po2017') as identifier
    , status
    , lft
    , rgt
    , 1 as inherit_members
    , default_version_id
    FROM projects
    where parent_id =
    (SELECT id FROM projects where identifier = 'test') AND status=1

  12. #12
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Please, run your job with loglevel=Detailed and show me the log lines containing the SQL statement.
    So long, and thanks for all the fish.

  13. #13
    Join Date
    Oct 2013
    Posts
    23

    Default

    Running with logLevel=detailed:

    2016/11/22 12:51:37 - input projects.0 - Iniciando execução...
    2016/11/22 12:51:37 - input projects.0 - SQL query : SELECT
    id
    , name
    , description
    , homepage
    , 0 as is_public
    , ${param_parent_id_novo} as parent_id
    , NOW() as created_on
    , updated_on
    2016/11/22 12:51:37 - input projects.0 - , CONCAT(SUBSTR(identifier,1,123),'-po2017') as identifier
    , status
    , lft
    , rgt
    , 1 as inherit_members
    , default_version_id
    FROM projects
    where parent_id =
    2016/11/22 12:51:37 - input projects.0 - (SELECT id FROM projects where identifier = ${param_identifier_old}) AND status=1
    2016/11/22 12:51:37 - input projects.0 -
    2016/11/22 12:51:37 - projects output.0 - Iniciando execução...
    2016/11/22 12:51:37 - projects_from_to table output.0 - Iniciando execução...
    2016/11/22 12:51:37 - transf1 - Transforma��o alocou 3 threads e 2 rowsets.
    2016/11/22 12:51:37 - input projects.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Erro inesperado
    2016/11/22 12:51:37 - input projects.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : org.pentaho.di.core.exception.KettleDatabaseException:
    2016/11/22 12:51:37 - input projects.0 - An error occurred executing SQL:
    2016/11/22 12:51:37 - input projects.0 - SELECT
    id
    , name
    , description
    , homepage
    , 0 as is_public
    , ${param_parent_id_novo} as parent_id
    , NOW() as created_on
    , updated_on
    2016/11/22 12:51:37 - input projects.0 - , CONCAT(SUBSTR(identifier,1,123),'-po2017') as identifier
    , status
    , lft
    , rgt
    , 1 as inherit_members
    , default_version_id
    FROM projects
    where parent_id =
    2016/11/22 12:51:37 - input projects.0 - (SELECT id FROM projects where identifier = ${param_identifier_old}) AND status=1
    2016/11/22 12:51:37 - input projects.0 -


    2016/11/22 12:51:37 - input projects.0 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '{param_parent_id_novo} as parent_id
    , NOW() as created_on
    , updated_on
    2016/11/22 12:51:37 - input projects.0 - ,' at line 7
    2016/11/22 12:51:37 - input projects.0 -
    2016/11/22 12:51:37 - input projects.0 - at org.pentaho.di.core.database.Database.openQuery(Database.java:1716)
    2016/11/22 12:51:37 - input projects.0 - at org.pentaho.di.trans.steps.tableinput.TableInput.doQuery(TableInput.java:224)
    2016/11/22 12:51:37 - input projects.0 - at org.pentaho.di.trans.steps.tableinput.TableInput.processRow(TableInput.java:138)
    2016/11/22 12:51:37 - input projects.0 - at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
    2016/11/22 12:51:37 - input projects.0 - at java.lang.Thread.run(Unknown Source)
    2016/11/22 12:51:37 - input projects.0 - Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '{param_parent_id_novo} as parent_id
    , NOW() as created_on
    , updated_on
    2016/11/22 12:51:37 - input projects.0 - ,' at line 7
    2016/11/22 12:51:37 - input projects.0 - at sun.reflect.GeneratedConstructorAccessor159.newInstance(Unknown Source)
    2016/11/22 12:51:37 - input projects.0 - at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    2016/11/22 12:51:37 - input projects.0 - at java.lang.reflect.Constructor.newInstance(Unknown Source)
    2016/11/22 12:51:37 - input projects.0 - at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
    2016/11/22 12:51:37 - input projects.0 - at com.mysql.jdbc.Util.getInstance(Util.java:387)
    2016/11/22 12:51:37 - input projects.0 - at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
    2016/11/22 12:51:37 - input projects.0 - at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
    2016/11/22 12:51:37 - input projects.0 - at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
    2016/11/22 12:51:37 - input projects.0 - at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
    2016/11/22 12:51:37 - input projects.0 - at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
    2016/11/22 12:51:37 - input projects.0 - at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2545)
    2016/11/22 12:51:37 - input projects.0 - at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2503)
    2016/11/22 12:51:37 - input projects.0 - at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1369)
    2016/11/22 12:51:37 - input projects.0 - at org.pentaho.di.core.database.Database.openQuery(Database.java:1705)
    2016/11/22 12:51:37 - input projects.0 - ... 4 more
    2016/11/22 12:51:37 - input projects.0 - Finished reading query, closing connection.
    2016/11/22 12:51:37 - transf1 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Erros detectados!
    2016/11/22 12:51:37 - redminemig - Connection to database closed!

  14. #14
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    We can see in the log that variables are not substituted, so either option "Replace variables in script" isn't enabled or the variables have the wrong names - tertium non datur.
    I attached a modified example using Table-Input, that shows a substituted variable in an SQL script when run with loglevel = Detailed.
    If you still struggle with this, consider to attach something for inspection.
    Attached Files Attached Files
    So long, and thanks for all the fish.

  15. #15
    Join Date
    Oct 2013
    Posts
    23

    Default

    Well, i could not find what´s wrong in mine ETL...
    It didnt work attach the zip here on the forum, so i uploaded on my dropbox my files so you can take a look please.
    Thank you very much.

    Zip with my files:
    https://dl.dropboxusercontent.com/u/27053/ETL%20PDI.zip

  16. #16
    Join Date
    Jun 2012
    Posts
    5,534

    Default Only 30 days until Christmas

    Some comments in no particular order:

    • If you press "Reply to Thread" you must switch to the advanced editor to see an upload symbol (a paperclip) in the action bar.
    • Your use of the Success job entry is futile - it only makes sense when the previous result was FALSE and you want the job to be TRUE regardless.
    • Initially, when editing subjob the parameter mapping is empty. If you press "Get Parameters" and nothing happens, you forgot to define parameters for subjob. It's done by opening subjob and pressing Ctrl-T for its properties dialog to show which has a tab parameters. Only after you defined parameters (param_identifier_old and param_parent_id_novo) there, you can map fields to them.
    So long, and thanks for all the fish.

  17. #17
    Join Date
    Oct 2013
    Posts
    23

    Default

    Marabu,

    I´ve tried indeed to attach using the upload symbol. But i don´t know what was happening when i clicked to add/upload any file of my computer it was not working.
    Thanks, i will not use the sucess step anymore on cases like this one.
    And, finally, thank you very much for opening my ETL and telling me what was missing.
    The big confusion that i had made was that i dind´t understand when you were telling me to press CTRL+T on the subjob to map the parameters. On my pentaho it´s ctrl+J the shortcut (so i´ve pressed ctrl+T and nothing happened). So i thought that window you were telling me was the double clicking on the subjob step into the main job step and inserting values into the parameters tab.

    You were great and showed extreme patience and attention helping me solving my problem and teaching me how to iterate on variables values.
    Certainly this knowledge will be very useful in the future.

    ps: sorry for my not so fluent english.

  18. #18
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Oh, my bad, it's Ctrl-T for transformations
    Also, there's nothing wrong with your English.
    It's certainly better than my Portuguese


    PS: What's funny, though, is the fact that Spoon tells me Ctrl-T is the hotkey but it isn't...


    Name:  214978.png
Views: 85
Size:  39.9 KB
    Last edited by marabu; 11-24-2016 at 12:30 PM.
    So long, and thanks for all the fish.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.