Hitachi Vantara Pentaho Community Forums
Page 1 of 2 12 LastLast
Results 1 to 10 of 13

Thread: Execute R Script

  1. #1
    Join Date
    May 2014
    Posts
    8

    Default Execute R Script

    Hi all!

    I am new to Pentaho and I want to send data from MySQL to R (3.0.2 in Windows 7 32 bits), process it and get it back using Spoon. I am trying to use "Execute R Script" step (RScriptPlugin-0.0.4) in Spoon 5.0.1 stable and followed the steps in http://wiki.pentaho.com/display/EAI/R+script+executor but I am not able to open the test transformations it includes (even if I installed the suggested plug ins from the Marketplace and I see the icon under "Scripting"). A "Detect empty stream" step after the "R script executor" detects an empty flow and whether "Path to R Script" is void, an invalid or non-existing file name makes no difference. For editing the step I only have one tab (Path to R Script, Input Variables for R script and Output variables form R script sections under Properties). What may be happening and how can I solve it? Is there another way to do it?

    Thanks a lot!

  2. #2
    Join Date
    Apr 2008
    Posts
    4,685

    Default

    I don't have that plug-in installed, so I can't be a lot of help...

    But I'm betting that you don't have an input step in your Transformation.
    Steps that are not in the Input folder generally won't run unless they have input coming to them.

  3. #3
    Join Date
    May 2014
    Posts
    8

    Default

    Quote Originally Posted by gutlez View Post
    I don't have that plug-in installed, so I can't be a lot of help...

    But I'm betting that you don't have an input step in your Transformation.
    Steps that are not in the Input folder generally won't run unless they have input coming to them.
    I have tried both with Table input and CSV input and even inserting "Generate rows".

  4. #4
    Join Date
    Apr 2008
    Posts
    4,685

    Default

    And you started with SpoonR.bat, rather than spoon.bat?

  5. #5
    Join Date
    May 2014
    Posts
    8

    Default

    Yes, SpoonR.bat

  6. #6
    Join Date
    Apr 2008
    Posts
    4,685

    Default

    As mentioned at the beginning of the thread - I don't have that plugin...
    I've given you all the basic troubleshooting that I can think of.

  7. #7
    Join Date
    May 2014
    Posts
    8

    Default

    Thanks for be willing to help I appreciate that

  8. #8

    Default

    What are your inputs? Did you test receiving fields and rows as a simple dataframe and print it out to check the basic I/O is working?

    I had to fight last week with it and I finally managed to get it work. Just ask me if you need

  9. #9
    Join Date
    May 2014
    Posts
    8

    Default

    Hi, I couldn't manage to upload the input files, but here I write their content:

    file.csv:
    -----------------
    a;b
    2;1
    2;2
    2;3

    -----------------

    Rfile.R
    -----------------
    c <- a + b


    OUTPUT <- list("c"=c)

    -----------------


    I only used a "Detect Empty Stream" and it was reading 4 lines (R=4) and writting 3 (W=3) between the csv and the "Execute R Script", which got the a and b fields in Input, but I had to write by myself the c as Output Field. I tried printing with another script but as soon as I can try the printing this (in a file I guess) I will comment on the results.

    Thanks!!!

  10. #10

    Default

    Quote Originally Posted by Mim View Post
    Hi, I couldn't manage to upload the input files, but here I write their content:

    file.csv:
    -----------------
    a;b
    2;1
    2;2
    2;3

    -----------------

    Rfile.R
    -----------------
    c <- a + b


    OUTPUT <- list("c"=c)

    -----------------


    I only used a "Detect Empty Stream" and it was reading 4 lines (R=4) and writting 3 (W=3) between the csv and the "Execute R Script", which got the a and b fields in Input, but I had to write by myself the c as Output Field. I tried printing with another script but as soon as I can try the printing this (in a file I guess) I will comment on the results.

    Thanks!!!
    I am not sure I understand you correctly.
    I'd do a test (sorry if it is actually what you are doing).
    Use an standard step for inputing from CSV. Connect that step to R script. Define in the input Rscript tab your CSV stream as dataframe1, and simply write in the script pane the line "dataframe1". Be sure you uncheck the "output as String".

    If you need more precise debugging, you can use print(whatever) instead after checking "output as String".

    During test it will check the input fields of your stream and don't worry if it uses mock values, That will allow you to define automatically your output.

    Then connect the Rscript output to a dumm or log write to check these fields are passing trhough.

    I was asking because I am not sure if you are reading your csv trhough a module from the rscript itself, if so, I recommend you to design it as I explained to check better when and what is passing in the flow. It works for me.

    When that works, then you can start implementing the logic. Be careful because I got stucked. Since the Rscript test function uses mock data to define the output types, you can fall into errors depending on the function your using as this data cannot be logicaly expected (binomial, etc...). In that case you'll need to play a bit to manually define the output fields and then make the test by playing the transformation directly to get expected behaviour.

    After that is working, you can have problems with casts and with resulting dataframe input fields (StringAsFactors), but just make sure the basic I/O is working, maybe you won't need any additional effort.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.