Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: Pentaho MapReduce

  1. #1

    Exclamation Pentaho MapReduce

    Hi, I am using Pentaho Mapreduce to parse Weblog Data in MapR, I am going throught the sample document provided in wiki.pentaho.com. However when I try to run the job, without showing error it keeps on running and in logging tab it show "Pentaho-Mapreduce- Setup complete:0.0 Mapper Completion:0.0 Reducer completion:0.0"..................................It shows same message several time and keeps running without any result....I stop it manually then it stops, finally with out any error it says "Pentaho Mapreduce failed".....please help me...I am using hadoop 0.20.0 version and Pentaho 4.3 trial version...i have attached .ktr and .kjb files..pfa...thank you
    Attached Files Attached Files
    Mr. Manasa Ranjan Panda
    Hyderabad,INDIA
    E.Mail-:manasaranjan.panda@gmail.com
    M.No-+91 9392923252

  2. #2
    Join Date
    Nov 2011
    Posts
    18

    Default

    Manasa,

    A few things to check:

    1. Are you able to run the sample MapReduce job that ships with MapR from the command line?
    hadoop jar /opt/mapr/hadoop/hadoop-0.20.2/hadoop-0.20.2-dev-examples.jar wordcount /myvolume/in /myvolume/out If this also hangs then the problem is with your MapR cluster not Pentaho and you need to verify your MapR cluster is running properly.
    2. Have you followed the Hadoop Node Configuration steps detailed here: http://wiki.pentaho.com/display/BAD/...ntaho+for+MapR
    3. Are there any errors in your spoon log that are not appearing in the logging tab? On Windows your spoon log is in C:\Users\<username>\AppData\Local\Temp and is named spoon*.log. On linux the spoon log is usually in /tmp. You may have to close Spoon before these logs are written.

    Hope this helps,
    Chris

  3. #3

    Exclamation PentahoMapReduce

    Thank you for suggestions,


    1. I executed hadoop-examples. jar files on command prompt, there is no error, it is running fine, means MAPR cluster is running fine.


    2. I have installed hadoop-0.20.2 on window 7, using cygwin s/w, I tried "configure Pentaho for MapR" from wiki.pentaho.
    during configuration : I updated the launcher.properties as defined there, deleted hadoop-0.20.2.jar, and copied hadoop-0.20.2-dev-core.jar into $PDI_HOME/libext, after this step I didn't get maprfs-0.1.jar and also in wiki.pentaho hadoop configuration, it is said to add " /opt/pentaho/pentaho-mapreduce/lib/*" to hadoop_classpath in "hadoop-env.sh", however as I am using hadoop on windows through Cygwin and I have extracted Pentaho 4.3 trial version, I am not finding any folder called pentaho-mapreduce, so which path is to be added in "Hadoop_Classpath" of "Hadoop-env.sh". and also next step which is updating " /conf/mapred-site.xml" I haven't done because, as early said I am not getting "pentaho/pentaho-mapreduce" path.


    3. I tried again to run Pentaho MapReduce job, as attached , now it is giving me error
    error: java.lang.ClassNotFoundException: org.pentaho.di.core.exception.KettleException
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    2012/04/19 12:19:42 - at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    2012/04/19 12:19:42 - at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    2012/04/19 12:19:42 - at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    2012/04/19 12:19:42 - at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    2012/04/19 12:19:42 - at java.lang.Class.forName0(Native Method)
    2012/04/19 12:19:42 - at java.lang.Class.forName(Class.java:247)
    2012/04/19 12:19:42 - at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
    2012/04/19 12:19:42 - at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
    2012/04/19 12:19:42 - at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:833)
    2012/04/19 12:19:42 - at org.apache.hadoop.mapred.JobConf.getMapRunnerClass(JobConf.java:790)
    2012/04/19 12:19:42 - at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    2012/04/19 12:19:42 - at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    2012/04/19 12:19:42 - at org.apache.hadoop.mapred.Child.main(Child.java:170)


    Please help...
    Thank you for suggestions,
    Mr. Manasa Ranjan Panda
    Hyderabad,INDIA
    E.Mail-:manasaranjan.panda@gmail.com
    M.No-+91 9392923252

  4. #4
    Join Date
    Nov 2011
    Posts
    18

    Default

    Manasa,

    I incorrectly thought you said you were using MapR Hadoop. If you are using Apache Hadoop 0.20.2 you are not using MapR and should instead follow the instructions here: http://wiki.pentaho.com/display/BAD/...adoop+Versions. You do need to follow the Hadoop Node Configuration steps. Note the PHD component these steps tell you to install is a separate download and install from the PDI 4.3 Preview release. This install will create the Pentaho MapReduce folder. The link to the download is in the instructions. I have never used Hadoop with Cygwin so I am not sure if there are any quirks to the install.

    When PDI 4.3 GA's the PHD install will not be necessary, but it is required for the 4.3 Preview release.

    You also should follow the Hadoop How-Tos instead of the MapR ones http://wiki.pentaho.com/display/BAD/Hadoop.

    Chris

  5. #5

    Default

    Hi, I'm trying the same thing as Manasa and I encounter exactly the same problem. I followed the instructions you suggested but still the same exception.
    Well the other thing is that PHD is not necessary anymore as I understood from these guidelines. I just installed Apache Hadoop 0.20.2 to run locally as one node.
    Does anyone have some insight on this? Manasa, did you manage to solve it?
    Thanks
    Last edited by kepha; 07-26-2012 at 09:22 PM.

  6. #6

    Default

    Does anyone has the insights about this problem? I still have the same issue.
    I run hadoop-0.20.2 locally, I followed the instructions for this example:
    http://wiki.pentaho.com/display/BAD/...se+Weblog+Data.
    *The guidelines in some parts are different form the images, I suppose this happened when you changed the text. Anyway, just to let you know.

    Since I run hadoop-0.20.2 I suppose no other configuration of the kettle is needed as said here:
    http://wiki.pentaho.com/display/BAD/...adoop+Versions

    I tested hadoop independently and it works ok.

    I do not see what can be the problem I tried also very simple mapper transformations and the problem is still the same.
    Here is Spoon's log:

    2012/07/27 11:16:31 - Pentaho MapReduce 2 - Setup Complete: 0.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2012/07/27 11:16:36 - Pentaho MapReduce 2 - Setup Complete: 0.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2012/07/27 11:16:41 - Pentaho MapReduce 2 - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : [FAILED] -- Task: 0 Attempt: 0 Event: 1
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : Error: java.lang.ClassNotFoundException: org.pentaho.di.core.exception.KettleException
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at java.security.AccessController.doPrivileged(Native Method)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at java.lang.Class.forName0(Native Method)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at java.lang.Class.forName(Class.java:264)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:833)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.apache.hadoop.mapred.JobConf.getMapRunnerClass(JobConf.java:790)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    2012/07/27 11:16:46 - Pentaho MapReduce 2 - ERROR (version 4.3.0-stable, build 16786 from 2012-04-24 14.11.32 by buildguy) : at org.apache.hadoop.mapred.Child.main(Child.java:170)

  7. #7
    Join Date
    Aug 2010
    Posts
    87

    Default

    Hi kepha,

    You're correct that the PHD is no longer required. You appear to be running an older version of the Big Data Plugin that relies upon it though. Please try the latest PDI 4.3.0 stable release. You should see log messages indicating the Kettle environment is being staged into HDFS before Pentaho MapReduce begins execution.

    Best,
    Jordan

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.