Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Hadoop 2.7.1 and PDI

  1. #1
    Join Date
    Apr 2012
    Posts
    253

    Default Hadoop 2.7.1 and PDI

    I was just minding my own business when my boss snuck in and pushed me down the Hadoop rabbit hole.

    I've configured HDP22 to work with my remote cluster. I have no problems moving files to HDFS and accessing them. Even compiled and submitted Spark jar jobs to the Spark cluster.

    I've now tried to use the Pentaho MapReduce step in 6.0. It was moderately straightforward, but now I'm running into the following error. I'm fairly sure it involves the fact that
    the cluster is not actually using HortonWorks, but Apache Hadoop 2.7.1. Obviously it can't find /hdp/apps/2.2.0.0-2041/mapreduce/mapreduce.tar.gz. Not sure what to do. Haven't been
    able to figure this one out. Not even sure what would go into that tarball (other than shared jars). Don't know if I should try to recompile the shim from 2.6.0 to 2.7.1 or if there
    is something easier to try. Any suggestions would be great.

    Thanks!

    2015/12/10 13:11:54 - Spoon - Starting job...
    2015/12/10 13:11:54 - hadoop_mapreduce_nonjava - Start of job execution
    2015/12/10 13:11:54 - hadoop_mapreduce_nonjava - Starting entry [Pentaho MapReduce]
    2015/12/10 13:11:54 - Pentaho MapReduce - Cleaning output path: hdfs://<address>/devadm/user/<dir>/cities/out/city_ultimate_counts.txt
    2015/12/10 13:11:54 - Pentaho MapReduce - Installing Kettle to /opt/pentaho/mapreduce/6.0.0.0-353-6.0.0.0-353-hdp22
    2015/12/10 13:36:03 - Pentaho MapReduce - Kettle successfully installed to /opt/pentaho/mapreduce/6.0.0.0-353-6.0.0.0-353-hdp22
    2015/12/10 13:36:04 - Pentaho MapReduce - Configuring Pentaho MapReduce job to use Kettle installation from /opt/pentaho/mapreduce/6.0.0.0-353-6.0.0.0-353-hdp22
    2015/12/10 13:36:04 - Pentaho MapReduce - mapreduce.application.classpath: classes/,$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/2.2.0.0-2041/hadoop/lib/hadoop-lzo-0.6.0.2.2.0.0-2041.jar:/etc/hadoop/conf/secure
    2015/12/10 13:36:05 - Pentaho MapReduce - ERROR (version 6.0.0.0-353, build 1 from 2015-10-07 13.27.43 by buildguy) : File does not exist: hdfs://<address>/hdp/apps/2.2.0.0-2041/mapreduce/mapreduce.tar.gz
    2015/12/10 13:36:05 - Pentaho MapReduce - ERROR (version 6.0.0.0-353, build 1 from 2015-10-07 13.27.43 by buildguy) : java.io.FileNotFoundException: File does not exist: hdfs://<address>/hdp/apps/2.2.0.0-2041/mapreduce/mapreduce.tar.gz
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:137)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.fs.AbstractFileSystem.resolvePath(AbstractFileSystem.java:460)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.fs.FileContext$24.next(FileContext.java:2137)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.fs.FileContext$24.next(FileContext.java:2133)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.fs.FileContext.resolve(FileContext.java:2133)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.fs.FileContext.resolvePath(FileContext.java:595)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:753)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:435)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
    2015/12/10 13:36:05 - Pentaho MapReduce - at java.security.AccessController.doPrivileged(Native Method)
    2015/12/10 13:36:05 - Pentaho MapReduce - at javax.security.auth.Subject.doAs(Subject.java:422)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.pentaho.hadoop.shim.hdp22.HadoopShim.submitJob(HadoopShim.java:83)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.pentaho.hadoop.shim.hdp22.delegating.DelegatingHadoopShim.submitJob(DelegatingHadoopShim.java:112)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.pentaho.di.job.entries.hadooptransjobexecutor.JobEntryHadoopTransJobExecutor.execute(JobEntryHadoopTransJobExecutor.java:869)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.pentaho.di.job.Job.execute(Job.java:730)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.pentaho.di.job.Job.execute(Job.java:873)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.pentaho.di.job.Job.execute(Job.java:546)
    2015/12/10 13:36:05 - Pentaho MapReduce - at org.pentaho.di.job.Job.run(Job.java:435)

  2. #2
    Join Date
    Apr 2012
    Posts
    253

    Default

    Going to try changing the mapreduce.application.classpath. That could be an issue. Not sure if it will fix the file not found however.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.