PDA

View Full Version : Error with custom MapReduce output format



better
08-02-2013, 06:30 PM
I used the wiki example files from here:

http://wiki.pentaho.com/display/BAD/Using+a+Custom+Input+or+Output+Format+in+Pentaho+MapReduce

But when I run it in my environment (cdh4.2), I get the following error. Any suggestions?

2013/08/02 18:07:35 - Pentaho MapReduce - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : class YearMultipleTextOutputFormat not org.apache.hadoop.mapred.OutputFormat
2013/08/02 18:07:35 - Pentaho MapReduce - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : java.lang.RuntimeException: class YearMultipleTextOutputFormat not org.apache.hadoop.mapred.OutputFormat
2013/08/02 18:07:35 - Pentaho MapReduce - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.apache.hadoop.conf.Configuration.setClass(Configuration.java:1661)
2013/08/02 18:07:35 - Pentaho MapReduce - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.apache.hadoop.mapred.JobConf.setOutputFormat(JobConf.java:678)
2013/08/02 18:07:35 - Pentaho MapReduce - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.hadoop.shim.common.ConfigurationProxy.setOutputFormat(ConfigurationProxy.java:84)
2013/08/02 18:07:35 - Pentaho MapReduce - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.di.job.entries.hadooptransjobexecutor.JobEntryHadoopTransJobExecutor.execute(JobEntryHadoopTransJobExecutor.java:672)
2013/08/02 18:07:35 - Pentaho MapReduce - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.di.job.Job.execute(Job.java:589)
2013/08/02 18:07:35 - Pentaho MapReduce - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.di.job.Job.execute(Job.java:728)
2013/08/02 18:07:35 - Pentaho MapReduce - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.di.job.Job.execute(Job.java:443)
2013/08/02 18:07:35 - Pentaho MapReduce - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.di.job.Job.run(Job.java:363)

mattb_pdi
08-02-2013, 10:51 PM
The log says you have PDI 4.4.0-stable, which did not include the "cdh42" hadoop configuration. In the plugins/pentaho-big-data-plugin/plugin.properties file, what is the "active.hadoop.configuration" property set to? I can run the examples fine, but I have PDI EE 4.4.0 and I added the "cdh42" hadoop configuration (aka "shim") to my PDI. This shim (and others) is available in PDI 4.4.2, but I'm not sure if there's been a Community Edition release yet.

better
08-12-2013, 10:54 PM
The active.hadoop.configuration is set to cdh42. So you're saying that cdh42 and CE 4.4.0 are not supported?

better
08-13-2013, 10:57 AM
I just tried the "supported" cdh4 hadoop shim and get the same error. Has anyone successfully run the example using CE 4.4.0 + cdh4?

huxinglong
08-27-2013, 01:13 AM
I'm using PDI 4.4.0 and CHD4.3 the same error occured,and also,when I use the "pentaho Mapreduce" step in PDI,a error occured like this :
ERROR 27-08 12:46:14,690 - Pentaho MapReduce - Pathname ......:/work/pentaho/mapreduce/5.0.0-M1-TRUNK-SNAPSHOT-cdh42/lib/xom-1.1.jar:/work/pentaho/mapreduce/5.0.0-M1-TRUNK-SNAPSHOT-cdh42/lib/xpp3_min-1.1.4c.jar:/work/pentaho/mapreduce/5.0.0-M1-TRUNK-SNAPSHOT-cdh42/lib/xstream-1.4.2.jar is not a valid DFS filename.
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:176)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:820)
at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:730)
at org.apache.hadoop.mapreduce.v2.util.MRApps.addToClasspathIfNotJar(MRApps.java:230)
at org.apache.hadoop.mapreduce.v2.util.MRApps.setClasspath(MRApps.java:188)
at org.apache.hadoop.mapred.YARNRunner.createApplicationSubmissionContext(YARNRunner.java:413)
at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:288)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:391)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
at org.pentaho.hadoop.shim.common.CommonHadoopShim.submitJob(CommonHadoopShim.java:228)
at org.pentaho.di.job.entries.hadooptransjobexecutor.JobEntryHadoopTransJobExecutor.execute(JobEntryHadoopTransJobExecutor.java:821)
at org.pentaho.di.job.Job.execute(Job.java:589)
at org.pentaho.di.job.Job.execute(Job.java:728)
at org.pentaho.di.job.Job.execute(Job.java:728)
at org.pentaho.di.job.Job.execute(Job.java:443)
at org.pentaho.di.job.Job.run(Job.java:363)
how can I fix this error ?