PDA

View Full Version : Hadoop Job Executor error (PDI CE 4.3)



kenneththo
08-08-2012, 09:18 PM
Hi all,
I am trying to get simple wordcount job to work, but without luck, I´d greatly appreciate if anyone can shed some light. Many thanks!

I set up the Simple Configuration by specifying the location of the jar, and the command line arguments are: hdfs://localhost:9000/user/hadoop/input hdfs://localhost:9000/user/hadoop/output

In the code, I added the following to specify the location of the jobtracker:
config.set("mapred.job.tracker", "localhost:9001");

Here is the main :

public static void main(String[] args) throws Exception {
Configuration config = new Configuration();
config.set("mapred.job.tracker", "localhost:9001");
Job job = new Job(config, "test");
job.setJarByClass(LineIndexer.class);
job.setMapperClass(LineIndexMapper.class);
job.setCombinerClass(LineIndexCombiner.class);
job.setReducerClass(LineIndexReducer.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}


But I am getting the following error, can someone please help? :

java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.pentaho.di.job.entries.hadoopjobexecutor.JobEntryHadoopJobExecutor$1.run(JobEntryHadoopJobExecutor.java:357)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.UnknownHostException: unknown host: bogus
at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:850)
at org.apache.hadoop.ipc.Client.call(Client.java:720)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy16.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(FileInputFormat.java:357)
at LineIndexer.main(LineIndexer.java:166)
... 6 more

jganoff
08-14-2012, 10:53 AM
That bogus entry is coming from $KETTLE/libext/bigdata/pigConf/core-site.xml. It's likely coming from the fs.default.name property you're not overriding in your Configuration object.