View Full Version : How do kettle write hadoop job xml

05-22-2012, 04:07 AM
Hi everyone:
when I use kettle to develop hadoop mapReduce job. I encounter a issue. I have configurated mapred.local.dir in mapred-site.xml file(both namenode and datanode). when I just run pig (no use kettle to run pig), mapred.local.dir of job xml match which I configurated. but when I use kettle to submit pig job or mapReduce job. The mapred.local.dir are not what I configurated. They are always ${hadoop.tmp.dir}/mapred/local. I so confuse about it. What should I do to make job xml(kettle submit) like what I want.
Thanks for everyone give me help and advice.

05-23-2012, 12:30 AM
You can set any custom properties under the User Defined tab of the Pentaho MapReduce or Hadoop Job Executor steps. You can also make sure the hadoop configuration in mapred-site.xml is on the Kettle class path (any directory under $KETTLE/libext should work for you). We include a mapred-site.xml at $KETTLE/libext/bigdata/pigConf/mapred-site.xml so you can edit that one if you'd rather not set the properties for every Hadoop job.

05-23-2012, 03:18 AM
Thanks for your help jganoff. That's very helpful.