lyllbl
05-29-2012, 03:22 AM
Hi everyone:
I have a file to be joined 15 times. this file is 300M and just has parent and child fields. But when I execute first join, it is failed.This is my logic.
8809
Hadoop tell me the error is failed to report status for 600 seconds. Killing! so I guess may be kettle need download file firstly to local (I test it if i just directly download this file kettle need more than 2 hours). Then I chose to cut down the records to 10 and run this job (change Hadoop File input file no map input file, map input file also is 300M file). That's successful.
After that I change my logic to this.
8810
just change Hadoop File Input to MapReduce Input (Map input file also chose 300M file). The MapReduce job can run, but the output file is empty. (The same logic I use Hadoop File Input and Hadoop File Output is correct 8811)
I saw Hadoop combine output records is 0.
my issue is
1 whether kettle execute Hadoop File Input/output step in local machine?
2 In one Map/Reduce transformation could not join two mapReduce input?(or means just have only one mapReduce input/output in one map/reduce transformation)
3 If I want to implement this logic, How can I change or what can I do?
I have a file to be joined 15 times. this file is 300M and just has parent and child fields. But when I execute first join, it is failed.This is my logic.
8809
Hadoop tell me the error is failed to report status for 600 seconds. Killing! so I guess may be kettle need download file firstly to local (I test it if i just directly download this file kettle need more than 2 hours). Then I chose to cut down the records to 10 and run this job (change Hadoop File input file no map input file, map input file also is 300M file). That's successful.
After that I change my logic to this.
8810
just change Hadoop File Input to MapReduce Input (Map input file also chose 300M file). The MapReduce job can run, but the output file is empty. (The same logic I use Hadoop File Input and Hadoop File Output is correct 8811)
I saw Hadoop combine output records is 0.
my issue is
1 whether kettle execute Hadoop File Input/output step in local machine?
2 In one Map/Reduce transformation could not join two mapReduce input?(or means just have only one mapReduce input/output in one map/reduce transformation)
3 If I want to implement this logic, How can I change or what can I do?