PDA

View Full Version : Following the tutorial - Loading Data into HDFS



prav2828
12-11-2012, 02:06 AM
I have a CDH4 two node cluster and I got pentaho installed in my box and followed the steps in the pentaho wiki to integrate pentaho with CDH4 cluster. Now I am trying to run Loading Data into HDFS job by following the tutorial but job got failed and I am getting the following error log:

2012/12/11 10:39:56 - Spoon - Starting job...
2012/12/11 10:39:56 - load_hive - Start of job execution
2012/12/11 10:39:56 - load_hive - Starting entry [Hadoop Copy Files]
2012/12/11 10:39:56 - Hadoop Copy Files - Starting ...
2012/12/11 10:39:56 - Hadoop Copy Files - Processing row source File/folder source : [hdfs://myserver:8020/user/pdi/weblogs/parse] ... destination file/folder : [hdfs://myserver:8020/user/hive/warehouse/weblogs]... wildcard : [part-*]
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : Can not copy file/folder [hdfs://myserver:8020/user/pdi/weblogs/parse] to [hdfs://myserver:8020/user/hive/warehouse/weblogs]. Exception : [
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) :
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : Unable to get VFS File object for filename 'hdfs://myserver:8020/user/pdi/weblogs/parse' : Could not resolve file "hdfs://myserver:8020/user/pdi/weblogs/parse".
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) :
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : ]
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : org.pentaho.di.core.exception.KettleFileException:
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) :
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : Unable to get VFS File object for filename 'hdfs://myserver:8020/user/pdi/weblogs/parse' : Could not resolve file "hdfs://myserver:8020/user/pdi/weblogs/parse".
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) :
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) :
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : at org.pentaho.di.core.vfs.KettleVFS.getFileObject(KettleVFS.java:161)
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : at org.pentaho.di.core.vfs.KettleVFS.getFileObject(KettleVFS.java:104)
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.ProcessFileFolder(JobEntryCopyFiles.java:375)
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.execute(JobEntryCopyFiles.java:324)
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : at org.pentaho.di.job.Job.execute(Job.java:528)
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : at org.pentaho.di.job.Job.execute(Job.java:667)
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : at org.pentaho.di.job.Job.execute(Job.java:393)
2012/12/11 10:39:56 - Hadoop Copy Files - ERROR (version 4.3.0-GA, build 16753 from 2012-04-18 21.39.30 by buildguy) : at org.pentaho.di.job.Job.run(Job.java:313)
2012/12/11 10:39:56 - load_hive - Finished job entry [Hadoop Copy Files] (result=[false])
2012/12/11 10:39:56 - load_hive - Job execution finished
2012/12/11 10:39:56 - Spoon - Job has ended.


where did i go wrong?
Please help me solving the issue.