Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Problem with "Using Pentaho MapReduce to Generate an Aggregate Dataset" tutorial

  1. #1
    Join Date
    Nov 2010
    Posts
    16

    Default Problem with "Using Pentaho MapReduce to Generate an Aggregate Dataset" tutorial

    Hi all!

    I running Kettle 4.3 on a VMWare virtual machine downloaded from cloudera.com: CDH3 Update 3 (cdh3u3) https://ccp.cloudera.com/display/SUPPORT/CDH+Downloads

    I followed the configuration steps provided at http://wiki.pentaho.com/display/BAD/...adoop+Versions (except for point C because it says "For Hadoop 0.20.205" and that, I understand, is not my case)

    I downloaded the example at http://wiki.pentaho.com/display/BAD/...regate+Dataset and configured the pentaho mapreduce step.

    When I execute the job everything seems to work fine until it reaches the step "Configuring Pentaho MapReduce to use Kettle installation", etc. and it doesn't seem to go any further. The application is responsive and there's no error messages in the log:

    INFO 17-10 16:19:28,777 - Spoon - Starting job...
    INFO 17-10 16:19:28,793 - aggregate_mr - Start of job execution
    INFO 17-10 16:19:28,873 - aggregate_mr - Starting entry [Pentaho MapReduce]
    INFO 17-10 16:19:29,570 - aggregate_mapper - Dispatching started for transformation [aggregate_mapper]
    INFO 17-10 16:19:29,668 - aggregate_reducer - Dispatching started for transformation [aggregate_reducer]
    INFO 17-10 16:19:29,696 - Pentaho MapReduce - Configuring for Hadoop distribution: Cloudera
    INFO 17-10 16:19:30,401 - Pentaho MapReduce - Cleaning output path: hdfs://localhost:8020/weblogs/aggregate_mr
    INFO 17-10 16:19:30,446 - Pentaho MapReduce - Configuring Pentaho MapReduce job to use Kettle installation from /opt/pentaho/mapreduce/4.3.0

    Any idea of what it could be?

    Thank you very much!

    Alex
    Last edited by iShotAlex; 10-17-2012 at 10:46 AM.

  2. #2
    Join Date
    Nov 2010
    Posts
    16

    Default

    Resolved: the file path was incorrect

    hdfs://localhost:8020/weblogs/aggregate_mr instead of hdfs://localhost:8020/user/pdi/weblogs/aggregate_mr

    Suggestion: Kettle could throw an error instead of staying idle if the path doesn't exist

    Cheers

    Alex

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.