Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Scheduling options on a cluster

  1. #1
    Join Date
    Mar 2012
    Posts
    1

    Default Scheduling options on a cluster

    Greetings!

    New to Pentaho, but very interested in the potential.

    I had a question regarding to scheduling on a Hadoop cluster - what are my options?

    In particular, I would be interested in any scenarios for scheduling Pentaho ETL jobs using Oozie.

    Thanx,

    Dave

  2. #2
    dmoran Guest

    Default

    Pentaho ETL jobs can be executed from the command line so it should be easy to schedule them to run from oozie or any scheduler.

    We use "pan" to run transforms:
    http://wiki.pentaho.com/display/EAI/...+Documentation

    We use "kitchen" to run jobs:
    http://wiki.pentaho.com/display/EAI/...+Documentation

    Some advanced concepts for scheduling:
    http://diethardsteiner.blogspot.com/...uling-and.html

    You can also use the Pentaho BI Server to schedule ETL jobs which have been configured to run on the cluster. I am currently working on the roadmap for Kettle as it related to Big Data and we have been wanting to look into Oozie integration but haven't heard much from people with real use cases. I am very interested in your specific use case and what your "wish list" for Oozie integration is. Are you already using Oozie and wanting to orchestrate from there? Are you assuming you nee to use it but are open to other technology?

    Thanks,
    Doug

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.