Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: jbpm workflow for kettle entries

  1. #1
    dhartford Guest

    Default jbpm workflow for kettle entries

    Hello all,
    Just bringing up this topic for discussion (rumor of a summit in
    January).

    I'm no expert at either jbpm, nor with kettle, but am familiar and
    learning/using both. The .kjb and .ktr files created with Kettle have
    similarities to the jbpm processdefinition.xml files, and the
    individual Kettle entry types similar to different node-types within
    jbpm.

    While the Kettle designers are EXCELLENT and would not want to break
    the designer tools, it would be interesting to see about changing the
    back-end hops to jbpm transistions and some of the basic Entry types to
    jbpm nodes and see the results.

    Why?
    *jbpm is a Business Process Management solution. Doing ETL on top of
    BPM makes sense, especially if some people are using the same BPM or
    other solutions.

    *Could integrate ETL directly into a full BPM process, instead of ETL
    as a seperate solution/process.

    *BAM/reporting tools built on top of jBPM could be used to watch Kettle
    ETL performance (again, particularly individual entries across an
    entire process).

    *Once flushed out, could focus more on the ETL node/entry pieces
    instead of the workflow mechanics (i.e. jbpm focus on parrallel
    processing, async processes, logging of individual nodes, etc while
    individual Kettle Entries/Nodes focus on their job).

    Technical pros/cons:
    *Jbpm understands variables that are carried from
    node-to-node/entry-to-entry.
    *Jbpm understands sub-processes so transformations-in-jobs/jobs-in-jobs
    would already be supported.
    *Support for both application-embedding and enterprise (Java EE).

    *However, datastreams may be a challenge (I haven't looked at Kettle
    code, so not sure).
    *Lack of seperation between what would be considered a 'job' versus a
    'transformation'.

    Comments?


    --~--~---------~--~----~------------~-------~--~----~
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) g...oups (DOT) com
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
    -~----------~----~----~----~------~----~------~--~---

  2. #2
    Matt Casters Guest

    Default Re: jbpm workflow for kettle entries

    "When all you have is a hammer, you see nails everywhere."

    You haven't really explained why you think jBPM would do a better job
    of executing a transformation than what we have now.

    That's the problem that I see at the moment. Also, the Pentaho
    platform has workflow management built into it, so you could use that.
    It's not jBPM, but from what I've read jBPM would not be hard to
    implement.

    Personally, I'm not against the idea, but it would shoot complexity of
    the job handling through the roof, we would have all kinds of
    installation issues, there would be a learning curve for the whole
    development team for months on end. And after all that I'm not sure
    we would have a better ETL solution.

    All the best,

    Matt


    Quoting dhartford <dhartford (AT) ghsinc (DOT) com>:

    >
    > Hello all,
    > Just bringing up this topic for discussion (rumor of a summit in
    > January).
    >
    > I'm no expert at either jbpm, nor with kettle, but am familiar and
    > learning/using both. The .kjb and .ktr files created with Kettle have
    > similarities to the jbpm processdefinition.xml files, and the
    > individual Kettle entry types similar to different node-types within
    > jbpm.
    >
    > While the Kettle designers are EXCELLENT and would not want to break
    > the designer tools, it would be interesting to see about changing the
    > back-end hops to jbpm transistions and some of the basic Entry types to
    > jbpm nodes and see the results.
    >
    > Why?
    > *jbpm is a Business Process Management solution. Doing ETL on top of
    > BPM makes sense, especially if some people are using the same BPM or
    > other solutions.
    >
    > *Could integrate ETL directly into a full BPM process, instead of ETL
    > as a seperate solution/process.
    >
    > *BAM/reporting tools built on top of jBPM could be used to watch Kettle
    > ETL performance (again, particularly individual entries across an
    > entire process).
    >
    > *Once flushed out, could focus more on the ETL node/entry pieces
    > instead of the workflow mechanics (i.e. jbpm focus on parrallel
    > processing, async processes, logging of individual nodes, etc while
    > individual Kettle Entries/Nodes focus on their job).
    >
    > Technical pros/cons:
    > *Jbpm understands variables that are carried from
    > node-to-node/entry-to-entry.
    > *Jbpm understands sub-processes so transformations-in-jobs/jobs-in-jobs
    > would already be supported.
    > *Support for both application-embedding and enterprise (Java EE).
    >
    > *However, datastreams may be a challenge (I haven't looked at Kettle
    > code, so not sure).
    > *Lack of seperation between what would be considered a 'job' versus a
    > 'transformation'.
    >
    > Comments?
    >
    >
    >



    -------
    Matt Casters
    Chief Data Integration
    Pentaho Open Source Business Intelligence


    --~--~---------~--~----~------------~-------~--~----~
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) g...oups (DOT) com
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
    -~----------~----~----~----~------~----~------~--~---

  3. #3
    Darren Hartford Guest

    Default RE: jbpm workflow for kettle entries

    I never said it would be better job of executing a transformation - nor
    do I expect it would. The only reason I brought it up is if a number of
    people are already using a certain library or tool to solve a problem in
    multiple scenarios (example: Quartz), it is easier to learn that one
    library or tool instead of re-learning yet another workflow/scheduling
    library/reporting tool/etc.

    Shark workflow is actually pretty easy, I definitely give it that.
    However, following the multiple-scenario route -- for solving business
    problems most workflow/BPM solutions would go towards either a BPEL or
    XPDL/jPDL based solution and the people solving the problem may already
    have that domain knowledge in place (jBPM = both, but I'm thinking of
    jPDL).

    And, as for what it would bring to the table - as I mentioned, reporting
    and BAM tools built on top of jBPM would automatically improve
    Kettle/ETL processes and business domain knowledge. Integration with
    existing processes would be huge as wouldn't have to build an additional
    framework/reporting/log management/custom code to tie an ETL with the
    rest of a process. Definitely not recommending abandon Shark or
    whatever is already in place until value is proven from an applied
    standpoint versus an academic standpoint.

    The big one that I may need to re-emphasize is integration. I'm not
    sure how many are familiar with data/document capture workflows, but the
    concept is you have a workflow system that moves the token/process
    around to (at times very) different entries/nodes/tasks. The workflow
    itself is only an enabler, the actual work is done by the
    entry/node/task. The beautiful part is that you can use other
    entry/node/tasks that already exist for that workflow system without
    having to custom-write one, especially if you were not expecting it
    (i.e. doing an ETL process that suddenly needs to include a
    user-interacted queue of individual items (say 5 out of 100) that need
    to be examined or QA'd).

    "When all you have is a hammer, you see nails everywhere" quote is not
    always negative - when you are familiar with a tool that can accomplish
    the task at hand, but need to use another hammer because the *toolkit*
    required it, need to re-examine at what can solve the most problems
    without cluttering your garage (or having a ton of tool manuals lying
    around that you have to reference for different tools).

    If integration with jBPM is a no-show, it is what it is, just brought it
    up for discussion. The existing solution works great.

    -D


    > -----Original Message-----
    > From: Matt Casters [mailto:mcasters (AT) pentaho (DOT) org]
    > Sent: Thursday, December 28, 2006 4:37 AM
    > To: kettle-developers (AT) googlegroups (DOT) com
    > Cc: Darren Hartford
    > Subject: Re: jbpm workflow for kettle entries
    >
    >
    > "When all you have is a hammer, you see nails everywhere."
    >
    > You haven't really explained why you think jBPM would do a
    > better job of executing a transformation than what we have now.
    >
    > That's the problem that I see at the moment. Also, the
    > Pentaho platform has workflow management built into it, so
    > you could use that.
    > It's not jBPM, but from what I've read jBPM would not be
    > hard to implement.
    >
    > Personally, I'm not against the idea, but it would shoot
    > complexity of the job handling through the roof, we would
    > have all kinds of installation issues, there would be a
    > learning curve for the whole development team for months on
    > end. And after all that I'm not sure we would have a better
    > ETL solution.
    >
    > All the best,
    >
    > Matt
    >
    >
    > Quoting dhartford <dhartford (AT) ghsinc (DOT) com>:
    >
    > >
    > > Hello all,
    > > Just bringing up this topic for discussion (rumor of a summit in
    > > January).
    > >
    > > I'm no expert at either jbpm, nor with kettle, but am familiar and
    > > learning/using both. The .kjb and .ktr files created with

    > Kettle have
    > > similarities to the jbpm processdefinition.xml files, and the
    > > individual Kettle entry types similar to different

    > node-types within
    > > jbpm.
    > >
    > > While the Kettle designers are EXCELLENT and would not want

    > to break
    > > the designer tools, it would be interesting to see about

    > changing the
    > > back-end hops to jbpm transistions and some of the basic

    > Entry types
    > > to jbpm nodes and see the results.
    > >
    > > Why?
    > > *jbpm is a Business Process Management solution. Doing ETL

    > on top of
    > > BPM makes sense, especially if some people are using the

    > same BPM or
    > > other solutions.
    > >
    > > *Could integrate ETL directly into a full BPM process,

    > instead of ETL
    > > as a seperate solution/process.
    > >
    > > *BAM/reporting tools built on top of jBPM could be used to watch
    > > Kettle ETL performance (again, particularly individual

    > entries across
    > > an entire process).
    > >
    > > *Once flushed out, could focus more on the ETL node/entry pieces
    > > instead of the workflow mechanics (i.e. jbpm focus on parrallel
    > > processing, async processes, logging of individual nodes, etc while
    > > individual Kettle Entries/Nodes focus on their job).
    > >
    > > Technical pros/cons:
    > > *Jbpm understands variables that are carried from
    > > node-to-node/entry-to-entry.
    > > *Jbpm understands sub-processes so
    > > transformations-in-jobs/jobs-in-jobs
    > > would already be supported.
    > > *Support for both application-embedding and enterprise (Java EE).
    > >
    > > *However, datastreams may be a challenge (I haven't looked

    > at Kettle
    > > code, so not sure).
    > > *Lack of seperation between what would be considered a

    > 'job' versus a
    > > 'transformation'.
    > >
    > > Comments?
    > >
    > >
    > > >

    >
    >
    > -------
    > Matt Casters
    > Chief Data Integration
    > Pentaho Open Source Business Intelligence
    >
    >


    --~--~---------~--~----~------------~-------~--~----~
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) g...oups (DOT) com
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
    -~----------~----~----~----~------~----~------~--~---

  4. #4
    Matt Casters Guest

    Default RE: jbpm workflow for kettle entries

    > (i.e. doing an ETL process that suddenly needs to include a
    user-interacted queue of individual items (say 5 out of 100) that need to be
    examined or QA'd).

    This is actually one of the pitfalls when designing a warehouse. A
    warehouse should reflect the content of the source systems as good as
    possible, make it historically correct, etc.
    Having users come into play to correct, verify, validate data is a very
    slippery and dangerous road and only for the rarest cases would I recommend
    doing such a thing.
    I don't bring this up ever so lightly as I have seen several multi-man-year
    DWH projects come crashing down because of this.
    The problem is that it very soon becomes a "Data was bad, we can't load it
    because it wasn't verified/cleaned/checked by user X" type situation. Ugh.

    > If integration with jBPM is a no-show, it is what it is, just brought it

    up for discussion.

    Actually Darren, nothing is a no-show. I know of jBPM (I heard on of the
    the main devs is a fellow Belgian) and have seen it at work before.
    I also know it's not the simplest thing to set up, the backend needs
    persistance, etc. It's not a 5-minute hack for sure :-)

    If I need to take something with me from all this, to the next iteration of
    the job execution system, then it is that we could use some abstraction so
    that we can plug it into any workflow system later on. And if I'm not
    mistaken, that is exactly what the Pentaho platform did, so I guess I should
    put it on the agenda of the next Pentaho development summit at the end of
    January.
    To my knowledge, in the long (looong) run we are looking for a solution to
    unify jobs, workflows and action sequences, so it's probably going to be
    something like a real workflow management system that will have to handle
    this anyway.

    All the best,

    Matt




    -----Original Message-----
    From: kettle-developers (AT) googlegroups (DOT) com
    [mailto:kettle-developers (AT) googlegroups (DOT) com] On Behalf Of Darren Hartford
    Sent: Thursday, December 28, 2006 3:14 PM
    To: kettle-developers (AT) googlegroups (DOT) com
    Subject: RE: jbpm workflow for kettle entries


    I never said it would be better job of executing a transformation - nor do I
    expect it would. The only reason I brought it up is if a number of people
    are already using a certain library or tool to solve a problem in multiple
    scenarios (example: Quartz), it is easier to learn that one library or tool
    instead of re-learning yet another workflow/scheduling library/reporting
    tool/etc.

    Shark workflow is actually pretty easy, I definitely give it that.
    However, following the multiple-scenario route -- for solving business
    problems most workflow/BPM solutions would go towards either a BPEL or
    XPDL/jPDL based solution and the people solving the problem may already have
    that domain knowledge in place (jBPM = both, but I'm thinking of jPDL).

    And, as for what it would bring to the table - as I mentioned, reporting and
    BAM tools built on top of jBPM would automatically improve Kettle/ETL
    processes and business domain knowledge. Integration with existing
    processes would be huge as wouldn't have to build an additional
    framework/reporting/log management/custom code to tie an ETL with the rest
    of a process. Definitely not recommending abandon Shark or whatever is
    already in place until value is proven from an applied standpoint versus an
    academic standpoint.

    The big one that I may need to re-emphasize is integration. I'm not sure
    how many are familiar with data/document capture workflows, but the concept
    is you have a workflow system that moves the token/process around to (at
    times very) different entries/nodes/tasks. The workflow itself is only an
    enabler, the actual work is done by the entry/node/task. The beautiful part
    is that you can use other entry/node/tasks that already exist for that
    workflow system without having to custom-write one, especially if you were
    not expecting it (i.e. doing an ETL process that suddenly needs to include a
    user-interacted queue of individual items (say 5 out of 100) that need to be
    examined or QA'd).

    "When all you have is a hammer, you see nails everywhere" quote is not
    always negative - when you are familiar with a tool that can accomplish the
    task at hand, but need to use another hammer because the *toolkit* required
    it, need to re-examine at what can solve the most problems without
    cluttering your garage (or having a ton of tool manuals lying around that
    you have to reference for different tools).

    If integration with jBPM is a no-show, it is what it is, just brought it up
    for discussion. The existing solution works great.

    -D


    > -----Original Message-----
    > From: Matt Casters [mailto:mcasters (AT) pentaho (DOT) org]
    > Sent: Thursday, December 28, 2006 4:37 AM
    > To: kettle-developers (AT) googlegroups (DOT) com
    > Cc: Darren Hartford
    > Subject: Re: jbpm workflow for kettle entries
    >
    >
    > "When all you have is a hammer, you see nails everywhere."
    >
    > You haven't really explained why you think jBPM would do a better job
    > of executing a transformation than what we have now.
    >
    > That's the problem that I see at the moment. Also, the Pentaho
    > platform has workflow management built into it, so you could use that.
    > It's not jBPM, but from what I've read jBPM would not be hard to
    > implement.
    >
    > Personally, I'm not against the idea, but it would shoot complexity of
    > the job handling through the roof, we would have all kinds of
    > installation issues, there would be a learning curve for the whole
    > development team for months on end. And after all that I'm not sure
    > we would have a better ETL solution.
    >
    > All the best,
    >
    > Matt
    >
    >
    > Quoting dhartford <dhartford (AT) ghsinc (DOT) com>:
    >
    > >
    > > Hello all,
    > > Just bringing up this topic for discussion (rumor of a summit in
    > > January).
    > >
    > > I'm no expert at either jbpm, nor with kettle, but am familiar and
    > > learning/using both. The .kjb and .ktr files created with

    > Kettle have
    > > similarities to the jbpm processdefinition.xml files, and the
    > > individual Kettle entry types similar to different

    > node-types within
    > > jbpm.
    > >
    > > While the Kettle designers are EXCELLENT and would not want

    > to break
    > > the designer tools, it would be interesting to see about

    > changing the
    > > back-end hops to jbpm transistions and some of the basic

    > Entry types
    > > to jbpm nodes and see the results.
    > >
    > > Why?
    > > *jbpm is a Business Process Management solution. Doing ETL

    > on top of
    > > BPM makes sense, especially if some people are using the

    > same BPM or
    > > other solutions.
    > >
    > > *Could integrate ETL directly into a full BPM process,

    > instead of ETL
    > > as a seperate solution/process.
    > >
    > > *BAM/reporting tools built on top of jBPM could be used to watch
    > > Kettle ETL performance (again, particularly individual

    > entries across
    > > an entire process).
    > >
    > > *Once flushed out, could focus more on the ETL node/entry pieces
    > > instead of the workflow mechanics (i.e. jbpm focus on parrallel
    > > processing, async processes, logging of individual nodes, etc while
    > > individual Kettle Entries/Nodes focus on their job).
    > >
    > > Technical pros/cons:
    > > *Jbpm understands variables that are carried from
    > > node-to-node/entry-to-entry.
    > > *Jbpm understands sub-processes so
    > > transformations-in-jobs/jobs-in-jobs
    > > would already be supported.
    > > *Support for both application-embedding and enterprise (Java EE).
    > >
    > > *However, datastreams may be a challenge (I haven't looked

    > at Kettle
    > > code, so not sure).
    > > *Lack of seperation between what would be considered a

    > 'job' versus a
    > > 'transformation'.
    > >
    > > Comments?
    > >
    > >
    > > >

    >
    >
    > -------
    > Matt Casters
    > Chief Data Integration
    > Pentaho Open Source Business Intelligence
    >
    >





    --~--~---------~--~----~------------~-------~--~----~
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) g...oups (DOT) com
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
    -~----------~----~----~----~------~----~------~--~---

  5. #5
    Darren Hartford Guest

    Default RE: jbpm workflow for kettle entries

    > If I need to take something with me from all this, to the
    > next iteration of the job execution system, then it is that
    > we could use some abstraction so that we can plug it into any
    > workflow system later on. And if I'm not mistaken, that is
    > exactly what the Pentaho platform did, so I guess I should
    > put it on the agenda of the next Pentaho development summit
    > at the end of January.
    > To my knowledge, in the long (looong) run we are looking for
    > a solution to unify jobs, workflows and action sequences, so
    > it's probably going to be something like a real workflow
    > management system that will have to handle this anyway.


    I think you summed this up nicely Matt :-)

    Changing the parent workflow/job-execution system, or adding an
    abstraction for use as workflow-plugins, would indeed take a very long
    time. I was thinking 2-year some-odd plan, so definitely not a
    quick-hack -- but the 2-year timeline needs to start from somewhere, and
    hopefully this discussion will be the catalyst.

    Thanks for your time, and have fun at the summit!
    -D

    --~--~---------~--~----~------------~-------~--~----~
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) g...oups (DOT) com
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
    -~----------~----~----~----~------~----~------~--~---

  6. #6
    James Dixon Guest

    Default Fwd: jbpm workflow for kettle entries

    Here are my thoughts on this subject as we have looked into this
    topic before. I think your interest in WorkFlow Engine (WFE) / ETL
    integration is well founded.

    First the bad news...

    You correctly identified some of the cons and challenges to this
    approach.

    Workflow engines (WFEs) are based on one of two design models: state
    machines or event transitions. The definitions of the workflows of
    each of these are very different, although both can be represented
    graphically. The Kettle engine is not based on either of these models
    (rightly so because it is an ETL engine not a workflow engine). When
    Kettle transformations and the two kinds of workflows are displayed
    graphically they do look very similar but the similarity ends there.

    We looked into this because the Pentaho action sequences use
    streaming data like Kettle does. After a lot of discussion and head-
    scratching we came to the conclusion that WFEs are not suitable for
    executing these kinds of processes. The seem to be several (bad)
    options if you try to use a WFE to execute a stream-based process.

    1) Treat each row of data as a new instance of the workflow. The
    contents of each row are passed from node to node. This is a huge
    overhead and makes any kind of multi-row operation very difficult /
    impossible to implement.

    2) Treat each row-set as a new instance of the workflow. The entire
    row-set is passed from node to node. This is clearly not scalable.

    3) Attempt to pass a stream between nodes in the workflow. This is a
    no go for many engines. WFEs are designed to be persistent and
    transaction oriented. The entire WFE can be restarted and the state
    of your instance of the workflow will be exactly as it was. A live
    stream cannot be persisted.

    Another issue is that WFEs require a single starting-point for a
    workflow and a single exit point. Streamed ETL data does not behave
    this way.

    We have not completely closed the door on using a WFE to execute
    action sequences but there are many technical obstacles to
    implementing this.

    Now the better news...

    Workflow-enabling a Kettle transformation (instead of a single Kettle
    step) has none of the problems above and works great, enabling you to
    perform ETL transformations as part of your workflow. If you need a
    process that involves lots of workflow-type stuff you might find that
    you end up with a larger number of transformations that contain fewer
    steps each. We have done this kind of integration already with the
    Pentaho platform and Shark and jBPM support is on our roadmap (we
    prototyped it a long time ago). A direct jBPM-Kettle integration
    would not be difficult to do either. We will focus first on
    integrating jBPM with the Pentaho platform as it enables all the
    Pentaho components (e.g. JFreeReport, Mondrian etc) to be used within
    jBPM workflows.

    James Dixon

    Chief Geek, Pentaho Corp
    Citadel International, Suite 340

  7. #7
    Matt Casters Guest

    Default RE: jbpm workflow for kettle entries

    Hi James,

    It will be interesting to have a chat about this in a few weeks.
    However, I think it's the job engine we're talking about, not the
    transformation engine.

    All the best,


    Matt
    ____________________________________________
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org <http://www.pentaho.org/> -- mcasters (AT) pentaho (DOT) org
    Tel. +32 (0) 486 97 29 37


    _____

    From: kettle-developers (AT) googlegroups (DOT) com
    [mailto:kettle-developers (AT) googlegroups (DOT) com] On Behalf Of James Dixon
    Sent: Saturday, January 06, 2007 7:24 AM
    To: kettle-developers (AT) googlegroups (DOT) com
    Subject: Fwd: jbpm workflow for kettle entries


    Here are my thoughts on this subject as we have looked into this topic
    before. I think your interest in WorkFlow Engine (WFE) / ETL integration is
    well founded.

    First the bad news...

    You correctly identified some of the cons and challenges to this approach.

    Workflow engines (WFEs) are based on one of two design models: state
    machines or event transitions. The definitions of the workflows of each of
    these are very different, although both can be represented graphically. The
    Kettle engine is not based on either of these models (rightly so because it
    is an ETL engine not a workflow engine). When Kettle transformations and the
    two kinds of workflows are displayed graphically they do look very similar
    but the similarity ends there.

    We looked into this because the Pentaho action sequences use streaming data
    like Kettle does. After a lot of discussion and head-scratching we came to
    the conclusion that WFEs are not suitable for executing these kinds of
    processes. The seem to be several (bad) options if you try to use a WFE to
    execute a stream-based process.

    1) Treat each row of data as a new instance of the workflow. The contents of
    each row are passed from node to node. This is a huge overhead and makes any
    kind of multi-row operation very difficult / impossible to implement.

    2) Treat each row-set as a new instance of the workflow. The entire row-set
    is passed from node to node. This is clearly not scalable.

    3) Attempt to pass a stream between nodes in the workflow. This is a no go
    for many engines. WFEs are designed to be persistent and transaction
    oriented. The entire WFE can be restarted and the state of your instance of
    the workflow will be exactly as it was. A live stream cannot be persisted.

    Another issue is that WFEs require a single starting-point for a workflow
    and a single exit point. Streamed ETL data does not behave this way.

    We have not completely closed the door on using a WFE to execute action
    sequences but there are many technical obstacles to implementing this.

    Now the better news...

    Workflow-enabling a Kettle transformation (instead of a single Kettle step)
    has none of the problems above and works great, enabling you to perform ETL
    transformations as part of your workflow. If you need a process that
    involves lots of workflow-type stuff you might find that you end up with a
    larger number of transformations that contain fewer steps each. We have done
    this kind of integration already with the Pentaho platform and Shark and
    jBPM support is on our roadmap (we prototyped it a long time ago). A direct
    jBPM-Kettle integration would not be difficult to do either. We will focus
    first on integrating jBPM with the Pentaho platform as it enables all the
    Pentaho components (e.g. JFreeReport, Mondrian etc) to be used within jBPM
    workflows.

    James Dixon

    Chief Geek, Pentaho Corp
    Citadel International, Suite 340 . 5950 Hazeltine National Dr. . Orlando, FL
    32822, USA
    +1 407 812-OPEN (6736) ext. 6204 . Fax: 407-517-4575
    Toll Free: 1-866-496-2703






    --~--~---------~--~----~------------~-------~--~----~
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) g...oups (DOT) com
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
    -~----------~----~----~----~------~----~------~--~---

  8. #8
    James Guest

    Default Re: jbpm workflow for kettle entries

    Hi Matt,

    It seemed to be a transformation-level and job-level discussion to me.
    What I was trying to say was that using jBPM as the engine for
    executing a transformation is not a good fit whereas using it as the
    engine at the job level is a good fit and is on the roadmap for the
    Pentaho platform.

    James


    --~--~---------~--~----~------------~-------~--~----~
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    To unsubscribe from this group, send email to kettle-developers-unsubscribe (AT) g...oups (DOT) com
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en
    -~----------~----~----~----~------~----~------~--~---

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.