Hitachi Vantara Pentaho Community Forums
Page 2 of 2 FirstFirst 12
Results 11 to 15 of 15

Thread: Spoon Jobs Plugins -- parallel paths

  1. #11
    Matt Casters Guest

    Default Re: Spoon Jobs Plugins -- parallel paths

    In Pentaho Kettle Solutions or here is some info:
    http://wiki.pentaho.com/display/EAI/...ript+job+entry


    2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>

    > Agree, will refactor once I get all the pieces working I need.
    >
    > Is there some place I can look to see the function definitions of the
    > Result class?
    >
    >
    >
    > On Mon, Oct 3, 2011 at 2:44 PM, Matt Casters <mcasters (AT) pentaho (DOT) org> wrote:
    >
    >> Actually, we just added a "Job Executor" step in 4.3.0-M1 so the
    >> possibilities have increased a bit.
    >>
    >> As a general piece of advice, non-specific to Kettle: don't try to do
    >> everything in one transformation or job. Make things modular to keep a nice
    >> overview.
    >> Think about the idea of staging the data into a buffer (file) or queue
    >> (database table). Then you can scale as far as you like, for example like
    >> Diethard documented a while back:
    >> http://diethardsteiner.blogspot.com/...designing.html
    >>
    >> Matt
    >>
    >>
    >> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>
    >>> Yes we started initially using steps, but needed a little more flow
    >>> control. Forgive me my newbe questions I am new to spoon, we may need to
    >>> look back at steps (the lack of flow control might have been a knowledge
    >>> issue on my part) but we need a way to do the majority of things in
    >>> sequential order each step waiting for the next, but also split off into
    >>> multiple paths when needed.
    >>>
    >>> If I can detect the number of inbound and outbound paths within the
    >>> plugin I can handle what I need in the Jobs, once we have the Jobs going I
    >>> will see if I can solve the flow issues we were having within the steps. My
    >>> project manager had ran into those and told me to do the jobs plugins. I
    >>> had suggested the "Wait on steps" to solve it but he wanted something with
    >>> less user interaction.
    >>>
    >>> Also just curious on this is there a way to display data in a Job (open a
    >>> window with the results in a table) when it finishes right now I am writing
    >>> the data to a CSV file that I receive back from the server I'm calling. I
    >>> know there is in Steps/Transformations, and I've thought about calling a
    >>> Transformation from the Job to handle the display portion.
    >>>
    >>>
    >>>
    >>>
    >>>
    >>> On Mon, Oct 3, 2011 at 1:49 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>
    >>>> I actually don't mind the questions about plugin development.
    >>>>
    >>>> Anyway, most people would write a step plugin for parallel work. All
    >>>> the questions you ask then have easy answers.
    >>>>
    >>>> Matt
    >>>>
    >>>>
    >>>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>>
    >>>>> Is there a group dedicated to developing plug-ins? I figured the
    >>>>> Development board was for both the core and the development of plugins.
    >>>>>
    >>>>> Thanks for the suggestions, the plug-ins get out side of the typical
    >>>>> use of Spoon as I understand it. What I'm doing in the multiple paths is
    >>>>> splitting off and pre-processing (across a cluster of servers) multiple
    >>>>> groups of data (this isn't a traditional database that I'm interfacing
    >>>>> with). The pre-processing then returns proprietary code that I must have in
    >>>>> later steps to utilize the the preprocessed data.
    >>>>>
    >>>>> From a programming point of view, if I have 3 paths going into one step
    >>>>> with in the Job I assume only one object of the class is created. So if I
    >>>>> use a variable to switch my logic I can merge the data together as it comes
    >>>>> in until I've reached the number of paths and then continue.
    >>>>>
    >>>>> Is there a programmatic way in a plugin to detect the number of
    >>>>> outgoing or inbound paths attached? I think I can handle the other issues
    >>>>> but I don't want this value to be a user input or hard coded.
    >>>>>
    >>>>>
    >>>>>
    >>>>> On Mon, Oct 3, 2011 at 12:43 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>>>
    >>>>>> No special reason Andy, just old habits of a Kettle guy formerly known
    >>>>>> as DBA.
    >>>>>>
    >>>>>>
    >>>>>> 2011/10/3 Andy Grohe <agrohe21 (AT) gmail (DOT) com>
    >>>>>>
    >>>>>>> Since we are asking the questions, I would normally say use
    >>>>>>> "serialize to file" which keeps kettle data structures intact vs going out
    >>>>>>> to files or db.
    >>>>>>>
    >>>>>>> @matt, curious why you suggest db vs the native kettle serialize
    >>>>>>> inputs/outputs?
    >>>>>>>
    >>>>>>> Sent from my iPhone
    >>>>>>>
    >>>>>>> On Oct 3, 2011, at 11:33 AM, Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>> wrote:
    >>>>>>>
    >>>>>>> Hi Joe,
    >>>>>>>
    >>>>>>> If you join different data streams, you can indeed use a step like
    >>>>>>> Merge Join.
    >>>>>>> However, if you want to simply merge the data from 2 or more copies
    >>>>>>> of the same step you don't need to do anything as it's standard behavior of
    >>>>>>> a step.
    >>>>>>>
    >>>>>>> In the case of job entries (not clear what you are building) it's
    >>>>>>> indeed hard to have parallel entries add to the result row list.
    >>>>>>> However, perhaps it would be more efficient to add the rows to a
    >>>>>>> database staging table or another similar temporary container.
    >>>>>>>
    >>>>>>> Matt
    >>>>>>>
    >>>>>>>
    >>>>>>> 2011/10/3 Joe Chambers < <joseph.chambers (AT) gmail (DOT) com>
    >>>>>>> joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>
    >>>>>>>> I am developing a set of plugins to interface with a new data
    >>>>>>>> platform. I've got it working in a linear fashion. However I want
    >>>>>>>> to
    >>>>>>>> run some of the tasks in parallel or multiple paths/threads. I see
    >>>>>>>> you can run multiple paths but rejoining them and having data passed
    >>>>>>>> to the merge step seems to be an issue. I am using the prevResult
    >>>>>>>> and
    >>>>>>>> returning the Result in the execute function to carry my data
    >>>>>>>> between
    >>>>>>>> steps. The problem the merge/join is just called by the thread that
    >>>>>>>> finishes first, is there a way to have some type of wait loop that I
    >>>>>>>> can merge the data from all the previous steps going into the merge
    >>>>>>>> step.
    >>>>>>>>
    >>>>>>>> I'm looking at using a static variable to enter a waiting loop that
    >>>>>>>> would block all other calls until all the data is available, each
    >>>>>>>> additional call to this step would, based on this static variable,
    >>>>>>>> go
    >>>>>>>> into a merge function that would merge its data into a static
    >>>>>>>> variable
    >>>>>>>> and then once the count has reached the number of paths continue.
    >>>>>>>> With this I need to know a way to write a split step that can some
    >>>>>>>> how
    >>>>>>>> detect the number of exiting paths, is this possible?
    >>>>>>>>
    >>>>>>>> There has to be a better way but I don't see a construct to do it.
    >>>>>>>>
    >>>>>>>> I know this doesn't quite fit in with Spoon's existing
    >>>>>>>> infrastructure
    >>>>>>>> but I've been tasked with doing this.
    >>>>>>>>
    >>>>>>>> Thanks,
    >>>>>>>> Joseph
    >>>>>>>>
    >>>>>>>> --
    >>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>> Groups "kettle-developers" group.
    >>>>>>>> To post to this group, send email to
    >>>>>>>> <kettle-developers (AT) googlegroups (DOT) com>
    >>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>> <kettle-developers%2Bunsubscribe (AT) googlegroups (DOT) com>
    >>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>> For more options, visit this group at
    >>>>>>>> <http://groups.google.com/group/kettle-developers?hl=en>
    >>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>
    >>>>>>>>
    >>>>>>>
    >>>>>>>
    >>>>>>> --
    >>>>>>> Matt Casters < <mcasters (AT) pentaho (DOT) org>mcasters (AT) pentaho (DOT) org>
    >>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>> )
    >>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>> Intelligence
    >>>>>>>
    >>>>>>>
    >>>>>>> --
    >>>>>>> You received this message because you are subscribed to the Google
    >>>>>>> Groups "kettle-developers" group.
    >>>>>>> To post to this group, send email to
    >>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>> To unsubscribe from this group, send email to
    >>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>> For more options, visit this group at
    >>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>
    >>>>>>> --
    >>>>>>> You received this message because you are subscribed to the Google
    >>>>>>> Groups "kettle-developers" group.
    >>>>>>> To post to this group, send email to
    >>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>> To unsubscribe from this group, send email to
    >>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>> For more options, visit this group at
    >>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>
    >>>>>>
    >>>>>>
    >>>>>>
    >>>>>> --
    >>>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>> )
    >>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>> Intelligence
    >>>>>>
    >>>>>>
    >>>>>> --
    >>>>>> You received this message because you are subscribed to the Google
    >>>>>> Groups "kettle-developers" group.
    >>>>>> To post to this group, send email to
    >>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>> To unsubscribe from this group, send email to
    >>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>> For more options, visit this group at
    >>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>
    >>>>>
    >>>>> --
    >>>>> You received this message because you are subscribed to the Google
    >>>>> Groups "kettle-developers" group.
    >>>>> To post to this group, send email to
    >>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>> To unsubscribe from this group, send email to
    >>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>> For more options, visit this group at
    >>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>
    >>>>
    >>>>
    >>>>
    >>>> --
    >>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>> )
    >>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>> Intelligence
    >>>>
    >>>>
    >>>> --
    >>>> You received this message because you are subscribed to the Google
    >>>> Groups "kettle-developers" group.
    >>>> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    >>>> .
    >>>> To unsubscribe from this group, send email to
    >>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>> For more options, visit this group at
    >>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>
    >>>
    >>> --
    >>> You received this message because you are subscribed to the Google Groups
    >>> "kettle-developers" group.
    >>> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    >>> To unsubscribe from this group, send email to
    >>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>> For more options, visit this group at
    >>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>

    >>
    >>
    >>
    >> --
    >> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >> )
    >> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >> Pentaho : The Commercial Open Source Alternative for Business Intelligence
    >>
    >>
    >> --
    >> You received this message because you are subscribed to the Google Groups
    >> "kettle-developers" group.
    >> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    >> To unsubscribe from this group, send email to
    >> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >> For more options, visit this group at
    >> http://groups.google.com/group/kettle-developers?hl=en.
    >>

    >
    > --
    > You received this message because you are subscribed to the Google Groups
    > "kettle-developers" group.
    > To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    > To unsubscribe from this group, send email to
    > kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    > For more options, visit this group at
    > http://groups.google.com/group/kettle-developers?hl=en.
    >




    --
    Matt Casters <mcasters (AT) pentaho (DOT) org>
    Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    (Wiley <http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>)
    Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    Pentaho : The Commercial Open Source Alternative for Business Intelligence

    --
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    To unsubscribe from this group, send email to kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en.

  2. #12
    Joseph Chambers Guest

    Default Re: Spoon Jobs Plugins -- parallel paths

    Thanks I'll look at that, is there a way inside the execute function to tell
    it to die, but not die as failed. I've got what I want working but I am
    having to set result.setResult(false) on the ones that finish first and the
    result.setResult(true) on the last "thread" to come into my pathMerge
    plugin. This works but it shows the Red No symbol, and I would rather not
    do that.

    On Mon, Oct 3, 2011 at 5:15 PM, Matt Casters <mcasters (AT) pentaho (DOT) org> wrote:

    > In Pentaho Kettle Solutions or here is some info:
    > http://wiki.pentaho.com/display/EAI/...ript+job+entry
    >
    >
    > 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >
    >> Agree, will refactor once I get all the pieces working I need.
    >>
    >> Is there some place I can look to see the function definitions of the
    >> Result class?
    >>
    >>
    >>
    >> On Mon, Oct 3, 2011 at 2:44 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>
    >>> Actually, we just added a "Job Executor" step in 4.3.0-M1 so the
    >>> possibilities have increased a bit.
    >>>
    >>> As a general piece of advice, non-specific to Kettle: don't try to do
    >>> everything in one transformation or job. Make things modular to keep a nice
    >>> overview.
    >>> Think about the idea of staging the data into a buffer (file) or queue
    >>> (database table). Then you can scale as far as you like, for example like
    >>> Diethard documented a while back:
    >>> http://diethardsteiner.blogspot.com/...designing.html
    >>>
    >>> Matt
    >>>
    >>>
    >>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>
    >>>> Yes we started initially using steps, but needed a little more flow
    >>>> control. Forgive me my newbe questions I am new to spoon, we may need to
    >>>> look back at steps (the lack of flow control might have been a knowledge
    >>>> issue on my part) but we need a way to do the majority of things in
    >>>> sequential order each step waiting for the next, but also split off into
    >>>> multiple paths when needed.
    >>>>
    >>>> If I can detect the number of inbound and outbound paths within the
    >>>> plugin I can handle what I need in the Jobs, once we have the Jobs going I
    >>>> will see if I can solve the flow issues we were having within the steps. My
    >>>> project manager had ran into those and told me to do the jobs plugins. I
    >>>> had suggested the "Wait on steps" to solve it but he wanted something with
    >>>> less user interaction.
    >>>>
    >>>> Also just curious on this is there a way to display data in a Job (open
    >>>> a window with the results in a table) when it finishes right now I am
    >>>> writing the data to a CSV file that I receive back from the server I'm
    >>>> calling. I know there is in Steps/Transformations, and I've thought about
    >>>> calling a Transformation from the Job to handle the display portion.
    >>>>
    >>>>
    >>>>
    >>>>
    >>>>
    >>>> On Mon, Oct 3, 2011 at 1:49 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>>
    >>>>> I actually don't mind the questions about plugin development.
    >>>>>
    >>>>> Anyway, most people would write a step plugin for parallel work. All
    >>>>> the questions you ask then have easy answers.
    >>>>>
    >>>>> Matt
    >>>>>
    >>>>>
    >>>>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>>>
    >>>>>> Is there a group dedicated to developing plug-ins? I figured the
    >>>>>> Development board was for both the core and the development of plugins.
    >>>>>>
    >>>>>> Thanks for the suggestions, the plug-ins get out side of the typical
    >>>>>> use of Spoon as I understand it. What I'm doing in the multiple paths is
    >>>>>> splitting off and pre-processing (across a cluster of servers) multiple
    >>>>>> groups of data (this isn't a traditional database that I'm interfacing
    >>>>>> with). The pre-processing then returns proprietary code that I must have in
    >>>>>> later steps to utilize the the preprocessed data.
    >>>>>>
    >>>>>> From a programming point of view, if I have 3 paths going into one
    >>>>>> step with in the Job I assume only one object of the class is created. So
    >>>>>> if I use a variable to switch my logic I can merge the data together as it
    >>>>>> comes in until I've reached the number of paths and then continue.
    >>>>>>
    >>>>>> Is there a programmatic way in a plugin to detect the number of
    >>>>>> outgoing or inbound paths attached? I think I can handle the other issues
    >>>>>> but I don't want this value to be a user input or hard coded.
    >>>>>>
    >>>>>>
    >>>>>>
    >>>>>> On Mon, Oct 3, 2011 at 12:43 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>>>>
    >>>>>>> No special reason Andy, just old habits of a Kettle guy formerly
    >>>>>>> known as DBA.
    >>>>>>>
    >>>>>>>
    >>>>>>> 2011/10/3 Andy Grohe <agrohe21 (AT) gmail (DOT) com>
    >>>>>>>
    >>>>>>>> Since we are asking the questions, I would normally say use
    >>>>>>>> "serialize to file" which keeps kettle data structures intact vs going out
    >>>>>>>> to files or db.
    >>>>>>>>
    >>>>>>>> @matt, curious why you suggest db vs the native kettle serialize
    >>>>>>>> inputs/outputs?
    >>>>>>>>
    >>>>>>>> Sent from my iPhone
    >>>>>>>>
    >>>>>>>> On Oct 3, 2011, at 11:33 AM, Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>>> wrote:
    >>>>>>>>
    >>>>>>>> Hi Joe,
    >>>>>>>>
    >>>>>>>> If you join different data streams, you can indeed use a step like
    >>>>>>>> Merge Join.
    >>>>>>>> However, if you want to simply merge the data from 2 or more copies
    >>>>>>>> of the same step you don't need to do anything as it's standard behavior of
    >>>>>>>> a step.
    >>>>>>>>
    >>>>>>>> In the case of job entries (not clear what you are building) it's
    >>>>>>>> indeed hard to have parallel entries add to the result row list.
    >>>>>>>> However, perhaps it would be more efficient to add the rows to a
    >>>>>>>> database staging table or another similar temporary container.
    >>>>>>>>
    >>>>>>>> Matt
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> 2011/10/3 Joe Chambers < <joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>> joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>>
    >>>>>>>>> I am developing a set of plugins to interface with a new data
    >>>>>>>>> platform. I've got it working in a linear fashion. However I want
    >>>>>>>>> to
    >>>>>>>>> run some of the tasks in parallel or multiple paths/threads. I see
    >>>>>>>>> you can run multiple paths but rejoining them and having data
    >>>>>>>>> passed
    >>>>>>>>> to the merge step seems to be an issue. I am using the prevResult
    >>>>>>>>> and
    >>>>>>>>> returning the Result in the execute function to carry my data
    >>>>>>>>> between
    >>>>>>>>> steps. The problem the merge/join is just called by the thread
    >>>>>>>>> that
    >>>>>>>>> finishes first, is there a way to have some type of wait loop that
    >>>>>>>>> I
    >>>>>>>>> can merge the data from all the previous steps going into the merge
    >>>>>>>>> step.
    >>>>>>>>>
    >>>>>>>>> I'm looking at using a static variable to enter a waiting loop that
    >>>>>>>>> would block all other calls until all the data is available, each
    >>>>>>>>> additional call to this step would, based on this static variable,
    >>>>>>>>> go
    >>>>>>>>> into a merge function that would merge its data into a static
    >>>>>>>>> variable
    >>>>>>>>> and then once the count has reached the number of paths continue.
    >>>>>>>>> With this I need to know a way to write a split step that can some
    >>>>>>>>> how
    >>>>>>>>> detect the number of exiting paths, is this possible?
    >>>>>>>>>
    >>>>>>>>> There has to be a better way but I don't see a construct to do it.
    >>>>>>>>>
    >>>>>>>>> I know this doesn't quite fit in with Spoon's existing
    >>>>>>>>> infrastructure
    >>>>>>>>> but I've been tasked with doing this.
    >>>>>>>>>
    >>>>>>>>> Thanks,
    >>>>>>>>> Joseph
    >>>>>>>>>
    >>>>>>>>> --
    >>>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>>> Groups "kettle-developers" group.
    >>>>>>>>> To post to this group, send email to
    >>>>>>>>> <kettle-developers (AT) googlegroups (DOT) com>
    >>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>> <kettle-developers%2Bunsubscribe (AT) googlegroups (DOT) com>
    >>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>> For more options, visit this group at
    >>>>>>>>> <http://groups.google.com/group/kettle-developers?hl=en>
    >>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> --
    >>>>>>>> Matt Casters < <mcasters (AT) pentaho (DOT) org>mcasters (AT) pentaho (DOT) org>
    >>>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>>> )
    >>>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>>> Intelligence
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> --
    >>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>> Groups "kettle-developers" group.
    >>>>>>>> To post to this group, send email to
    >>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>> For more options, visit this group at
    >>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>
    >>>>>>>> --
    >>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>> Groups "kettle-developers" group.
    >>>>>>>> To post to this group, send email to
    >>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>> For more options, visit this group at
    >>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>
    >>>>>>>
    >>>>>>>
    >>>>>>>
    >>>>>>> --
    >>>>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>> )
    >>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>> Intelligence
    >>>>>>>
    >>>>>>>
    >>>>>>> --
    >>>>>>> You received this message because you are subscribed to the Google
    >>>>>>> Groups "kettle-developers" group.
    >>>>>>> To post to this group, send email to
    >>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>> To unsubscribe from this group, send email to
    >>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>> For more options, visit this group at
    >>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>
    >>>>>>
    >>>>>> --
    >>>>>> You received this message because you are subscribed to the Google
    >>>>>> Groups "kettle-developers" group.
    >>>>>> To post to this group, send email to
    >>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>> To unsubscribe from this group, send email to
    >>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>> For more options, visit this group at
    >>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>
    >>>>>
    >>>>>
    >>>>>
    >>>>> --
    >>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>> )
    >>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>> Intelligence
    >>>>>
    >>>>>
    >>>>> --
    >>>>> You received this message because you are subscribed to the Google
    >>>>> Groups "kettle-developers" group.
    >>>>> To post to this group, send email to
    >>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>> To unsubscribe from this group, send email to
    >>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>> For more options, visit this group at
    >>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>
    >>>>
    >>>> --
    >>>> You received this message because you are subscribed to the Google
    >>>> Groups "kettle-developers" group.
    >>>> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    >>>> .
    >>>> To unsubscribe from this group, send email to
    >>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>> For more options, visit this group at
    >>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>
    >>>
    >>>
    >>>
    >>> --
    >>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>> )
    >>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>> Pentaho : The Commercial Open Source Alternative for Business
    >>> Intelligence
    >>>
    >>>
    >>> --
    >>> You received this message because you are subscribed to the Google Groups
    >>> "kettle-developers" group.
    >>> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    >>> To unsubscribe from this group, send email to
    >>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>> For more options, visit this group at
    >>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>

    >>
    >> --
    >> You received this message because you are subscribed to the Google Groups
    >> "kettle-developers" group.
    >> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    >> To unsubscribe from this group, send email to
    >> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >> For more options, visit this group at
    >> http://groups.google.com/group/kettle-developers?hl=en.
    >>

    >
    >
    >
    > --
    > Matt Casters <mcasters (AT) pentaho (DOT) org>
    > Chief Data Integration, Kettle founder, Author of Pentaho Kettle Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    > (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    > )
    > Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    > Pentaho : The Commercial Open Source Alternative for Business Intelligence
    >
    >
    > --
    > You received this message because you are subscribed to the Google Groups
    > "kettle-developers" group.
    > To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    > To unsubscribe from this group, send email to
    > kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    > For more options, visit this group at
    > http://groups.google.com/group/kettle-developers?hl=en.
    >


    --
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    To unsubscribe from this group, send email to kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en.

  3. #13
    Joseph Chambers Guest

    Default Re: Spoon Jobs Plugins -- parallel paths

    Can you elaborate on the function of result.setExitStatus(int) I can't seem
    to find any documentation on the allowed values to pass it.

    On Mon, Oct 3, 2011 at 6:37 PM, Joseph Chambers
    <joseph.chambers (AT) gmail (DOT) com>wrote:

    > Thanks I'll look at that, is there a way inside the execute function to
    > tell it to die, but not die as failed. I've got what I want working but I
    > am having to set result.setResult(false) on the ones that finish first and
    > the result.setResult(true) on the last "thread" to come into my pathMerge
    > plugin. This works but it shows the Red No symbol, and I would rather not
    > do that.
    >
    > On Mon, Oct 3, 2011 at 5:15 PM, Matt Casters <mcasters (AT) pentaho (DOT) org> wrote:
    >
    >> In Pentaho Kettle Solutions or here is some info:
    >> http://wiki.pentaho.com/display/EAI/...ript+job+entry
    >>
    >>
    >> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>
    >>> Agree, will refactor once I get all the pieces working I need.
    >>>
    >>> Is there some place I can look to see the function definitions of the
    >>> Result class?
    >>>
    >>>
    >>>
    >>> On Mon, Oct 3, 2011 at 2:44 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>
    >>>> Actually, we just added a "Job Executor" step in 4.3.0-M1 so the
    >>>> possibilities have increased a bit.
    >>>>
    >>>> As a general piece of advice, non-specific to Kettle: don't try to do
    >>>> everything in one transformation or job. Make things modular to keep a nice
    >>>> overview.
    >>>> Think about the idea of staging the data into a buffer (file) or queue
    >>>> (database table). Then you can scale as far as you like, for example like
    >>>> Diethard documented a while back:
    >>>> http://diethardsteiner.blogspot.com/...designing.html
    >>>>
    >>>> Matt
    >>>>
    >>>>
    >>>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>>
    >>>>> Yes we started initially using steps, but needed a little more flow
    >>>>> control. Forgive me my newbe questions I am new to spoon, we may need to
    >>>>> look back at steps (the lack of flow control might have been a knowledge
    >>>>> issue on my part) but we need a way to do the majority of things in
    >>>>> sequential order each step waiting for the next, but also split off into
    >>>>> multiple paths when needed.
    >>>>>
    >>>>> If I can detect the number of inbound and outbound paths within the
    >>>>> plugin I can handle what I need in the Jobs, once we have the Jobs going I
    >>>>> will see if I can solve the flow issues we were having within the steps. My
    >>>>> project manager had ran into those and told me to do the jobs plugins. I
    >>>>> had suggested the "Wait on steps" to solve it but he wanted something with
    >>>>> less user interaction.
    >>>>>
    >>>>> Also just curious on this is there a way to display data in a Job (open
    >>>>> a window with the results in a table) when it finishes right now I am
    >>>>> writing the data to a CSV file that I receive back from the server I'm
    >>>>> calling. I know there is in Steps/Transformations, and I've thought about
    >>>>> calling a Transformation from the Job to handle the display portion.
    >>>>>
    >>>>>
    >>>>>
    >>>>>
    >>>>>
    >>>>> On Mon, Oct 3, 2011 at 1:49 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>>>
    >>>>>> I actually don't mind the questions about plugin development.
    >>>>>>
    >>>>>> Anyway, most people would write a step plugin for parallel work. All
    >>>>>> the questions you ask then have easy answers.
    >>>>>>
    >>>>>> Matt
    >>>>>>
    >>>>>>
    >>>>>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>>>>
    >>>>>>> Is there a group dedicated to developing plug-ins? I figured the
    >>>>>>> Development board was for both the core and the development of plugins.
    >>>>>>>
    >>>>>>> Thanks for the suggestions, the plug-ins get out side of the typical
    >>>>>>> use of Spoon as I understand it. What I'm doing in the multiple paths is
    >>>>>>> splitting off and pre-processing (across a cluster of servers) multiple
    >>>>>>> groups of data (this isn't a traditional database that I'm interfacing
    >>>>>>> with). The pre-processing then returns proprietary code that I must have in
    >>>>>>> later steps to utilize the the preprocessed data.
    >>>>>>>
    >>>>>>> From a programming point of view, if I have 3 paths going into one
    >>>>>>> step with in the Job I assume only one object of the class is created. So
    >>>>>>> if I use a variable to switch my logic I can merge the data together as it
    >>>>>>> comes in until I've reached the number of paths and then continue.
    >>>>>>>
    >>>>>>> Is there a programmatic way in a plugin to detect the number of
    >>>>>>> outgoing or inbound paths attached? I think I can handle the other issues
    >>>>>>> but I don't want this value to be a user input or hard coded.
    >>>>>>>
    >>>>>>>
    >>>>>>>
    >>>>>>> On Mon, Oct 3, 2011 at 12:43 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>>>>>
    >>>>>>>> No special reason Andy, just old habits of a Kettle guy formerly
    >>>>>>>> known as DBA.
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> 2011/10/3 Andy Grohe <agrohe21 (AT) gmail (DOT) com>
    >>>>>>>>
    >>>>>>>>> Since we are asking the questions, I would normally say use
    >>>>>>>>> "serialize to file" which keeps kettle data structures intact vs going out
    >>>>>>>>> to files or db.
    >>>>>>>>>
    >>>>>>>>> @matt, curious why you suggest db vs the native kettle serialize
    >>>>>>>>> inputs/outputs?
    >>>>>>>>>
    >>>>>>>>> Sent from my iPhone
    >>>>>>>>>
    >>>>>>>>> On Oct 3, 2011, at 11:33 AM, Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>>>> wrote:
    >>>>>>>>>
    >>>>>>>>> Hi Joe,
    >>>>>>>>>
    >>>>>>>>> If you join different data streams, you can indeed use a step like
    >>>>>>>>> Merge Join.
    >>>>>>>>> However, if you want to simply merge the data from 2 or more copies
    >>>>>>>>> of the same step you don't need to do anything as it's standard behavior of
    >>>>>>>>> a step.
    >>>>>>>>>
    >>>>>>>>> In the case of job entries (not clear what you are building) it's
    >>>>>>>>> indeed hard to have parallel entries add to the result row list.
    >>>>>>>>> However, perhaps it would be more efficient to add the rows to a
    >>>>>>>>> database staging table or another similar temporary container.
    >>>>>>>>>
    >>>>>>>>> Matt
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>> 2011/10/3 Joe Chambers < <joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>>> joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>>>
    >>>>>>>>>> I am developing a set of plugins to interface with a new data
    >>>>>>>>>> platform. I've got it working in a linear fashion. However I
    >>>>>>>>>> want to
    >>>>>>>>>> run some of the tasks in parallel or multiple paths/threads. I
    >>>>>>>>>> see
    >>>>>>>>>> you can run multiple paths but rejoining them and having data
    >>>>>>>>>> passed
    >>>>>>>>>> to the merge step seems to be an issue. I am using the prevResult
    >>>>>>>>>> and
    >>>>>>>>>> returning the Result in the execute function to carry my data
    >>>>>>>>>> between
    >>>>>>>>>> steps. The problem the merge/join is just called by the thread
    >>>>>>>>>> that
    >>>>>>>>>> finishes first, is there a way to have some type of wait loop that
    >>>>>>>>>> I
    >>>>>>>>>> can merge the data from all the previous steps going into the
    >>>>>>>>>> merge
    >>>>>>>>>> step.
    >>>>>>>>>>
    >>>>>>>>>> I'm looking at using a static variable to enter a waiting loop
    >>>>>>>>>> that
    >>>>>>>>>> would block all other calls until all the data is available, each
    >>>>>>>>>> additional call to this step would, based on this static variable,
    >>>>>>>>>> go
    >>>>>>>>>> into a merge function that would merge its data into a static
    >>>>>>>>>> variable
    >>>>>>>>>> and then once the count has reached the number of paths continue.
    >>>>>>>>>> With this I need to know a way to write a split step that can some
    >>>>>>>>>> how
    >>>>>>>>>> detect the number of exiting paths, is this possible?
    >>>>>>>>>>
    >>>>>>>>>> There has to be a better way but I don't see a construct to do it.
    >>>>>>>>>>
    >>>>>>>>>> I know this doesn't quite fit in with Spoon's existing
    >>>>>>>>>> infrastructure
    >>>>>>>>>> but I've been tasked with doing this.
    >>>>>>>>>>
    >>>>>>>>>> Thanks,
    >>>>>>>>>> Joseph
    >>>>>>>>>>
    >>>>>>>>>> --
    >>>>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>>>> Groups "kettle-developers" group.
    >>>>>>>>>> To post to this group, send email to
    >>>>>>>>>> <kettle-developers (AT) googlegroups (DOT) com>
    >>>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>>> <kettle-developers%2Bunsubscribe (AT) googlegroups (DOT) com>
    >>>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>>> For more options, visit this group at
    >>>>>>>>>> <http://groups.google.com/group/kettle-developers?hl=en>
    >>>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>> --
    >>>>>>>>> Matt Casters < <mcasters (AT) pentaho (DOT) org>mcasters (AT) pentaho (DOT) org>
    >>>>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>>>> )
    >>>>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>>>> Intelligence
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>> --
    >>>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>>> Groups "kettle-developers" group.
    >>>>>>>>> To post to this group, send email to
    >>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>> For more options, visit this group at
    >>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>
    >>>>>>>>> --
    >>>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>>> Groups "kettle-developers" group.
    >>>>>>>>> To post to this group, send email to
    >>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>> For more options, visit this group at
    >>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>
    >>>>>>>>
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> --
    >>>>>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>>> )
    >>>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>>> Intelligence
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> --
    >>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>> Groups "kettle-developers" group.
    >>>>>>>> To post to this group, send email to
    >>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>> For more options, visit this group at
    >>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>
    >>>>>>>
    >>>>>>> --
    >>>>>>> You received this message because you are subscribed to the Google
    >>>>>>> Groups "kettle-developers" group.
    >>>>>>> To post to this group, send email to
    >>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>> To unsubscribe from this group, send email to
    >>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>> For more options, visit this group at
    >>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>
    >>>>>>
    >>>>>>
    >>>>>>
    >>>>>> --
    >>>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>> )
    >>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>> Intelligence
    >>>>>>
    >>>>>>
    >>>>>> --
    >>>>>> You received this message because you are subscribed to the Google
    >>>>>> Groups "kettle-developers" group.
    >>>>>> To post to this group, send email to
    >>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>> To unsubscribe from this group, send email to
    >>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>> For more options, visit this group at
    >>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>
    >>>>>
    >>>>> --
    >>>>> You received this message because you are subscribed to the Google
    >>>>> Groups "kettle-developers" group.
    >>>>> To post to this group, send email to
    >>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>> To unsubscribe from this group, send email to
    >>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>> For more options, visit this group at
    >>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>
    >>>>
    >>>>
    >>>>
    >>>> --
    >>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>> )
    >>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>> Intelligence
    >>>>
    >>>>
    >>>> --
    >>>> You received this message because you are subscribed to the Google
    >>>> Groups "kettle-developers" group.
    >>>> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    >>>> .
    >>>> To unsubscribe from this group, send email to
    >>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>> For more options, visit this group at
    >>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>
    >>>
    >>> --
    >>> You received this message because you are subscribed to the Google Groups
    >>> "kettle-developers" group.
    >>> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    >>> To unsubscribe from this group, send email to
    >>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>> For more options, visit this group at
    >>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>

    >>
    >>
    >>
    >> --
    >> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >> )
    >> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >> Pentaho : The Commercial Open Source Alternative for Business Intelligence
    >>
    >>
    >> --
    >> You received this message because you are subscribed to the Google Groups
    >> "kettle-developers" group.
    >> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    >> To unsubscribe from this group, send email to
    >> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >> For more options, visit this group at
    >> http://groups.google.com/group/kettle-developers?hl=en.
    >>

    >
    >


    --
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    To unsubscribe from this group, send email to kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en.

  4. #14
    Matt Casters Guest

    Default Re: Spoon Jobs Plugins -- parallel paths

    It sets an exit value from executing a shell script.

    2011/10/4 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>

    > Can you elaborate on the function of result.setExitStatus(int) I can't seem
    > to find any documentation on the allowed values to pass it.
    >
    >
    > On Mon, Oct 3, 2011 at 6:37 PM, Joseph Chambers <joseph.chambers (AT) gmail (DOT) com
    > > wrote:

    >
    >> Thanks I'll look at that, is there a way inside the execute function to
    >> tell it to die, but not die as failed. I've got what I want working but I
    >> am having to set result.setResult(false) on the ones that finish first and
    >> the result.setResult(true) on the last "thread" to come into my pathMerge
    >> plugin. This works but it shows the Red No symbol, and I would rather not
    >> do that.
    >>
    >> On Mon, Oct 3, 2011 at 5:15 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>
    >>> In Pentaho Kettle Solutions or here is some info:
    >>> http://wiki.pentaho.com/display/EAI/...ript+job+entry
    >>>
    >>>
    >>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>
    >>>> Agree, will refactor once I get all the pieces working I need.
    >>>>
    >>>> Is there some place I can look to see the function definitions of the
    >>>> Result class?
    >>>>
    >>>>
    >>>>
    >>>> On Mon, Oct 3, 2011 at 2:44 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>>
    >>>>> Actually, we just added a "Job Executor" step in 4.3.0-M1 so the
    >>>>> possibilities have increased a bit.
    >>>>>
    >>>>> As a general piece of advice, non-specific to Kettle: don't try to do
    >>>>> everything in one transformation or job. Make things modular to keep a nice
    >>>>> overview.
    >>>>> Think about the idea of staging the data into a buffer (file) or queue
    >>>>> (database table). Then you can scale as far as you like, for example like
    >>>>> Diethard documented a while back:
    >>>>> http://diethardsteiner.blogspot.com/...designing.html
    >>>>>
    >>>>> Matt
    >>>>>
    >>>>>
    >>>>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>>>
    >>>>>> Yes we started initially using steps, but needed a little more flow
    >>>>>> control. Forgive me my newbe questions I am new to spoon, we may need to
    >>>>>> look back at steps (the lack of flow control might have been a knowledge
    >>>>>> issue on my part) but we need a way to do the majority of things in
    >>>>>> sequential order each step waiting for the next, but also split off into
    >>>>>> multiple paths when needed.
    >>>>>>
    >>>>>> If I can detect the number of inbound and outbound paths within the
    >>>>>> plugin I can handle what I need in the Jobs, once we have the Jobs going I
    >>>>>> will see if I can solve the flow issues we were having within the steps. My
    >>>>>> project manager had ran into those and told me to do the jobs plugins. I
    >>>>>> had suggested the "Wait on steps" to solve it but he wanted something with
    >>>>>> less user interaction.
    >>>>>>
    >>>>>> Also just curious on this is there a way to display data in a Job
    >>>>>> (open a window with the results in a table) when it finishes right now I am
    >>>>>> writing the data to a CSV file that I receive back from the server I'm
    >>>>>> calling. I know there is in Steps/Transformations, and I've thought about
    >>>>>> calling a Transformation from the Job to handle the display portion.
    >>>>>>
    >>>>>>
    >>>>>>
    >>>>>>
    >>>>>>
    >>>>>> On Mon, Oct 3, 2011 at 1:49 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>>>>
    >>>>>>> I actually don't mind the questions about plugin development.
    >>>>>>>
    >>>>>>> Anyway, most people would write a step plugin for parallel work. All
    >>>>>>> the questions you ask then have easy answers.
    >>>>>>>
    >>>>>>> Matt
    >>>>>>>
    >>>>>>>
    >>>>>>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>
    >>>>>>>> Is there a group dedicated to developing plug-ins? I figured the
    >>>>>>>> Development board was for both the core and the development of plugins.
    >>>>>>>>
    >>>>>>>> Thanks for the suggestions, the plug-ins get out side of the typical
    >>>>>>>> use of Spoon as I understand it. What I'm doing in the multiple paths is
    >>>>>>>> splitting off and pre-processing (across a cluster of servers) multiple
    >>>>>>>> groups of data (this isn't a traditional database that I'm interfacing
    >>>>>>>> with). The pre-processing then returns proprietary code that I must have in
    >>>>>>>> later steps to utilize the the preprocessed data.
    >>>>>>>>
    >>>>>>>> From a programming point of view, if I have 3 paths going into one
    >>>>>>>> step with in the Job I assume only one object of the class is created. So
    >>>>>>>> if I use a variable to switch my logic I can merge the data together as it
    >>>>>>>> comes in until I've reached the number of paths and then continue.
    >>>>>>>>
    >>>>>>>> Is there a programmatic way in a plugin to detect the number of
    >>>>>>>> outgoing or inbound paths attached? I think I can handle the other issues
    >>>>>>>> but I don't want this value to be a user input or hard coded.
    >>>>>>>>
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> On Mon, Oct 3, 2011 at 12:43 PM, Matt Casters <mcasters (AT) pentaho (DOT) org
    >>>>>>>> > wrote:
    >>>>>>>>
    >>>>>>>>> No special reason Andy, just old habits of a Kettle guy formerly
    >>>>>>>>> known as DBA.
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>> 2011/10/3 Andy Grohe <agrohe21 (AT) gmail (DOT) com>
    >>>>>>>>>
    >>>>>>>>>> Since we are asking the questions, I would normally say use
    >>>>>>>>>> "serialize to file" which keeps kettle data structures intact vs going out
    >>>>>>>>>> to files or db.
    >>>>>>>>>>
    >>>>>>>>>> @matt, curious why you suggest db vs the native kettle serialize
    >>>>>>>>>> inputs/outputs?
    >>>>>>>>>>
    >>>>>>>>>> Sent from my iPhone
    >>>>>>>>>>
    >>>>>>>>>> On Oct 3, 2011, at 11:33 AM, Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>>>>> wrote:
    >>>>>>>>>>
    >>>>>>>>>> Hi Joe,
    >>>>>>>>>>
    >>>>>>>>>> If you join different data streams, you can indeed use a step like
    >>>>>>>>>> Merge Join.
    >>>>>>>>>> However, if you want to simply merge the data from 2 or more
    >>>>>>>>>> copies of the same step you don't need to do anything as it's standard
    >>>>>>>>>> behavior of a step.
    >>>>>>>>>>
    >>>>>>>>>> In the case of job entries (not clear what you are building) it's
    >>>>>>>>>> indeed hard to have parallel entries add to the result row list.
    >>>>>>>>>> However, perhaps it would be more efficient to add the rows to a
    >>>>>>>>>> database staging table or another similar temporary container.
    >>>>>>>>>>
    >>>>>>>>>> Matt
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>> 2011/10/3 Joe Chambers < <joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>>>> joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>>>>
    >>>>>>>>>>> I am developing a set of plugins to interface with a new data
    >>>>>>>>>>> platform. I've got it working in a linear fashion. However I
    >>>>>>>>>>> want to
    >>>>>>>>>>> run some of the tasks in parallel or multiple paths/threads. I
    >>>>>>>>>>> see
    >>>>>>>>>>> you can run multiple paths but rejoining them and having data
    >>>>>>>>>>> passed
    >>>>>>>>>>> to the merge step seems to be an issue. I am using the
    >>>>>>>>>>> prevResult and
    >>>>>>>>>>> returning the Result in the execute function to carry my data
    >>>>>>>>>>> between
    >>>>>>>>>>> steps. The problem the merge/join is just called by the thread
    >>>>>>>>>>> that
    >>>>>>>>>>> finishes first, is there a way to have some type of wait loop
    >>>>>>>>>>> that I
    >>>>>>>>>>> can merge the data from all the previous steps going into the
    >>>>>>>>>>> merge
    >>>>>>>>>>> step.
    >>>>>>>>>>>
    >>>>>>>>>>> I'm looking at using a static variable to enter a waiting loop
    >>>>>>>>>>> that
    >>>>>>>>>>> would block all other calls until all the data is available, each
    >>>>>>>>>>> additional call to this step would, based on this static
    >>>>>>>>>>> variable, go
    >>>>>>>>>>> into a merge function that would merge its data into a static
    >>>>>>>>>>> variable
    >>>>>>>>>>> and then once the count has reached the number of paths continue.
    >>>>>>>>>>> With this I need to know a way to write a split step that can
    >>>>>>>>>>> some how
    >>>>>>>>>>> detect the number of exiting paths, is this possible?
    >>>>>>>>>>>
    >>>>>>>>>>> There has to be a better way but I don't see a construct to do
    >>>>>>>>>>> it.
    >>>>>>>>>>>
    >>>>>>>>>>> I know this doesn't quite fit in with Spoon's existing
    >>>>>>>>>>> infrastructure
    >>>>>>>>>>> but I've been tasked with doing this.
    >>>>>>>>>>>
    >>>>>>>>>>> Thanks,
    >>>>>>>>>>> Joseph
    >>>>>>>>>>>
    >>>>>>>>>>> --
    >>>>>>>>>>> You received this message because you are subscribed to the
    >>>>>>>>>>> Google Groups "kettle-developers" group.
    >>>>>>>>>>> To post to this group, send email to
    >>>>>>>>>>> <kettle-developers (AT) googlegroups (DOT) com>
    >>>>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>>>> <kettle-developers%2Bunsubscribe (AT) googlegroups (DOT) com>
    >>>>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>>>> For more options, visit this group at
    >>>>>>>>>>> <http://groups.google.com/group/kettle-developers?hl=en>
    >>>>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>>>
    >>>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>> --
    >>>>>>>>>> Matt Casters < <mcasters (AT) pentaho (DOT) org>mcasters (AT) pentaho (DOT) org>
    >>>>>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>>>>> )
    >>>>>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>>>>> Intelligence
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>> --
    >>>>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>>>> Groups "kettle-developers" group.
    >>>>>>>>>> To post to this group, send email to
    >>>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>>> For more options, visit this group at
    >>>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>>
    >>>>>>>>>> --
    >>>>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>>>> Groups "kettle-developers" group.
    >>>>>>>>>> To post to this group, send email to
    >>>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>>> For more options, visit this group at
    >>>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>>
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>> --
    >>>>>>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>>>> )
    >>>>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>>>> Intelligence
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>> --
    >>>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>>> Groups "kettle-developers" group.
    >>>>>>>>> To post to this group, send email to
    >>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>> For more options, visit this group at
    >>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>
    >>>>>>>>
    >>>>>>>> --
    >>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>> Groups "kettle-developers" group.
    >>>>>>>> To post to this group, send email to
    >>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>> For more options, visit this group at
    >>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>
    >>>>>>>
    >>>>>>>
    >>>>>>>
    >>>>>>> --
    >>>>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>> )
    >>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>> Intelligence
    >>>>>>>
    >>>>>>>
    >>>>>>> --
    >>>>>>> You received this message because you are subscribed to the Google
    >>>>>>> Groups "kettle-developers" group.
    >>>>>>> To post to this group, send email to
    >>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>> To unsubscribe from this group, send email to
    >>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>> For more options, visit this group at
    >>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>
    >>>>>>
    >>>>>> --
    >>>>>> You received this message because you are subscribed to the Google
    >>>>>> Groups "kettle-developers" group.
    >>>>>> To post to this group, send email to
    >>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>> To unsubscribe from this group, send email to
    >>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>> For more options, visit this group at
    >>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>
    >>>>>
    >>>>>
    >>>>>
    >>>>> --
    >>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>> )
    >>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>> Intelligence
    >>>>>
    >>>>>
    >>>>> --
    >>>>> You received this message because you are subscribed to the Google
    >>>>> Groups "kettle-developers" group.
    >>>>> To post to this group, send email to
    >>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>> To unsubscribe from this group, send email to
    >>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>> For more options, visit this group at
    >>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>
    >>>>
    >>>> --
    >>>> You received this message because you are subscribed to the Google
    >>>> Groups "kettle-developers" group.
    >>>> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    >>>> .
    >>>> To unsubscribe from this group, send email to
    >>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>> For more options, visit this group at
    >>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>
    >>>
    >>>
    >>>
    >>> --
    >>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>> )
    >>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>> Pentaho : The Commercial Open Source Alternative for Business
    >>> Intelligence
    >>>
    >>>
    >>> --
    >>> You received this message because you are subscribed to the Google Groups
    >>> "kettle-developers" group.
    >>> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    >>> To unsubscribe from this group, send email to
    >>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>> For more options, visit this group at
    >>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>

    >>
    >>

    > --
    > You received this message because you are subscribed to the Google Groups
    > "kettle-developers" group.
    > To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    > To unsubscribe from this group, send email to
    > kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    > For more options, visit this group at
    > http://groups.google.com/group/kettle-developers?hl=en.
    >




    --
    Matt Casters <mcasters (AT) pentaho (DOT) org>
    Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    (Wiley <http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>)
    Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    Pentaho : The Commercial Open Source Alternative for Business Intelligence

    --
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    To unsubscribe from this group, send email to kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en.

  5. #15
    Joseph Chambers Guest

    Default Re: Spoon Jobs Plugins -- parallel paths

    Thanks everyone for you help I have the first iteration running. I am doing
    a little clean up and house keeping within the code. If someone could point
    me in the right direction for the following:

    I would like to write some information to the Job Metrics in-case the
    connection to the cluster server fails. I have it marking the step with the
    red "no" symbol and writing to the Log using logBasic.



    On Mon, Oct 3, 2011 at 7:17 PM, Matt Casters <mcasters (AT) pentaho (DOT) org> wrote:

    > It sets an exit value from executing a shell script.
    >
    >
    > 2011/10/4 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >
    >> Can you elaborate on the function of result.setExitStatus(int) I can't
    >> seem to find any documentation on the allowed values to pass it.
    >>
    >>
    >> On Mon, Oct 3, 2011 at 6:37 PM, Joseph Chambers <
    >> joseph.chambers (AT) gmail (DOT) com> wrote:
    >>
    >>> Thanks I'll look at that, is there a way inside the execute function to
    >>> tell it to die, but not die as failed. I've got what I want working but I
    >>> am having to set result.setResult(false) on the ones that finish first and
    >>> the result.setResult(true) on the last "thread" to come into my pathMerge
    >>> plugin. This works but it shows the Red No symbol, and I would rather not
    >>> do that.
    >>>
    >>> On Mon, Oct 3, 2011 at 5:15 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>
    >>>> In Pentaho Kettle Solutions or here is some info:
    >>>> http://wiki.pentaho.com/display/EAI/...ript+job+entry
    >>>>
    >>>>
    >>>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>>
    >>>>> Agree, will refactor once I get all the pieces working I need.
    >>>>>
    >>>>> Is there some place I can look to see the function definitions of the
    >>>>> Result class?
    >>>>>
    >>>>>
    >>>>>
    >>>>> On Mon, Oct 3, 2011 at 2:44 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>>>
    >>>>>> Actually, we just added a "Job Executor" step in 4.3.0-M1 so the
    >>>>>> possibilities have increased a bit.
    >>>>>>
    >>>>>> As a general piece of advice, non-specific to Kettle: don't try to do
    >>>>>> everything in one transformation or job. Make things modular to keep a nice
    >>>>>> overview.
    >>>>>> Think about the idea of staging the data into a buffer (file) or queue
    >>>>>> (database table). Then you can scale as far as you like, for example like
    >>>>>> Diethard documented a while back:
    >>>>>> http://diethardsteiner.blogspot.com/...designing.html
    >>>>>>
    >>>>>> Matt
    >>>>>>
    >>>>>>
    >>>>>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>>>>
    >>>>>>> Yes we started initially using steps, but needed a little more flow
    >>>>>>> control. Forgive me my newbe questions I am new to spoon, we may need to
    >>>>>>> look back at steps (the lack of flow control might have been a knowledge
    >>>>>>> issue on my part) but we need a way to do the majority of things in
    >>>>>>> sequential order each step waiting for the next, but also split off into
    >>>>>>> multiple paths when needed.
    >>>>>>>
    >>>>>>> If I can detect the number of inbound and outbound paths within the
    >>>>>>> plugin I can handle what I need in the Jobs, once we have the Jobs going I
    >>>>>>> will see if I can solve the flow issues we were having within the steps. My
    >>>>>>> project manager had ran into those and told me to do the jobs plugins. I
    >>>>>>> had suggested the "Wait on steps" to solve it but he wanted something with
    >>>>>>> less user interaction.
    >>>>>>>
    >>>>>>> Also just curious on this is there a way to display data in a Job
    >>>>>>> (open a window with the results in a table) when it finishes right now I am
    >>>>>>> writing the data to a CSV file that I receive back from the server I'm
    >>>>>>> calling. I know there is in Steps/Transformations, and I've thought about
    >>>>>>> calling a Transformation from the Job to handle the display portion.
    >>>>>>>
    >>>>>>>
    >>>>>>>
    >>>>>>>
    >>>>>>>
    >>>>>>> On Mon, Oct 3, 2011 at 1:49 PM, Matt Casters <mcasters (AT) pentaho (DOT) org>wrote:
    >>>>>>>
    >>>>>>>> I actually don't mind the questions about plugin development.
    >>>>>>>>
    >>>>>>>> Anyway, most people would write a step plugin for parallel work.
    >>>>>>>> All the questions you ask then have easy answers.
    >>>>>>>>
    >>>>>>>> Matt
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> 2011/10/3 Joseph Chambers <joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>>
    >>>>>>>>> Is there a group dedicated to developing plug-ins? I figured the
    >>>>>>>>> Development board was for both the core and the development of plugins.
    >>>>>>>>>
    >>>>>>>>> Thanks for the suggestions, the plug-ins get out side of the
    >>>>>>>>> typical use of Spoon as I understand it. What I'm doing in the multiple
    >>>>>>>>> paths is splitting off and pre-processing (across a cluster of servers)
    >>>>>>>>> multiple groups of data (this isn't a traditional database that I'm
    >>>>>>>>> interfacing with). The pre-processing then returns proprietary code that I
    >>>>>>>>> must have in later steps to utilize the the preprocessed data.
    >>>>>>>>>
    >>>>>>>>> From a programming point of view, if I have 3 paths going into one
    >>>>>>>>> step with in the Job I assume only one object of the class is created. So
    >>>>>>>>> if I use a variable to switch my logic I can merge the data together as it
    >>>>>>>>> comes in until I've reached the number of paths and then continue.
    >>>>>>>>>
    >>>>>>>>> Is there a programmatic way in a plugin to detect the number of
    >>>>>>>>> outgoing or inbound paths attached? I think I can handle the other issues
    >>>>>>>>> but I don't want this value to be a user input or hard coded.
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>> On Mon, Oct 3, 2011 at 12:43 PM, Matt Casters <
    >>>>>>>>> mcasters (AT) pentaho (DOT) org> wrote:
    >>>>>>>>>
    >>>>>>>>>> No special reason Andy, just old habits of a Kettle guy formerly
    >>>>>>>>>> known as DBA.
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>> 2011/10/3 Andy Grohe <agrohe21 (AT) gmail (DOT) com>
    >>>>>>>>>>
    >>>>>>>>>>> Since we are asking the questions, I would normally say use
    >>>>>>>>>>> "serialize to file" which keeps kettle data structures intact vs going out
    >>>>>>>>>>> to files or db.
    >>>>>>>>>>>
    >>>>>>>>>>> @matt, curious why you suggest db vs the native kettle serialize
    >>>>>>>>>>> inputs/outputs?
    >>>>>>>>>>>
    >>>>>>>>>>> Sent from my iPhone
    >>>>>>>>>>>
    >>>>>>>>>>> On Oct 3, 2011, at 11:33 AM, Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>>>>>> wrote:
    >>>>>>>>>>>
    >>>>>>>>>>> Hi Joe,
    >>>>>>>>>>>
    >>>>>>>>>>> If you join different data streams, you can indeed use a step
    >>>>>>>>>>> like Merge Join.
    >>>>>>>>>>> However, if you want to simply merge the data from 2 or more
    >>>>>>>>>>> copies of the same step you don't need to do anything as it's standard
    >>>>>>>>>>> behavior of a step.
    >>>>>>>>>>>
    >>>>>>>>>>> In the case of job entries (not clear what you are building) it's
    >>>>>>>>>>> indeed hard to have parallel entries add to the result row list.
    >>>>>>>>>>> However, perhaps it would be more efficient to add the rows to a
    >>>>>>>>>>> database staging table or another similar temporary container.
    >>>>>>>>>>>
    >>>>>>>>>>> Matt
    >>>>>>>>>>>
    >>>>>>>>>>>
    >>>>>>>>>>> 2011/10/3 Joe Chambers < <joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>>>>> joseph.chambers (AT) gmail (DOT) com>
    >>>>>>>>>>>
    >>>>>>>>>>>> I am developing a set of plugins to interface with a new data
    >>>>>>>>>>>> platform. I've got it working in a linear fashion. However I
    >>>>>>>>>>>> want to
    >>>>>>>>>>>> run some of the tasks in parallel or multiple paths/threads. I
    >>>>>>>>>>>> see
    >>>>>>>>>>>> you can run multiple paths but rejoining them and having data
    >>>>>>>>>>>> passed
    >>>>>>>>>>>> to the merge step seems to be an issue. I am using the
    >>>>>>>>>>>> prevResult and
    >>>>>>>>>>>> returning the Result in the execute function to carry my data
    >>>>>>>>>>>> between
    >>>>>>>>>>>> steps. The problem the merge/join is just called by the thread
    >>>>>>>>>>>> that
    >>>>>>>>>>>> finishes first, is there a way to have some type of wait loop
    >>>>>>>>>>>> that I
    >>>>>>>>>>>> can merge the data from all the previous steps going into the
    >>>>>>>>>>>> merge
    >>>>>>>>>>>> step.
    >>>>>>>>>>>>
    >>>>>>>>>>>> I'm looking at using a static variable to enter a waiting loop
    >>>>>>>>>>>> that
    >>>>>>>>>>>> would block all other calls until all the data is available,
    >>>>>>>>>>>> each
    >>>>>>>>>>>> additional call to this step would, based on this static
    >>>>>>>>>>>> variable, go
    >>>>>>>>>>>> into a merge function that would merge its data into a static
    >>>>>>>>>>>> variable
    >>>>>>>>>>>> and then once the count has reached the number of paths
    >>>>>>>>>>>> continue.
    >>>>>>>>>>>> With this I need to know a way to write a split step that can
    >>>>>>>>>>>> some how
    >>>>>>>>>>>> detect the number of exiting paths, is this possible?
    >>>>>>>>>>>>
    >>>>>>>>>>>> There has to be a better way but I don't see a construct to do
    >>>>>>>>>>>> it.
    >>>>>>>>>>>>
    >>>>>>>>>>>> I know this doesn't quite fit in with Spoon's existing
    >>>>>>>>>>>> infrastructure
    >>>>>>>>>>>> but I've been tasked with doing this.
    >>>>>>>>>>>>
    >>>>>>>>>>>> Thanks,
    >>>>>>>>>>>> Joseph
    >>>>>>>>>>>>
    >>>>>>>>>>>> --
    >>>>>>>>>>>> You received this message because you are subscribed to the
    >>>>>>>>>>>> Google Groups "kettle-developers" group.
    >>>>>>>>>>>> To post to this group, send email to
    >>>>>>>>>>>> <kettle-developers (AT) googlegroups (DOT) com>
    >>>>>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>>>>> <kettle-developers%2Bunsubscribe (AT) googlegroups (DOT) com>
    >>>>>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>>>>> For more options, visit this group at
    >>>>>>>>>>>> <http://groups.google.com/group/kettle-developers?hl=en>
    >>>>>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>>>>
    >>>>>>>>>>>>
    >>>>>>>>>>>
    >>>>>>>>>>>
    >>>>>>>>>>> --
    >>>>>>>>>>> Matt Casters < <mcasters (AT) pentaho (DOT) org>mcasters (AT) pentaho (DOT) org>
    >>>>>>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>>>>>> )
    >>>>>>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29
    >>>>>>>>>>> 37
    >>>>>>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>>>>>> Intelligence
    >>>>>>>>>>>
    >>>>>>>>>>>
    >>>>>>>>>>> --
    >>>>>>>>>>> You received this message because you are subscribed to the
    >>>>>>>>>>> Google Groups "kettle-developers" group.
    >>>>>>>>>>> To post to this group, send email to
    >>>>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>>>> For more options, visit this group at
    >>>>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>>>
    >>>>>>>>>>> --
    >>>>>>>>>>> You received this message because you are subscribed to the
    >>>>>>>>>>> Google Groups "kettle-developers" group.
    >>>>>>>>>>> To post to this group, send email to
    >>>>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>>>> For more options, visit this group at
    >>>>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>> --
    >>>>>>>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>>>>> )
    >>>>>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>>>>> Intelligence
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>> --
    >>>>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>>>> Groups "kettle-developers" group.
    >>>>>>>>>> To post to this group, send email to
    >>>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>>> For more options, visit this group at
    >>>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>>
    >>>>>>>>>
    >>>>>>>>> --
    >>>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>>> Groups "kettle-developers" group.
    >>>>>>>>> To post to this group, send email to
    >>>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>>> For more options, visit this group at
    >>>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>>
    >>>>>>>>
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> --
    >>>>>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>>>> )
    >>>>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>>>> Intelligence
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> --
    >>>>>>>> You received this message because you are subscribed to the Google
    >>>>>>>> Groups "kettle-developers" group.
    >>>>>>>> To post to this group, send email to
    >>>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>>> To unsubscribe from this group, send email to
    >>>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>>> For more options, visit this group at
    >>>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>>
    >>>>>>>
    >>>>>>> --
    >>>>>>> You received this message because you are subscribed to the Google
    >>>>>>> Groups "kettle-developers" group.
    >>>>>>> To post to this group, send email to
    >>>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>>> To unsubscribe from this group, send email to
    >>>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>>> For more options, visit this group at
    >>>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>>
    >>>>>>
    >>>>>>
    >>>>>>
    >>>>>> --
    >>>>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>>>> )
    >>>>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>>>> Intelligence
    >>>>>>
    >>>>>>
    >>>>>> --
    >>>>>> You received this message because you are subscribed to the Google
    >>>>>> Groups "kettle-developers" group.
    >>>>>> To post to this group, send email to
    >>>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>>> To unsubscribe from this group, send email to
    >>>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>>> For more options, visit this group at
    >>>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>>
    >>>>>
    >>>>> --
    >>>>> You received this message because you are subscribed to the Google
    >>>>> Groups "kettle-developers" group.
    >>>>> To post to this group, send email to
    >>>>> kettle-developers (AT) googlegroups (DOT) com.
    >>>>> To unsubscribe from this group, send email to
    >>>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>>> For more options, visit this group at
    >>>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>>
    >>>>
    >>>>
    >>>>
    >>>> --
    >>>> Matt Casters <mcasters (AT) pentaho (DOT) org>
    >>>> Chief Data Integration, Kettle founder, Author of Pentaho Kettle
    >>>> Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    >>>> (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    >>>> )
    >>>> Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    >>>> Pentaho : The Commercial Open Source Alternative for Business
    >>>> Intelligence
    >>>>
    >>>>
    >>>> --
    >>>> You received this message because you are subscribed to the Google
    >>>> Groups "kettle-developers" group.
    >>>> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com
    >>>> .
    >>>> To unsubscribe from this group, send email to
    >>>> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >>>> For more options, visit this group at
    >>>> http://groups.google.com/group/kettle-developers?hl=en.
    >>>>
    >>>
    >>>

    >> --
    >> You received this message because you are subscribed to the Google Groups
    >> "kettle-developers" group.
    >> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    >> To unsubscribe from this group, send email to
    >> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >> For more options, visit this group at
    >> http://groups.google.com/group/kettle-developers?hl=en.
    >>

    >
    >
    >
    > --
    > Matt Casters <mcasters (AT) pentaho (DOT) org>
    > Chief Data Integration, Kettle founder, Author of Pentaho Kettle Solutions<http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177>
    > (Wiley<http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html>
    > )
    > Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37
    > Pentaho : The Commercial Open Source Alternative for Business Intelligence
    >
    >
    > --
    > You received this message because you are subscribed to the Google Groups
    > "kettle-developers" group.
    > To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    > To unsubscribe from this group, send email to
    > kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    > For more options, visit this group at
    > http://groups.google.com/group/kettle-developers?hl=en.
    >


    --
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    To unsubscribe from this group, send email to kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.