Hitachi Vantara Pentaho Community Forums
Results 1 to 4 of 4

Thread: Kettle SDK

  1. #1
    Matt Casters Guest

    Default Kettle SDK

    Dear Kettle developers,

    At least in my mind, the Kettle Java API is reasonably easy to work with and
    understand.
    However, I do realize that there is a serious lack of examples and
    documentation before we can start to really cater to Java developers out
    there.
    The new book Pentaho Kettle Solutions takes a stab at this for as far as the
    basics are concerned (plugin development, running xforms and jobs, and so
    on) but we realize that a lot more is needed.

    So here is my invitation to tell us what you need as far as the ultimate
    Kettle Software Development Kit is needed.
    What do you need in terms of examples and documentation?

    Beyond that, how can we make the Kettle API easier to work with. There have
    been initiatives like the CDA (Community Data Access) project that leverage
    Kettle through what you might call transformation assemblers for Pentaho
    reporting. How about a pure Kettle API? I've been prototyping code like
    the following:


    -----snip------
    DataService service = new DataService();

    // Add a step to the services that reads a CSV file...
    //
    service.add( new StepDataSource("ID_CSV", createCsvInputStep()) );

    // Add a SQL data source that reads from a database...
    //
    service.add( new SqlDataSource("ID_SQL", "SELECT id, gender FROM
    gender", getGenderDatabaseMeta())) ;

    // Now explain to the service that these 2 data sources need to be
    connected...
    //
    service.add( new MemoryLookupDataOperation( "ID_LOOKUP",
    "ID_CSV", new String[] { "id", },
    "ID_SQL", new String[] { "id", },
    new String[] { "gender", } ) );

    TransMeta transMeta = service.generateTransMeta("ID_TRANS");
    List<Object[]> rows = service.getRows("ID_LOOKUP");
    -----snap------

    Would that be something you guys are interested in? What sort of
    functionality would you like to see from the Kettle Java API.

    Regards,

    Matt
    --
    Matt Casters <mcasters (AT) pentaho (DOT) org>
    Chief Data Integration
    Pentaho : The Commercial Open Source Alternative for Business Intelligence

    --
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    To unsubscribe from this group, send email to kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en.

  2. #2
    Will Gorman Guest

    Default Re: Kettle SDK

    Hi Everyone,

    As Matt pointed out, we're very interested in hearing about how you
    are using Kettle's existing APIs. Within the Pentaho universe, we've
    started to use Kettle's transformation engine in some very interesting
    embedded contexts. In Pentaho Metadata, we implemented an inline ETL
    query engine, so you can send metadata queries to CSV files. It would
    be easy to extend this to arbitrary kettle inputs. In our new data
    access wizard coming out in 3.7, we're using Kettle's transformation
    engine to stage CSV data into a database, so that Mondrian can query
    the data. We're also working on extending that so data sources such
    as Sales Force could be staged and analyzed. Another great use case
    of Kettle's transformation engine is in the work that Daniel and Pedro
    have done with CDA, where different datasources can be joined, etc.

    We know how powerful embedding Kettle's transformation engine is in
    our use cases, but is it useful for a broader set of cases? We think
    so. Before joining Pentaho, I wrote a number of Java based systems
    for use in many different industries (Defense, Aerospace, Medical,
    etc). In all of these projects, there was always custom data access
    and transformation code implemented. If I knew of PDI back then, I
    could have saved myself a lot of time writing those systems.

    So again, please share your use cases, and areas you'd like to see
    improved related to our APIs.

    Thanks!

    Will

    On Fri, Oct 8, 2010 at 4:56 AM, Matt Casters <mcasters (AT) pentaho (DOT) org> wrote:[color=blue]
    > Dear Kettle developers,
    > At least in my mind, the Kettle Java API is reasonably easy to work with and
    > understand.
    > However, I do realize that there is a serious lack of examples and
    > documentation before we can start to really cater to Java developers out
    > there.
    > The new book Pentaho Kettle Solutions takes a stab at this for as far as the
    > basics are concerned (plugin development, running xforms and jobs, and so
    > on) but we realize that a lot more is needed.
    > So here is my invitation to tell us what you need as far as the ultimate
    > Kettle Software Development Kit is needed.
    > What do you need in terms of examples and documentation?
    > Beyond that, how can we make the Kettle API easier to work with.

  3. #3
    Jens Bleuel Guest

    Default Re: Kettle SDK

    Hi all,

    I see a SDK/API interesting in two ways:
    1) Embedding the Kettle Engine
    2) Plug-Ins (we have a variety like steps, job entries, partitioning,
    perspectives, databases, etc.)

    In both scenarios I propose a pretty stable API that should be upward
    compatible. This is not possible by all means but we could improve it by
    including something like a SDK/API layer like a wrapper (and Matt
    pointed out some nice examples).

    We could offer this as an extra option and the developer could choose if
    he or she uses this more stable SDK/API or keep on going the direct
    approach. With the SDK/API there could be a small performance impact,
    especially when this is a pretty flexible and viable long term solution.
    To keep the SDK/API pretty stable we could think of a flexible solution
    like setting key/value pairs for each function where even the object
    type of the values is kept flexible. The limitation would be in return
    values of functions (also call back functions) when these change over
    time since the calling or handling application needs to deal with them.
    Compatibility layers could be solve it in these cases. In my view this
    is a very flexible and long term solution, but we need to check the
    trade off with the effort and performance aspects.

    Keep on hacking
    Jens

    ---
    Jens Bleuel


    Am 08.10.2010 17:00, schrieb Will Gorman:
    > Hi Everyone,
    >
    > As Matt pointed out, we're very interested in hearing about how you
    > are using Kettle's existing APIs. Within the Pentaho universe, we've
    > started to use Kettle's transformation engine in some very interesting
    > embedded contexts. In Pentaho Metadata, we implemented an inline ETL
    > query engine, so you can send metadata queries to CSV files. It would
    > be easy to extend this to arbitrary kettle inputs. In our new data
    > access wizard coming out in 3.7, we're using Kettle's transformation
    > engine to stage CSV data into a database, so that Mondrian can query
    > the data. We're also working on extending that so data sources such
    > as Sales Force could be staged and analyzed. Another great use case
    > of Kettle's transformation engine is in the work that Daniel and Pedro
    > have done with CDA, where different datasources can be joined, etc.
    >
    > We know how powerful embedding Kettle's transformation engine is in
    > our use cases, but is it useful for a broader set of cases? We think
    > so. Before joining Pentaho, I wrote a number of Java based systems
    > for use in many different industries (Defense, Aerospace, Medical,
    > etc). In all of these projects, there was always custom data access
    > and transformation code implemented. If I knew of PDI back then, I
    > could have saved myself a lot of time writing those systems.
    >
    > So again, please share your use cases, and areas you'd like to see
    > improved related to our APIs.
    >
    > Thanks!
    >
    > Will
    >
    > On Fri, Oct 8, 2010 at 4:56 AM, Matt Casters<mcasters (AT) pentaho (DOT) org> wrote:
    >> Dear Kettle developers,
    >> At least in my mind, the Kettle Java API is reasonably easy to work with and
    >> understand.
    >> However, I do realize that there is a serious lack of examples and
    >> documentation before we can start to really cater to Java developers out
    >> there.
    >> The new book Pentaho Kettle Solutions takes a stab at this for as far as the
    >> basics are concerned (plugin development, running xforms and jobs, and so
    >> on) but we realize that a lot more is needed.
    >> So here is my invitation to tell us what you need as far as the ultimate
    >> Kettle Software Development Kit is needed.
    >> What do you need in terms of examples and documentation?
    >> Beyond that, how can we make the Kettle API easier to work with. There have
    >> been initiatives like the CDA (Community Data Access) project that leverage
    >> Kettle through what you might call transformation assemblers for Pentaho
    >> reporting. How about a pure Kettle API? I've been prototyping code like
    >> the following:
    >>
    >> -----snip------
    >> DataService service = new DataService();
    >>
    >> // Add a step to the services that reads a CSV file...
    >> //
    >> service.add( new StepDataSource("ID_CSV", createCsvInputStep()) );
    >>
    >> // Add a SQL data source that reads from a database...
    >> //
    >> service.add( new SqlDataSource("ID_SQL", "SELECT id, gender FROM
    >> gender", getGenderDatabaseMeta())) ;
    >>
    >> // Now explain to the service that these 2 data sources need to be
    >> connected...
    >> //
    >> service.add( new MemoryLookupDataOperation( "ID_LOOKUP",
    >> "ID_CSV", new String[] { "id", },
    >> "ID_SQL", new String[] { "id", },
    >> new String[] { "gender", } ) );
    >>
    >> TransMeta transMeta = service.generateTransMeta("ID_TRANS");
    >> List<Object[]> rows = service.getRows("ID_LOOKUP");
    >> -----snap------
    >> Would that be something you guys are interested in? What sort of
    >> functionality would you like to see from the Kettle Java API.
    >> Regards,
    >>
    >> Matt
    >> --
    >> Matt Casters<mcasters (AT) pentaho (DOT) org>
    >> Chief Data Integration
    >> Pentaho : The Commercial Open Source Alternative for Business Intelligence
    >>
    >> --
    >> You received this message because you are subscribed to the Google Groups
    >> "kettle-developers" group.
    >> To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    >> To unsubscribe from this group, send email to
    >> kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    >> For more options, visit this group at
    >> http://groups.google.com/group/kettle-developers?hl=en.
    >>

    >


    --
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    To unsubscribe from this group, send email to kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en.

  4. #4
    Nicholas Goodman Guest

    Default Re: Kettle SDK

    I use Kettle embedded in two different projects... Having an improved API and an official SDK would be great.

    While building the whole thing from scratch in code, even in an easier way like Matt outlined, is great being able to start with "template" ktr/kjbs would be even better. So there ya go, my two cents: please make the SDK have the ability to load existing transform/jobs (from rep or files) and then "mod" from there like Matt outlined before.

    Nick

    On Oct 8, 2010, at 1:56 AM, Matt Casters wrote:

    > Dear Kettle developers,
    >
    > At least in my mind, the Kettle Java API is reasonably easy to work with and understand.
    > However, I do realize that there is a serious lack of examples and documentation before we can start to really cater to Java developers out there.
    > The new book Pentaho Kettle Solutions takes a stab at this for as far as the basics are concerned (plugin development, running xforms and jobs, and so on) but we realize that a lot more is needed.
    >
    > So here is my invitation to tell us what you need as far as the ultimate Kettle Software Development Kit is needed.
    > What do you need in terms of examples and documentation?
    >
    > Beyond that, how can we make the Kettle API easier to work with. There have been initiatives like the CDA (Community Data Access) project that leverage Kettle through what you might call transformation assemblers for Pentaho reporting. How about a pure Kettle API? I've been prototyping code like the following:
    >
    >
    > -----snip------
    > DataService service = new DataService();
    >
    > // Add a step to the services that reads a CSV file...
    > //
    > service.add( new StepDataSource("ID_CSV", createCsvInputStep()) );
    >
    > // Add a SQL data source that reads from a database...
    > //
    > service.add( new SqlDataSource("ID_SQL", "SELECT id, gender FROM gender", getGenderDatabaseMeta())) ;
    >
    > // Now explain to the service that these 2 data sources need to be connected...
    > //
    > service.add( new MemoryLookupDataOperation( "ID_LOOKUP",
    > "ID_CSV", new String[] { "id", },
    > "ID_SQL", new String[] { "id", },
    > new String[] { "gender", } ) );
    >
    > TransMeta transMeta = service.generateTransMeta("ID_TRANS");
    > List<Object[]> rows = service.getRows("ID_LOOKUP");
    > -----snap------
    >
    > Would that be something you guys are interested in? What sort of functionality would you like to see from the Kettle Java API.
    >
    > Regards,
    >
    > Matt
    > --
    > Matt Casters <mcasters (AT) pentaho (DOT) org>
    > Chief Data Integration
    > Pentaho : The Commercial Open Source Alternative for Business Intelligence
    >
    > --
    > You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    > To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    > To unsubscribe from this group, send email to kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    > For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en.


    --
    You received this message because you are subscribed to the Google Groups "kettle-developers" group.
    To post to this group, send email to kettle-developers (AT) googlegroups (DOT) com.
    To unsubscribe from this group, send email to kettle-developers+unsubscribe (AT) g...oups (DOT) com.
    For more options, visit this group at http://groups.google.com/group/kettle-developers?hl=en.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.