Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: Project road map, history of kettle

  1. #1

    Default Project road map, history of kettle

    Hi Matt


    I'm writting right now my bachelor thesis and I'm at that place, where I have to say a couple of words about the history of kettle, project road map, where kettle is going to, etc. I have studied the documetation very carefully and all the places that I know, but unfortunally I have not found anything. A little scratch, with couple of facts like for example the "Conceptual modell" would be great.



    Thanks a lot,
    Pavel

  2. #2
    Join Date
    Nov 1999
    Posts
    9,727

    Default RE: Project road map, history of kettle

    Hi Pavel,

    Below is what I wrote some time ago to the kettle-developers mailing list. Maybe it will help you understand where Kettle came from.

    The roadmap is not that hard to see: on one hand it's the list of feature requests on Javaforge.
    On the other hand it's design things that are being added like plugins for transformations, connections and meta-data plugins so that extra information can be added/edited for the steps.
    These last are driven by the medium term and long term requirements we have. For example the plugin stuff comes from the (Pentaho) requirement to look into running ETL/Peporting/Mining on the Sun Grid.
    Finally, what is obviously also driving the roadmap strongly these days is integration with the rest of the pentaho toolsets: Mondrian, JFreeReport, Pentaho framework and Weka.

    All the best,

    Matt



    Hi Kettle devs,

    As some of you might wonder how Kettle came to be, I wrote this (rather lengthy) historical overview.

    As some of you like to point out, in many respects Kettle is not the most beautiful piece of Java code ever written.
    That is because when I started with Kettle 5 years ago, I had only a little bit of experience with Java 1.1, actually in writing a Japanese chess database as an aid for my Shogi addiction. (http://http://www.ibridge.be/shogi/)
    The ugly piece of code referenced above is the reason why I picked SWT as a GUI framework at a certain time, but that's another story...

    In 2001, when the idea for writing my own ETL tool came about, I had been working as a BI consultant for a number of years and found that there was too much messing about in regards to transfering data from one place to another.
    As with many software, the driving force behind Kettle simply was: "there has to be a better and cheaper way to do this".

    It's one thing to say this. It's a completely different thing to actually write something that is better than inventing ugly data warehouse solutions written in PL/SQL, VB, Shell scripts and what not.

    It actually took me 2 years to do a thourough analyses of the problem. Although that might seem a long time, please remember that until last month, I could only work on Kettle during weekends or at night because I had a day time job as a consultant.
    In those 2 years, I had written analyses documents and a couple of test-programs in C that used sockets to transfer data. The advantage I saw was that it could work on multiple machines at the same time, etc.
    The problem with piping was of-course that it was dead slow and that the real problem, extracting data from databases, was not solved at all.

    That's why in early 2003 I looked at Java again for the first time in years. A lot of things were moving in the java language, new versions were coming out and more importantly, there were free JDBC drivers to be found for most databases.
    Ever so slowly I started working on Value, Row and all the other base classes we have now.
    Some time was wasted on trying to write a scripting engine (using my own Byte-code) so that calculations could be done dynamically.

    For the archeologists (or freakshow fans) among you: here is the very first archived version of Kettle: http://www.javaforge.com/proj/doc/de...o?doc_id=10680

    By mid-2003 building the XML to test quickly became a showstopper. If you have no time to waste, you need to be efficient about it, so I needed a GUI.
    SWT was the next big thing and writing Swing, let alone AWT was not my thing. I was using Eclipse to program, and that's all there is to it really.
    Although there have been problems with SWT before, it seems to be working fine now for the most part.

    The first version of the tool that now is called Spoon, was named "Stir" and at the time and looked like this: http://www.javaforge.com/proj/doc/de...o?doc_id=10682 (It actually looked worse because this is running with the new SWT 3.2 libs)
    Stir featured a big X on the graphical view, the log view was not working, neither were most step dialogs, but it might help you understand how the current version came about... or not ;-)
    That version can also be found in the archive: http://www.javaforge.com/proj/doc/de...o?doc_id=10683

    In the second half of 2003, when the first GUI became available it became less of a burden to write the XML and slowly but surely, things were advancing towards a first version.
    Here is a screenshot of 1.0 beta 7 : http://www.javaforge.com/proj/doc/de...o?doc_id=10685 . Even though the number of available steps and supported database has more than doubled since those days, you can see it's starting to look like the version we have now.
    The source code of that version is also placed in the archive: http://www.javaforge.com/proj/doc/de...o?doc_id=10686

    In 2004 it was working reasonably stable and we I able to deploy it for the first time at a customer. Because of the "real-world" situation, a lot of things needed to be fixed and new features needed to be implemented.
    That why in those days, things were advancing a lot faster than the first 3 years. It seemed the code-base grew so fast that several re-factorings and code-cleanings were needed.
    Version 2.0. was one of the last "unstructured" versions, to be downloaded here: http://www.javaforge.com/proj/doc/de...o?doc_id=10687

    No real java developer would consider NOT using packages in larger projects, and that's exactly what Kettle was slowly but surely becoming.
    It was thanks to the Java expertise from companies like ixor (Wim De Clerq especially) that Kettle survived that difficult period.

    It was difficult not only because around that period my son Sam was born and little to no time was left for development.
    I was also difficult because of the (needless) code complexity.
    Mostly it was difficult and frustrating because I would be told time and time again by BI colleages that there were plenty of ETL tools on the market and that it would be completely pointless for me to continue writing one.

    Although this has changed "somewhat" since I open sourced Kettle, these kind of remarks are still popping up now and then.
    They were (and are) bypassing the simple observation that it should not cost a company thousands of Euros, Dollars or any other currency to do simple things like moving data from one database to another.
    When I say cost I'm not only talking about the price of software, but also the time you spend on it. BI consultants like myself (certainly) do no work for free and the longer someone works on a problem the more it costs.

    This final observation was key to the success of Kettle because of its simplicity, but it is also a challenge for us as developers as we need to keep looking at the bottom line: an ETL tool helps you do things faster and cheaper.
    The end-user requirements are most important and that should remain the case. The Java code was and is nothing more than a way to achieve that goal.

    I hope you will find Kettle as fun to work with as I and I hope you found this little stroll into the past interesting, if not you would probably not have read until this line anyway ;-)
    Now go back to work! :-) There is plenty of work to do, code to clean and bugs to fix!

    All the best,

    Matt
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

  3. #3
    Join Date
    Sep 2007
    Posts
    829

    Default

    Just curious.. where are all these ancient files and screenshots?
    The links are not valid,
    thanks!

  4. #4
    Join Date
    Nov 1999
    Posts
    9,727

    Default

    Hi Maria,

    Unfortunately we had to clean up things on JavaForge. Here is the zip file of the very first Java Kettle version ever!

    FirstVersion.zip

    An early screenshot of Stir (now called Spoon):

    (from June 2003)


    I'll find another place to post the Kettle Archive soon.

    All the best,
    Matt
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

  5. #5

    Default

    Impressive stuff!
    This is a signature.... everyone gets it.

    Join the Unofficial Pentaho IRC channel on freenode.
    Server: chat.freenode.net Channel: ##pentaho

    Please try and make an effort and search the wiki and forums before posting!
    Checkout the Saiku, the future of Open Source Interactive OLAP(http://analytical-labs.com)


  6. #6
    Join Date
    Sep 2007
    Posts
    829

    Default

    Thanks, Matt!

    Quote Originally Posted by MattCasters View Post
    Hi Maria,

    Unfortunately we had to clean up things on JavaForge. Here is the zip file of the very first Java Kettle version ever!
    An early screenshot of Stir (now called Spoon) from June 2003

    I'll find another place to post the Kettle Archive soon.
    All the best,
    Matt

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2017 Pentaho Corporation. All Rights Reserved.