Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: Amazon S3

  1. #1

    Default Amazon S3

    Hello. Does anyone have exprience using Kettle to push data to Amazon S3? I'm interested in learning about approaches others may have used.

    Thanks,

    Chris

  2. #2
    Join Date
    Nov 1999
    Posts
    9,727

    Default

    Hi Chris,

    Data can mean a lot of things. If you talk about CSV/XML then there is not that much around out of the box.
    On Linux there are ways to mount an S3 filesystem (s3fs for example, FUSE).

    Personally I used an Amazon library to create a parallel reader for S3, a writer shouldn't be too hard.

    Data can also mean a relational database. For example, MySQL has an AWS S3 storage engine by Mark Atwood.

    Matt
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

  3. #3

    Default Amazon S3

    Hi Matt,

    In this case, I'm taking about files. Thanks for the links, I'll take a look. I was thinking about an S3 writer. Then I could create a PDI job to extract data from source systems, write to files, compress files, then write the files to an S3 file system in the cloud.

    Chris

  4. #4
    Join Date
    Nov 1999
    Posts
    9,727

    Default

    You could probably start from the "Text File Output" step and convert it to the JetS3t Java lib.
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

  5. #5

    Default S3

    Hi Matt,

    I was also looking at using something like Jungle Disk ($20) - http://www.jungledisk.com/index.aspx. Then I could mount the S3 to the PDI server. JungleDisk would handle file transfer and encryption. For $1 a month extra it also has incremental backup capability and can restart large file transfers from the point of failure.

    Do you know if there is an EC2 machine image with PDI?

    Chris

  6. #6
    Join Date
    Nov 1999
    Posts
    9,727

    Default

    If you're using Linux, s3fs will do the trick, had some trouble with JungleDisk myself.
    Besides that, whatever gets the job done :-)

    No PDI image yet, sorry. Since I need a few for MySQL UC (let's meet up again!) there should be a few popping up in the next couple of months though.
    I'll make sure to make them public.

    Take care,
    Matt
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

  7. #7

    Default S3

    Great, I'll be at the MySQL UC this year - speaking on the last day - http://en.oreilly.com/mysql2009/publ...le/detail/5593

  8. #8
    Join Date
    Nov 1999
    Posts
    9,727

    Default

    I have my solo session on Wednesday: http://en.oreilly.com/mysql2009/publ...le/detail/6739
    On Tuesday I'm presenting with Roland Bouman: http://en.oreilly.com/mysql2009/publ...le/detail/7016

    Cheers,
    Matt
    Matt Casters, Chief Data Integration
    Pentaho, Open Source Business Intelligence
    http://www.pentaho.org -- mcasters@pentaho.org

    Author of the book Pentaho Kettle Solutions by Wiley. Also available as e-Book and on the Kindle reading applications (iPhone, iPad, Android, Kindle devices, ...)

    Join us on IRC server Freenode.net, channel ##pentaho

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2017 Pentaho Corporation. All Rights Reserved.