I've been asked to integrate into a BI Portal based on the GA PCI a solution that:
1.- fetches log files from a configurable list of ftp servers
2.- transforms the log files appropiately to
3.- insert them into a datawarehouse
Ideally, all of this (ftp location, job scheduling) should be configurable in the admin portal.
After going over (among others) "Creating Pentaho Solutions" , "Pentaho Advanced Installation Guide", "Spoon's User Manual", "Chef's User Manual" and searched the forums I'm missing the following pieces
1.- AFAICT, only Chef knows something about FTP, and Pentaho can't (yet) run Chef jobs (JIRA: PLATFORM-304). I haven't seen any FTP components in Pentaho.
2.- I need all the flexibility of cron jobs, but I've seen no documentation on how to set them up in the embedded Quartz, nor any components designed to do this.
I guess I can code my own component for 1, but I'm a bit lost regarding 2: is the Quartz scheduler already started? How do I configure jobs via actions?
Am I missing something? What's the easiest way to accomplish this?