Hitachi Vantara Pentaho Community Forums
Results 1 to 17 of 17

Thread: Filter file names

  1. #1
    Join Date
    Mar 2010
    Posts
    181

    Default Filter file names

    Is it possible to list a file names in a FTP site with wildcard character and filter only the file with the latest time stamp.

    Please let me know the sequence of steps that can be used. I have tried various steps but not sure to how filter it.

    Thanks

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    What do you mean by "latest time stamp"?
    The last hour, the last day, last minute?

  3. #3
    Join Date
    Mar 2010
    Posts
    181

    Default

    last hour.

    I tried Get file names with VFS ftps://myusername:mypassword@somehost/download , which did not work. However, "Get a File with FTPS" works fine though it downloads all the files, which I don't want.

    My requirement is to FTP only the file with the newest time stamp.

  4. #4
    Join Date
    Mar 2010
    Posts
    181

    Default

    Any help ?

  5. #5
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    Hi.
    I would create a transformation Job with:

    - Execute Shell Scripting and launch cURL or Wget to retrieve file names and info

    Then a Transformation to load the file created by the Shell.
    - File text Input
    - Sort by Date
    - Filter selecting the first row

    Mick
    Last edited by Mick_data; 10-14-2011 at 09:36 AM. Reason: Changed my mind!!

  6. #6
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Why not use a Get File Names with the VFS handler, then filter that based on Time Stamp, then feed that into the Get Files?

  7. #7
    Join Date
    Mar 2010
    Posts
    181

    Default

    I am trying the Get File Name with the VFS handler. But it does not work, However this works fine with "Get a File with FTPS" step. I am using ftps://uNamewd@ftp.ftpsite.com:990/download/.* I get the error "No File found! Please check the filename/directory and regular expression options.

    Thanks

  8. #8
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    If you connect to the FTPS server directly, can you do a dir or ls and get results with date time information?

  9. #9

    Default

    I have same wrong to Uc Sam... & i'm using the same steps in transformations...
    The diference is i take the example VFS Configuration Sample (Samples Kettle Folder) & try to configurate this trsnformation in Transfirmation-->Setting-->Parameters and put this

    Parameter Default Value Description
    vfs.http.proxyHost xxx.xxx.xxx.xxx (IP)A parameter that exists but is not used by any steps in this transformation
    vfs.sftp.StrictHostKeyChecking N Accept the encryption key of any SFTP server

    and the error in Log transformation is:
    2011/10/14 16:06:58 - FileInputList - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : java.lang.NullPointerException
    2011/10/14 16:06:58 - FileInputList - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.core.vfs.configuration.KettleGenericFileSystemConfigBuilder.setParameter(KettleGenericFileSystemConfigBuilder.java:108)
    2011/10/14 16:06:58 - FileInputList - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.core.vfs.KettleVFS.buildFsOptions(KettleVFS.java:171)
    2011/10/14 16:06:58 - FileInputList - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.core.vfs.KettleVFS.getFileObject(KettleVFS.java:117)
    2011/10/14 16:06:58 - FileInputList - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.core.vfs.KettleVFS.getFileObject(KettleVFS.java:94)
    2011/10/14 16:06:58 - FileInputList - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.core.fileinput.FileInputList.createFileList(FileInputList.java:175)
    2011/10/14 16:06:58 - FileInputList - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.trans.steps.getfilenames.GetFileNamesMeta.getFileList(GetFileNamesMeta.java:685)
    2011/10/14 16:06:58 - FileInputList - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.trans.steps.getfilenames.GetFileNames.init(GetFileNames.java:325)
    2011/10/14 16:06:58 - FileInputList - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.trans.step.StepInitThread.run(StepInitThread.java:52)
    2011/10/14 16:06:58 - FileInputList - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at java.lang.Thread.run(Unknown Source)

    Pls... if somebody can tell Us how we can configurate this?? Pls, my head is in the table.... thanks

    Delia

  10. #10
    Join Date
    Mar 2010
    Posts
    181

    Default

    Yes I am able to browse the directories by using FileZilla, Winscp clients.

    "Get a File with FTPS" actually fails with Get a file with FTPS - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : org.ftp4che.exception.FtpFileNotFoundException: FtpWorkflowException --> Return Value: 550 Description: PROT P required


    Also, If I set Binary Mode it fails with Null Pointer exception
    2011/10/14 13:20:27 - Get a file with FTPS - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : Error getting files from FTPS :
    2011/10/14 13:20:27 - Get a file with FTPS - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : java.lang.NullPointerException
    2011/10/14 13:20:27 - Get a file with FTPS - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.job.Job.run (Job.java:288)
    2011/10/14 13:20:27 - Get a file with FTPS - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.job.Job.execute (Job.java:368)
    2011/10/14 13:20:27 - Get a file with FTPS - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.job.Job.execute (Job.java:642)
    2011/10/14 13:20:27 - Get a file with FTPS - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.job.Job.execute (Job.java:503)
    2011/10/14 13:20:27 - Get a file with FTPS - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.job.entries.ftpsget.JobEntryFTPSGet.execute (JobEntryFTPSGet.java:825)
    2011/10/14 13:20:27 - Get a file with FTPS - ERROR (version 4.2.0-stable, build 15748 from 2011-09-08 13.11.42 by buildguy) : at org.pentaho.di.job.entries.ftpsget.FTPSConnection.setBinaryMode (FTPSConnection.java:238)

  11. #11
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Sounds like a JIRA entry to me...

  12. #12
    Join Date
    Mar 2010
    Posts
    181

    Default

    You mean , this is a Kettle Bug ?

  13. #13
    Join Date
    Mar 2010
    Posts
    181

    Default

    @ dorellan

    I tried setting
    vfs.http.proxyHost xxx.xxx.xxx.xxx (IP)A parameter that exists but is not used by any steps in this transformation
    vfs.sftp.StrictHostKeyChecking N Accept the encryption key of any SFTP server

    but that does not help me much for FTPS connection type. Is there any other option ?

  14. #14
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Quote Originally Posted by UcSam View Post
    You mean , this is a Kettle Bug ?
    That's what I would get out of it.

  15. #15
    Join Date
    Mar 2010
    Posts
    181

    Default

    Yes I understand that.I initially had , but then removed it.

  16. #16

    Default

    Dear, i can doiT!!!!!!

    in the Setting transformation, i change parameters to sftp


    Parameter Default Value Description
    vfs.sftp.StrictHostKeyChecking.xxx.xxx.xxx.xxx N Accept the encryption key of any SFTP server
    vfs.sftp.proxyHost xxx.xxx.xxx.xxx A parameter that exists but is not used by any steps in this transformation

    & the Get File names steps, i writted vfs path in File/Directory Column table & RegExp in Wildcard Column.
    The result, list file name in ftp site (Pls see that server is ftp & sftp system... )

    Well, i hope help in somethig. Thanks everybody for help!!!

    Delia

  17. #17
    Join Date
    Mar 2010
    Posts
    181

    Default

    Jira http://jira.pentaho.com/browse/PDI-6868 has been created for this issue.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.