Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: JOB - "File exist" able to use wild card?

  1. #1

    Default JOB - "File exist" able to use wild card?

    Good morning community,

    I'm trying to use the "file exist" brick after an ftp transfer to make sure they have been received.

    However, the file names uses the following pattern:

    constant_ddmmyyyy.csv

    Is there a way to use wildcards, because obviously the filename will change all the time.

    I've tried the following:

    D:\PROJECT\TO_LOAD\CONSTANT(.*).csv

    and

    D:\PROJECT\TO_LOAD\CONSTANT*.csv

    The step simply fails.

    I keep having the error message that

    D:\PROJECT\TO_LOAD\CONSTANT*.csv doesn't exists.

    How should I do to use wildcards?

    Thanks in advance for your help.
    There's no point going somewhere if you don't enjoy the ride.

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Well, make a small transformation with a "Get filenames" step and count/evaluate the number of rows.
    That way you can use wildcards.

  3. #3

    Default

    I think it would be an excellent feature to add in the version 2.5.1 : )
    Can I post a request for it?
    There's no point going somewhere if you don't enjoy the ride.

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    You can always post a request for it, personally I'm not going to implement it short term. Originally that step was only intended for "trigger files".

    And there's a very nice work-around for it available

    Regards,
    Sven

  5. #5
    Join Date
    Nov 1999
    Posts
    9,729

    Exclamation

    Nicholas,

    Sven makes an excellent point here. Suppose your files are coming in using FTP.
    A file is being written, let's call it FILE_20070328.txt.
    Now, in advance you don't know the size of that file. Let's say it's 10MB and takes 30 seconds to FTP.
    In your transformation you detect this file and start working. Chances are very high that you'll be reading an incomplete file.

    There are 2 ways to solve this problem:
    1) You write to FILE_20070328.txt.tmp and rename when the FTP is done
    2) You write all files you need to transfer and then FTP a small file, called a trigger file or sentinel file.

    The first option is generating a problem on a different level though. Suppose you have a data warehouse to update and you expect 10 files from a remote system (CUSTOMER_20070328.txt, SALES_20070328.txt, etc.)
    In that case the option has to include counting the number of available files (perhaps that's why you're doing it), verifying the content, complex wildcards etc, just to make sure. When you also need to handle half-complete FTP attempts, partial files, etc it becomes messy pretty fast.

    The trigger/sentinel option is by far easier and more ellegant in use and that is why we included that option.

    All the best,
    Matt

  6. #6
    Join Date
    Feb 2007
    Posts
    10

    Default

    use "Check if a folder is empty" step. Supports regexp

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.