Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: large file count breaks VFS in kettle 2.5

  1. #1

    Default large file count breaks VFS in kettle 2.5

    I found that trying to do Get File Names or Text File Input steps that specify a directory that contains thousands of files breaks the default VFS library that ships with kettle. The method that should return the file listing of the directory so that the regex can be compared with each file just never returns. The bug is entirely within VFS, and it has been fixed in recent builds of commons-vfs, so the workaround is simply to install a recent build of commons-vfs in place of the one which ships with kettle. I grabbed a nightly build from July 6, and it works fine. I didn't attempt to test anything else, although I will try to find a released version of the library which works correctly before releasing our project to production. I have no idea when the bug was actually fixed.

    Just posting this as a heads up for other kettle users. I haven't filed a bug against kettle, since it is a vfs problem, and vfs has already fixed it, but you should definitely be warned. I was using directories that had anything from 4,000 to 8,000 files in them and none of them worked. I didn't attempt to find the point at which it breaks.

    --sam

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Thank you Sam.
    I have my own pet peeve with VFS: https://issues.apache.org/jira/browse/VFS-130
    Did you come across this one?

    All the best,

    Matt

  3. #3

    Default

    You'd have to convince me to use windows, first, and I don't see that happening anytime soon, fortunately

    --sam

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.