Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Too many open files in HTTP client step (3.2.0 GA)

  1. #1

    Default Too many open files in HTTP client step (3.2.0 GA)

    Hello all,

    Environment: PDI 3.2.0 GA, Ubuntu 10.04 64-bit, OpenJDK 1.6.0_18

    I'm having trouble running a transform that uses a HTTP client step to retrieve some data from a web service.

    When I run the step with at least 1000 rows, fairly consistently I get an error like so:

    2010/08/05 13:49:52 - org.pentaho.di.trans.steps.http.HTTP - ERROR (version 3.2.0-GA, build 10572 from 2009-05-12 08.45.26 by buildguy) : Because of an error, this step can't continue:
    2010/08/05 13:49:52 - org.pentaho.di.trans.steps.http.HTTP - ERROR (version 3.2.0-GA, build 10572 from 2009-05-12 08.45.26 by buildguy) : Unable to get result from specified URL : <snip>
    2010/08/05 13:49:52 - org.pentaho.di.trans.steps.http.HTTP - ERROR (version 3.2.0-GA, build 10572 from 2009-05-12 08.45.26 by buildguy) : Too many open files


    ulimit open file settings are at the default (1024) so clearly there is a file descriptor leak of some kind.

    If I run:

    lsof -i6 | grep "10.1.7.71" | grep "ESTABLISHED" | wc -l

    and

    lsof -i6 | grep "10.1.7.71" | grep "CLOSE_WAIT" | wc -l

    to watch the number of open sockets to the web server we're hitting, I can see that CLOSE_WAIT stays around 50 or so but ESTABLISHED can grow up to over 200 before garbage collection or whatever else kicks in and reduces this. Since lsof is showing me a good 600 open files for spoon just on start up, this makes it pretty likely that we breach 1024 files.

    In the hopes of being able to regulate how many client connections can get created at once, I tried lowering the rowset size way down to 20, but I don't see any real difference in the number of open sockets I'm seeing in my tests.

    I can of course increase the ulimit setting to help get around this, but it makes me nervous because I don't understand when or how this kind of failure can come up.

    Is there anything else that someone can suggest to help avoid this issue? I am happy to have reduced performance if I can be certain that the transform will not fail.

    Thanks in advance,
    Alex

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    I think this is caused by your operating system. By default your OS keeps sockets open for a short time (to make sure all data is passed or something like that).
    Try to slow down the operations a bit with a wait step (50 ms or something like that) and you'll notice that the open files count should stay at a lower figure.
    At least that's what I tried (also on 10.04) and it worked for me. (in fact I used a slower web server :-))

    The alternative is to allow your OS to open more files at once.

  3. #3

    Default

    Thanks for the response.

    I'll probably just go with increasing the limit in /etc/security/limits.conf to 3 or 4k, but the delay step seems like a simple enough way to deal with this, not sure why i didn't think of that before!

  4. #4
    Join Date
    Mar 2011
    Posts
    5

    Default

    Quote Originally Posted by MattCasters View Post
    Try to slow down the operations a bit with a wait step (50 ms or something like that) and you'll notice that the open files count should stay at a lower figure.
    I also got this error when POSTing to Openbravo REST service. I need a 150ms Delay step to successfully post all 1500 records in my own laptop.

    Using PDI/Kettle 4.1.2, Sun Java 1.6.0_24 on Ubuntu 10.10.

    I really think HTTP Post/Client should have a "maximum number of connections" configuration. Because Delay step for this is just... hackish!! (and it's still not 100% reliable)
    Last edited by HendyIrawan; 04-18-2011 at 02:10 PM.

  5. #5
    Join Date
    Mar 2011
    Posts
    5

    Default

    BTW there is jira bug here: http://jira.pentaho.com/browse/PDI-5419

    You can vote on it or contribute to fix :-)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.