Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Pentaho Data Integration Http Post

  1. #1
    Join Date
    Oct 2015
    Posts
    9

    Default Pentaho Data Integration Http Post

    Hi:

    I have a data extraction job which uses HTTP POST step to hit a website to extract data. the site goes unresponsive after a couple of hits and the program stops. Is there a way that I can make the job do a couple of retries if it doesn't get 200 response at the first hit. (Considering the caution that all the steps in a transformation run in parallel and the input should not be skipped while retries are made). Any help would be much appreciated. Thanks in advance.

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    The step itself can't be configured to do retries, but you get the response status code, so you can implement whatever you need.
    As to parallel execution of steps, you would follow HTTP-Post with Filter-Rows to keep results != 200 from being processed.
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Oct 2015
    Posts
    9

    Default

    Quote Originally Posted by marabu View Post
    The step itself can't be configured to do retries, but you get the response status code, so you can implement whatever you need.
    As to parallel execution of steps, you would follow HTTP-Post with Filter-Rows to keep results != 200 from being processed.
    Yes, but the requirement is that, let's just say, I have to hit the website with different set of parameter, say ids, each having a different response. I shouldn't hit the next id until I get a 200 response for the current request. How do I "linger around" with a couple of retries on each request itself and then fail the execution, if the response is still not 200.

  4. #4
    Join Date
    Apr 2016
    Posts
    156

    Default

    Quote Originally Posted by phantom26 View Post
    How do I "linger around" with a couple of retries on each request itself and then fail the execution, if the response is still not 200.
    Is the # of retries small-ish (e.g. less than 5-10)? If so, can do a manual chain of 'x' test conditions (using Filter step), where x = number of retries. Each test condition checks a param for previous result code... if result <> 200, then go to a Transformation Executor/Mapping and do the call again (returning call's result into stream for next test), else if result == 200 break out of the manual loop.

    It's a dumb approach to implementing simple loop for fixed (i.e. hard-coded) number of iterations.

    Alternatively... maybe could use concepts of recursion to create a transformation that does your call, gets a result, and if result <> 200 calls itself. Difficult part would be fiddling with stream variables to control recursion depth... don't think variables would work (since they're set in parent transformation's scope).
    My runtime environment: MacOS, JDK 1.8u121, PDI 7.0

  5. #5
    Join Date
    Aug 2011
    Posts
    360

    Default

    Maybe your best bet is to implement your post call in a javascript step:
    Use a java rest client library, Jersey for example, to do the call on each row
    and implement retries.
    This maybe the simpler for such a requirement, avoiding nasty hacks with pentaho loops.

    Or, if your calls are unrelated from each other, you could just filter rows with code <>200 to
    an error table. Then redo your job starting from the error table to do the retries.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.