Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Create a web service from my transformation

  1. #1

    Post Create a web service from my transformation


    I am new to the forum and new to PDI but not to ETL. Our small company is trying to get away from Informatica and I am trying to do a proof of concept with PDI. So here is the concept: a web app creates data and sends this data via POST to a URL that I supply that will consume the data, transform it into a simple mapping to our Oracle DB staging tables where the data will be consumed by a custom PL/SQL program within our ERP system.

    The "consume the data" part is the portion of this concept that I am stuck on. The rest is straight-forward ETL and the PDI documentation is pretty clear how to establish these transformations. However, creating a web service from the transformation so data is fed to it and essentially invoking the transformation to "do its thing" is not so straight-forward. In Informatica this is done by enabling the web service function in the workflow (which I think relates to PDI's Job).

    I have searched PDI documentation, Papa Google, and this forum and could not find anything that suggests remotely how to do this. Just to be clear, I need to provide my URL to the web app so data can be fed to me; I am not referencing a URL for some sort of WSDL look up.

    Any help would be greatly appreciated.


  2. #2


    Anyone have any ideas? I can rephrase the question if there is uncertainty about what I am asking.

    If there is documentation or other threads that discuss this in detail, please point the way. I am eager to understand how to do this.


  3. #3


    Hey David,

    it is not clear to me what you web app does. Usually a web app is already directly connected to a server providing the app or site and all the communication is going through that server. Then you just have to log whatever you need.

    If you actually want to send the data actively via POST to a server then you will have to come up with your own solution. I am not aware of PDI offering a solution to that. Write a PHP program with a simple API that you can POST your data to and have PHP store this data in a database.

    As soon as you have your data in a database the standard ETL pipeline can get started.

    Kind regards


    PS: please support:

  4. #4


    Hi David,

    You are starting off with a more advanced task - so I'll break it down for you. It may take some time for you to understand how all of the pieces work together.

    You'll need to use Carte - that's the server part of what you'll be running.

    You'll need to use a step that outputs to a servlet. Google that and there are good examples from Matt. Text or JSON output are good examples. If you set them up simply so that you can preview your content in PDI, then change it to output to servlet - you should see about the same thing when you see the output in the servlet.

    Lastly - for your example, you'll need to have your transformation accept a parameter. When you use Kitchen or Pan, these are passed from the command line, for a servlet, it is put in the url. I've tested this with Impala and used it a few times and know it to work reliably.

    That being said... this effort isn't for the faint of heart - if you stick with it, you'll get it. You can be satisfied that you'll then be a somewhat advanced user.

    Hope that helps.

    ... Here is the best link to explain the process....
    and another good one:
    Last edited by dbron0000; 06-02-2014 at 11:09 PM. Reason: adding the link

  5. #5


    Thanks for the reply. The web app is a hosted web portal that is used for customers to get quotes and place orders. The data is sent to us via POST because their database includes other customers that they host in their system. The POST contains data that includes session ID and transaction ID that opens a socket between the two and lets the containing data flow in. That part is done. Just need to get the data from the POST (raw XML data) into the PDI server and invoke the joblet or worklet or whatever it is called in Pentaho.

    I have a solution that Pentaho support sent me. I have started working through it now. I will post here if it works as designed in my head.

  6. #6


    Thanks for the reply. I have viewed both of those links before and they are not exactly what I am looking for but the second post (diehardsteiner) is closer to the answer, I think. It references the use of Carte to expose the service to a web browser. The other issue that I face is if multiple transactions happen at the same time. I will need some sort of caching service. Pentaho support has pointed me to ApachActiveMQ for that so I am testing that and a few other ideas that they provided. I will post back once the solution is in place.

  7. #7


    Looking forward to hearing more. I've used RabbitMQ for about the past 18-24 months. I initially built a java app that called kettle via the API, but recently updated that to be a consumer that runs in a transformation. There is a good AMQP plugin on Github though - that does both production and consumption. Here is a link:

    It's not really a consumer - it uses gets to pull messages, but it's good for testing and getting things up and running quickly.

    Using Carte to expose your transformations as a web service is a good approach - I've been happy with it (although I haven't built much beyond a few proof of concepts).

  8. #8
    Join Date
    Jun 2014


    I would also like to hear how you solve getting the POST data into the Kettle web service. I have this need as well, an extra plus would be if other RESTful HTTP verbs such as PUT and DELETE could be read as well from a transformation.

  9. #9
    Join Date
    Apr 2014



    did you find the solution for this problem, if so, could you share it... I'm in need of the same thing...

    thanks in advance and best regards,

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.