Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: Translate XML string and store as JSON record in MongoDB?

  1. #1
    Join Date
    Nov 2013
    Posts
    4

    Question Translate XML string and store as JSON record in MongoDB?

    I have a transformation step that calls a web service which returns a response via XML. The next step is a User Defined Java Step to translate the XML to JSON. My last step writes the data to MongoDB.

    My problem is that the JSON gets written as a single string to Mongo instead of a JSON object. This happens, I assume, because the output fields from the Java step must be either String or Serialized; I've tried Serialized but the MongoDBOutput step won't accept serialized fields. Is there a way I can write it as an object?

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Quote Originally Posted by Deeg View Post
    The next step is a User Defined Java Step to translate the XML to JSON.
    Not sure why you would have to do this.
    Using one of the XML parser steps would be my first choice.

    Quote Originally Posted by Deeg View Post
    My problem is that the JSON gets written as a single string to Mongo instead of a JSON object.
    Where is the difference?

    Did you ever see Mongo Output at work?
    So long, and thanks for all the fish.

  3. #3
    Join Date
    Nov 2013
    Posts
    4

    Default

    Quote Originally Posted by marabu View Post
    Not sure why you would have to do this.
    Using one of the XML parser steps would be my first choice.
    My problem with using the parser step is that (I believe) you have to list out each field you want to convert. Using the Java step I can parse the XML and convert it into JSON in a few lines without any knowledge of the individual fields. The advantage of this (besides having to add 60+ fields) is that if the fields change the kettle step doesn't have to change as well.

    Quote Originally Posted by marabu View Post
    Where is the difference?

    Did you ever see Mongo Output at work?
    The differences is that all the fields and values are stored as a single string value. What I want is something like this:

    response: {
    returnCode: "200"
    message: "OK"
    ...
    }


    What I get is

    response: "{ returnCode: \"200\" message: \"OK\"... }"

    In the second example the JSON is stored as a single string. Did I explain that ok?
    Last edited by Deeg; 11-20-2013 at 01:34 PM.

  4. #4
    Join Date
    Nov 2013
    Posts
    4

    Default

    I figured out how to get what I want which is...to upgrade to 5.0.1. The MongoDbOutput step has a flag to indicate that the string is really a JSON object. Thanks for looking into it marabu.

  5. #5
    Join Date
    Nov 2011
    Posts
    7

    Default

    Quote Originally Posted by Deeg View Post
    ... The next step is a User Defined Java Step to translate the XML to JSON.
    Deeg,

    Could you share how you did this? (what expression you used) This is exactly what I'm trying to do.

  6. #6
    Join Date
    Nov 2013
    Posts
    4

    Default

    Here's my code. The field "xml_result" is the string retrieved in my previous step and contains the XML string. "response" is the JSON string. You need to add jsonpath-1.0.jar file to your Pentaho lib directory.

    Code:
    import org.json.XML;
    import org.json.JSONException;
    import org.json.JSONObject;
    
    public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException
    {
        // First, get a row from the default input hop
        //
        Object[] r = getRow();
    
        // If the row object is null, we are done processing.
        //
        if (r == null) {
          setOutputDone();
          return false;
        }
    
        // It is always safest to call createOutputRow() to ensure that your output row's Object[] is large
        // enough to handle any new fields you are creating in this step.
        //
        Object[] outputRow = createOutputRow(r, data.outputRowMeta.size());
    
        try
        {
            String xml = get(Fields.In, "xml_result").getString(r);
            JSONObject json = XML.toJSONObject( xml );
            String jsonString = json.toString();
            get(Fields.Out, "response").setValue(outputRow, jsonString);
        }
        catch ( JSONException e )
        {
            get(Fields.Out, "response").setValue(outputRow, "Error");
        }    
    
        // putRow will send the row on to the default output hop.
        //
        putRow(data.outputRowMeta, outputRow);
    
        return true;
    }
    

  7. #7

    Default

    How was your previous step to get a single string with the whole XML? Did you use the "Get data from XML" ?

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.