Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: JSON input parse different/ breaking from version 6.1 to 7.1

  1. #1
    Join Date
    Sep 2014
    Posts
    175

    Default JSON input parse different/ breaking from version 6.1 to 7.1

    Hello there --

    I've upgraded versions recently from 6.1 to 7.1. I've noticed the JSON parse step breaking now.

    Not sure what could have changed between versions - I'm pretty sure "JSONquery" is fairly static and dated itself, so am a bit confused.

    The query parse language is $[*].[*].id' and $[*].[*].author.name

    Essentially it's a combination of arrays that contain an equal amount of id fields and author.name fields.

    For some reason the 7.1 version at least appears to be converting $[*] into possibly something funny. Why was the logic changed here in particular?

    If you want an example of each row being parsed, it's this:

    [{"id":"3023", "author":{"name":"twain"}, "field3":"blah"},
    {"id":"3025", "author":{"name":"bob"}, "field3":"blah"},
    {"id":"3055", "author":{"name":"kim"}, "field3":"blah"}]


    As you can see, due to the nature of "multi parse" JSON being needed for unequal leaves, this JSON starts as a "one item array" for each row.

    $[*] essentially identifies the '1 item array'.

    $[*].[*] says takes this 1 item array (the whole object) and take the sub objects in there. In this case, the {} entries.

    So not sure why this is breaking (worked fine in 6.1).

    It's saying $[*].[*].id is returning all id objects but $[*].[*].author.name is only returning 5 or so (odd). The actual data is fully complete.

    I know I'll have to investigate in detail further myself, but what was the grand logic change in JSON parsing between 6.1 and 7.1? And why has Pentaho not solved this unequal leaf problem in JSON parsing yet? I've seen plugins come, and go. Explanations offered, rabbit holes chased down, and no solution presently exists other than using 2-3 JSON input steps, or some wild custom JavaScript.

  2. #2
    Join Date
    Sep 2014
    Posts
    175

    Default

    Alright I figured it out about 5 minutes after posting. I'll leave it here anyway.

    Somewhere between 6.1 and 7.1, the logic was changed.

    It appears that the input [{"field":"value"}] is no longer considered a 1 item array (due to the outside []) but as just a giant object. Looks like $.[*].id works now instead of $[*].[*].id.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.