US and Worldwide: +1 (866) 660-7555
Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: How to process a JSON file with multiple records?

  1. #1

    Default How to process a JSON file with multiple records?

    I am having a problem getting the JSON INPUT step to process multiple JSON records within a single file. If there is only one line in the file:

    {"unit_record":[{"TERM":"201220","INST_METHOD":"TR"}]}

    Then it processes just fine. However as soon as there is more than one JSON record in the file, the JSON INPUT step says there are no rows:

    {"unit_record":[{"TERM":"201220","INST_METHOD":"TR"}]}
    {"unit_record":[{"TERM":"200910","INST_METHOD":"IN"}]}

    Perhaps my REGEX is incorrect? Here is what I have in my JSON INPUT step for the path for each column:

    $.unit_record[*.TERM]
    $.unit_record[*.INSTRUCTIONAL_METHOD]

  2. #2
    Join Date
    Jun 2012
    Posts
    3,262

    Default

    While each line is valid JSON, the whole file content is not.
    Attached Files Attached Files
    pdi-ce-4.4.0-stable (with patches)
    java 1.7.0_51 (OpenJDK)
    ubuntu 13.10 (x86_64)
    timezone CET / CEST
    sig updated 2014-01-25

  3. #3

    Default

    Awesome! That works great...thanks!

  4. #4
    Join Date
    May 2013
    Posts
    10

    Default

    I have similar problem.
    My JSON input file contains more that one JSON Objects. When I go with "Preview Rows", the error message says "Error in parsing file".
    I checked the validity of the file online. It has no syntax/ formatting error.
    The file contains data as:

    {"possibly_sensitive_editable":true,"text":"\u2665 http:\/\/t.co\/MY9uDO3W","retweeted":false,"contributors":null,"coordinates":null,"in_reply_to_status_id":null,"possibly_sensitive":false,"in_reply_to_status_id_str":null,"retweet_count":0,"truncated":false,"source":"\u003Ca href=\"http:\/\/twitter.com\/download\/iphone\" rel=\"nofollow\"\u003ETwitter for iPhone\u003C\/a\u003E","entities":{"hashtags":[],"media":[{"type":"photo","indices":[2,22],"media_url_https":"https:\/\/p.twimg.com\/A4lQdJfCYAAp2xs.jpg","display_url":"pic.twitter.com\/MY9uDO3W","expanded_url":"http:\/\/twitter.com\/m3raftk\/status\/254823315611475968\/photo\/1","id_str":"254823315615670272","url":"http:\/\/t.co\/MY9uDO3W","media_url":"http:\/\/p.twimg.com\/A4lQdJfCYAAp2xs.jpg","sizes":{"thumb":{"resize":"crop","h":150,"w":150},"large":{"resize":"fit","h":299,"w":299},"small":{"resize":"fit","h":299,"w":299},"medium":{"resize":"fit","h":299,"w":299}},"id":254823315615670272}],"urls":[],"user_mentions":[]},"geo":null,"in_reply_to_user_id":null,"created_at":"Sun Oct 07 05:59:47 +0000 2012","in_reply_to_user_id_str":null,"id_str":"254823315611475968","place":null,"in_reply_to_screen_name":null,"favorited":false,"user":{"profile_background_image_url":"http:\/\/a0.twimg.com\/images\/themes\/theme1\/bg.png","followers_count":8817,"profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/2666832432\/1c4719290f5a41238dce1618d05b094e_normal.jpeg","screen_name":"m3raftk","default_profile_image":false,"geo_enabled":false,"profile_link_color":"0084B4","profile_banner_url":"https:\/\/si0.twimg.com\/profile_banners\/534963081\/1349031726","favourites_count":8,"notifications":null,"profile_background_color":"C0DEED","description":"\u0622\u0643\u0627\u0648\u0646\u062a \u062e\u0622\u0635 \u0644\u0639\u0645\u0644 \u0631\u062a\u0648\u064a\u062a \u0641\u0640\u0648\u0644\u0648 \u0645\u0640\u064a \n\n\u0631\u062a\u0648\u064a\u062a \u0644\u0627\u0639\u0644\u0627\u0646\u064a + \u062d\u0637 \u062d\u0633\u0627\u0628\u064a \u0641\u064a \u062a\u063a\u0631\u064a\u062f\u062a\u0643 \u0648\u0627\u0631\u0633\u0644\u0647\u0627 \u0648\u0627\u0646\u0627 \u0627\u0633\u0648\u064a \u0644\u0643 \u0631\u062a\u0648\u064a\u062a \u0633\u0648\u0644\u064a \u0641\u0648\u0644\u0648 \u0648\u0633\u0648\u064a\u0644\u0643 \u0631\u062a\u0648\u064a\u062a , \u0648\u0634\u0643\u0631\u0622 \u0639 \u062a\u0639\u0627\u0648\u0646\u0643\u0645 ","profile_background_tile":false,"contributors_enabled":false,"lang":"ar","time_zone":null,"created_at":"Sat Mar 24 01:20:43 +0000 2012","profile_sidebar_fill_color":"DDEEF6","id_str":"534963081","listed_count":5,"statuses_count":18243,"url":null,"follow_request_sent":null,"friends_count":9693,"profile_sidebar_border_color":"C0DEED","protected":false,"is_translator":false,"default_profile":true,"following":null,"profile_background_image_url_https":"https:\/\/si0.twimg.com\/images\/themes\/theme1\/bg.png","name":"\u0631\u062a\u0648\u064a\u062a\u0643\u0645 || \u2022RT\u2022","verified":false,"profile_use_background_image":true,"profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/2666832432\/1c4719290f5a41238dce1618d05b094e_normal.jpeg","location":"kuwait ","id":534963081,"utc_offset":null,"profile_text_color":"333333"},"id":254823315611475968}
    {"possibly_sensitive_editable":true,"text":"Niko Liko's cousin was arrested today in Juarez Mx.when the victim was asked who violated her she said, \"Some Clown\" http:\/\/t.co\/5VZXsnso","retweeted":false,"contributors":null,"coordinates":null,"in_reply_to_status_id":null,"possibly_sensitive":false,"in_reply_to_status_id_str":null,"retweet_count":0,"truncated":false,"source":"\u003Ca href=\"http:\/\/www.apple.com\" rel=\"nofollow\"\u003EPhotos on iOS\u003C\/a\u003E","entities":{"hashtags":[],"media":[{"type":"photo","indices":[117,137],"media_url_https":"https:\/\/p.twimg.com\/A4lQdJ5CAAAxIP6.jpg","display_url":"pic.twitter.com\/5VZXsnso","expanded_url":"http:\/\/twitter.com\/senorkrinkle\/status\/254823315716308993\/photo\/1","id_str":"254823315724697600","url":"http:\/\/t.co\/5VZXsnso","media_url":"http:\/\/p.twimg.com\/A4lQdJ5CAAAxIP6.jpg","sizes":{"thumb":{"resize":"crop","h":150,"w":150},"large":{"resize":"fit","h":411,"w":347},"small":{"resize":"fit","h":403,"w":340},"medium":{"resize":"fit","h":411,"w":347}},"id":254823315724697600}],"urls":[],"user_mentions":[]},"geo":null,"in_reply_to_user_id":null,"created_at":"Sun Oct 07 05:59:47 +0000 2012","in_reply_to_user_id_str":null,"id_str":"254823315716308993","place":null,"in_reply_to_screen_name":null,"favorited":false,"user":{"profile_background_image_url":"http:\/\/a0.twimg.com\/images\/themes\/theme1\/bg.png","followers_count":40,"profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/1253096833\/qfI5zSRf_normal","screen_name":"senorkrinkle","default_profile_image":false,"geo_enabled":false,"profile_link_color":"0084B4","favourites_count":2,"notifications":null,"profile_background_color":"C0DEED","description":"anti right ...but Almost never wrong!","profile_background_tile":false,"contributors_enabled":false,"lang":"en","time_zone":null,"created_at":"Mon Feb 21 06:33:03 +0000 2011","profile_sidebar_fill_color":"DDEEF6","id_str":"255370866","listed_count":2,"statuses_count":2298,"url":null,"follow_request_sent":null,"friends_count":69,"profile_sidebar_border_color":"C0DEED","protected":false,"is_translator":false,"default_profile":true,"following":null,"profile_background_image_url_https":"https:\/\/si0.twimg.com\/images\/themes\/theme1\/bg.png","name":"se\u00f1orkrinkle ","verified":false,"profile_use_background_image":true,"profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/1253096833\/qfI5zSRf_normal","location":"Earth ","id":255370866,"utc_offset":null,"profile_text_color":"333333"},"id":254823315716308993}



    Sharing with you the error message that I am getting while previewing the rows.

    2013/05/27 17:52:21 - Spoon - Transformation opened.
    2013/05/27 17:52:21 - Spoon - Launching transformation [testing_jason]...
    2013/05/27 17:52:21 - Spoon - Started the transformation execution.
    2013/05/27 17:52:21 - Transformation metadata - Natural sort of steps executed in 0 ms (1 time previous steps calculated)
    2013/05/27 17:52:21 - Spoon - The transformation has finished!!
    2013/05/27 19:43:48 - D:\Harshad Joshi(6848)\Pentaho Documentation\twitter_sample\testing_jason.ktr : testing_jason - Dispatching started for transformation [D:\Harshad Joshi(6848)\Pentaho Documentation\twitter_sample\testing_jason.ktr : testing_jason]
    2013/05/27 19:43:48 - Json Input.0 - ERROR (version 4.2.1-stable, build 15952 from 2011-10-25 15.27.10 by buildguy) : Could not open file #1 : file:///D://Pentaho Documentation/twitter_sample/xyz.txt --> org.pentaho.di.core.exception.KettleException:
    2013/05/27 19:43:48 - Json Input.0 - ERROR (version 4.2.1-stable, build 15952 from 2011-10-25 15.27.10 by buildguy) : org.pentaho.di.core.exception.KettleException:
    2013/05/27 19:43:48 - Json Input.0 - ERROR (version 4.2.1-stable, build 15952 from 2011-10-25 15.27.10 by buildguy) : Error parsing file [D:\Pentaho Documentation\xyz.txt]!
    2013/05/27 19:43:48 - Json Input.0 - ERROR (version 4.2.1-stable, build 15952 from 2011-10-25 15.27.10 by buildguy) :
    2013/05/27 19:43:48 - Json Input.0 - ERROR (version 4.2.1-stable, build 15952 from 2011-10-25 15.27.10 by buildguy) :
    2013/05/27 19:43:48 - Json Input.0 - ERROR (version 4.2.1-stable, build 15952 from 2011-10-25 15.27.10 by buildguy) : Error parsing file [D:\Pentaho Documentation\twitter_sample\xyz.txt]!
    2013/05/27 19:43:48 - Json Input.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=1)
    2013/05/27 19:43:48 - D:\Pentaho Documentation\twitter_sample\testing_jason.ktr : testing_jason - D:\Pentaho Documentation\testing_jason.ktr : testing_jason
    2013/05/27 19:43:48 - D:\Pentaho Documentation\twitter_sample\testing_jason.ktr : testing_jason - D:\Pentaho Documentation\testing_jason.ktr : testing_jason


    What corrections/ changes do I need to make in order to make JSON Input step work flawless ?
    Thanks in anticipation.
    Last edited by Harshad; 05-27-2013 at 10:28 AM.

  5. #5
    Join Date
    Jun 2012
    Posts
    3,262

    Default

    I have no problem to parse your sample data when reading it from a text file having two lines.
    If you need help, share your transformation.
    pdi-ce-4.4.0-stable (with patches)
    java 1.7.0_51 (OpenJDK)
    ubuntu 13.10 (x86_64)
    timezone CET / CEST
    sig updated 2014-01-25

  6. #6
    Join Date
    May 2013
    Posts
    10

    Default

    Quote Originally Posted by marabu View Post
    I have no problem to parse your sample data when reading it from a text file having two lines.
    If you need help, share your transformation.

    Actually the file has more than 100 rows I suppose and I have shared the data having 2 rows only.
    I'll share the transformation with you.
    The transformation has no confusing steps. Just a JSON input and a flat file output.
    Please confirm the Filename. I cannot upload the text file more that 900kbs so I am attaching the JASON input file of 500kbs having few records.
    Please help.
    Attached Files Attached Files

  7. #7
    Join Date
    Jun 2012
    Posts
    3,262

    Default

    It seems you hijacked a thread without even reading it.
    The solution I gave in post #2 was to read JSON line by line.
    This is done by a Text File Input step. JSON Input is ordered to read from a field, then. Have a look.
    You should learn about JSONPath, too. It is $.text what you want to read, not just text.
    And last not least, there is a twitter related sample in your Kettle install folder (samples/transformations/real-time-streaming).

    BTW: Only attach as much data as needed and zip large files ...
    pdi-ce-4.4.0-stable (with patches)
    java 1.7.0_51 (OpenJDK)
    ubuntu 13.10 (x86_64)
    timezone CET / CEST
    sig updated 2014-01-25

  8. #8
    Join Date
    May 2013
    Posts
    10

    Default

    Quote Originally Posted by marabu View Post
    It seems you hijacked a thread without even reading it.
    The solution I gave in post #2 was to read JSON line by line.
    This is done by a Text File Input step. JSON Input is ordered to read from a field, then. Have a look.
    You should learn about JSONPath, too. It is $.text what you want to read, not just text.
    And last not least, there is a twitter related sample in your Kettle install folder (samples/transformations/real-time-streaming).

    BTW: Only attach as much data as needed and zip large files ...


    Thanks for the help, marabu.
    Could you please share the transformation(.ktr) file with me which works on the data that I have provided in this thread. The one that has two rows.
    At least I will get an idea about what configurations do I need to make while reading a JSON file.

  9. #9
    Join Date
    Jun 2012
    Posts
    3,262

    Default

    Here you are
    Attached Files Attached Files
    pdi-ce-4.4.0-stable (with patches)
    java 1.7.0_51 (OpenJDK)
    ubuntu 13.10 (x86_64)
    timezone CET / CEST
    sig updated 2014-01-25

  10. #10
    Join Date
    May 2013
    Posts
    10

    Default

    Thanks a ton !
    Its working flawlessly...!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •