Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: how to extract top N lines from each group?

  1. #1

    Default how to extract top N lines from each group?

    Hi All,

    suppose there were two fields in the stream (or file), one field is the group ( from a to c), and another field is the incomes of the group.

    Now I wanna to extract top N (say, top 20) income from each group, that's, top 20 from a, top 20 from b, and top 20 from c. my stupid way is to set up 2 filters (group ==a, and group==b, and the left is group ==c). This stupid way can't be applied to many groups (if there're 1000 group, I can't set up 999 filters).

    Do you have any good methods to solve this problem? Thanks in advance.

    ---
    Samuel Wu
    Attached Files Attached Files

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    Javascript and 1 filter

    or a Groupby step and 1 filter ;-)

    Sven
    Last edited by sboden; 07-16-2007 at 02:16 AM.

  3. #3

    Default

    Sven,
    Thanks. I'm thinking using java script, but my java is too bad...

    Can you give me some lines as an example?

    ---
    Samuel

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    If it works without javascript first go that way.

    Assuming your input is in the right order (else sort it)... use a group by step... "include all rows"... "Add line number, restart in each group"... and then after that put a filter on linenumber <= 20.

    Regards,
    Sven

  5. #5

    Default

    I got it solved with the group by.

    I attached the kettle I used.

    Thanks, Sven.
    Attached Files Attached Files

  6. #6

    Default

    I got it solved with the group by.

    I attached the kettle I used.

    Thanks, Sven.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.