Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: How to Process data from MySQL Table

  1. #1
    Join Date
    May 2011
    Posts
    11

    Question How to Process data from MySQL Table

    Hello


    I want to do something probably really simple in spoon but I'm not managing to make it work


    What I need is simply to get some rows from a MySQL table and apply to it some processing and finally to load this data into another MySQL table.


    The data to be processed is like the figure below:

    Attachment 8049

    The loc_id column refers to each distinct user and each click they executed on the website creates a new row. So the user identified by "1011081020520110625234600729" has clicked 4 times.

    First I need to identify how many clicks each user gave, if it's between 2 and 300 then it's ok; otherwise he will not be considered (I've tried to implement it through the sql query "count(loc_id)" but it became much slower)

    After that, I need to identify how many different prop_codes he has clicked (the last column of the figure). If it contains only the value 'BP', I identify it as 1. If it contains only ' BF', 2, and if both appear ('BP' and 'BF'), 3.

    Finally I need to calculate how many distinct 'us' appears (another field)

    After that, I'll load the data into another table called users with the values:
    • loc_id
    • IP (even though it might appear different IPs, I just need to take one value)
    • number of distinct us
    • us (i can select anyone since I just use 1 value)
    • store of preference (1,2 or 3)
    Is it possible to make this transformation via spoon? It seems to be kinda straightforward but I couldn't make it work. I've tried to use 'Select Values' after the Table Input step and then Compare the values but I still need to make it work for each specific user, as well as to count how many clicks the users gave

    I appreciate any help!

    Will

  2. #2
    Join Date
    Mar 2006
    Posts
    170

    Default

    Hi Will,

    Try doing table input step with your query then sort the rows by loc_id, prop_codes, us using the sort step.
    Next up use the Group By step and that is where you will Play around learning how you want to window/summarize the data.
    After that Insert/Update your target table.

    Look in the examples and wiki for the various tricks inside the Group By step.

    That should hopefully point you in a direction.

    Good Luck!

    Kent

  3. #3
    Join Date
    May 2011
    Posts
    11

    Default

    Quote Originally Posted by kandrews View Post
    Hi Will,

    Try doing table input step with your query then sort the rows by loc_id, prop_codes, us using the sort step.
    Next up use the Group By step and that is where you will Play around learning how you want to window/summarize the data.
    After that Insert/Update your target table.

    Look in the examples and wiki for the various tricks inside the Group By step.

    That should hopefully point you in a direction.

    Good Luck!

    Kent
    Hi Kent,

    Thank you very much for the help. It worked pretty well

    Since it's not making a very high row/s performance, I was trying to 'break' into small parts the join statement used in the input table.

    What I'm trying to do is first to get a user value. With this value I search into another table all entries that matches that user value, and them process the resulting data.

    I tried to put two Input Tables. With one I select the user and then I drop this value into the second Input Table. So I execute the query in that table trying to use as input variable the user previously selected, but I get an error message by doing this

    Is there a way of doing this strategy via spoon?

    Tnx in advance!

    Will

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.