Hitachi Vantara Pentaho Community Forums
Results 1 to 14 of 14

Thread: DATA from on table to another with Hourly totals

  1. #1
    Join Date
    Aug 2013
    Posts
    139

    Default DATA from on table to another with Hourly totals

    hello all,

    I am new to PDI 4.4, BI Server 4.8 CE and DB MYsql
    our requirement is to load data from table1 to table2,
    table1 contains meter reading records for every 45 second for around 60 meters
    table2 should contain only one record per hour which is units consumed for that hour

    The typical part of this calculation is to calculate number of units consumed by each meter on hourly basis

    i guess below example would give some idea

    MeterID TimeStamp Units
    meter1 21-03-2014 02:59:15 40(difference of reading ( 2am last reading of 21-03-2014 02:59:15 - 1am last reading of 21-03-2014 01:59:45)
    meter 2 21-03-2014 02:59:45 60(difference of reading ( 2am last reading of 21-03-2014 02:59:45 - 1 am last reading of 21-03-2014 01:59:15)

    and so on every hour for everyday, Suggest me some ideas to crack this
    Last edited by suresh84c; 04-01-2014 at 10:56 AM.
    Thank U For Your Time
    Suresh

  2. #2
    Join Date
    Aug 2013
    Posts
    139

    Default

    did tried but i am struggling to start up give me something to start up............. i am counting on you guys... really confused how to achieve this....
    Thank U For Your Time
    Suresh

  3. #3
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Sorry to say this, but enthusiasm alone will get you nowhere.
    As someone who's reading this forum for several months now, you should have learned of the importance of a sound problem statement.
    Everybody will jump to your help if you are struggling with Kettle, but it's almost impossible to teach you the art of requirements specification, for example.
    For a start, let someone proofread your posting.
    Avoid non-existent words like u, plz, b/w, when communicating with professionals.
    Make your posting easy and pleasant to read. I for one will decide after about five seconds to skip a posting due to a lack of appeal, interest or understanding.
    Generally, don't expect someone to deal with your posting more than five minutes.
    If you have problems to find the right words, provide reusable sample data for input, but don't expect us to type it in.
    For your expected output you can use tables in the message editor.

    All the best to you.
    So long, and thanks for all the fish.

  4. #4
    Join Date
    Aug 2013
    Posts
    139

    Default

    Thank You Marabu, Point Noted
    Thank U For Your Time
    Suresh

  5. #5
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Let's assume table1 has columns (meter, timestamp, value).
    I would derive day (of the year) and hour (of the day) fields from timestamp using a Calculator step.
    Now I can read max(value) grouping by (meter, day, hour).
    I would use Analytic Query's LAG 1 function to carry the previous maxvalue and another Calculator to determine the hourly units.
    So long, and thanks for all the fish.

  6. #6
    Join Date
    Aug 2013
    Posts
    139

    Default DATA from on table to another with Hourly totals calulaction atttched .ktr

    Hello Everyone

    Thanks Marubu

    I created the Transformation as explained

    the problem i am facing is, its not grouping meter wise, hour, day. i.e its not getting one record per hour after grouping
    and while doing the Analytic Query Lag 1 operation its just picking the previous value
    which is of same hour and also of different meter i am attaching the .ktr file with data grid step with some sample data


    I tried also with sql query by getting one record per hour and value and tried to do grouping meter wise but no use

    have a look in .Ktr file attached below might get more idea. Kettle version PDI 5.0 CA

    which part am i missing in this



    Meter_Units.ktr
    Last edited by suresh84c; 04-03-2014 at 03:40 AM.
    Thank U For Your Time
    Suresh

  7. #7
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    You forgot to calculate the hour value.
    Also, you are supposed to provide sorted rows as input to a Group By step.
    So long, and thanks for all the fish.

  8. #8
    Join Date
    Aug 2013
    Posts
    139

    Default atttched .ktr file for refrence

    Thank You Marabu for your support.

    I created this ktr as suggested and achieved the format as required till Analytic query i am getting null value by using Lag 1 for group by Hour,

    its working if i changed it to day or meter id but that is not what i required, i am missing something



    Meter_Un.ktr
    Thank U For Your Time
    Suresh

  9. #9
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    You still didn't calculate the hour of the day.
    So long, and thanks for all the fish.

  10. #10
    Join Date
    Aug 2013
    Posts
    139

    Default

    Thank you Marabu

    Got it and successfully implemented


    now i getting units for meterid 1 from 0 to 23 for first day and second day so on
    i missing units for 0'th hourly every day i.e null value for 0'th hour lag 1 feild

    solution Guess:
    need to get 0'th lag 1 reading from previous day 23'rd hour reading, if so is it possible?
    Thank U For Your Time
    Suresh

  11. #11
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    The keys in "Analytic Query" must be the same as in "Group By", the list may be shortened at the end, though.
    I would expect to find meterid as the only key used in "Analytic Query".
    So long, and thanks for all the fish.

  12. #12
    Join Date
    Aug 2013
    Posts
    139

    Default

    Thanks marabu

    table1:meterid,reading,timestamp
    table2:meterid, units,Actual_units,timestamp-----[hourly data table]

    Actual_units calculation is like difference of units like meter1=meter1-[meter10 + meter30]

    These formulas we are reading it from Property file

    Example Table look like

    Formula: Meter1=meter1-meter2, Meter4=meter4-(meter3+meter5)

    Meter Id units Actual Units time
    meter1 20 10 08-04-2014 10:58
    meter2 10 10 08-04-2014 10:59
    meter3 40 40 08-04-2014 10:57
    meter4 80 15 08-04-2014 10:59
    meter5 25 25 08-04-2014 10:59
    meter1 30 15 08-04-2014 11:59
    meter2 15 15 08-04-2014 11:57
    meter3 20 20 08-04-2014 11:58
    meter4 90 35 08-04-2014 11:59
    meter5 35 35 08-04-2014 11:58


    Is it better to build another transformation or do it in same
    can we achieve this without java

    Please suggest us to solve this
    Last edited by suresh84c; 04-08-2014 at 08:11 AM.
    Thank U For Your Time
    Suresh

  13. #13
    Join Date
    Aug 2013
    Posts
    139

    Default

    Hi Marabu,


    Sorry for that thread that was created for another person and posted unknowingly,,

    we need to check the last row record of the output table for any meter Id, and read that timestamp and get the records from input table from next hour

    say suppose last record time stamp is 2014-03-04 10:58:59 in output table for meter1 we then retrieve all meter records from 11th hour of that date to today previous hour(get system time to check present and go to previous, which is completed hour)

    Regards,
    Thank U For Your Time
    Suresh

  14. #14
    Join Date
    Aug 2013
    Posts
    139

    Default

    Thanks Marubu,

    I did achieved the task now the calculations of units are perfect and everything run fine when i work with whole table.
    i am getting rows till completed hour from input table as suggested in other thread
    we get around 40k rows data for 1 day so thought to get data only required for calculation would be much faster


    This job has to run every hour and insert only one hour rows say if it run at 7:5:05 it will only insert 6th hour data.
    when started working on this Faced the following Problems,

    1. I need to check output table last(max time) row for every meter and then insert new rows from next hour
    (This is for in case if something went wrong and we missed few hours it has to insert all at next run)

    2. if i got those max times from output table and get the rows from there in Input table
    the first hour i will calculating lag 1 record will be null.(this also a case for exactly one hour data)
    if i some how get previous hour row and calculate how to avoid adding already existing row in output table

    Help me to get this, please tell me if i was in right direction or not

    I do apologize for my grammar
    Thank U For Your Time
    Suresh

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.