Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: Date vs DateTime

  1. #1

    Default Date vs DateTime

    I'm starting a new project using Kettle, with some characteristics of a data warehouse application. I have complete control over the DB schema that I use for internal and output data stores, stored in SQL Server. Looking for advice on "best practice" using DATE types.

    Most of the dates that I receive are just date, time is immaterial and usually zeroed. I was using the default DATETIME in SQL Server, as that is what Kettle defaults to. But when I do math such as "calcuate date 45 days ago", I have to strip off the time to do the math. Since my inputs and outputs don't care about time, I thought to switch to just DATE column types. However, it seems that my streams now are using String as data type. Is this standard Kettle behavior:

    • DATE in DB translates to String in stream
    • DATETIME in DB translates to Date in stream


    I don't like the idea of doing string-to-date conversions and worrying about whether the string has / or - separator. Should I just define all date-y columns as DATETIME, and just strip time off when calculating or doing output? I've done a few Kettle projects before, but they were very small; this is the first significant production app, and I'd like to do it the easiest/best way the first time.

    Thanks

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    The database driver uses java.sql.Date as the internal type for columns of type DATE.
    Date looks like DateTime with the time portion set to midnight.
    If you connect a Table Input step to a Text File Output step, you will observe Spoon using Date for your Date fields.

    Exactly when do you experience an unforced type conversion from Date to String?
    So long, and thanks for all the fish.

  3. #3

    Default

    If time is entirely of no importance you can also consider storing dates as integers in your DWH. There are a couple of formats that you may consider depending on the RDBMS. The drawback is , that you need to have a frontend action to transfer the integer to a real date or a lookup table that stores this information

  4. #4

    Default

    I'm seeing it in the Table Input step. My DB colum is DATE, yet Show Output Fields shows it as a String type in the stream.

  5. #5

    Default

    Here is a solution suggested by my DBA: Edit the SQL SELECT in Table Input to cast the column from DATE to DATETIME:

    SELECT
    Portfolio_Account_Number
    , Portfolio_Description
    , Security_Description
    , Security_Type
    , Symbol
    , Cost_Basis
    , Original_Trade_Date=CONVERT(DATETIME, Original_Trade_Date)
    , Quantity
    , Unrealized_Gain
    , Current_Price
    , Price_Date=CONVERT(DATETIME, Price_Date)
    , Non_Managed_Asset
    FROM lot_pc_raw


    2nd suggestion was to change the driver and or Connection. I'm using Microsoft JDBC driver, and Connection type of "MS SQL Serve"r. Others at my company are using the Connection Type of "Generic database".

  6. #6
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Quote Originally Posted by Bob McC View Post
    I'm seeing it in the Table Input step. My DB colum is DATE, yet Show Output Fields shows it as a String type in the stream.
    PDI 4.3.0 (current stable) has no "Show Output Fields" action in "Table Input" step...

    You are using SQL Server 2008 R2?
    You are using the current JDBC driver 4.0?
    What's your PDI version?

    Since the driver documentation and your observation don't match I'd hate to know a bug going unaccounted for: http://msdn.microsoft.com/en-us/library/ms378878.aspx
    So long, and thanks for all the fish.

  7. #7

    Default

    I thought every field has "Show input fields" and "Show output fields" on the pop-up menu.

    SQL Server 2008 R2 (SQL Server 10.50.4000)
    Not sure how to find JDBC version.
    PDI 4.1.0

  8. #8
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    You are right, I forgot the popup menu.

    You can learn the JDBC driver version with the console command "java -cp path-to-jar driver-classname".

    If you installed the most current driver you should find sqljdbc.jar or sqljdbc4.jar in the JDBC folder, else you are using JTDS.
    So long, and thanks for all the fish.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.