Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: Parste Apache Tomcat access log with PDI?

  1. #1

    Default Parste Apache Tomcat access log with PDI?

    Hi,

    Has somebody here tried to parse the Apache Tomcat access log with PDI? I am looking for a way to interpret the data like IP, timestamp and accessed ressource and to write it to a database.

    Here an excerpt how the lines look like:

    Code:
    10.61.4.5 - - [12/Nov/2013:00:20:38 +0100] "GET /birt/run?__report=vailable.rptdesign " 200 4320
    10.61.4.4 - - [12/Nov/2013:00:20:42 +0100] "GET /birt/run?__report=vailable.rptdesign " 200 4320
    10.49.18.10 - - [12/Nov/2013:00:20:42 +0100] "POST /birt/run?__report=\dashboard\db\main.rptdesign&__overwrite=true&__sessionId=20131112_002002_872 HTTP/1.1" 200 492738
    What would be the best approach here? Using the RegEx step?
    Thanks for any hint!

    Bobse

  2. #2
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    Or maybe even Text Input File.
    Try with "space" as delimiter and Enclosure (").
    You should have 7 fields.
    You can discard 2 of them (- and -) and use the remaining fields.
    -- Mick --

  3. #3
    Join Date
    Nov 2008
    Posts
    777

    Default

    From a quick Internet Search, I found this link: http://tomcat.apache.org/tomcat-5.5-doc/config/valve.html#Access_Log_Valve

    The shorthand pattern name common (which is also the default) corresponds to '%h %l %u %t "%r" %s %b'.

    With that pattern in mind, my preference would be to use the Regex Evaluation step configured with capture groups. You can get as fancy as your want with the regex, but something like this seems to do the trick:

    Name:  tomcat_log.jpg
Views: 96
Size:  29.1 KB
    Attached Files Attached Files
    Last edited by darrell.nelson; 11-14-2013 at 10:46 AM.
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Isn't this what the Regex sample is doing?

    samples/transformations/Regex Eval - parse NCSA access log records.ktr

  5. #5
    Join Date
    Apr 2008
    Posts
    1,771

    Default

    Hi Darren, great regexp.
    You have achieved the same results but it would have taken *me* 3 hours to write that regexp!!
    -- Mick --

  6. #6
    Join Date
    Nov 2008
    Posts
    777

    Default

    Quote Originally Posted by Mick_data View Post
    Hi Darrell, great regexp.
    You have achieved the same results but it would have taken *me* 3 hours to write that regexp!!
    I use those capture groups fairly often but I had to forced myself to learn them. Even then I still have to go back to www.regular-expressions.info most of the time!

    Quote Originally Posted by MattCasters View Post
    Isn't this what the Regex sample is doing?
    Dang. I never remember to check the samples folders...
    pdi-ce-4.4.0-stable
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

  7. #7

    Default

    Hi Darell, Hi Matt,

    Thanks a lot for your replies! Guess I really have to look more into regEx

    Bobse

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.