Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Using Pentaho MapReduce to Parse Weblog Data - can't start mapreduce with no error

  1. #1
    Join Date
    Apr 2016
    Posts
    5

    Question Using Pentaho MapReduce to Parse Weblog Data - can't start mapreduce with no error

    Using Pentaho MapReduce to Parse Weblog Data - can't start mapreduce with no error


    I follow http://wiki.pentaho.com/display/BAD/...se+Weblog+Data to create a job "weblog_parse_mr.kjb" and a trans "weblog_parse_mapper.ktr".


    I use the following command to start this job, but mapreduce can't start and there's no error, just keep on waiting.
    /mnt/kettle/data-integration/kitchen.sh -file=/home/hduser/kettle_jobs/ini_test_jobs/weblog_parse_mr_less.kjb -level=Debug


    Logs as follow: (there're some Chinese character, I think they're not very important. And my English is not good, I'm sorry.)
    --------------------------------------------------------------------------------------------------------------------
    [hduser@master data-integration]$ /mnt/kettle/data-integration/kitchen.sh -file=/home/hduser/kettle_jobs/ini_test_jobs/weblog_parse_mr_less.kjb -level=Debug
    Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
    16:25:47,012 INFO [KarafInstance]
    *******************************************************************************
    *** Karaf Instance Number: 1 at /mnt/kettle/data-integration/./system/karaf ***
    *** //data1 ***
    *** Karaf Port:8801 ***
    *** OSGI Service Port:9050 ***
    *******************************************************************************
    四月 01, 2016 4:25:48 下午 org.apache.karaf.main.Main$KarafLockCallback lockAquired
    信息: Lock acquired. Setting startlevel to 100
    2016/04/01 16:25:48 - Kitchen - Logging is at level : 调试
    2016/04/01 16:25:48 - Kitchen - Start of run.
    2016/04/01 16:25:48 - Kitchen - Allocate new job.
    2016/04/01 16:25:48 - Kitchen - Parsing command line options.
    2016/04/01 16:25:49 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
    2016-04-01 16:25:52.339:INFO:oejs.Server:jetty-8.1.15.v20140411
    2016-04-01 16:25:52.390:INFO:oejs.AbstractConnector:Started NIOSocketConnectorWrapper@0.0.0.0:9050
    log4j:ERROR Could not parse url [file:/mnt/kettle/data-integration/./system/osgi/log4j.xml].
    java.io.FileNotFoundException: /mnt/kettle/data-integration/./system/osgi/log4j.xml (没有那个文件或目录)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at java.io.FileInputStream.<init>(FileInputStream.java:93)
    at sun.net.http://www.protocol.file.FileURLConn...ction.java:90)
    at sun.net.http://www.protocol.file.FileURLConn...tion.java:188)
    at org.apache.log4j.xml.DOMConfigurator$2.parse(DOMConfigurator.java:765)
    at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:871)
    at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:778)
    at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
    at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
    at org.apache.log4j.Logger.getLogger(Logger.java:104)
    at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:262)
    at org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:108)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1025)
    at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:844)
    at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:541)
    at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:292)
    at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:269)
    at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:657)
    at org.springframework.osgi.extender.internal.activator.ContextLoaderListener.<clinit>(ContextLoaderListener.java:253)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at java.lang.Class.newInstance(Class.java:442)
    at org.apache.felix.framework.Felix.createBundleActivator(Felix.java:4362)
    at org.apache.felix.framework.Felix.activateBundle(Felix.java:2149)
    at org.apache.felix.framework.Felix.startBundle(Felix.java:2072)
    at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1299)
    at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
    at java.lang.Thread.run(Thread.java:745)

    ........ (text I have entered is to long , so I clear this part of logs)

    2016/04/01 16:26:02 - weblog_parse_mr_less - 开始执行任务(Begin to execute job)
    2016/04/01 16:26:02 - weblog_parse_mr_less - exec(0, 0, START.0)
    2016/04/01 16:26:02 - START - Starting job entry
    2016/04/01 16:26:02 - weblog_parse_mr_less - 开始项[Pentaho MapReduce - mr]
    2016/04/01 16:26:02 - weblog_parse_mr_less - exec(1, 0, Pentaho MapReduce - mr.0)
    2016/04/01 16:26:02 - Pentaho MapReduce - mr - Starting job entry
    2016/04/01 16:26:02 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:///mnt/kettle/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/hdp22/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:///mnt/kettle/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/hdp22/lib/client/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:///mnt/kettle/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/hdp22/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/mnt/kettle/data-integration/launcher/../lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/mnt/kettle/data-integration/plugins/pentaho-big-data-plugin/lib/slf4j-log4j12-1.7.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    2016/04/01 16:26:03 - weblog_parse_mapper - 为了转换解除补丁开始 [weblog_parse_mapper]
    Attempting to load ESAPI.properties via file I/O.
    Attempting to load ESAPI.properties as resource file via file I/O.
    Not found in 'org.owasp.esapi.resources' directory or file not readable: /mnt/kettle/data-integration/ESAPI.properties
    Not found in SystemResource Directory/resourceDirectory: .esapi/ESAPI.properties
    Not found in 'user.home' (/home/hduser) directory: /home/hduser/esapi/ESAPI.properties
    Loading ESAPI.properties via file I/O failed. Exception was: java.io.FileNotFoundException
    Attempting to load ESAPI.properties via the classpath.
    SUCCESSFULLY LOADED ESAPI.properties via the CLASSPATH from '/ (root)' using current thread context class loader!
    SecurityConfiguration for Validator.ConfigurationFile not found in ESAPI.properties. Using default: validation.properties
    Attempting to load validation.properties via file I/O.
    Attempting to load validation.properties as resource file via file I/O.
    Not found in 'org.owasp.esapi.resources' directory or file not readable: /mnt/kettle/data-integration/validation.properties
    Not found in SystemResource Directory/resourceDirectory: .esapi/validation.properties
    Not found in 'user.home' (/home/hduser) directory: /home/hduser/esapi/validation.properties
    Loading validation.properties via file I/O failed.
    Attempting to load validation.properties via the classpath.
    validation.properties could not be loaded by any means. fail. Exception was: java.lang.IllegalArgumentException: Failed to load ESAPI.properties as a classloader resource.
    SecurityConfiguration for Logger.LogServerIP not either "true" or "false" in ESAPI.properties. Using default: true
    2016/04/01 16:26:03 - Pentaho MapReduce - mr - Using org.apache.hadoop.io.Text for the map output value
    2016/04/01 16:26:05 - Pentaho MapReduce - mr - Cleaning output path: hdfs://172.16.189.123:9000/user/pdi/weblogs/parse_less
    2016/04/01 16:26:05 - Pentaho MapReduce - mr - Using Kettle installation from /opt/pentaho/mapreduce/6.0.1.0-386-6.0.1.0-386-hdp22
    2016/04/01 16:26:05 - Pentaho MapReduce - mr - Configuring Pentaho MapReduce job to use Kettle installation from /opt/pentaho/mapreduce/6.0.1.0-386-6.0.1.0-386-hdp22
    2016/04/01 16:26:05 - Pentaho MapReduce - mr - mapreduce.application.classpath: classes/,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
    2016/04/01 16:26:07 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
    2016/04/01 16:26:17 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
    2016/04/01 16:26:27 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
    2016/04/01 16:26:37 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
    2016/04/01 16:26:47 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
    2016/04/01 16:26:57 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
    ......
    2016/04/01 16:39:27 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
    2016/04/01 16:39:37 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
    --------------------------------------------------------------------------------------------------------------------
    As above, there's no error, and mapreduce can't start. I don't know what to do.


    PS:
    I have two hadoop cluster dev-environment.


    one dev-environment can run this job successfully.
    parts of logs as follow:
    ------------------------------------------------
    ......
    SecurityConfiguration for Logger.LogServerIP not either "true" or "false" in ESAPI.properties. Using default: true
    2016/04/01 16:02:08 - Pentaho MapReduce - mr - Using org.apache.hadoop.io.Text for the map output value
    2016/04/01 16:02:09 - Pentaho MapReduce - mr - Cleaning output path: hdfs://192.168.124.129:9000/user/pdi/weblogs/parse_less
    2016/04/01 16:02:10 - Pentaho MapReduce - mr - Using Kettle installation from /opt/pentaho/mapreduce/6.0.1.0-386-6.0.1.0-386-hdp22
    2016/04/01 16:02:10 - Pentaho MapReduce - mr - Configuring Pentaho MapReduce job to use Kettle installation from /opt/pentaho/mapreduce/6.0.1.0-386-6.0.1.0-386-hdp22
    2016/04/01 16:02:10 - Pentaho MapReduce - mr - mapreduce.application.classpath: classes/,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
    2016/04/01 16:02:14 - Pentaho MapReduce - mr - Setup Complete: 0.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2016/04/01 16:02:24 - Pentaho MapReduce - mr - Setup Complete: 0.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2016/04/01 16:02:34 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2016/04/01 16:02:44 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2016/04/01 16:02:54 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2016/04/01 16:03:55 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2016/04/01 16:04:05 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
    2016/04/01 16:04:15 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 44.475254 Reducer Completion: 0.0
    2016/04/01 16:04:25 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 61.912037 Reducer Completion: 0.0
    2016/04/01 16:04:35 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 66.66667 Reducer Completion: 0.0
    2016/04/01 16:04:35 - Pentaho MapReduce - mr - [SUCCEEDED] -- Task: attempt_1459477876939_0003_m_000001_0 Attempt: attempt_1459477876939_0003_m_000001_0 Event: 0
    2016/04/01 16:04:35 - Pentaho MapReduce - mr - Container killed by the ApplicationMaster.
    2016/04/01 16:04:35 - Pentaho MapReduce - mr - Container killed on request. Exit code is 143
    2016/04/01 16:04:35 - Pentaho MapReduce - mr - Container exited with a non-zero exit code 143
    2016/04/01 16:04:35 - Pentaho MapReduce - mr - [SUCCEEDED] -- Task: attempt_1459477876939_0003_m_000000_0 Attempt: attempt_1459477876939_0003_m_000000_0 Event: 1
    ......
    ------------------------------------------------


    the version of softs I used as follow:


    pdi-ce-6.0.1.0-386
    hadoop-2.7.1
    CentOS 6.4


    Thanks for reading my problem.

  2. #2
    Join Date
    Apr 2016
    Posts
    5

    Default

    I modify trans "weblog_parse_mapper.ktr", remove some other objects, and left "mapreduce Input" and "mapreduce output" only.
    But It still can't work.
    It seems "weblog_parse_mr.kjb" can't call "weblog_parse_mapper.ktr".
    and I have already "chmod 777" to all jobs and trans.


    Can anyone help me? or should I support some other log info???

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.