Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: jira solution - big jira.xml

  1. #1
    Join Date
    May 2007
    Posts
    8

    Default jira solution - big jira.xml

    Hi, anybody using the jira solution? I wanted to try it out but kettle appears to have problems importing big xmls. Mine is 221M and java dies OOM even with dedicated 1,5G

    Any workaround, I'm not going to dedicate a 6 or more G machine to loading the jira xml...

  2. #2
    Join Date
    May 2007
    Posts
    8

    Cool

    cool, I've tried with 20G heap size + 12G spare RAM for OS + rest of the jvm... I've got OOM on a later stage of the data loading process.

    2007/05/21 09:57:24 - jira_int_status - Initialising 4 steps...
    2007/05/21 09:57:24 - xml - component.0 - Starting to run...
    2007/05/21 09:57:24 - xml - component.0 - Opening file: /usr/local/pentahoOBS/pentaho-server/pentaho-solutions/software-quality/data/etl/../jira.xml
    2007/05/21 09:57:24 - Stream lookup.0 - Starting to run...
    2007/05/21 09:57:24 - Stream lookup.0 - Reading lookup values from step [jira_status.xls]
    2007/05/21 09:57:24 - jira_status.xls.0 - Starting to run...
    2007/05/21 09:57:24 - INT_STATUS.0 - Starting to run...
    2007/05/21 09:59:56 - jira_status.xls.0 - Finished processing (I=0, O=0, R=0, W=7, U=0, E=0
    2007/05/21 09:59:56 - Stream lookup.0 - Read 7 values in memory for lookup!
    2007/05/21 13:57:40 - xml - component.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=0
    Exception in thread "xml - component.0 (Thread-42)" java.lang.OutOfMemoryError: Java heap space

    JIRA solution developers, any idea how to fix it or at least is it possible and/or feasible?

  3. #3
    Join Date
    Jun 2005
    Posts
    144

    Default Needs to be updated

    Quote Originally Posted by akostadinov View Post
    cool, I've tried with 20G heap size + 12G spare RAM for OS + rest of the jvm... I've got OOM on a later stage of the data loading process.

    JIRA solution developers, any idea how to fix it or at least is it possible and/or feasible?
    Thanks for posting. I haven't updated the Jira/Bugzilla solution for Kettle 2.50 and the GA release of the Pentaho platform. I've was on sabatical for a while but am back and need to have a look at updating it soon.

    I submitted a Kettle bug report on the memory issues you appear to be having:
    http://www.javaforge.com/proj/tracke...e&task_id=4143

    And the response has come back that it should be resolved in newer versions of Kettle. Assuming you don't want to wait for me to update the solution (and do another release) you are welcome to try the solution on version 2.5.x of Kettle.

    Also, consider running the subjobs separately. For instance, instead of running the ./kitchen.sh /file=bugz_do_everything.kjb run separate versions for the jobs:

    ./kitchen.sh /file=load_bugz_int_stage1.kjb
    ./kitchen.sh /file=load_bugz_int_stage2.kjb
    ./kitchen.sh /file=load_dimensions.kjb
    ./kitchen.sh /file=load_facts.kjb
    ./kitchen.sh /file=load_summaries.kjb

    Ultimately I'd like to build another release for everyone involved. Thanks for your interest and I'll definitely post something when I do a new build.

    Let me know if any of the above helps out.

    Nick

  4. #4
    Join Date
    May 2007
    Posts
    8

    Default

    Thank you for the suggestion. I'll try and let you know what happened.

  5. #5
    Join Date
    May 2007
    Posts
    8

    Default

    Tried running tasks separately:
    ./kitchen.sh -file=/usr/local/pentahoOBS/pentaho-server/pentaho-solutions/software-quality/data/etl/load_jira_staging.kjb && echo 1 && ./kitchen.sh -file=/usr/local/pentahoOBS/pentaho-server/pentaho-solutions/software-quality/data/etl/load_jira_int_stage1.kjb && echo 2 && ./kitchen.sh -file=/usr/local/pentahoOBS/pentaho-server/pentaho-solutions/software-quality/data/etl/load_jira_int_stage2.kjb && echo 3 && ./kitchen.sh -file=/usr/local/pentahoOBS/pentaho-server/pentaho-solutions/software-quality/data/etl/load_dimensions.kjb && echo 4 && ./kitchen.sh -file=/usr/local/pentahoOBS/pentaho-server/pentaho-solutions/software-quality/data/etl/load_facts.kjb && echo 5 && ./kitchen.sh -file=/usr/local/pentahoOBS/pentaho-server/pentaho-solutions/software-quality/data/etl/load_summaries.kjb

    Used jrockit and it completed much faster as well using max of 14G RAM. Will it be any better using recent kettle?

  6. #6
    Join Date
    Jun 2005
    Posts
    144

    Default Updating for kettle25

    Quote Originally Posted by akostadinov View Post
    Used jrockit and it completed much faster as well using max of 14G RAM. Will it be any better using recent kettle?
    I certainly hope so. I've tested the solution on the bugzilla side of the solution and it's running quicker. I'll update the Jira side as soon as I can and post an update.

    What did you think? Will you find the reports helpful?

  7. #7
    Join Date
    May 2007
    Posts
    8

    Default

    I've had some data inconsistency issues so I didn't actually get any useful reports. I hope to have some more time for it this week. Will update you as soon as get it working (or not).

  8. #8
    Join Date
    May 2007
    Posts
    8

    Default

    Hallo again. Haven't had time to look further into the issue till now. Do you have any idea if the error I see below is due to a data inconsistency (jira data) or could it be caused by anything else?

    Thanks much,
    Aleksandar

    2007/06/14 10:48:11 - Unique rows.0 - Starting to run...
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : java.sql.SQLException: Duplicate entry '12315028-2005-05-12 11:13:
    09' for key 1
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2975)
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1600)
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPr
    eparedStatement.java:1125)
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at com.mysql.jdbc.ServerPreparedStatement.executeInternal(Server
    PreparedStatement.java:677)
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatem
    ent.java:1357)
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatem
    ent.java:1274)
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatem
    ent.java:1259)
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at be.ibridge.kettle.core.database.Database.insertRow(Database.j
    ava:1456)
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at be.ibridge.kettle.trans.step.tableoutput.TableOutput.writeToT
    able(TableOutput.java:178)
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at be.ibridge.kettle.trans.step.tableoutput.TableOutput.processR
    ow(TableOutput.java:72)
    2007/06/14 10:48:11 - Bugs - ERROR (version 2.3.1, build 63 from 2006/09/14 12:0
    4:05 @ sam) : at be.ibridge.kettle.trans.step.tableoutput.TableOutput.run(Tabl
    eOutput.java:309)
    2007/06/14 10:48:11 - stg_jira_issue_history.0 - ERROR (version 2.3.1, build 63
    from 2006/09/14 12:04:05 @ sam) : Because of an error, this step can't continue:

    2007/06/14 10:48:11 - stg_jira_issue_history.0 - ERROR (version 2.3.1, build 63
    from 2006/09/14 12:04:05 @ sam) : Error inserting row into table [stg_jira_issue
    _history] with values: [ISSUE_NAT_ID= 012315028, ISSUE_ACTION_DATE=2005/05/12 11
    :13:09.000]
    2007/06/14 10:48:11 - stg_jira_issue_history.0 - ERROR (version 2.3.1, build 63
    from 2006/09/14 12:04:05 @ sam) :
    2007/06/14 10:48:11 - stg_jira_issue_history.0 - ERROR (version 2.3.1, build 63
    from 2006/09/14 12:04:05 @ sam) : Error inserting row
    2007/06/14 10:48:11 - stg_jira_issue_history.0 - ERROR (version 2.3.1, build 63
    from 2006/09/14 12:04:05 @ sam) : Duplicate entry '12315028-2005-05-12 11:13:09'
    for key 1
    2007/06/14 10:48:11 - stg_jira_issue_history.0 - Finished processing (I=0, O=489
    , R=490, W=0, U=0, E=1
    2007/06/14 10:48:11 - jira_int_issue_step1 - ERROR (version 2.3.1, build 63 from
    2006/09/14 12:04:05 @ sam) : Errors detected!
    2007/06/14 10:48:11 - jira_int_issue_step1 - ERROR (version 2.3.1, build 63 from
    2006/09/14 12:04:05 @ sam) : Errors detected!
    2007/06/14 10:48:11 - CREATE RECORDS.0 - Finished reading query, closing connect
    ion.
    2007/06/14 10:48:11 - UPDATED RECORDS.0 - Finished reading query, closing connec
    tion.

  9. #9
    Join Date
    May 2007
    Posts
    8

    Default

    I get the same primary key duplication issue with few jira backup xmls even taken just after data consistency verification. The consistency verification is performed through the JIRA interface button. The few tries I did to load data into mysql I get the error at the same point with the same data value.

    Any other way to verify JIRA data consistency? Any idea how can I workaround/fix the problem?

    And just a suggestion for the documentation. To get Project IDs from the jira.xml you can use the ugly command below:
    cat jira.xml | sed -e 's/.*\(<Project \).*\(id="\)\([0-9]\+\)".*/\3/p' -e 'd' | sort

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.