Hitachi Vantara Pentaho Community Forums

Search Forums:

Type: Posts; User: dhartford; Keyword(s):

Page 1 of 10 1 2 3 4

Search Forums: Search took 0.03 seconds.

  1. PDI log4j.xml for debugging custom steps plugins

    There have been several references over the years of 'how do I enable the log4j logging?'. I just had to recently go through this for PDI 5.4, so wanted to share (and make it a new search result the...
  2. Replies
    2
    Views
    1,019

    So, although the kettle.properties did not work...

    So, although the kettle.properties did not work (even after restart), when I go into the spoon.bat file and manually modify "-DKETTLE_LOG_SIZE_LIMIT=50000", this *did* work, confusing....
  3. Replies
    2
    Views
    1,019

    Spoon Logging Tab - more lines/ log data?

    Hi all,
    I've tried my best to research several resources, but I'm at a lost --

    1) how can you increase the number of lines to show in the logging tab console when using the Spoon interface (or a...
  4. Replies
    1
    Views
    846

    this morning, 9/24/2014, things look good now!

    this morning, 9/24/2014, things look good now!
  5. Replies
    1
    Views
    846

    repo.pentaho.org issues 9/23/2014?

    Hi All,
    I'm trying to setup a maven project for Pentaho PDI plugin (Kettle Plugin), but it appears the repository keeps getting proxy errors.

    <repositories>
    <repository>...
  6. Kudos - DataCleaner and PDI-CMIS-PLUGIN (alfresco migration)

    Converting a legacy MS Access storage with file system locators to PDFs and TIFFs to Alfresco (not exactly traditional ETL...).

    Fabulous success using the DataCleaner plugin to review outliers of...
  7. Replies
    4
    Views
    930

    confirmed, I feel silly now :-) thanks, I'll...

    confirmed, I feel silly now :-)

    thanks, I'll play with WEKA some and see what it can do.
  8. Analysis services for IT (network diagrams, etc)?

    Hi all,
    Kind of a higher level topic - if one has a list of IT assets either in Excel, in a database or from a CMDB, are there tools/product suite/services that Pentaho offers or could offer around...
  9. Replies
    4
    Views
    930

    I'm probably not using WEKA for its original...

    I'm probably not using WEKA for its original intent, but I have a list of servers that relate to each other, and trying to create a tree diagram of the relations (the actual list is rather large, and...
  10. Replies
    4
    Views
    930

    First experience - CSV file loading

    Hi all,
    New to WEKA, but quite familiar with Kettle/PDI.

    I really appreciate the small stand-alone WEKA download, so I wanted to say that off the bat. Reading a little bit about the new...
  11. Reporting PDF accessibility (and HTML for the future)

    Hi all,
    I'm curious of Pentaho's reporting engine has addressed accessibility, particularly as it relates to PDF. For example, go to view->accessibility->quick check with any given PDF.

    There...
  12. Replies
    18
    Views
    2,167

    Glad to hear the problem has been clearly...

    Glad to hear the problem has been clearly identified, a big step forward!

    If you want, try increasing the number of steps for your output at startup (again, making sure pooled connections). ...
  13. Replies
    18
    Views
    2,167

    I have SSIS/sproc experience similar to this...

    I have SSIS/sproc experience similar to this (and, obviously, Kettle experience), and have similar challenges to these kinds of problems (outside of the fixed-width/delimitering *rows* based on...
  14. Thanks, I read through this reference, and it...

    Thanks,
    I read through this reference, and it looks close to what I want. I'll just have to test if it can stream (the io buffer control?) large zip files:
    ...
  15. gzip streaming task awesome, is there a regular zip version?

    Hey all,
    I'm quite impressed with the gzip streaming task (direct to transformation, no pre-uncompression). With large files (5-10G+), this makes a huge difference from the old job-level approach...
  16. Replies
    3
    Views
    905

    JDBC Driver 'opt-in' downloader

    Hey all,
    Saw the recent note on the developer list about the JDBC challenges.

    One of the approaches a couple of projects I've used/worked on in the past is a 'default download' setup to 'opt-in'...
  17. Passing a org.dom4j.Document to Kettle/Transformation

    Hey all,
    I have a transformation that the first step is the 'Get XML Data' step. currently everything is setup to pull from files and this is running fine. (tested pdi 4.0)

    Now I'm trying to...
  18. Replies
    5
    Views
    2,923

    opened ticket:...

    opened ticket: http://jira.pentaho.com/browse/PDI-6455
  19. Replies
    5
    Views
    2,923

    is there a way to use xpath functions within the...

    is there a way to use xpath functions within the 'Get XML Data' step? I haven't been successful and my ignorance is likely in the way as I was trying to do this:

    /rows/row/position()

    or...
  20. Replies
    5
    Views
    2,923

    The approach I'm doing now is creating a Row...

    The approach I'm doing now is creating a Row Flattener for each /column/ node (i.e. a 'Name Row Flattener', a 'Value Row Flattener').

    Unfortunately, when I go back and 'Merge Join', I'm still...
  21. Replies
    5
    Views
    2,923

    that is correct to get the data, but how do I...

    that is correct to get the data, but how do I group by for the row denormalizer if I didn't have attribute values?

    EDIT: Sorry, I didn't re-read it correctly -- the specific naming could work, but...
  22. Replies
    5
    Views
    2,923

    XML file and denormalize

    Hey all,
    I have an XML file that is similar to a table layout that would be good for the row denormaliser. However, if I understand the denormalize step I do not necessarily have a 'field' to group...
  23. Replies
    5
    Views
    4,797

    bumping up this thread as I ran into another...

    bumping up this thread as I ran into another scenario where the need to decrypt the file (as a whole, not individual data elements) has come up a couple more times.

    Reviewing JIRA, Tarallo already...
  24. Replies
    3
    Views
    1,792

    You can use Kettle/PDI library and Spoon from the...

    You can use Kettle/PDI library and Spoon from the open source version for free (excluding certain embedding scenarios I would imagine).

    Otherwise if you want to use the PDI-EE server/BI Platform...
  25. Replies
    1
    Views
    837

    OT or not: 'DIFF' file abillity

    Hey all,
    Looking at some of the processes/steps people in the ETL world do on a day-to-day basis, one of the things I see often is different file viewers as well as 'DIFF' tools between two files to...
  26. Replies
    3
    Views
    3,467

    I've written a proof of concept for X12-style EDI...

    I've written a proof of concept for X12-style EDI file *parsing*, in such a way as each segment becomes an independent datastream (or whatever the proper term is :-) including in-file...
  27. Java itself isn't that slow, we are all here...

    Java itself isn't that slow, we are all here using java :-)

    Image processing in java has traditionally been problematic. However...breaking up the work into parallel (i.e. small, focused,...
  28. I have done image manipulation in the past for...

    I have done image manipulation in the past for both fax image conversion, image splitting/combining, and basically anything related to Document/data capture or Forms Processing.

    As long as you...
  29. If you are testing the etl process, should be...

    If you are testing the etl process, should be able to see in the transformation the individual steps performance (records/sec) to help target the slower ones --> remember to go from the...
  30. Replies
    6
    Views
    1,464

    Ahh, ok - I missed the part about slave servers. ...

    Ahh, ok - I missed the part about slave servers. Are you using the DB repository or the file-based approach? In either scenario, I think you may need to copy (I assume) the kettle.properties or...
  31. Poll: This is a huge +1 on the idea. One of the...

    This is a huge +1 on the idea. One of the challenges for quick/adhoc reporting of data with the Pentaho stack was the need to tie Kettle and and Jfreereport (doh, showing my age, PDI and PRD). This...
  32. If you can change the title of your post, should...

    If you can change the title of your post, should mention hadoop/hive :-)
  33. Replies
    6
    Views
    1,464

    It looks like the property is not getting...

    It looks like the property is not getting replaced, been a while with 3.2, but I though the error messages provided the replaced property value:

    Caused by: java.net.UnknownHostException:...
  34. Replies
    4
    Views
    1,045

    Unless the intent is that the blob data...

    Unless the intent is that the blob data represents an encrypted payload value item for use in XML as a WebService/REST interface, likely the requirement or expectation of usage is poorly understood....
  35. stock Software Quality reports for JIRA?

    Hey all,
    I came across an old SF project that has, what looks like, the full BI server stack of kettle and jfreereport files to make (I assume) nice reports out of JIRA.
    ...
  36. Poll: I'm averaging around 2500-5000 r/s with a...

    I'm averaging around 2500-5000 r/s with a networked mysql 4.1 instance using a mix of mysql-connector-5* drivers.

    *10 columns or greater, usually 3 varchar(50), 3 datetime, 4 numeric (usually...
  37. Poll: Table Input Average Input Speeds you see (mysql)

    Although I specified mysql for this poll, the intent is more about what users are experiencing on average. Please post any specifics in this thread such as:

    * The average column sizes and...
  38. Replies
    5
    Views
    1,908

    I'm averaging around 2500-5000 r/s from a...

    I'm averaging around 2500-5000 r/s from a networked mysql 4 database instance, when I say on average, across multiple database schema/datatypes (usually at least 10 columns and mix of numeric,...
  39. Replies
    4
    Views
    2,176

    For Data Quality/Profiling, with the description...

    For Data Quality/Profiling, with the description of reviewing the current data for patterns, occurrences, duplicates, high-low, etc of the data itself, I use the following:
    ...
  40. As Matt pointed out, when you are dealing with...

    As Matt pointed out, when you are dealing with data transformation with external datasources, it's more the blackbox/test file and expected result file approach.

    The only thing you could do...
  41. Replies
    13
    Views
    2,400

    I'm of the same mindset, but this may be related...

    I'm of the same mindset, but this may be related to the expectation that this community-forum specifically is more direct user/developer focused who, again based on preference, generally prefer...
  42. Replies
    6
    Views
    7,703

    The idea of packaging together certain...

    The idea of packaging together certain jobs/transformations together is kind of interesting...especially when you take into consideration some Web Content Management (WCM) solutions use this approach...
  43. For duplicates, you can review using the data...

    For duplicates, you can review using the data warehouse step for junk/slow-moving dimension updates with caching enabled.

    Otherwise, although the title is deceiving (it was a hijacked thread),...
  44. Replies
    2
    Views
    1,387

    Rather than focusing on Kettle first, make sure...

    Rather than focusing on Kettle first, make sure you got your expectations right.

    Use your database tool of choice (mysql, ms enterprise manager, workbench/j, whatever), and connect remotely to the...
  45. Replies
    3
    Views
    1,185

    I don't use the DI/EE version, but with Tomcat...

    I don't use the DI/EE version, but with Tomcat there is a server.xml (non-rpm install version assumed) that has the port values for everything used by tomcat (i.e. search for '8080' and you should...
  46. Replies
    3
    Views
    1,185

    When you say "DI Server", do you mean the...

    When you say "DI Server", do you mean the Enterprise version?
  47. Thread: EDI Support

    by dhartford
    Replies
    3
    Views
    1,302

    Not that I'm aware of. I wrote a prototype...

    Not that I'm aware of.

    I wrote a prototype (non open source unfortunately) one back for Kettle...2.5 I think, but that plugin is quite aged now and never got polish time.
  48. http://forums.pentaho.org/showthread.php?t=72834

    http://forums.pentaho.org/showthread.php?t=72834
  49. Probably one of the answers you will get is that...

    Probably one of the answers you will get is that the Pentaho BI Server/Enterprise BI Server has scheduling w/ a user interface that can be used to schedule PDI jobs.

    I've been playing with the...
  50. Replies
    10
    Views
    4,365

    Depending on which version of Kettle/PDI you are...

    Depending on which version of Kettle/PDI you are using, there is a screen called 'keys' in the middle that the fields to lookup the row in the dimension. These fields are the ones that need indexes....
Results 1 to 50 of 451
Page 1 of 10 1 2 3 4
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.