Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: Overal BI/DI solution questions/review

  1. #1
    Join Date
    Dec 2015
    Posts
    2

    Question Overall BI/DI solution questions/review

    Hello,

    we are evaluating Pentaho apps suite for reporting and analytics solution for one of our customers. After reading large amount of official/community documents we get some understanding of the solution being developed. We'd like to get some feedback/comments on our approach and answers to the questions provided below.

    We think on the high level vision of solution as following:

    - use BI server CE on development environment
    - use BA enterprise version on production
    - don't use DI components, instead we are going to develop custom service application with following functionality:
    * work as silent background process on one or more nodes
    * receive raw business data from customer's business applications and transfer it into cube's for further analysis
    * provide automatic data scheme evolution - control and update physical DB schema and generate appropriate OLAP schema for Mondrian
    * automatically update OLAP schema on BI server if it get changed
    * later on possibly we'll also generate/update metadata (XMI files) for relational data analysis

    Customer's business applications are in continuous development - they evolve intensively. Thus data structures for analysis will also change frequently - this will be also continuous development process. We think that our DI solution will grow and evolve within next few years (mainly by adding cubes and dimensions). Customer wants start using analysis immediately. That's why we decided to develop custom data integration solution.

    Here some questions:
    1. What do you think on such approach to data integration? We think that it is "developer friendly" and suites good for agile development process.
    2. Is that possible to programmatically add/update Mondrian schema in BI server? Some remote service? Direct file access? We didn't found any info about that.
    3. Is it available some API in Mondrian to programmatically construct schema and export it into XML format?
    4. Is it OK using CE server on development and EE version on production for the same OLAP schema and physical data structures?
    5. Are there available any performance metrics/comparisons for data analysis? How productive Mondrian in high load? Is it OK if cube's data will be updated frequently?

    Thank you in advance.

    // Dmitry
    Last edited by dmitry_ol; 12-04-2015 at 06:35 AM.

  2. #2
    Join Date
    Dec 2015
    Posts
    2

    Question Overall BI/DI solution questions/review

    Hello,

    we are evaluating Pentaho apps suite for reporting and analytics solution for one of our customers. After reading large amount of official/community documents we get some understanding of the solution being developed. We'd like to get some feedback/comments on our approach and answers to the questions provided below.

    We think on the high level vision of solution as following:

    - use BI server CE on development environment
    - use BA enterprise version on production
    - don't use DI components, instead we are going to develop custom service application with following functionality:
    * work as silent background process on one or more nodes
    * receive raw business data from customer's business applications and transfer it into cube's for further analysis
    * provide automatic data scheme evolution - control and update physical DB schema and generate appropriate OLAP schema for Mondrian
    * automatically update OLAP schema on BI server if it get changed
    * later on possibly we'll also generate/update metadata (XMI files) for relational data analysis

    Customer's business applications are in continuous development - they evolve intensively. Thus data structures for analysis will also change frequently - this will be also continuous development process. We think that our DI solution will grow and evolve within next few years (mainly by adding cubes and dimensions). Customer wants start using analysis immediately. That's why we decided to develop custom data integration solution.

    Here some questions:

    1. What do you think on such approach to data integration? We think that it is "developer friendly" and suites good for agile development process.
    2. Is that possible to programmatically add/update Mondrian schema in BI server? Some remote service? Direct file access? We didn't found any info about that.
    3. Is it available some API in Mondrian to programmatically construct schema and export it into XML format?
    4. Is it OK using CE server on development and EE version on production for the same OLAP schema and physical data structures?
    5. Are there available any performance metrics/comparisons for data analysis? How productive Mondrian in high load? Is it OK if cube's data will be updated frequently?


    Thank you in advance.

    // Dmitry

  3. #3
    Join Date
    Aug 2011
    Posts
    360

    Default

    Hello,

    Based on my experiencw (more on PDI than on BA server), here's my point of view:
    - do not use CE version on dev and EE on production: if you can afford the EE licences, i bet you want
    to use the particular EE features or plugins. So you'll have to develop with these plugins too.
    If you don't want to use EE features at all, go CE on production too. It would let you build everything
    from source if you want. If this is a question of having support from pentaho.....i bet if you can code
    a full custom data integration system, you can resolve pentaho bugs by yourself (with help of community)

    Second point:
    Don't push raw data from operational systems directly into your cubes.
    Should have a staging area in between, so your system continusly push data from operational system to staging area.
    Then another system, wich is designed and dedicated to do that, get data from staging area and push it to cubes, datawarehouse,
    datavault or everything you want.
    This will separate concerns on your code, facilitate exploitation and futur scaling, and permits to use the good technology
    to do the good thing.
    The stagging area can be anything from flat files, database, hadoop cluster.
    Moreover, you can split your team and the application lifecycle on work of both side of stagging area.

    You dont want to use DI components..... but why? If you make the effort to install BA solution, you MUST try DI solution too.
    Moreover, the data integration engine is fully embedable in a java application. So you could build your dynamique data-integration system
    by generating pentaho transformations on the fly. This could save you a HUGH amount of code, instead of rebuildjng everything from scratch!
    So my advise: use custom code to push raw data from operational system to staging area.
    Then use custom DI solution wich use the pdi components to build your cubes.


    For questions on Mondrian i dont know.

    Best regards

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.