Hitachi Vantara Pentaho Community Forums
Page 1 of 2 12 LastLast
Results 1 to 10 of 13

Thread: [Mondrian] Testing a Hive dialect

  1. #1
    Fu Hongwei Guest

    Default [Mondrian] Testing a Hive dialect

    Hi,
    I'm new to the community. I don't know who to ask this for.
    I've written a Hive dialect for mondrian and been testing it.
    The DialectTest is passed, but it's still failing other tests.

    >From what I can see it's still a long way to make an integration

    of Hive and Mondrian practical.
    1. It's really slow. It takes 1 or 2 days to run through all the tests,
    a low end estimation. The latency is too big for most applications.
    It might take some major architectural change on the Hive side to solve
    this problem.
    2. Hive ql is still at a rather immature stage, partly because it's not
    really intended to be a full featured relational database. There are some
    bugs too, like the join behavior is incorrect at the current trunk.

    I know there is already a jira about it.
    http://jira.pentaho.com/browse/MONDRIAN-789
    How can I submit the code? Thanks
    2011-02-10



    Fu Hongwei

    _______________________________________________
    Mondrian mailing list
    Mondrian (AT) pentaho (DOT) org
    http://lists.pentaho.org/mailman/listinfo/mondrian

  2. #2
    Julian Hyde Guest

    Default RE: [Mondrian] Testing a Hive dialect

    I am well aware of the compromises with Hive. There is certainly an
    impedance mismatch between hadoop and the real-time analysis, and that is
    reflected in query response time. It is still useful to have a Hive
    dialect, because as you say, Hive is improving all the time. And Pentaho is
    thinking about ways to bridge the impedance mismatch.

    Can please you attach your code to the jira case as a patch? I will submit
    it.

    Also please attach the output of the test suite, and describe the version of
    Hive you are running against. That will be a reference point for others who
    are working on Hive.

    Julian


    _____

    From: mondrian-bounces (AT) pentaho (DOT) org [mailto:mondrian-bounces (AT) pentaho (DOT) org] On
    Behalf Of Fu Hongwei
    Sent: Thursday, February 10, 2011 1:29 AM
    To: mondrian
    Subject: [Mondrian] Testing a Hive dialect


    Hi,
    I'm new to the community. I don't know who to ask this for.
    I've written a Hive dialect for mondrian and been testing it.
    The DialectTest is passed, but it's still failing other tests.

    >From what I can see it's still a long way to make an integration

    of Hive and Mondrian practical.
    1. It's really slow. It takes 1 or 2 days to run through all the tests,
    a low end estimation. The latency is too big for most applications.
    It might take some major architectural change on the Hive side to solve
    this problem.
    2. Hive ql is still at a rather immature stage, partly because it's not
    really intended to be a full featured relational database. There are some
    bugs too, like the join behavior is incorrect at the current trunk.

    I know there is already a jira about it.
    http://jira.pentaho.com/browse/MONDRIAN-789
    How can I submit the code? Thanks
    2011-02-10

    _____

    Fu Hongwei


    _______________________________________________
    Mondrian mailing list
    Mondrian (AT) pentaho (DOT) org
    http://lists.pentaho.org/mailman/listinfo/mondrian

  3. #3
    Fu Hongwei Guest

    Default Re: RE: [Mondrian] Testing a Hive dialect

    Hi Julian,
    Thanks for the prompt reply.
    The test is running on Hive 0.7.0, but a patch will be needed. I've started a jira on Hive and will submit it soon too.
    https://issues.apache.org/jira/brows...ction_12987548

    There are too many failures in the test suite. I'm still working on it, but I will submit the DialectTest.java and HiveDialect.java, so people interested can work on it together.
    I think the point is first to make it running then speed it up.
    2011-02-11



    Fu Hongwei




  4. #4
    Julian Hyde Guest

    Default RE: RE: [Mondrian] Testing a Hive dialect

    If it is difficult to fix HIVE-1922, we could possibly workaround issue by
    changing Dialect.generateOrderItem. (Every dialect tends to have different
    rules for how to generate an ORDER BY clause -- order by column name, order
    by ordinal, order by expression, order by ordinal only when applied to a set
    operation such as union, etc. -- so there's no harm having yet another
    behavior.)

    Someone also mentioned that Hive only supports ANSI join syntax 'FROM t1
    JOIN t2 ON t1.x = t2.y', whereas mondrian only generates 'FROM t1, t2 WHERE
    t1.x = t2.y'. Is this still an issue? We would need to fix either Hive or
    Mondrian's dialect. Mondrian's dialect it probably easier.

    Since other people have expressed an interest in a Hive dialect, it would be
    useful if you check in what you have right now, even though there are many
    failures. Send me the files and I will check them in.

    Julian



    _____

    From: mondrian-bounces (AT) pentaho (DOT) org [mailto:mondrian-bounces (AT) pentaho (DOT) org] On
    Behalf Of Fu Hongwei
    Sent: Friday, February 11, 2011 3:53 AM
    To: 'Mondrian developer mailing list'
    Subject: Re: RE: [Mondrian] Testing a Hive dialect


    Hi Julian,
    Thanks for the prompt reply.
    The test is running on Hive 0.7.0, but a patch will be needed. I've started
    a jira on Hive and will submit it soon too.
    https://issues.apache.org/jira/brows...ian.jira..plug
    in.system.issuetabpanels:comment-tabpanel
    <https://issues.apache.org/jira/brows...ssian.jira.plu
    gin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987548#action_
    12987548> &focusedCommentId=12987548#action_12987548

    There are too many failures in the test suite. I'm still working on it, but
    I will submit the DialectTest.java and HiveDialect.java, so people
    interested can work on it together.
    I think the point is first to make it running then speed it up.
    2011-02-11

    _____

    Fu Hongwei
    _____


  5. #5
    Calum Miller Guest

    Default Re: [Mondrian] Testing a Hive dialect

    I have time next week to help make necessary Mondrian dialect changes. Happy to start on Monday if Jira is updated?

    Calum

    Sent from my iPhone

    On 11 Feb 2011, at 17:51, "Julian Hyde" <jhyde (AT) pentaho (DOT) com> wrote:

    > If it is difficult to fix HIVE-1922, we could possibly workaround issue by changing Dialect.generateOrderItem. (Every dialect tends to have different rules for how to generate an ORDER BY clause -- order by column name, order by ordinal, order by expression, order by ordinal only when applied to a set operation such as union, etc. -- so there's no harm having yet another behavior.)
    >
    > Someone also mentioned that Hive only supports ANSI join syntax 'FROM t1 JOIN t2 ON t1.x = t2.y', whereas mondrian only generates 'FROM t1, t2 WHERE t1.x = t2.y'. Is this still an issue? We would need to fix either Hive or Mondrian's dialect. Mondrian's dialect it probably easier.
    >
    > Since other people have expressed an interest in a Hive dialect, it would be useful if you check in what you have right now, even though there are many failures. Send me the files and I will check them in.
    >
    > Julian
    >
    >
    > From: mondrian-bounces (AT) pentaho (DOT) org [mailto:mondrian-bounces (AT) pentaho (DOT) org] On Behalf Of Fu Hongwei
    > Sent: Friday, February 11, 2011 3:53 AM
    > To: 'Mondrian developer mailing list'
    > Subject: Re: RE: [Mondrian] Testing a Hive dialect
    >
    > Hi Julian,
    > Thanks for the prompt reply.
    > The test is running on Hive 0.7.0, but a patch will be needed. I've started a jira on Hive and will submit it soon too.
    > https://issues.apache.org/jira/brows...ction_12987548
    >
    > There are too many failures in the test suite. I'm still working on it, but I will submit the DialectTest.java and HiveDialect.java, so people interested can work on it together.
    > I think the point is first to make it running then speed it up.
    > 2011-02-11
    > Fu Hongwei
    > 发件人: Julian Hyde
    > 发送时间: 2011-02-11 00:19:26
    > 收件人: 'Mondrian developer mailing list'
    > 抄送:
    > 主题: RE: [Mondrian] Testing a Hive dialect
    > I am well aware of the compromises with Hive. There is certainly an impedance mismatch between hadoop and the real-time analysis, and that is reflected in query response time. It is still useful to have a Hive dialect, because as you say, Hive is improving all the time. And Pentaho is thinking about ways to bridge the impedance mismatch.
    >
    > Can please you attach your code to the jira case as a patch? I will submit it.
    >
    > Also please attach the output of the test suite, and describe the version of Hive you are running against. That will be a reference point for others who are working on Hive.
    >
    > Julian
    >
    > From: mondrian-bounces (AT) pentaho (DOT) org [mailto:mondrian-bounces (AT) pentaho (DOT) org] On Behalf Of Fu Hongwei
    > Sent: Thursday, February 10, 2011 1:29 AM
    > To: mondrian
    > Subject: [Mondrian] Testing a Hive dialect
    >
    > Hi,
    > I'm new to the community. I don't know who to ask this for.
    > I've written a Hive dialect for mondrian and been testing it.
    > The DialectTest is passed, but it's still failing other tests.
    >
    > From what I can see it's still a long way to make an integration
    > of Hive and Mondrian practical.
    > 1. It's really slow. It takes 1 or 2 days to run through all the tests,
    > a low end estimation. The latency is too big for most applications.
    > It might take some major architectural change on the Hive side to solve
    > this problem.
    > 2. Hive ql is still at a rather immature stage, partly because it's not
    > really intended to be a full featured relational database. There are some
    > bugs too, like the join behavior is incorrect at the current trunk.
    >
    > I know there is already a jira about it.
    > http://jira.pentaho.com/browse/MONDRIAN-789
    > How can I submit the code? Thanks
    > 2011-02-10
    > Fu Hongwei
    > _______________________________________________
    > Mondrian mailing list
    > Mondrian (AT) pentaho (DOT) org
    > http://lists.pentaho.org/mailman/listinfo/mondrian


    _______________________________________________
    Mondrian mailing list
    Mondrian (AT) pentaho (DOT) org
    http://lists.pentaho.org/mailman/listinfo/mondrian

  6. #6
    fuhongwei141 Guest

    Default Re: Re: [Mondrian] Testing a Hive dialect

    Hi All,
    I've attached the patch to
    https://issues.apache.org/jira/brows...ction_12987548

    It works fine on my machine, Hive0.7.0 and hadoop 0.20.0
    Please tell me if anything wrong.

    2011-02-13



    Fu Hongwei




  7. #7
    Fu Hongwei Guest

    Default Re: RE: RE: [Mondrian] Testing a Hive dialect

    Thanks, please commit the files for me. Hive-1922 contains some stuff I consider easier to fix on the Hive side, but, yes, I think it's better if we have a non hive patch version.

    The failures are caused by some small problems. It's running beautifully now.

    I've fixed the join issue. You would also want to add a VM parameter when running the tests or it will report some errors.
    -javax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl

    Can I have some kind of commit right? It looks like I have a lot of files to submit.

    BTW, could somebody tell me how to make a patch in perforce like in svn? Thanks.
    2011-02-13



    Fu Hongwei




  8. #8
    Julian Hyde Guest

    Default RE: RE: RE: [Mondrian] Testing a Hive dialect

    I have checked in your patch as change 14118. Thank you for the
    contribution.

    I did some cleanup first (mainly to make the code comply with our coding
    conventions). Let me know if I broke anything.

    I'm doing some further cleanup to move the Hive-specific stuff (e.g.
    checking that the 'on' clause only contains 'x.a = y.b' or 'upper(x.a) =
    upper(y.b)') into the Hive dialect. My goal is to enable FROM-JOIN-ON syntax
    for other dialects such as Oracle.

    I do not have a Hive instance to test against, so my apologies in advance if
    I break anything in this upcoming change. I hope that change 14118 will give
    others such as Calum enough to test and work against.

    Generally I like to have a few more contributions before I give newcomers
    committer access. For now, the best way to send patches is using the
    packChange utility (available in //open/util/bin in the eigenbase perforce
    repository). But I will also accept tar files or patch files as long as each
    file has a '$Id: $' header somewhere so that I can identify the version that
    you modified. Next time can you also run the
    //open/mondrian/bin/checkFile.sh script on your changes, so check for
    compliance with our coding guidelines.

    Julian



    _____

    From: Fu Hongwei [mailto:fuhongwei2006 (AT) 163 (DOT) com]
    Sent: Sunday, February 13, 2011 7:15 AM
    To: jhyde
    Cc: 'Mondrian developer mailing list'
    Subject: Re: RE: RE: [Mondrian] Testing a Hive dialect


    Thanks, please commit the files for me. Hive-1922 contains some stuff I
    consider easier to fix on the Hive side, but, yes, I think it's better if we
    have a non hive patch version.

    The failures are caused by some small problems. It's running beautifully
    now.

    I've fixed the join issue. You would also want to add a VM parameter when
    running the tests or it will report some errors.
    -javax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal
    ..jaxp.DocumentBuilderFactoryImpl

    Can I have some kind of commit right? It looks like I have a lot of files to
    submit.

    BTW, could somebody tell me how to make a patch in perforce like in svn?
    Thanks.
    2011-02-13

    _____

    Fu Hongwei
    _____


  9. #9
    Fu Hongwei Guest

    Default Re: RE: RE: RE: [Mondrian] Testing a Hive dialect

    Hi,
    Thanks.
    There are still some failures. I will submit other changes when the test suite pass. Also, the HIVE patch seemed to be broken. I will try to fix that too.


    2011-02-14



    Fu Hongwei




  10. #10
    Calum Miller Guest

    Default Re: [Mondrian] Testing a Hive dialect

    Hi Fu,

    I discovered this link http://wiki.apache.org/hadoop/Hive/HBaseIntegration on Hive and HBase integration and wondered if this is something you had reviewed? I'm thinking this integration may reduce the latency issues with Hive and improve Mondrian responsiveness.

    Calum
    On 14 Feb 2011, at 15:23, Fu Hongwei wrote:
    [color=blue]
    > Hi,
    > Thanks.
    > There are still some failures. I will submit other changes when the test suite pass. Also, the HIVE patch seemed to be broken. I will try to fix that too.
    >
    >
    > 2011-02-14
    > Fu Hongwei
    >

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.