US and Worldwide: +1 (866) 660-7555
Results 1 to 9 of 9

Thread: Latin Characters en Cube Views

  1. #1
    Join Date
    Jan 2007
    Posts
    485

    Default Latin Characters en Cube Views

    I am building cubes with data that has "latin" characters (such as Tildes, etc.). Someone in the Spanish sub forum recomended I place the following encoding at the begining of my Schema definition:

    <?xml version="1.0" encoding="ISO-8859-1"?>

    I have done so, refreshed my solutions repository and I have even restarted my PCI, but the Pivot View of my cube still shows errorneous charatcters.

    Does anyone have another solution / recomendation?

    Regards, dtm

  2. #2
    Join Date
    Jan 2007
    Posts
    485

    Default

    Furhermore on my original thread:

    I have continued to try using the ISO... encoding noted above and am getting the following error on the PCIs cmd console:

    **Begin quote
    ERROR [MondrianModel] set UserMdx failed Mondrian Error:MDX object`[Productos.Productos].[Productos Todos].[Cr˫®dito Dependiente].' not found in cube 'CuboGenericoEncuestaServicios2'
    **End quote

    I'd appreciate any help you can afford. Regards, DMurray3

  3. #3
    Join Date
    Jan 2007
    Posts
    485

    Default

    After some further investigating.. I find that the problem arises when tunning the PivotView with the "MDX" button and among the MDX statement there are members that have "tildes" or other similar latin characters in their names.

    Please advise where and how must one tell Mondrian to recognize and accept these special characters. Apparently the encoding that I am placing at the begining of the xaction statement (noted in previous parts of this thread) are not enough.

    I'd appreciate anyones assistance. Regards, DMurray3

  4. #4
    Join Date
    Feb 2008
    Posts
    14

    Default

    I'm having a similar problem, Apparently you need to set the same encondig as your Database, I'm guessing if you put special characthers in the .xml itself (In the name or description, etc) they turn up OK, the ones coming from the DB are the ones that don't show. Good luck

  5. #5
    Join Date
    Mar 2007
    Posts
    216

    Smile

    Hi,

    As said I would check if my database's charset is ISO-8859-1
    This thread may help you http://forums.pentaho.org/showthread.php?t=27604
    It says to change the encoding in a web.xml parameter.
    Try and reply
    Hope this helps, it's a few months since I have not used Pentaho although I'm a BI pro now.

    a+, =)
    -=Clément=-

  6. #6
    Join Date
    Jan 2007
    Posts
    485

    Default

    Hi Clement.. Thanks for the quick response...

    My charactertset in MYSQL is "latin-1 --cp1252 West European" and the collation is "latin1_swedish_ci" (my MySQL version does not show the "latin1_spanish" collation as pickable option).

    According to the MySQL 5.0 Manual at www.mysql.com (section 8. Internationalization) :

    "latin1 is the default character set. MySQL's latin1 is the same as the Windows cp1252 character set. This means it is the same as the official ISO 8859-1 or IANA (Internet Assigned Numbers Authority) latin1, except that IANA latin1 treats the code points between 0x80 and 0x9f as “undefined,” whereas cp1252, and therefore MySQL's latin1, assign characters for those positions. For example, 0x80 is the Euro sign. For the “undefined” entries in cp1252, MySQL translates 0x81 to Unicode 0x0081, 0x8d to 0x008d, 0x8f to 0x008f, 0x90 to 0x0090, and 0x9d to 0x009d. "

    I can change my DB properties to use the ISO-8859-1... notwithstanding it is not specifically included among the char-set options; but I cannot change the chart-set options to ISO... on my tables given that MySQL does not provide for said char-set.

    In Pentaho, I had already discovered and adjusted the "encoding" to ISO-8859-1 in Pentaho (in order to see the "latin characters" in the JPivot's Dimensions), as follows:

    - \PCI\jboss\server\default\deploy\pentaho.war\WEB-INF\portlet.xml
    (at the begining... to read: <?xml version="1.0" encoding="ISO-8859-1"?>)
    - \PCI\jboss\server\default\deploy\pentaho.war\WEB-INF\web.xml
    (same as suggested in J.Dixon's thread you quoted, ie "uncommenting the context param")
    - \PCI\jboss\server\default\deploy\jbossweb-tomcat55.sar\server.xml
    (in the <Service name ="jboss.web"... connector -after ..Timeout="true" I added- URIEncoding="ISO-8859-1">
    - and additionally have included the following sentence in all my xaction and mondrian.xml files
    (at the begining of each I added <?xml version="1.0" encoding="ISO-8859-1"?>)

    So... What do you suggest I do next..? Thanks again... Daniel Murray

  7. #7
    Join Date
    Mar 2007
    Posts
    216

    Post

    Hi,

    You may need to go further with one exemple.You provide ˫® without the correct word so I could not do.If you are able to read binary data from your mysql field, know that ˫® has Unicode 02EB 00AE. You can use Jujuedit in binary mode to compare with the content of your data field. (see picture attached) maybe you will be able to debug what encoding is used. Another idea is to look at browser's encoding. Give also details about OS you use.

    a+, =)
    -=Clément=-
    Attached Images Attached Images

  8. #8
    Join Date
    Jul 2007
    Posts
    2,473

    Default

    I use cubes with latin accented chars with no problem at all.

    I use, on the mondrian schema, iso-8859-1, but utf8 could be used; just make sure it is a valid xml. There should be no need to change the db encoding, since JDBC drivers are capable to do the necessary transformations.

    Maybe it's a problem of the browser picking the wrong charset?
    Pedro Alves
    Meet us on ##pentaho, a FreeNode irc channel

  9. #9
    Join Date
    Jan 2007
    Posts
    485

    Default More on Latin Chars in Cube Views

    Thanks pmalves, Clement and everyone else who has tried to help… I still have the same problem.

    I am using W/XP with SP2; the Pentaho PCI 1.2.0.534 GA (including the Mondrian that comes with it); Jdk/jre 1.5.0_13. This is what I have tried so far:

    1) I changed my DB tables to
    CHARSET = UTF8 (seen in MySQL as utf8 – UTF-8 Unicode
    COLLAT = utf8_general_ci

    2) I have changed the encoding configurations throughout Pentaho (as far as I know where to go) back UTF-8, to include:

    \PCI\jboss\server\default\deploy\pentaho.war\WEB-INF\portlet.xml
    \PCI\jboss\server\default\deploy\pentaho.war\WEB-INF\web.xml
    \PCI\jboss\server\default\deploy\jboss-web-tomcat55.sar\server.xml
    …my_cube.xaction
    …my_cube.mondrian.xml

    3) I have tried with
    IE 6.0.2900.2180 (using [es-ec] and [es], changing of preferences between both, shows no
    changes.

    Firefox 2.0.0.6 (using languages [es-ec], [es]). When I changing preference to [es-es], I get a
    Jpivot error (javax.servlet.ServletException: javax.servlet.jsp.JspException:
    net.sf.saxon.trans.DynamicError: Illegal HTML character: decimal 131

    4) Clement: in the above, the following characters are being changed to symbols:
    l The á (a with tilde) is being switched to a combination of Ãi (Unicode U+00C3 and U+0069) [or what is to be…very similar to the ú]
    l the é (e with tilde) is being switched to a combination of é (UnicodesU+00C3 and U+00A9 respectively);
    l the * (i with tilde) is being switched to à (Unicode = U+00C3);
    l the é (e with tilde) is being switched to a combination of é (UnicodesU+00C3 and U+00A9 respectively);
    l the ó (letter o with tilde) is being switched to the combination of ó (Unicode U+00C3 and U+00B3);
    l The ú (u with tilde) is being switched to a combination of Ãi (Unicode U+00C3 and U+0069) [or what is to be…very similar to the á]
    l the ñ (n with carat) is being switched to ñ (Unicode U+00C3 and U+00B1)

    I’ll search and download the Jujuedit and see what I can do….

    In the above configurations, as long as I don’t use the MDX Editor, all my spanish texts appear correctly (tildes and so forth) and can even change Columns/Rows/Filters with the Navigator; it is when I try to modify thru the MDX Query Editor when I get the “characters” errors (as can be seen in the screen shot sequences enclosed).

    I’ll try to reset my Pentaho configurations back to ISO-8859-1 and see how I can do the same with my tables in MySQL. At least with ISO-8959-1, the MDX Editor –other than not allowing me to use a conditional format- was recognizing my latin characters with every edit of the MDX.

    I post my results with this second venue, as soon as I get them. Again any thanks, Daniel Murray
    Last edited by DMurray3; 03-17-2008 at 04:02 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •