View Issue Details

IDProjectCategoryView StatusLast Update
0006441mantisbtlocalizationpublic2007-05-08 03:43
Reportermorpheus Assigned Toachumakov  
PrioritynormalSeverityfeatureReproducibilityN/A
Status closedResolutionduplicate 
Product Version1.0.0rc3 
Summary0006441: I have created an english_utf8 locale
Description

At my company, about 30% of our testers input their bug reports in Japanese. We set them up to use japanese_utf8 as their locale, and I assumed that I would be able to read their Japanese bug reports in the English locale on Mantis. However, it seems that the English locale uses the Windows-1252 character set. Therefore, the Japanese characters appear garbled.

I have created a locale for an English user interface using UTF-8 character encoding. This has solved my problem. I can read bug reports in any language, so long as they were submitted using UTF-8, at the same time leaving the user interface in English.

At some point it would probably be best to separate the languages from the character encoding. That is, you could store all of the lang/strings_* files in UTF-8, and then convert all of your character output to different encodings based on the user's preference when you serve the page to the browser.

Anyway, I have attached my patch. Please feel free to contact me with questions.

It would be nice to get credit in the documentation. My information is:

James Ryan
jamesr@totalinfosecurity.com
http://www.totalinfosecurity.com

Thanks...

TagsNo tags attached.
Attached Files

Relationships

duplicate of 0004804 closedthraxisp Unable to allow mangers to add custom fields 
related to 0004084 closedsiebrand [all lang] Use UTF-8 codepage 

Activities

xella

xella

2005-12-02 02:54

reporter   ~0011679

As an international company, we have similar problem.

Currently, we solve this problem pretty easy with VIM and iconv library. In vim, we open an encoding file and issue 3 commands:

:e ++enc=original_encoding
:file new_encoding_name.php
:w ++enc=utf-8

The first command loads file in right encoding, the second one assures that we won't overwrite original file, then we change encoding name in the text. Finally, the third command saves file in UTF-8.

The problem with this is that we have to regenerate encodings every time new mantis version appears. So separating translation from encodings could really be a good feature.

xella

xella

2005-12-02 02:58

reporter   ~0011680

I have a suggestion to the feature - translation file could contain "default" encoding, which could be used by default if admin won't set some global encoding used by mantis.

For translating from one encoding to another, maybe this could help:

http://www.gnu.org/software/libiconv/

morpheus

morpheus

2005-12-02 03:21

reporter   ~0011681

Thank you all for your comments. Actually, I used iconv from the Linux command line to convert the lang/strings_english file to utf-8. But, since ascii overlaps with utf-8 the first 256 bytes, it probably wasn't necessary.

PHP has native support for the Linux iconv library, so long as you compile it in when you compile PHP. This is what my company uses to serve content in different character sets. What it means is you don't have to store multiple files on your server that have the same content but different character encoding.

This change would be relatively easy to make in Mantis...if only I had the time...

achumakov

achumakov

2006-09-24 05:22

reporter   ~0013456

This problem could be fixed with utf8-ization of Mantis.
For now, no conversion of strings_english is required, but you could just change $s_charset = 'windows-1252'; to $s_charset = 'utf-8';.