0006441: I have created an english_utf8 locale - MantisBT

ID	Project	Category	View Status	Date Submitted	Last Update
0006441	mantisbt	localization	public	2005-11-30 00:11	2007-05-08 03:43

Reporter	morpheus	Assigned To	achumakov
Priority	normal	Severity	feature	Reproducibility	N/A
Status	closed	Resolution	duplicate
Product Version	1.0.0rc3

Summary	0006441: I have created an english_utf8 locale
Description	At my company, about 30% of our testers input their bug reports in Japanese. We set them up to use japanese_utf8 as their locale, and I assumed that I would be able to read their Japanese bug reports in the English locale on Mantis. However, it seems that the English locale uses the Windows-1252 character set. Therefore, the Japanese characters appear garbled. I have created a locale for an English user interface using UTF-8 character encoding. This has solved my problem. I can read bug reports in any language, so long as they were submitted using UTF-8, at the same time leaving the user interface in English. At some point it would probably be best to separate the languages from the character encoding. That is, you could store all of the lang/strings_* files in UTF-8, and then convert all of your character output to different encodings based on the user's preference when you serve the page to the browser. Anyway, I have attached my patch. Please feel free to contact me with questions. It would be nice to get credit in the documentation. My information is: James Ryan jamesr@totalinfosecurity.com http://www.totalinfosecurity.com Thanks...
Tags	No tags attached.
Attached Files	english_utf8.patch.tar.bz (16,412 bytes)

xella 2005-12-02 02:54 reporter ~0011679	As an international company, we have similar problem. Currently, we solve this problem pretty easy with VIM and iconv library. In vim, we open an encoding file and issue 3 commands: :e ++enc=original_encoding :file new_encoding_name.php :w ++enc=utf-8 The first command loads file in right encoding, the second one assures that we won't overwrite original file, then we change encoding name in the text. Finally, the third command saves file in UTF-8. The problem with this is that we have to regenerate encodings every time new mantis version appears. So separating translation from encodings could really be a good feature.

xella 2005-12-02 02:58 reporter ~0011680	I have a suggestion to the feature - translation file could contain "default" encoding, which could be used by default if admin won't set some global encoding used by mantis. For translating from one encoding to another, maybe this could help: http://www.gnu.org/software/libiconv/

morpheus 2005-12-02 03:21 reporter ~0011681	Thank you all for your comments. Actually, I used iconv from the Linux command line to convert the lang/strings_english file to utf-8. But, since ascii overlaps with utf-8 the first 256 bytes, it probably wasn't necessary. PHP has native support for the Linux iconv library, so long as you compile it in when you compile PHP. This is what my company uses to serve content in different character sets. What it means is you don't have to store multiple files on your server that have the same content but different character encoding. This change would be relatively easy to make in Mantis...if only I had the time...

achumakov 2006-09-24 05:22 reporter ~0013456	This problem could be fixed with utf8-ization of Mantis. For now, no conversion of strings_english is required, but you could just change $s_charset = 'windows-1252'; to $s_charset = 'utf-8';.