View Issue Details

IDProjectCategoryView StatusLast Update
0011298mantisbtlocalizationpublic2010-04-23 14:30
Reporterdolmen Assigned Tosiebrand  
PrioritynormalSeveritytrivialReproducibilityalways
Status closedResolutionfixed 
Target Version1.2.1Fixed in Version1.2.1 
Summary0011298: encoding bug line 841 of strings_french.txt
Description

The translation for $s_my_view_title_my_comments (line 841 of strings_french.txt) has UTF-8 encoding that look liks strange at least to vim.

Patch attached.

I don't kwow how I could fix that through translatewiki.

TagsNo tags attached.
Attached Files
encoding.patch (748 bytes)   
diff --git a/lang/strings_french.txt b/lang/strings_french.txt
index 1e2441b..64fcc1f 100644
--- a/lang/strings_french.txt
+++ b/lang/strings_french.txt
@@ -838,7 +838,7 @@ $s_my_view_title_resolved = 'Résolu';
 $s_my_view_title_monitored = 'Surveillé par moi';
 $s_my_view_title_feedback = 'En attente de suivi de moi';
 $s_my_view_title_verify = 'En attente d\'une confirmation de résolution de moi';
-$s_my_view_title_my_comments = 'Bogues sur lesquels j’ai ajouté un commentaire';
+$s_my_view_title_my_comments = 'Bogues sur lesquels j’ai ajouté un commentaire';
 $s_news_added_msg = 'Nouvelle ajoutée...';
 $s_news_deleted_msg = 'Nouvelle supprimée...';
 $s_delete_news_sure_msg = 'Voulez-vous vraiment supprimer cette nouvelle ?';
encoding.patch (748 bytes)   

Activities

siebrand

siebrand

2009-12-22 06:03

developer   ~0023953

Looks fine to me on master. Possibly you have your shell encoding set incorrectly (?)

See http://translatewiki.net/wiki/Mantis:S_my_view_title_my_comments/fr for the current string.

dolmen

dolmen

2009-12-22 10:37

reporter   ~0023958

The quote charcter (U+0092) is ok, but it is not correctly encoded: is uses 3 bytes (E2, 80, 99) instead of 2 (C2, 92). This is not valid according to the UTF-8 definition.

UTF-8 was ambiguous when it has been initially defined, but it has been strictified since.
See the colored table at http://en.wikipedia.org/wiki/UTF-8#Description
See this sentence from RFC 3629: http://tools.ietf.org/html/rfc3629#section-3

   It is important to note
   that the rows of the table are mutually exclusive, i.e., there is
   only one valid way to encode a given character.
dolmen

dolmen

2009-12-22 10:50

reporter   ~0023959

Reopening as I've added explanations.

dolmen

dolmen

2009-12-22 10:59

reporter   ~0023960

Here is some additional information from http://perldoc.perl.org/perlunicode.html#Unicode-Encodings


The following table is from Unicode 3.2.

  1. Code Points 1st Byte 2nd Byte 3rd Byte 4th Byte
  2. U+0000..U+007F 00..7F
  3. U+0080..U+07FF C2..DF 80..BF
  4. U+0800..U+0FFF E0 A0..BF 80..BF
  5. U+1000..U+CFFF E1..EC 80..BF 80..BF
  6. U+D000..U+D7FF ED 80..9F 80..BF
  7. U+D800..U+DFFF ill-formed
  8. U+E000..U+FFFF EE..EF 80..BF 80..BF
    1. U+10000..U+3FFFF F0 90..BF 80..BF 80..BF
    2. U+40000..U+FFFFF F1..F3 80..BF 80..BF 80..BF
    3. U+100000..U+10FFFF F4 80..8F 80..BF 80..BF

Note the A0..BF in U+0800..U+0FFF, the 80..9F in U+D000...U+D7FF , the 90..B F in U+10000..U+3FFFF, and the 80...8F in U+100000..U+10FFFF. The "gaps" are caused by legal UTF-8 avoiding non-shortest encodings: it is technically possible to UTF-8-encode a single code point in different ways, but that is explicitly forbidden, and the shortest possible encoding should always be used.

siebrand

siebrand

2010-03-20 17:57

developer   ~0024829

Fixed in commit d087800acb97cb6000497c617ea0ed3891dbc3dd

Related Changesets

MantisBT: master-1.2.x d087800a

2010-03-20 17:56

siebrand


Details Diff
Fix 0011298: encoding bug line 841 of strings_french.txt

Signed-off-by: Siebrand Mazeland <s.mazeland@xs4all.nl>
Affected Issues
0011298
mod - lang/strings_french.txt Diff File

MantisBT: master 70f8bd83

2010-03-20 17:56

siebrand


Details Diff
Fix 0011298: encoding bug line 841 of strings_french.txt

Signed-off-by: Siebrand Mazeland <s.mazeland@xs4all.nl>
Affected Issues
0011298
mod - lang/strings_french.txt Diff File