0010607: Non-ASCII characters, collation and MS SQL Server - MantisBT

ID	Project	Category	View Status	Date Submitted	Last Update
0010607	mantisbt	db mssql	public	2009-06-19 08:21	2014-05-16 15:00

Reporter	cbasset	Assigned To	dregad
Priority	normal	Severity	minor	Reproducibility	always
Status	closed	Resolution	no change required
Platform	x86	OS	Windows	OS Version	2003 server SP2
Product Version	1.1.6

Summary	0010607: Non-ASCII characters, collation and MS SQL Server
Description	Mantis is storing text in database using UTF8 collation which isn't supported by MS SQL Server. When using non-ASCII characters in Mantis like é (eacute) everything is ok... as long as only mantis is concerned! Trying to integrate Mantis with other tools raises the following issues: when accessing directly in the database with other tools, text that has been created with mantis is displayed as "Ã©" instead of é (eacute). when inserting data directly in the database (data migration, interface with CRM tool), text that is properly stored in database with non-ASCII characters doesn't display at all in Mantis. Text without non-ASCII characters is properly displayed. The workaround suggested for MySQL i.e. using UTF8 collation in the database can't be applied to MS SQL Server because it doesn't handle this collation.
Steps To Reproduce	Create an issue with an accentuated character like "é" Query the database with standard SQL client and read the record: the accentuated characters is not properly stored in database and can't be retrieved with standard features. or Insert an issue directly in the database with an accentuated character like "é" in the description using a standard SQL client. Display the issue in mantis => the description isn't displayed.
Tags	No tags attached.
Attached Files	mantis_sqlserver_utf8.png (22,617 bytes) mantis_sqlserver_utf8.png (22,617 bytes)

dhx 2009-06-20 06:34 reporter ~0022203	Microsoft decided to go with UCS-2 (now obsoleted by UTF-16) whereas the standard for just about everyone else in the world is UTF-8. It sounds like MS SQL will never give you the full Unicode range (some characters require 4 bytes representation). We could write some sort of string conversion from UTF-8 to UCS-2 (and vice versa) to interface with MS SQL. However the problem is - what do we do with perfectly legitimate 4 byte characters? UTF-16 solves this problem by allowing surrogate pairs (2x 2 byte characters). UCS-2 doesn't have this feature. Do we simply drop these characters as being "illegal" when working with MS SQL servers? Do we replace them with something else?

dhx 2009-06-20 06:36 reporter ~0022204	Also relevant to this discussion: Description of storing UTF-8 data in SQL Server http://support.microsoft.com/kb/232580

grangeway 2009-06-20 07:13 reporter ~0022205	We'll probably consider moving to the MS sql driver as an option when they release 1.1 as that will support utf8 I believe.

dhx 2009-06-20 07:23 reporter ~0022207	Link for their site: http://blogs.msdn.com/sqlphp/ I'm not sure what they'll do with surrogate pairs/four-byte UTF-8 characters. Any info on that?

dhx 2009-06-20 07:29 reporter ~0022208 Last edited: 2009-06-20 07:33	Can we use iconv in PHP as a wrapper around queries sent to the server, text returned from the database, etc? As per http://msdn.microsoft.com/en-us/library/cc626307%28SQL.90%29.aspx Update: this seems buggy as per http://au.php.net/manual/en/function.iconv.php (but there are some workarounds included)

tomkraw1 2012-08-08 11:25 reporter ~0032510	I used SQL Server 2000 and 2005 with SQL Server native driver. In both cases an application was sending correct UTF-8 headers and all was stored in db with Polish_CI_AS collation. Data was stored in the db in ugly looking way but retrieved from the db and displayed on the web pages (UTF-8 encoded) looked fine. Without any character converting. I think we should suggest in the manual using SQL Server 2005 (or newer) with SQL Server native driver (which supports UTF-8). cheers.

dregad 2012-08-09 03:22 developer ~0032524	As mentioned in previous note, use of native driver is recommended with MSSQL 2005 or more recent. Thanks tomkraw1 for your contribution.

grangeway 2014-05-16 15:00 reporter ~0040364	MantisBT currently supports Mysql and has support for other database engines. The support for other databases is known to be problematic. Having implemented the current database layer into Mantis 10 years ago, I'm currently working on replacing the current layer. If you are interested in using Mantis with non-mysql databases - for example, Oracle, PGSQL or MSSQL, and would be willing to help out testing the new database layer, please drop me an email at paul@mantisforge.org In the meantime, I'd advise running Mantis with Mysql Only to avoid issues. Thanks Paul