How to find a hidden unicode character using SQL Server

This mini-blog describes how to analyze every character in a unicode text string in order to find hidden characters, variable-byte characters, and unexpected unicode characters. This can be very useful because certain unicode characters cause applications to fail unexpectedly. This code will help you analyze the data and find any problem characters.

DECLARE @nstring NVARCHAR(100)
SET @nstring =N'TeSt'

DECLARE @position INT SET @position = 1

DECLARE @CharList TABLE ( Position INT, UnicodeChar NVARCHAR(1), UnicodeValue INT )

WHILE @position <= DATALENGTH(@nstring) BEGIN INSERT @CharList SELECT @position as Position ,CONVERT(nchar(1),SUBSTRING(@nstring, @position, 1)) as UnicodeChar ,UNICODE(SUBSTRING(@nstring, @position, 1)) as UnicodeValue SET @position = @position + 1 END

SELECT * FROM @CharList

The results of this query will display the unicode value for every character in the string:

Position UnicodeChar UnicodeValue
1 T 84
2 e 101
3 Ș 536
4 t 116

Note the particularly high value highlighted above which shows this character is clearly not a standard ASCII value. 

Evan Schmidt

Published on 07/30/2013

Authored by admin