SQL Server accent-insensitive sorts and string comparisons
In general, SQL Server default collation is SQL_Latin1_General_CP1_CI_AS,
So by default SQL Server database is always insensitive to the case(CI) and sensitive to the accent(AS) of the characters. In sometimes we need to do accent-insensitive comparisons and sorts. To do such operations we can use the
COLLATE clause followed by collation name where collation should be
accent-insensitive (AI). We can check the collation case and accent
information in the name itself. Collation name contains _CS - case-sensitive;
_CI - case-insensitive; _AS - Accent-sensitive;_AI - Accent-insensitive.
In SQL Server, collation can be set at different levels such as
- SQL Server Level
- Database Level
- Column Level.
To check the default collation of the SQL Server
SELECT SERVERPROPERTY('COLLATION') as collation_default
OR
EXECUTE sp_helpsort
To list all collations supported by SQL Server
SELECT name, description FROM fn_helpcollations()
To list all Accent-insensitive collations supported by SQL Server
SELECT name, description FROM fn_helpcollations() where name like '%_AI%';
Accent-insensitive sorts
Reference :
https://dba.stackexchange.com/questions/199853/accent-sensitive-sort/199933#answer-199865
Thanks, Solomon Rutzky
In general, the Unicode collation algorithm (UCA) allows for complex and
multi-level sorting, such as
-
Sorting is not Comparison:
- Determining whether two strings are the same or different is fairly straight forward (given a particular locale/language and set of sensitivities). But determining the order of 2 or more strings can be highly complex.
-
Sorting is done in a series of steps, with each step applied to the entire
string, not character by character:
- Standard: sort base characters (regardless of accent and case differences)
- IF Accent-sensitive, apply accent/diacritic weights
- IF Case-sensitive, apply casing weights
Example
USE [tempdb];
create table names(name varchar(45) collate SQL_Latin1_General_CP1_CI_AS)
insert into names(name) values ('Joel'),('JoEl'),('Joêl')
-- order by name
select name from names order by name
Output
Joel
JoEl
Joêl
-- order by name with accent-insensitive collate select name from names order by name collate SQL_Latin1_General_CP1_CI_AI Output JoEl Joel Joêl
Accent-insensitive search
In sql server query, if you filter the data on column name='Joel' with accent-sensitive collation,then we end up retrieving only Joel records here we missed Joêl record. If we filter the data with accent-insensitive collation then we retrieve all the records matches with Joel.ExampleUSE [tempdb];
create table names(name varchar(45) collate SQL_Latin1_General_CP1_CI_AS) insert into names(name) values ('Joel'),('JoEl'),('Joêl')-- Search by name Joel select * from names where name='Joel' Output Joel JoEl
-- search by name Joel with accent-insensitive collate select * from names where name='Joel' collate SQL_Latin1_General_CP1_CI_AI Output Joel JoEl Joêl
That's all about SQL Server Accent-Insensitive search and sorting. Thanks for your time.Let me know if you have any questions, suggestions, or comments.
0 comments:
Post a Comment