找回密码
 注册

微信扫码登录

使用验证码登录

只需一步,快速开始

胜天工科技销售各种数字电视信号调制卡胜天工科技销售各种数字电视信号码流卡

【游客、新手、注册会员的区别】 【积分策略和会员晋级说明】 【发帖和附件上传规则】 【如何下载感兴趣的资料】 【如何获取梦游币】 【侵权资料处理及免责说明】
查看: 2685|回复: 2

MySQL的字符集和连接校对的概念和理解

 火... [复制链接]
  • TA的每日心情
    开心
    2023-5-10 22:52
  • 签到天数: 191 天

    [LV.7]常住居民III

    发表于 2010-4-16 11:44:15 | 显示全部楼层 |阅读模式
    分享到:
    消息来自- 北京
    本帖最后由 天·地·人 于 2010-4-16 12:04 编辑

    Character Sets and Collations in General

    A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set. Let's make the distinction clear with an example of an imaginary character set.

    Suppose that we have an alphabet with four letters: “A”, “B”, “a”, “b”. We give each letter a number: “A” = 0, “B” = 1, “a” = 2, “b” = 3. The letter “A” is a symbol, the number 0 is the encoding for “A”, and the combination of all four letters and their encodings is a character set.

    Suppose that we want to compare two string values, “A” and “B”. The simplest way to do this is to look at the encodings: 0 for “A” and 1 for “B”. Because 0 is less than 1, we say “A” is less than “B”. What we've just done is apply a collation to our character set. The collation is a set of rules (only one rule in this case): “compare the encodings.” We call this simplest of all possible collations a binary collation.

    But what if we want to say that the lowercase and uppercase letters are equivalent? Then we would have at least two rules: (1) treat the lowercase letters “a” and “b” as equivalent to “A” and “B”; (2) then compare the encodings. We call this a case-insensitive collation. It is a little more complex than a binary collation.

    In real life, most character sets have many characters: not just “A” and “B” but whole alphabets, sometimes multiple alphabets or eastern writing systems with thousands of characters, along with many special symbols and punctuation marks. Also in real life, most collations have many rules, not just for whether to distinguish lettercase, but also for whether to distinguish accents, and for multiple-character mappings.

    MySQL can do these things for you:
    • Store strings using a variety of character sets
    • Compare strings using a variety of collations
    • Mix strings with different character sets or collations in the same server, the same database, or even the same table
    • Allow specification of character set and collation at any level

    In these respects, MySQL is far ahead of most other database management systems. However, to use these features effectively, you need to know what character sets and collations are available, how to change the defaults, and how they affect the behavior of string operators and functions.
  • TA的每日心情
    开心
    2023-5-10 22:52
  • 签到天数: 191 天

    [LV.7]常住居民III

     楼主| 发表于 2010-4-16 11:51:40 | 显示全部楼层
    消息来自- 北京
    本帖最后由 天·地·人 于 2010-4-16 12:18 编辑

    Character Sets and Collations in MySQL

    The MySQL server can support multiple character sets. To list the available character sets, use the SHOW CHARACTER SET statement. A partial listing follows.
    1.jpg
    Any given character set always has at least one collation. It may have several collations. To list the collations for a character set, use the SHOW COLLATION statement. For example, to see the collations for the latin1 (cp1252 West European) character set, use this statement to find those collation names that begin with latin1:
    2.jpg
    The latin1 collations have the following meanings.
    3.jpg

    Collations have these general characteristics:
    • Two different character sets cannot have the same collation.
    • Each character set has one collation that is the default collation. For example, the default collation for latin1 is latin1_swedish_ci. The output for SHOW CHARACTER SET indicates which collation is the default for each displayed character set.
    • There is a convention for collation names: They start with the name of the character set with which they are associated, they usually include a language name, and they end with _ci (case insensitive), _cs (case sensitive), or _bin (binary).
    In cases where a character set has multiple collations, it might not be clear which collation is most suitable for a given application. To avoid choosing the wrong collation, it can be helpful to perform some comparisons with representative data values to make sure that a given collation sorts values the way you expect.
    回复

    使用道具 举报

  • TA的每日心情
    开心
    2026-4-27 11:48
  • 签到天数: 4186 天

    [LV.Master]伴坛终老

    发表于 2010-4-17 16:48:31 | 显示全部楼层
    消息来自- 北京海淀
    字符集和校对规则有4个级别的默认设置:服务器级、数据库级、表级和连接级。


    数据库中关于字符集的种类有很多,对编程有影响的主要是客户端字符集和数据库字符集。

    数据库中常用的操作就是保存数据和读取数据,在这过程中,乱不乱码和数据库字符集貌似没有什么关系。我们只要保证写入时选择的字符集和读取时选择的字符集一致,即只需保证两次操作的客户端字符集一致即可。

    在写入时MySQL会将客户端指定的字符集转换成数据库字符集存入数据文件,读取时又将数据库字符集转换成客户端指定的字符集展示给客户端,把客户端字符集和数据库字符设置一致,显而易见的好处是免掉转换的性能损耗;另外,如果考虑到以后数据库的迁移,将数据库字符集设置为大多数数据库都支持的字符集会省掉很大麻烦。
    回复

    使用道具 举报

    您需要登录后才可以回帖 登录 | 注册

    本版积分规则

    QQ|Archiver|手机版|小黑屋|数字电视开发网 ( 京ICP备16008897号-5 )

    GMT+8, 2026-5-14 08:15 , Processed in 0.114015 second(s), 25 queries , Gzip On.

    Powered by Discuz! X3.5

    © 2001-2026 Discuz! Team.

    快速回复 返回顶部 返回列表