Method Sql.mysql()->set_unicode_encode_mode()
- Method
set_unicode_encode_mode
boolset_unicode_encode_mode(intenable)- Description
Enables or disables unicode encode mode.
In this mode, if the server supports UTF-8 and the connection charset is
latin1(the default) orunicodethen big_query handles wide unicode queries. Enabled by default.Unicode encode mode works as follows: Eight bit strings are sent as
latin1and wide strings are sent usingutf8. big_query sendsSET character_set_clientstatements as necessary to update the charset on the server side. If the server doesn't support that then it fails, but the wide string query would fail anyway.To make this transparent, string literals with introducers (e.g.
_binary 'foo') are excluded from the UTF-8 encoding. This means that big_query needs to do some superficial parsing of the query when it is a wide string.- Returns
1Unicode encode mode is enabled.
0Unicode encode mode couldn't be enabled because an incompatible connection charset is set. You need to do
set_charset("latin1")orset_charset("unicode")to enable it.- Note
Note that this mode doesn't affect the MySQL system variable
character_set_connection, i.e. it will still be set tolatin1by default which means server functions likeUPPER()won't handle non-latin1characters correctly in all cases.To fix that, do
set_charset("unicode"). That will allow unicode encode mode to work whileutf8is fully enabled at the server side.Tip: If you enable
utf8on the server side, you need to send raw binary strings as_binary'...'. Otherwise they will get UTF-8 encoded by the server.- Note
When unicode encode mode is enabled and the connection charset is
latin1, the charset accepted by big_query is not quite Unicode sincelatin1is based oncp1252. The differences are in the range0x80..0x9fwhere Unicode has control chars.This small discrepancy is not present when the connection charset is
unicode.- See also