Method Sql.mysql()->set_unicode_encode_mode()
- Method
set_unicode_encode_mode
bool
set_unicode_encode_mode(int
enable
)- Description
Enables or disables unicode encode mode.
In this mode, if the server supports UTF-8 and the connection charset is
latin1
(the default) orunicode
then big_query handles wide unicode queries. Enabled by default.Unicode encode mode works as follows: Eight bit strings are sent as
latin1
and wide strings are sent usingutf8
. big_query sendsSET character_set_client
statements as necessary to update the charset on the server side. If the server doesn't support that then it fails, but the wide string query would fail anyway.To make this transparent, string literals with introducers (e.g.
_binary 'foo'
) are excluded from the UTF-8 encoding. This means that big_query needs to do some superficial parsing of the query when it is a wide string.- Returns
1
Unicode encode mode is enabled.
0
Unicode encode mode couldn't be enabled because an incompatible connection charset is set. You need to do
set_charset("latin1")
orset_charset("unicode")
to enable it.- Note
Note that this mode doesn't affect the MySQL system variable
character_set_connection
, i.e. it will still be set tolatin1
by default which means server functions likeUPPER()
won't handle non-latin1
characters correctly in all cases.To fix that, do
set_charset("unicode")
. That will allow unicode encode mode to work whileutf8
is fully enabled at the server side.Tip: If you enable
utf8
on the server side, you need to send raw binary strings as_binary'...'
. Otherwise they will get UTF-8 encoded by the server.- Note
When unicode encode mode is enabled and the connection charset is
latin1
, the charset accepted by big_query is not quite Unicode sincelatin1
is based oncp1252
. The differences are in the range0x80..0x9f
where Unicode has control chars.This small discrepancy is not present when the connection charset is
unicode
.- See also