MariaDB-server/sql/sql_locale.h

101 lines
3.2 KiB
C
Raw Permalink Normal View History

2011-06-30 17:46:53 +02:00
/* Copyright (c) 2006, 2010, Oracle and/or its affiliates. All rights reserved.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA */
#ifndef SQL_LOCALE_INCLUDED
#define SQL_LOCALE_INCLUDED
typedef struct my_locale_errmsgs
{
const char *language;
const char ***errmsgs;
} MY_LOCALE_ERRMSGS;
typedef struct st_typelib TYPELIB;
class MY_LOCALE
{
public:
uint number;
MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp() This patch also fixes: MDEV-33050 Build-in schemas like oracle_schema are accent insensitive MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0 MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0 MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0 MDEV-33088 Cannot create triggers in the database `MYSQL` MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0 MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0 MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0 MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0 - Removing the virtual function strnncoll() from MY_COLLATION_HANDLER - Adding a wrapper function CHARSET_INFO::streq(), to compare two strings for equality. For now it calls strnncoll() internally. In the future it will turn into a virtual function. - Adding new accent sensitive case insensitive collations: - utf8mb4_general1400_as_ci - utf8mb3_general1400_as_ci They implement accent sensitive case insensitive comparison. The weight of a character is equal to the code point of its upper case variant. These collations use Unicode-14.0.0 casefolding data. The result of my_charset_utf8mb3_general1400_as_ci.strcoll() is very close to the former my_charset_utf8mb3_general_ci.strcasecmp() There is only a difference in a couple dozen rare characters, because: - the switch from "tolower" to "toupper" comparison, to make utf8mb3_general1400_as_ci closer to utf8mb3_general_ci - the switch from Unicode-3.0.0 to Unicode-14.0.0 This difference should be tolarable. See the list of affected characters in the MDEV description. Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters! Unlike utf8mb4_general_ci, it does not treat all BMP characters as equal. - Adding classes representing names of the file based database objects: Lex_ident_db Lex_ident_table Lex_ident_trigger Their comparison collation depends on the underlying file system case sensitivity and on --lower-case-table-names and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci. - Adding classes representing names of other database objects, whose names have case insensitive comparison style, using my_charset_utf8mb3_general1400_as_ci: Lex_ident_column Lex_ident_sys_var Lex_ident_user_var Lex_ident_sp_var Lex_ident_ps Lex_ident_i_s_table Lex_ident_window Lex_ident_func Lex_ident_partition Lex_ident_with_element Lex_ident_rpl_filter Lex_ident_master_info Lex_ident_host Lex_ident_locale Lex_ident_plugin Lex_ident_engine Lex_ident_server Lex_ident_savepoint Lex_ident_charset engine_option_value::Name - All the mentioned Lex_ident_xxx classes implement a method streq(): if (ident1.streq(ident2)) do_equal(); This method works as a wrapper for CHARSET_INFO::streq(). - Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name" in class members and in function/method parameters. - Replacing all calls like system_charset_info->coll->strcasecmp(ident1, ident2) to ident1.streq(ident2) - Taking advantage of the c++11 user defined literal operator for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h) data types. Use example: const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column; is now a shorter version of: const Lex_ident_column primary_key_name= Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
2023-04-26 15:27:01 +04:00
const Lex_ident_locale name;
const char *description;
const bool is_ascii;
TYPELIB *month_names;
TYPELIB *ab_month_names;
TYPELIB *day_names;
TYPELIB *ab_day_names;
uint max_month_name_length;
uint max_day_name_length;
uint decimal_point;
uint thousand_sep;
const char *grouping;
MY_LOCALE_ERRMSGS *errmsgs;
MY_LOCALE(uint number_par,
MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp() This patch also fixes: MDEV-33050 Build-in schemas like oracle_schema are accent insensitive MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0 MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0 MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0 MDEV-33088 Cannot create triggers in the database `MYSQL` MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0 MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0 MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0 MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0 - Removing the virtual function strnncoll() from MY_COLLATION_HANDLER - Adding a wrapper function CHARSET_INFO::streq(), to compare two strings for equality. For now it calls strnncoll() internally. In the future it will turn into a virtual function. - Adding new accent sensitive case insensitive collations: - utf8mb4_general1400_as_ci - utf8mb3_general1400_as_ci They implement accent sensitive case insensitive comparison. The weight of a character is equal to the code point of its upper case variant. These collations use Unicode-14.0.0 casefolding data. The result of my_charset_utf8mb3_general1400_as_ci.strcoll() is very close to the former my_charset_utf8mb3_general_ci.strcasecmp() There is only a difference in a couple dozen rare characters, because: - the switch from "tolower" to "toupper" comparison, to make utf8mb3_general1400_as_ci closer to utf8mb3_general_ci - the switch from Unicode-3.0.0 to Unicode-14.0.0 This difference should be tolarable. See the list of affected characters in the MDEV description. Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters! Unlike utf8mb4_general_ci, it does not treat all BMP characters as equal. - Adding classes representing names of the file based database objects: Lex_ident_db Lex_ident_table Lex_ident_trigger Their comparison collation depends on the underlying file system case sensitivity and on --lower-case-table-names and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci. - Adding classes representing names of other database objects, whose names have case insensitive comparison style, using my_charset_utf8mb3_general1400_as_ci: Lex_ident_column Lex_ident_sys_var Lex_ident_user_var Lex_ident_sp_var Lex_ident_ps Lex_ident_i_s_table Lex_ident_window Lex_ident_func Lex_ident_partition Lex_ident_with_element Lex_ident_rpl_filter Lex_ident_master_info Lex_ident_host Lex_ident_locale Lex_ident_plugin Lex_ident_engine Lex_ident_server Lex_ident_savepoint Lex_ident_charset engine_option_value::Name - All the mentioned Lex_ident_xxx classes implement a method streq(): if (ident1.streq(ident2)) do_equal(); This method works as a wrapper for CHARSET_INFO::streq(). - Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name" in class members and in function/method parameters. - Replacing all calls like system_charset_info->coll->strcasecmp(ident1, ident2) to ident1.streq(ident2) - Taking advantage of the c++11 user defined literal operator for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h) data types. Use example: const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column; is now a shorter version of: const Lex_ident_column primary_key_name= Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
2023-04-26 15:27:01 +04:00
const Lex_ident_locale &name_par,
const char *descr_par, bool is_ascii_par,
TYPELIB *month_names_par, TYPELIB *ab_month_names_par,
TYPELIB *day_names_par, TYPELIB *ab_day_names_par,
uint max_month_name_length_par, uint max_day_name_length_par,
uint decimal_point_par, uint thousand_sep_par,
const char *grouping_par, MY_LOCALE_ERRMSGS *errmsgs_par) :
number(number_par),
name(name_par), description(descr_par), is_ascii(is_ascii_par),
month_names(month_names_par), ab_month_names(ab_month_names_par),
day_names(day_names_par), ab_day_names(ab_day_names_par),
max_month_name_length(max_month_name_length_par),
max_day_name_length(max_day_name_length_par),
decimal_point(decimal_point_par),
thousand_sep(thousand_sep_par),
grouping(grouping_par),
errmsgs(errmsgs_par)
{}
my_repertoire_t repertoire() const
{ return is_ascii ? MY_REPERTOIRE_ASCII : MY_REPERTOIRE_EXTENDED; }
MDEV-36216 TO_CHAR FM format not recognized in SQL_MODE=Oracle Adding support for the "FM" format in function TO_CHAR(date_time, fmt). "FM" in the format string disables padding of all components following it. So now TO_CHAR() works as follows: - By default string format components DAY (weekday name) and MONTH (month name) are right-padded with spaces to the maximum possible DAY and MONTH name lengths respectively, according to the current locale specified in @@lc_time_names. So for example, with lc_time_names='en_US' all month names are right-padded with spaces up to 9 characters ('September' is the longest). SET lc_time_names='en_US'; SELECT TO_CHAR('0001-02-03', 'MONTH'); -> 'February ' (padded to 9 chars) NEW: When typed after FM, DAY and MONTH names are not right-padded with trailing spaces any more: SET lc_time_names='en_US'; SELECT TO_CHAR('0001-02-03', 'FMMONTH'); -> 'February' (not padded) - By default numeric components YYYY, YYY, YY, Y, DD, H12, H24, MI, SS are left-padded with leading digits '0' up to the maximum possible number of digits in the component (e.g. 4 for YYYY): SELECT TO_CHAR('0001-02-03', 'YYYY'); -> '0001' (padded to 4 chars) NEW: When typed after FM, these numeric components are not left-padded with leading zeros any more: SELECT TO_CHAR('0001-02-03', 'FMYYYY'); -> '1' (not padded) - If FM is specified multiple times in a format string, every FM negates the previous padding state: * an odd FM disables padding * an even FM enables padding Implementation details: - Adding a helper class Date_time_format_oracle. - Adding a helper method Date_time_format_oracle::append_lex_cstring() - Moving the function append_val() to Date_time_format_oracle as a method. - Moving the function make_date_time_oracle() to Date_time_format_oracle as a method format(). - Adding helper methods month_name() and day_name() in class MY_LOCALE, to return the corresponding components as LEX_CSTRINGs.
2025-04-07 17:17:44 +04:00
/*
Get a non-abbreviated month name by index
@param month - the month index 0..11
*/
LEX_CSTRING month_name(uint month) const
{
if (month > 11)
return Lex_cstring("##", 2);
return Lex_cstring_strlen(month_names->type_names[month]);
}
/*
Get a non-abbreviated weekday name by index
@param weekday - the weekday index 0..6
*/
LEX_CSTRING day_name(uint weekday) const
{
if (weekday > 6)
return Lex_cstring("##", 2);
return Lex_cstring_strlen(day_names->type_names[weekday]);
}
};
/* Exported variables */
extern MY_LOCALE my_locale_en_US;
MDEV-22214 mariadbd.exe calls function mysqld.exe, and crashes Stop linking plugins to the server executable on Windows. Instead, extract whole server functionality into a large DLL, called server.dll. Link both plugins, and small server "stub" exe to it. This eliminates plugin dependency on the name of the server executable. It also reduces the size of the packages (since tiny mysqld.exe and mariadbd.exe are now both linked to one big DLL) Also, simplify the functionality of exporing all symbols from selected static libraries. Rely on WINDOWS_EXPORT_ALL_SYMBOLS, rather than old self-backed solution. fix compile error replace GetProcAddress(GetModuleHandle(NULL), "variable_name") for server exported data with actual variable names. Runtime loading was never required,was error prone , since symbols could be missing at runtime, and now it actually failed, because we do not export symbols from executable anymore, but from a shared library This did require a MYSQL_PLUGIN_IMPORT decoration for the plugin, but made the code more straightforward, and avoids missing symbols at runtime (as mentioned before). The audit plugin is still doing some dynamic loading, as it aims to work cross-version. Now it won't work cross-version on Windows, as it already uses some symbols that are *not* dynamically loaded, e.g fn_format and those symbols now exported from server.dll , when earlier they were exported by mysqld.exe Windows, fixes for storage engine plugin loading after various rebranding stuff Create server.dll containing functionality of the whole server make mariadbd.exe/mysqld.exe a stub that is only calling mysqld_main() fix build
2020-04-10 14:09:18 +02:00
extern MYSQL_PLUGIN_IMPORT MY_LOCALE *my_locales[];
extern MY_LOCALE *my_default_lc_messages;
extern MY_LOCALE *my_default_lc_time_names;
/* Exported functions */
MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp() This patch also fixes: MDEV-33050 Build-in schemas like oracle_schema are accent insensitive MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0 MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0 MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0 MDEV-33088 Cannot create triggers in the database `MYSQL` MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0 MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0 MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0 MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0 - Removing the virtual function strnncoll() from MY_COLLATION_HANDLER - Adding a wrapper function CHARSET_INFO::streq(), to compare two strings for equality. For now it calls strnncoll() internally. In the future it will turn into a virtual function. - Adding new accent sensitive case insensitive collations: - utf8mb4_general1400_as_ci - utf8mb3_general1400_as_ci They implement accent sensitive case insensitive comparison. The weight of a character is equal to the code point of its upper case variant. These collations use Unicode-14.0.0 casefolding data. The result of my_charset_utf8mb3_general1400_as_ci.strcoll() is very close to the former my_charset_utf8mb3_general_ci.strcasecmp() There is only a difference in a couple dozen rare characters, because: - the switch from "tolower" to "toupper" comparison, to make utf8mb3_general1400_as_ci closer to utf8mb3_general_ci - the switch from Unicode-3.0.0 to Unicode-14.0.0 This difference should be tolarable. See the list of affected characters in the MDEV description. Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters! Unlike utf8mb4_general_ci, it does not treat all BMP characters as equal. - Adding classes representing names of the file based database objects: Lex_ident_db Lex_ident_table Lex_ident_trigger Their comparison collation depends on the underlying file system case sensitivity and on --lower-case-table-names and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci. - Adding classes representing names of other database objects, whose names have case insensitive comparison style, using my_charset_utf8mb3_general1400_as_ci: Lex_ident_column Lex_ident_sys_var Lex_ident_user_var Lex_ident_sp_var Lex_ident_ps Lex_ident_i_s_table Lex_ident_window Lex_ident_func Lex_ident_partition Lex_ident_with_element Lex_ident_rpl_filter Lex_ident_master_info Lex_ident_host Lex_ident_locale Lex_ident_plugin Lex_ident_engine Lex_ident_server Lex_ident_savepoint Lex_ident_charset engine_option_value::Name - All the mentioned Lex_ident_xxx classes implement a method streq(): if (ident1.streq(ident2)) do_equal(); This method works as a wrapper for CHARSET_INFO::streq(). - Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name" in class members and in function/method parameters. - Replacing all calls like system_charset_info->coll->strcasecmp(ident1, ident2) to ident1.streq(ident2) - Taking advantage of the c++11 user defined literal operator for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h) data types. Use example: const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column; is now a shorter version of: const Lex_ident_column primary_key_name= Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
2023-04-26 15:27:01 +04:00
MY_LOCALE *my_locale_by_name(const LEX_CSTRING &name);
MY_LOCALE *my_locale_by_number(uint number);
void cleanup_errmsgs(void);
#endif /* SQL_LOCALE_INCLUDED */