Victor Stinner
03c3e35d42
Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings:
...
cp037, cp500 and iso8859_1 codecs
2013-04-09 21:53:09 +02:00
Victor Stinner
cd777eaf53
Issue #17615 : Comparing two Unicode strings now uses wmemcmp() when possible
...
wmemcmp() is twice faster than a dummy loop (342 usec vs 744 usec) on Fedora
18/x86_64, GCC 4.7.2.
2013-04-08 22:43:44 +02:00
Victor Stinner
c1302bba4c
Issue #17615 : Expand expensive PyUnicode_READ() macro in unicode_compare():
...
write specialized functions for each combination of Unicode kinds.
2013-04-08 21:50:54 +02:00
Victor Stinner
207dd38726
fix unused variable
2013-04-03 03:14:58 +02:00
Victor Stinner
eb4b5ac8af
Close #16757 : Avoid calling the expensive _PyUnicode_FindMaxChar() function
...
when possible
2013-04-03 02:02:33 +02:00
Victor Stinner
cfc4c13b04
Add _PyUnicodeWriter_WriteSubstring() function
...
Write a function to enable more optimizations:
* If the substring is the whole string and overallocation is disabled, just
keep a reference to the string, don't copy characters
* Avoid a call to the expensive _PyUnicode_FindMaxChar() function when
possible
2013-04-03 01:48:39 +02:00
Raymond Hettinger
51612fd803
merge
2013-03-23 08:21:52 -07:00
Raymond Hettinger
378170d5d9
Issue 17447: Clarify that str.isidentifier doesn't check for reserved keywords.
2013-03-23 08:21:12 -07:00
Victor Stinner
fb84b5d48d
(Merge 3.3) _PyUnicode_Writer() now also reuses Unicode singletons:
...
empty string and latin1 single character
2013-03-06 19:29:09 +01:00
Victor Stinner
2cb16aa3cb
_PyUnicode_Writer() now also reuses Unicode singletons:
...
empty string and latin1 single character
2013-03-06 19:28:37 +01:00
Victor Stinner
cf77da9fb5
Backed out changeset b9f7b1bf36aa
2013-03-06 01:09:24 +01:00
Victor Stinner
313cac88c5
Issue #17223 : Fix PyUnicode_FromUnicode() on Windows (16-bit wchar_t type)
...
to reject invalid UTF-16 surrogate.
2013-03-06 00:41:50 +01:00
Victor Stinner
36025478bf
(Merge 3.3) Issue #17223 : Fix PyUnicode_FromUnicode() for string of 1 character
...
outside the range U+0000-U+10ffff.
2013-02-26 00:16:57 +01:00
Victor Stinner
d21b58c05d
Issue #17223 : Fix PyUnicode_FromUnicode() for string of 1 character outside
...
the range U+0000-U+10ffff.
2013-02-26 00:15:54 +01:00
Victor Stinner
cfd2c1b4cc
(Merge 3.3) Issue #17137 : When an Unicode string is resized, the internal wide
...
character string (wstr) format is now cleared.
2013-02-07 23:17:34 +01:00
Victor Stinner
bbbac2ec34
Issue #17137 : When an Unicode string is resized, the internal wide character
...
string (wstr) format is now cleared.
2013-02-07 23:12:46 +01:00
Serhiy Storchaka
d0c79dcda5
Issue #17043 : The unicode-internal decoder no longer read past the end of
...
input buffer.
2013-02-07 16:26:55 +02:00
Serhiy Storchaka
03ee12ed72
Issue #17043 : The unicode-internal decoder no longer read past the end of
...
input buffer.
2013-02-07 16:25:25 +02:00
Serhiy Storchaka
3fd4ab356d
Issue #17043 : The unicode-internal decoder no longer read past the end of
...
input buffer.
2013-02-07 16:23:21 +02:00
Serhiy Storchaka
2aee6a6460
Issue #16971 : Fix a refleak in the charmap decoder.
2013-01-29 12:16:57 +02:00
Serhiy Storchaka
afb1cb5579
Issue #16971 : Fix a refleak in the charmap decoder.
2013-01-29 12:13:22 +02:00
Serhiy Storchaka
8fe5a9f9c3
Issue #16979 : Fix error handling bugs in the unicode-escape-decode decoder.
2013-01-29 10:37:39 +02:00
Serhiy Storchaka
24193debd4
Issue #16979 : Fix error handling bugs in the unicode-escape-decode decoder.
2013-01-29 10:28:07 +02:00
Serhiy Storchaka
d679377be7
Issue #16979 : Fix error handling bugs in the unicode-escape-decode decoder.
2013-01-29 10:20:44 +02:00
Serhiy Storchaka
ed3c4128c0
Issue #10156 : In the interpreter's initialization phase, unicode globals
...
are now initialized dynamically as needed.
2013-01-26 12:18:17 +02:00
Serhiy Storchaka
678db84b37
Issue #10156 : In the interpreter's initialization phase, unicode globals
...
are now initialized dynamically as needed.
2013-01-26 12:16:36 +02:00
Serhiy Storchaka
059972535f
Issue #10156 : In the interpreter's initialization phase, unicode globals
...
are now initialized dynamically as needed.
2013-01-26 12:14:02 +02:00
Serhiy Storchaka
570c5b2354
Issue #16980 : Fix processing of escaped non-ascii bytes in the
...
unicode-escape-decode decoder.
2013-01-25 23:53:29 +02:00
Serhiy Storchaka
73e38809e0
Issue #16980 : Fix processing of escaped non-ascii bytes in the
...
unicode-escape-decode decoder.
2013-01-25 23:52:21 +02:00
Serhiy Storchaka
6481bfb2b5
Issue #16335 : Fix integer overflow in unicode-escape decoder.
2013-01-21 11:44:40 +02:00
Serhiy Storchaka
c35f3a9f61
Issue #16335 : Fix integer overflow in unicode-escape decoder.
2013-01-21 11:42:57 +02:00
Serhiy Storchaka
4f5f0e54e0
Issue #16335 : Fix integer overflow in unicode-escape decoder.
2013-01-21 11:38:00 +02:00
Serhiy Storchaka
441d30fac7
Issue #15989 : Fix several occurrences of integer overflow
...
when result of PyLong_AsLong() narrowed to int without checks.
This is a backport of changesets 13e2e44db99d and 525407d89277.
2013-01-19 12:26:26 +02:00
Serhiy Storchaka
9101e23ff6
Issue #15989 : Fix several occurrences of integer overflow
...
when result of PyLong_AsLong() narrowed to int without checks.
This is a backport of changesets 13e2e44db99d and 525407d89277.
2013-01-19 12:41:45 +02:00
Serhiy Storchaka
55e2cb497b
Issue #14850 : Now a chamap decoder treates U+FFFE as "undefined mapping"
...
in any mapping, not only in an unicode string.
2013-01-15 15:30:04 +02:00
Serhiy Storchaka
45d16d9924
Issue #14850 : Now a chamap decoder treates U+FFFE as "undefined mapping"
...
in any mapping, not only in an unicode string.
2013-01-15 15:01:20 +02:00
Serhiy Storchaka
4fb8caee87
Issue #14850 : Now a chamap decoder treates U+FFFE as "undefined mapping"
...
in any mapping, not only in an unicode string.
2013-01-15 14:43:21 +02:00
Serhiy Storchaka
7898043868
Issue #15989 : Fix several occurrences of integer overflow
...
when result of PyLong_AsLong() narrowed to int without checks.
2013-01-15 01:12:17 +02:00
Benjamin Peterson
0b32a480bd
merge 3.3 ( #16906 )
2013-01-09 09:52:22 -06:00
Benjamin Peterson
0c270a8bb7
correct static string clearing loop ( closes #16906 )
2013-01-09 09:52:01 -06:00
Serhiy Storchaka
24a3ef6999
Issue #11461 : Fix the incremental UTF-16 decoder. Original patch by
...
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:41:55 +02:00
Serhiy Storchaka
ae3b32ad6b
Issue #11461 : Fix the incremental UTF-16 decoder. Original patch by
...
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:40:52 +02:00
Serhiy Storchaka
48e188e573
Issue #11461 : Fix the incremental UTF-16 decoder. Original patch by
...
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:14:24 +02:00
Serhiy Storchaka
dec798eb46
Fix out of bound read in UTF-32 decoder on "narrow Unicode" builds.
2013-01-08 22:45:42 +02:00
Serhiy Storchaka
4e02538bf3
Issue #16856 : Fix a segmentation fault from calling repr() on a dict with
...
a key whose repr raise an exception.
2013-01-04 12:40:35 +02:00
Serhiy Storchaka
6c83e739d7
Issue #16856 : Fix a segmentation fault from calling repr() on a dict with
...
a key whose repr raise an exception.
2013-01-04 12:39:34 +02:00
Victor Stinner
18aa4477d3
Close #16281 : handle tailmatch() failure and remove useless comment
...
"honor direction and do a forward or backwards search": the runtime speed may
be different, but I consider that it doesn't really matter in practice. The
direction was never honored before: Python 2.7 uses memcmp() for the str type
for example.
2013-01-03 03:18:09 +01:00
Victor Stinner
7ae320d667
(Merge 3.2) Issue #16455 : On FreeBSD and Solaris, if the locale is C, the
...
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
2013-01-03 01:21:07 +01:00
Victor Stinner
20b654acb5
Issue #16455 : On FreeBSD and Solaris, if the locale is C, the
...
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
2013-01-03 01:08:58 +01:00
Andrew Svetlov
2606a6f197
Issue #16719 : Get rid of WindowsError. Use OSError instead
...
Patch by Serhiy Storchaka.
2012-12-19 14:33:35 +02:00