Compare commits

..

1 commit

Author SHA1 Message Date
Charalampos Stratakis
b82ac11fcb Fix the test suite support for Expat >= 2.4.5
Resolves: rhbz#2056970
2022-03-08 15:49:00 +01:00
54 changed files with 109 additions and 26940 deletions

View file

@ -1 +0,0 @@
1

View file

@ -1,10 +1,9 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Wed, 13 Jan 2010 21:25:18 +0000
Subject: 00001: Fixup distutils/unixccompiler.py to remove standard library
path from rpath
Subject: [PATCH] 00001: Fixup distutils/unixccompiler.py to remove standard
library path from rpath Was Patch0 in ivazquez' python3000 specfile
Was Patch0 in ivazquez' python3000 specfile
---
Lib/distutils/unixccompiler.py | 9 +++++++++
1 file changed, 9 insertions(+)

View file

@ -1,8 +1,8 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Wed, 13 Jan 2010 21:25:18 +0000
Subject: 00102: Change the various install paths to use /usr/lib64/ instead or
/usr/lib/
Subject: [PATCH] 00102: Change the various install paths to use /usr/lib64/
instead or /usr/lib/
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Mon, 18 Jan 2010 17:59:07 +0000
Subject: 00111: Don't try to build a libpythonMAJOR.MINOR.a
Subject: [PATCH] 00111: Don't try to build a libpythonMAJOR.MINOR.a
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Fri, 19 Jun 2020 16:54:05 +0200
Subject: 00132: Add rpmbuild hooks to unittest
Subject: [PATCH] 00132: Add rpmbuild hooks to unittest
Add non-standard hooks to unittest for use in the "check" phase, when
running selftests within the build:

View file

@ -1,7 +1,8 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Fri, 19 Jun 2020 16:02:24 +0200
Subject: 00155: avoid allocating thunks in ctypes unless absolutely necessary
Subject: [PATCH] 00155: avoid allocating thunks in ctypes unless absolutely
necessary
Avoid allocating thunks in ctypes unless absolutely necessary, to avoid
generating SELinux denials on "import ctypes" and "import uuid" when

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Fri, 19 Jun 2020 16:57:09 +0200
Subject: 00160: Disable test_fs_holes in RPM build
Subject: [PATCH] 00160: Disable test_fs_holes in RPM build
Python 3.3 added os.SEEK_DATA and os.SEEK_HOLE, which may be present in the
header files in the build chroot, but may not be supported in the running

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Fri, 19 Jun 2020 16:58:24 +0200
Subject: 00163: Disable parts of test_socket in RPM build
Subject: [PATCH] 00163: Disable parts of test_socket in RPM build
Some tests within test_socket fail intermittently when run inside Koji;
disable them using unittest._skipInRpmBuild

View file

@ -1,8 +1,8 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Fri, 19 Jun 2020 16:05:07 +0200
Subject: 00170: In debug builds, try to print repr() when a C-level assert
fails
Subject: [PATCH] 00170: In debug builds, try to print repr() when a C-level
assert fails
In debug builds, try to print repr() when a C-level assert fails in the
garbage collector (typically indicating a reference-counting error

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Miro=20Hron=C4=8Dok?= <miro@hroncok.cz>
Date: Wed, 15 Aug 2018 15:36:29 +0200
Subject: 00189: Instead of bundled wheels, use our RPM packaged wheels
Subject: [PATCH] 00189: Instead of bundled wheels, use our RPM packaged wheels
We keep them in /usr/share/python-wheels
---

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Michal Cyprian <m.cyprian@gmail.com>
Date: Mon, 26 Jun 2017 16:32:56 +0200
Subject: 00251: Change user install location
Subject: [PATCH] 00251: Change user install location
Set values of prefix and exec_prefix in distutils install command
to /usr/local if executable is /usr/bin/python* and RPM build

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Nick Coghlan <ncoghlan@redhat.com>
Date: Fri, 19 Jun 2020 17:02:52 +0200
Subject: 00262: PEP538 - Coerce legacy C locale
Subject: [PATCH] 00262: PEP538 - Coerce legacy C locale
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Charalampos Stratakis <cstratak@redhat.com>
Date: Fri, 19 Jun 2020 17:06:08 +0200
Subject: 00292: Restore PyExc_RecursionErrorInst symbol
Subject: [PATCH] 00292: Restore PyExc_RecursionErrorInst symbol
Restore the public PyExc_RecursionErrorInst symbol that was removed
from the 3.6.4 release upstream.

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Christian Heimes <christian@python.org>
Date: Fri, 19 Jun 2020 17:13:03 +0200
Subject: 00294: Define TLS cipher suite on build time
Subject: [PATCH] 00294: Define TLS cipher suite on build time
Define TLS cipher suite on build time depending
on the OpenSSL default cipher suite selection.

View file

@ -1,72 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Miss Islington (bot)"
<31488909+miss-islington@users.noreply.github.com>
Date: Mon, 21 Jan 2019 01:44:30 -0800
Subject: 00319: test_tarfile_ppc64
Fix sparse file tests of test_tarfile on ppc64le with the tmpfs
filesystem.
Upstream: https://bugs.python.org/issue35772
Co-authored-by: Victor Stinner <vstinner@redhat.com>
---
Lib/test/pythoninfo.py | 2 ++
Lib/test/test_tarfile.py | 9 +++++++--
.../next/Tests/2019-01-18-12-19-19.bpo-35772.sGBbsn.rst | 6 ++++++
3 files changed, 15 insertions(+), 2 deletions(-)
create mode 100644 Misc/NEWS.d/next/Tests/2019-01-18-12-19-19.bpo-35772.sGBbsn.rst
diff --git a/Lib/test/pythoninfo.py b/Lib/test/pythoninfo.py
index c5586b45a5..96b6db1cb7 100644
--- a/Lib/test/pythoninfo.py
+++ b/Lib/test/pythoninfo.py
@@ -515,6 +515,8 @@ def collect_resource(info_add):
value = resource.getrlimit(key)
info_add('resource.%s' % name, value)
+ call_func(info_add, 'resource.pagesize', resource, 'getpagesize')
+
def collect_test_socket(info_add):
try:
diff --git a/Lib/test/test_tarfile.py b/Lib/test/test_tarfile.py
index 573be812ea..8e0b275972 100644
--- a/Lib/test/test_tarfile.py
+++ b/Lib/test/test_tarfile.py
@@ -980,16 +980,21 @@ class GNUReadTest(LongnameTest, ReadTest, unittest.TestCase):
def _fs_supports_holes():
# Return True if the platform knows the st_blocks stat attribute and
# uses st_blocks units of 512 bytes, and if the filesystem is able to
- # store holes in files.
+ # store holes of 4 KiB in files.
+ #
+ # The function returns False if page size is larger than 4 KiB.
+ # For example, ppc64 uses pages of 64 KiB.
if sys.platform.startswith("linux"):
# Linux evidentially has 512 byte st_blocks units.
name = os.path.join(TEMPDIR, "sparse-test")
with open(name, "wb") as fobj:
+ # Seek to "punch a hole" of 4 KiB
fobj.seek(4096)
+ fobj.write(b'x' * 4096)
fobj.truncate()
s = os.stat(name)
support.unlink(name)
- return s.st_blocks == 0
+ return (s.st_blocks * 512 < s.st_size)
else:
return False
diff --git a/Misc/NEWS.d/next/Tests/2019-01-18-12-19-19.bpo-35772.sGBbsn.rst b/Misc/NEWS.d/next/Tests/2019-01-18-12-19-19.bpo-35772.sGBbsn.rst
new file mode 100644
index 0000000000..cfd282f1d0
--- /dev/null
+++ b/Misc/NEWS.d/next/Tests/2019-01-18-12-19-19.bpo-35772.sGBbsn.rst
@@ -0,0 +1,6 @@
+Fix sparse file tests of test_tarfile on ppc64 with the tmpfs filesystem. Fix
+the function testing if the filesystem supports sparse files: create a file
+which contains data and "holes", instead of creating a file which contains no
+data. tmpfs effective block size is a page size (tmpfs lives in the page cache).
+RHEL uses 64 KiB pages on aarch64, ppc64, ppc64le, only s390x and x86_64 use 4
+KiB pages, whereas the test punch holes of 4 KiB.

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Victor Stinner <vstinner@python.org>
Date: Fri, 19 Jun 2020 17:16:05 +0200
Subject: 00343: Fix test_faulthandler on GCC 10
Subject: [PATCH] 00343: Fix test_faulthandler on GCC 10
bpo-21131: Fix faulthandler.register(chain=True) stack (GH-15276)
https://bugs.python.org/issue21131

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Lumir Balhar <lbalhar@redhat.com>
Date: Tue, 4 Aug 2020 12:04:03 +0200
Subject: 00353: Original names for architectures with different names
Subject: [PATCH] 00353: Original names for architectures with different names
downstream
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Inada Naoki <songofacandy@gmail.com>
Date: Mon, 3 Jun 2019 10:51:32 +0900
Subject: 00358: align allocations and PyGC_Head to 16 bytes on 64-bit
Subject: [PATCH] 00358: align allocations and PyGC_Head to 16 bytes on 64-bit
platforms
Upstream bug: https://bugs.python.org/issue27987

File diff suppressed because it is too large Load diff

View file

@ -1,33 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Karolina Surma <ksurma@redhat.com>
Date: Mon, 24 Jan 2022 09:28:30 +0100
Subject: 00375: Fix test_distance to enable build on i686
Fix precision in test_distance (test.test_turtle.TestVec2D).
See: https://bugzilla.redhat.com/show_bug.cgi?id=2038843
---
Lib/test/test_turtle.py | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Lib/test/test_turtle.py b/Lib/test/test_turtle.py
index 2fd10ccd50..2d2034ef8a 100644
--- a/Lib/test/test_turtle.py
+++ b/Lib/test/test_turtle.py
@@ -220,7 +220,7 @@ class TestVec2D(VectorComparisonMixin, unittest.TestCase):
def test_distance(self):
vec = Vec2D(6, 8)
expected = 10
- self.assertEqual(abs(vec), expected)
+ self.assertAlmostEqual(abs(vec), expected)
vec = Vec2D(0, 0)
expected = 0
@@ -228,7 +228,7 @@ class TestVec2D(VectorComparisonMixin, unittest.TestCase):
vec = Vec2D(2.5, 6)
expected = 6.5
- self.assertEqual(abs(vec), expected)
+ self.assertAlmostEqual(abs(vec), expected)
def test_rotate(self):

View file

@ -1,7 +1,7 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Sebastian Pipping <sebastian@pipping.org>
Date: Mon, 21 Feb 2022 15:48:32 +0100
Subject: 00378: Support expat 2.4.5
Subject: [PATCH] 00378: Support expat 2.4.5
Curly brackets were never allowed in namespace URIs
according to RFC 3986, and so-called namespace-validating

View file

@ -1,150 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Petr Viktorin <encukou@gmail.com>
Date: Fri, 3 Jun 2022 11:43:35 +0200
Subject: 00382: CVE-2015-20107
Make mailcap refuse to match unsafe filenames/types/params (GH-91993)
Upstream: https://github.com/python/cpython/issues/68966
Tracker bug: https://bugzilla.redhat.com/show_bug.cgi?id=2075390
---
Doc/library/mailcap.rst | 12 +++++++++
Lib/mailcap.py | 26 +++++++++++++++++--
Lib/test/test_mailcap.py | 8 ++++--
...2-04-27-18-25-30.gh-issue-68966.gjS8zs.rst | 4 +++
4 files changed, 46 insertions(+), 4 deletions(-)
create mode 100644 Misc/NEWS.d/next/Security/2022-04-27-18-25-30.gh-issue-68966.gjS8zs.rst
diff --git a/Doc/library/mailcap.rst b/Doc/library/mailcap.rst
index 896afd1d73..849d0bc05f 100644
--- a/Doc/library/mailcap.rst
+++ b/Doc/library/mailcap.rst
@@ -54,6 +54,18 @@ standard. However, mailcap files are supported on most Unix systems.
use) to determine whether or not the mailcap line applies. :func:`findmatch`
will automatically check such conditions and skip the entry if the check fails.
+ .. versionchanged:: 3.11
+
+ To prevent security issues with shell metacharacters (symbols that have
+ special effects in a shell command line), ``findmatch`` will refuse
+ to inject ASCII characters other than alphanumerics and ``@+=:,./-_``
+ into the returned command line.
+
+ If a disallowed character appears in *filename*, ``findmatch`` will always
+ return ``(None, None)`` as if no entry was found.
+ If such a character appears elsewhere (a value in *plist* or in *MIMEtype*),
+ ``findmatch`` will ignore all mailcap entries which use that value.
+ A :mod:`warning <warnings>` will be raised in either case.
.. function:: getcaps()
diff --git a/Lib/mailcap.py b/Lib/mailcap.py
index bd0fc0981c..dcd4b449e8 100644
--- a/Lib/mailcap.py
+++ b/Lib/mailcap.py
@@ -2,6 +2,7 @@
import os
import warnings
+import re
__all__ = ["getcaps","findmatch"]
@@ -13,6 +14,11 @@ def lineno_sort_key(entry):
else:
return 1, 0
+_find_unsafe = re.compile(r'[^\xa1-\U0010FFFF\w@+=:,./-]').search
+
+class UnsafeMailcapInput(Warning):
+ """Warning raised when refusing unsafe input"""
+
# Part 1: top-level interface.
@@ -165,15 +171,22 @@ def findmatch(caps, MIMEtype, key='view', filename="/dev/null", plist=[]):
entry to use.
"""
+ if _find_unsafe(filename):
+ msg = "Refusing to use mailcap with filename %r. Use a safe temporary filename." % (filename,)
+ warnings.warn(msg, UnsafeMailcapInput)
+ return None, None
entries = lookup(caps, MIMEtype, key)
# XXX This code should somehow check for the needsterminal flag.
for e in entries:
if 'test' in e:
test = subst(e['test'], filename, plist)
+ if test is None:
+ continue
if test and os.system(test) != 0:
continue
command = subst(e[key], MIMEtype, filename, plist)
- return command, e
+ if command is not None:
+ return command, e
return None, None
def lookup(caps, MIMEtype, key=None):
@@ -206,6 +219,10 @@ def subst(field, MIMEtype, filename, plist=[]):
elif c == 's':
res = res + filename
elif c == 't':
+ if _find_unsafe(MIMEtype):
+ msg = "Refusing to substitute MIME type %r into a shell command." % (MIMEtype,)
+ warnings.warn(msg, UnsafeMailcapInput)
+ return None
res = res + MIMEtype
elif c == '{':
start = i
@@ -213,7 +230,12 @@ def subst(field, MIMEtype, filename, plist=[]):
i = i+1
name = field[start:i]
i = i+1
- res = res + findparam(name, plist)
+ param = findparam(name, plist)
+ if _find_unsafe(param):
+ msg = "Refusing to substitute parameter %r (%s) into a shell command" % (param, name)
+ warnings.warn(msg, UnsafeMailcapInput)
+ return None
+ res = res + param
# XXX To do:
# %n == number of parts if type is multipart/*
# %F == list of alternating type and filename for parts
diff --git a/Lib/test/test_mailcap.py b/Lib/test/test_mailcap.py
index c08423c670..920283d9a2 100644
--- a/Lib/test/test_mailcap.py
+++ b/Lib/test/test_mailcap.py
@@ -121,7 +121,8 @@ class HelperFunctionTest(unittest.TestCase):
(["", "audio/*", "foo.txt"], ""),
(["echo foo", "audio/*", "foo.txt"], "echo foo"),
(["echo %s", "audio/*", "foo.txt"], "echo foo.txt"),
- (["echo %t", "audio/*", "foo.txt"], "echo audio/*"),
+ (["echo %t", "audio/*", "foo.txt"], None),
+ (["echo %t", "audio/wav", "foo.txt"], "echo audio/wav"),
(["echo \\%t", "audio/*", "foo.txt"], "echo %t"),
(["echo foo", "audio/*", "foo.txt", plist], "echo foo"),
(["echo %{total}", "audio/*", "foo.txt", plist], "echo 3")
@@ -205,7 +206,10 @@ class FindmatchTest(unittest.TestCase):
('"An audio fragment"', audio_basic_entry)),
([c, "audio/*"],
{"filename": fname},
- ("/usr/local/bin/showaudio audio/*", audio_entry)),
+ (None, None)),
+ ([c, "audio/wav"],
+ {"filename": fname},
+ ("/usr/local/bin/showaudio audio/wav", audio_entry)),
([c, "message/external-body"],
{"plist": plist},
("showexternal /dev/null default john python.org /tmp foo bar", message_entry))
diff --git a/Misc/NEWS.d/next/Security/2022-04-27-18-25-30.gh-issue-68966.gjS8zs.rst b/Misc/NEWS.d/next/Security/2022-04-27-18-25-30.gh-issue-68966.gjS8zs.rst
new file mode 100644
index 0000000000..da81a1f699
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2022-04-27-18-25-30.gh-issue-68966.gjS8zs.rst
@@ -0,0 +1,4 @@
+The deprecated mailcap module now refuses to inject unsafe text (filenames,
+MIME types, parameters) into shell commands. Instead of using such text, it
+will warn and act as if a match was not found (or for test commands, as if
+the test failed).

View file

@ -1,130 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Miss Islington (bot)"
<31488909+miss-islington@users.noreply.github.com>
Date: Wed, 22 Jun 2022 15:05:00 -0700
Subject: 00386: CVE-2021-28861
Fix an open redirection vulnerability in the `http.server` module when
an URI path starts with `//` that could produce a 301 Location header
with a misleading target. Vulnerability discovered, and logic fix
proposed, by Hamza Avvan (@hamzaavvan).
Test and comments authored by Gregory P. Smith [Google].
(cherry picked from commit 4abab6b603dd38bec1168e9a37c40a48ec89508e)
Upstream: https://github.com/python/cpython/pull/93879
Tracking bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2120642
Co-authored-by: Gregory P. Smith <greg@krypto.org>
---
Lib/http/server.py | 7 +++
Lib/test/test_httpservers.py | 53 ++++++++++++++++++-
...2-06-15-20-09-23.gh-issue-87389.QVaC3f.rst | 3 ++
3 files changed, 61 insertions(+), 2 deletions(-)
create mode 100644 Misc/NEWS.d/next/Security/2022-06-15-20-09-23.gh-issue-87389.QVaC3f.rst
diff --git a/Lib/http/server.py b/Lib/http/server.py
index 60a4dadf03..ce05be13d3 100644
--- a/Lib/http/server.py
+++ b/Lib/http/server.py
@@ -323,6 +323,13 @@ class BaseHTTPRequestHandler(socketserver.StreamRequestHandler):
return False
self.command, self.path, self.request_version = command, path, version
+ # gh-87389: The purpose of replacing '//' with '/' is to protect
+ # against open redirect attacks possibly triggered if the path starts
+ # with '//' because http clients treat //path as an absolute URI
+ # without scheme (similar to http://path) rather than a path.
+ if self.path.startswith('//'):
+ self.path = '/' + self.path.lstrip('/') # Reduce to a single /
+
# Examine the headers and look for a Connection directive.
try:
self.headers = http.client.parse_headers(self.rfile,
diff --git a/Lib/test/test_httpservers.py b/Lib/test/test_httpservers.py
index 66e937e04b..5a0a7c3f74 100644
--- a/Lib/test/test_httpservers.py
+++ b/Lib/test/test_httpservers.py
@@ -324,7 +324,7 @@ class SimpleHTTPServerTestCase(BaseTestCase):
pass
def setUp(self):
- BaseTestCase.setUp(self)
+ super().setUp()
self.cwd = os.getcwd()
basetempdir = tempfile.gettempdir()
os.chdir(basetempdir)
@@ -343,7 +343,7 @@ class SimpleHTTPServerTestCase(BaseTestCase):
except:
pass
finally:
- BaseTestCase.tearDown(self)
+ super().tearDown()
def check_status_and_reason(self, response, status, data=None):
def close_conn():
@@ -399,6 +399,55 @@ class SimpleHTTPServerTestCase(BaseTestCase):
self.check_status_and_reason(response, HTTPStatus.OK,
data=support.TESTFN_UNDECODABLE)
+ def test_get_dir_redirect_location_domain_injection_bug(self):
+ """Ensure //evil.co/..%2f../../X does not put //evil.co/ in Location.
+
+ //netloc/ in a Location header is a redirect to a new host.
+ https://github.com/python/cpython/issues/87389
+
+ This checks that a path resolving to a directory on our server cannot
+ resolve into a redirect to another server.
+ """
+ os.mkdir(os.path.join(self.tempdir, 'existing_directory'))
+ url = f'/python.org/..%2f..%2f..%2f..%2f..%2f../%0a%0d/../{self.tempdir_name}/existing_directory'
+ expected_location = f'{url}/' # /python.org.../ single slash single prefix, trailing slash
+ # Canonicalizes to /tmp/tempdir_name/existing_directory which does
+ # exist and is a dir, triggering the 301 redirect logic.
+ response = self.request(url)
+ self.check_status_and_reason(response, HTTPStatus.MOVED_PERMANENTLY)
+ location = response.getheader('Location')
+ self.assertEqual(location, expected_location, msg='non-attack failed!')
+
+ # //python.org... multi-slash prefix, no trailing slash
+ attack_url = f'/{url}'
+ response = self.request(attack_url)
+ self.check_status_and_reason(response, HTTPStatus.MOVED_PERMANENTLY)
+ location = response.getheader('Location')
+ self.assertFalse(location.startswith('//'), msg=location)
+ self.assertEqual(location, expected_location,
+ msg='Expected Location header to start with a single / and '
+ 'end with a / as this is a directory redirect.')
+
+ # ///python.org... triple-slash prefix, no trailing slash
+ attack3_url = f'//{url}'
+ response = self.request(attack3_url)
+ self.check_status_and_reason(response, HTTPStatus.MOVED_PERMANENTLY)
+ self.assertEqual(response.getheader('Location'), expected_location)
+
+ # If the second word in the http request (Request-URI for the http
+ # method) is a full URI, we don't worry about it, as that'll be parsed
+ # and reassembled as a full URI within BaseHTTPRequestHandler.send_head
+ # so no errant scheme-less //netloc//evil.co/ domain mixup can happen.
+ attack_scheme_netloc_2slash_url = f'https://pypi.org/{url}'
+ expected_scheme_netloc_location = f'{attack_scheme_netloc_2slash_url}/'
+ response = self.request(attack_scheme_netloc_2slash_url)
+ self.check_status_and_reason(response, HTTPStatus.MOVED_PERMANENTLY)
+ location = response.getheader('Location')
+ # We're just ensuring that the scheme and domain make it through, if
+ # there are or aren't multiple slashes at the start of the path that
+ # follows that isn't important in this Location: header.
+ self.assertTrue(location.startswith('https://pypi.org/'), msg=location)
+
def test_get(self):
#constructs the path relative to the root directory of the HTTPServer
response = self.request(self.base_url + '/test')
diff --git a/Misc/NEWS.d/next/Security/2022-06-15-20-09-23.gh-issue-87389.QVaC3f.rst b/Misc/NEWS.d/next/Security/2022-06-15-20-09-23.gh-issue-87389.QVaC3f.rst
new file mode 100644
index 0000000000..029d437190
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2022-06-15-20-09-23.gh-issue-87389.QVaC3f.rst
@@ -0,0 +1,3 @@
+:mod:`http.server`: Fix an open redirection vulnerability in the HTTP server
+when an URI path starts with ``//``. Vulnerability discovered, and initial
+fix proposed, by Hamza Avvan.

File diff suppressed because it is too large Load diff

View file

@ -1,98 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Theo Buehler <botovq@users.noreply.github.com>
Date: Fri, 21 Oct 2022 20:37:54 -0700
Subject: 00392: CVE-2022-37454: Fix buffer overflows in _sha3 module
This is a port of the applicable part of XKCP's fix [1] for
CVE-2022-37454 and avoids the segmentation fault and the infinite
loop in the test cases published in [2].
[1]: https://github.com/XKCP/XKCP/commit/fdc6fef075f4e81d6b1bc38364248975e08e340a
[2]: https://mouha.be/sha-3-buffer-overflow/
(cherry picked from commit 0e4e058602d93b88256ff90bbef501ba20be9dd3)
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
---
Lib/test/test_hashlib.py | 9 +++++++++
.../2022-10-21-13-31-47.gh-issue-98517.SXXGfV.rst | 1 +
Modules/_sha3/kcp/KeccakSponge.inc | 15 ++++++++-------
3 files changed, 18 insertions(+), 7 deletions(-)
create mode 100644 Misc/NEWS.d/next/Security/2022-10-21-13-31-47.gh-issue-98517.SXXGfV.rst
diff --git a/Lib/test/test_hashlib.py b/Lib/test/test_hashlib.py
index 9711856853..08f0af3748 100644
--- a/Lib/test/test_hashlib.py
+++ b/Lib/test/test_hashlib.py
@@ -418,6 +418,15 @@ class HashLibTestCase(unittest.TestCase):
def test_case_md5_uintmax(self, size):
self.check('md5', b'A'*size, '28138d306ff1b8281f1a9067e1a1a2b3')
+ @unittest.skipIf(sys.maxsize < _4G - 1, 'test cannot run on 32-bit systems')
+ @bigmemtest(size=_4G - 1, memuse=1, dry_run=False)
+ def test_sha3_update_overflow(self, size):
+ """Regression test for gh-98517 CVE-2022-37454."""
+ h = hashlib.sha3_224()
+ h.update(b'\x01')
+ h.update(b'\x01'*0xffff_ffff)
+ self.assertEqual(h.hexdigest(), '80762e8ce6700f114fec0f621fd97c4b9c00147fa052215294cceeed')
+
# use the three examples from Federal Information Processing Standards
# Publication 180-1, Secure Hash Standard, 1995 April 17
# http://www.itl.nist.gov/div897/pubs/fip180-1.htm
diff --git a/Misc/NEWS.d/next/Security/2022-10-21-13-31-47.gh-issue-98517.SXXGfV.rst b/Misc/NEWS.d/next/Security/2022-10-21-13-31-47.gh-issue-98517.SXXGfV.rst
new file mode 100644
index 0000000000..2d23a6ad93
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2022-10-21-13-31-47.gh-issue-98517.SXXGfV.rst
@@ -0,0 +1 @@
+Port XKCP's fix for the buffer overflows in SHA-3 (CVE-2022-37454).
diff --git a/Modules/_sha3/kcp/KeccakSponge.inc b/Modules/_sha3/kcp/KeccakSponge.inc
index e10739deaf..cf92e4db4d 100644
--- a/Modules/_sha3/kcp/KeccakSponge.inc
+++ b/Modules/_sha3/kcp/KeccakSponge.inc
@@ -171,7 +171,7 @@ int SpongeAbsorb(SpongeInstance *instance, const unsigned char *data, size_t dat
i = 0;
curData = data;
while(i < dataByteLen) {
- if ((instance->byteIOIndex == 0) && (dataByteLen >= (i + rateInBytes))) {
+ if ((instance->byteIOIndex == 0) && (dataByteLen-i >= rateInBytes)) {
#ifdef SnP_FastLoop_Absorb
/* processing full blocks first */
@@ -199,10 +199,10 @@ int SpongeAbsorb(SpongeInstance *instance, const unsigned char *data, size_t dat
}
else {
/* normal lane: using the message queue */
-
- partialBlock = (unsigned int)(dataByteLen - i);
- if (partialBlock+instance->byteIOIndex > rateInBytes)
+ if (dataByteLen-i > rateInBytes-instance->byteIOIndex)
partialBlock = rateInBytes-instance->byteIOIndex;
+ else
+ partialBlock = (unsigned int)(dataByteLen - i);
#ifdef KeccakReference
displayBytes(1, "Block to be absorbed (part)", curData, partialBlock);
#endif
@@ -281,7 +281,7 @@ int SpongeSqueeze(SpongeInstance *instance, unsigned char *data, size_t dataByte
i = 0;
curData = data;
while(i < dataByteLen) {
- if ((instance->byteIOIndex == rateInBytes) && (dataByteLen >= (i + rateInBytes))) {
+ if ((instance->byteIOIndex == rateInBytes) && (dataByteLen-i >= rateInBytes)) {
for(j=dataByteLen-i; j>=rateInBytes; j-=rateInBytes) {
SnP_Permute(instance->state);
SnP_ExtractBytes(instance->state, curData, 0, rateInBytes);
@@ -299,9 +299,10 @@ int SpongeSqueeze(SpongeInstance *instance, unsigned char *data, size_t dataByte
SnP_Permute(instance->state);
instance->byteIOIndex = 0;
}
- partialBlock = (unsigned int)(dataByteLen - i);
- if (partialBlock+instance->byteIOIndex > rateInBytes)
+ if (dataByteLen-i > rateInBytes-instance->byteIOIndex)
partialBlock = rateInBytes-instance->byteIOIndex;
+ else
+ partialBlock = (unsigned int)(dataByteLen - i);
i += partialBlock;
SnP_ExtractBytes(instance->state, curData, instance->byteIOIndex, partialBlock);

View file

@ -1,95 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Miss Islington (bot)"
<31488909+miss-islington@users.noreply.github.com>
Date: Mon, 7 Nov 2022 19:22:14 -0800
Subject: 00394: CVE-2022-45061: CPU denial of service via inefficient IDNA
decoder
gh-98433: Fix quadratic time idna decoding.
There was an unnecessary quadratic loop in idna decoding. This restores
the behavior to linear.
(cherry picked from commit a6f6c3a3d6f2b580f2d87885c9b8a9350ad7bf15)
Co-authored-by: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
---
Lib/encodings/idna.py | 32 +++++++++----------
Lib/test/test_codecs.py | 6 ++++
...2-11-04-09-29-36.gh-issue-98433.l76c5G.rst | 6 ++++
3 files changed, 27 insertions(+), 17 deletions(-)
create mode 100644 Misc/NEWS.d/next/Security/2022-11-04-09-29-36.gh-issue-98433.l76c5G.rst
diff --git a/Lib/encodings/idna.py b/Lib/encodings/idna.py
index ea4058512f..bf98f51336 100644
--- a/Lib/encodings/idna.py
+++ b/Lib/encodings/idna.py
@@ -39,23 +39,21 @@ def nameprep(label):
# Check bidi
RandAL = [stringprep.in_table_d1(x) for x in label]
- for c in RandAL:
- if c:
- # There is a RandAL char in the string. Must perform further
- # tests:
- # 1) The characters in section 5.8 MUST be prohibited.
- # This is table C.8, which was already checked
- # 2) If a string contains any RandALCat character, the string
- # MUST NOT contain any LCat character.
- if any(stringprep.in_table_d2(x) for x in label):
- raise UnicodeError("Violation of BIDI requirement 2")
-
- # 3) If a string contains any RandALCat character, a
- # RandALCat character MUST be the first character of the
- # string, and a RandALCat character MUST be the last
- # character of the string.
- if not RandAL[0] or not RandAL[-1]:
- raise UnicodeError("Violation of BIDI requirement 3")
+ if any(RandAL):
+ # There is a RandAL char in the string. Must perform further
+ # tests:
+ # 1) The characters in section 5.8 MUST be prohibited.
+ # This is table C.8, which was already checked
+ # 2) If a string contains any RandALCat character, the string
+ # MUST NOT contain any LCat character.
+ if any(stringprep.in_table_d2(x) for x in label):
+ raise UnicodeError("Violation of BIDI requirement 2")
+ # 3) If a string contains any RandALCat character, a
+ # RandALCat character MUST be the first character of the
+ # string, and a RandALCat character MUST be the last
+ # character of the string.
+ if not RandAL[0] or not RandAL[-1]:
+ raise UnicodeError("Violation of BIDI requirement 3")
return label
diff --git a/Lib/test/test_codecs.py b/Lib/test/test_codecs.py
index 56485de3f6..a798d1f287 100644
--- a/Lib/test/test_codecs.py
+++ b/Lib/test/test_codecs.py
@@ -1640,6 +1640,12 @@ class IDNACodecTest(unittest.TestCase):
self.assertEqual("pyth\xf6n.org".encode("idna"), b"xn--pythn-mua.org")
self.assertEqual("pyth\xf6n.org.".encode("idna"), b"xn--pythn-mua.org.")
+ def test_builtin_decode_length_limit(self):
+ with self.assertRaisesRegex(UnicodeError, "too long"):
+ (b"xn--016c"+b"a"*1100).decode("idna")
+ with self.assertRaisesRegex(UnicodeError, "too long"):
+ (b"xn--016c"+b"a"*70).decode("idna")
+
def test_stream(self):
r = codecs.getreader("idna")(io.BytesIO(b"abc"))
r.read(3)
diff --git a/Misc/NEWS.d/next/Security/2022-11-04-09-29-36.gh-issue-98433.l76c5G.rst b/Misc/NEWS.d/next/Security/2022-11-04-09-29-36.gh-issue-98433.l76c5G.rst
new file mode 100644
index 0000000000..5185fac2e2
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2022-11-04-09-29-36.gh-issue-98433.l76c5G.rst
@@ -0,0 +1,6 @@
+The IDNA codec decoder used on DNS hostnames by :mod:`socket` or :mod:`asyncio`
+related name resolution functions no longer involves a quadratic algorithm.
+This prevents a potential CPU denial of service if an out-of-spec excessive
+length hostname involving bidirectional characters were decoded. Some protocols
+such as :mod:`urllib` http ``3xx`` redirects potentially allow for an attacker
+to supply such a name.

View file

@ -1,223 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Miss Islington (bot)"
<31488909+miss-islington@users.noreply.github.com>
Date: Mon, 22 May 2023 03:42:37 -0700
Subject: 00399: CVE-2023-24329
gh-102153: Start stripping C0 control and space chars in `urlsplit` (GH-102508)
`urllib.parse.urlsplit` has already been respecting the WHATWG spec a bit GH-25595.
This adds more sanitizing to respect the "Remove any leading C0 control or space from input" [rule](https://url.spec.whatwg.org/GH-url-parsing:~:text=Remove%20any%20leading%20and%20trailing%20C0%20control%20or%20space%20from%20input.) in response to [CVE-2023-24329](https://nvd.nist.gov/vuln/detail/CVE-2023-24329).
Backported from Python 3.12
(cherry picked from commit f48a96a28012d28ae37a2f4587a780a5eb779946)
Co-authored-by: Illia Volochii <illia.volochii@gmail.com>
Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>
---
Doc/library/urllib.parse.rst | 40 +++++++++++-
Lib/test/test_urlparse.py | 61 ++++++++++++++++++-
Lib/urllib/parse.py | 12 ++++
...-03-07-20-59-17.gh-issue-102153.14CLSZ.rst | 3 +
4 files changed, 113 insertions(+), 3 deletions(-)
create mode 100644 Misc/NEWS.d/next/Security/2023-03-07-20-59-17.gh-issue-102153.14CLSZ.rst
diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst
index b717d7cc05..83a7a82089 100644
--- a/Doc/library/urllib.parse.rst
+++ b/Doc/library/urllib.parse.rst
@@ -126,6 +126,12 @@ or on combining URL components into a URL string.
``#``, ``@``, or ``:`` will raise a :exc:`ValueError`. If the URL is
decomposed before parsing, no error will be raised.
+
+ .. warning::
+
+ :func:`urlparse` does not perform validation. See :ref:`URL parsing
+ security <url-parsing-security>` for details.
+
.. versionchanged:: 3.2
Added IPv6 URL parsing capabilities.
@@ -288,8 +294,14 @@ or on combining URL components into a URL string.
``#``, ``@``, or ``:`` will raise a :exc:`ValueError`. If the URL is
decomposed before parsing, no error will be raised.
- Following the `WHATWG spec`_ that updates RFC 3986, ASCII newline
- ``\n``, ``\r`` and tab ``\t`` characters are stripped from the URL.
+ Following some of the `WHATWG spec`_ that updates RFC 3986, leading C0
+ control and space characters are stripped from the URL. ``\n``,
+ ``\r`` and tab ``\t`` characters are removed from the URL at any position.
+
+ .. warning::
+
+ :func:`urlsplit` does not perform validation. See :ref:`URL parsing
+ security <url-parsing-security>` for details.
.. versionchanged:: 3.6
Out-of-range port numbers now raise :exc:`ValueError`, instead of
@@ -302,6 +314,9 @@ or on combining URL components into a URL string.
.. versionchanged:: 3.6.14
ASCII newline and tab characters are stripped from the URL.
+ .. versionchanged:: 3.6.15
+ Leading WHATWG C0 control and space characters are stripped from the URL.
+
.. _WHATWG spec: https://url.spec.whatwg.org/#concept-basic-url-parser
.. function:: urlunsplit(parts)
@@ -371,6 +386,27 @@ or on combining URL components into a URL string.
.. versionchanged:: 3.2
Result is a structured object rather than a simple 2-tuple.
+.. _url-parsing-security:
+
+URL parsing security
+--------------------
+
+The :func:`urlsplit` and :func:`urlparse` APIs do not perform **validation** of
+inputs. They may not raise errors on inputs that other applications consider
+invalid. They may also succeed on some inputs that might not be considered
+URLs elsewhere. Their purpose is for practical functionality rather than
+purity.
+
+Instead of raising an exception on unusual input, they may instead return some
+component parts as empty strings. Or components may contain more than perhaps
+they should.
+
+We recommend that users of these APIs where the values may be used anywhere
+with security implications code defensively. Do some verification within your
+code before trusting a returned component part. Does that ``scheme`` make
+sense? Is that a sensible ``path``? Is there anything strange about that
+``hostname``? etc.
+
.. _parsing-ascii-encoded-bytes:
Parsing ASCII Encoded Bytes
diff --git a/Lib/test/test_urlparse.py b/Lib/test/test_urlparse.py
index 3509278a01..7fd61ffea9 100644
--- a/Lib/test/test_urlparse.py
+++ b/Lib/test/test_urlparse.py
@@ -660,6 +660,65 @@ class UrlParseTestCase(unittest.TestCase):
self.assertEqual(p.scheme, "https")
self.assertEqual(p.geturl(), "https://www.python.org/javascript:alert('msg')/?query=something#fragment")
+ def test_urlsplit_strip_url(self):
+ noise = bytes(range(0, 0x20 + 1))
+ base_url = "http://User:Pass@www.python.org:080/doc/?query=yes#frag"
+
+ url = noise.decode("utf-8") + base_url
+ p = urllib.parse.urlsplit(url)
+ self.assertEqual(p.scheme, "http")
+ self.assertEqual(p.netloc, "User:Pass@www.python.org:080")
+ self.assertEqual(p.path, "/doc/")
+ self.assertEqual(p.query, "query=yes")
+ self.assertEqual(p.fragment, "frag")
+ self.assertEqual(p.username, "User")
+ self.assertEqual(p.password, "Pass")
+ self.assertEqual(p.hostname, "www.python.org")
+ self.assertEqual(p.port, 80)
+ self.assertEqual(p.geturl(), base_url)
+
+ url = noise + base_url.encode("utf-8")
+ p = urllib.parse.urlsplit(url)
+ self.assertEqual(p.scheme, b"http")
+ self.assertEqual(p.netloc, b"User:Pass@www.python.org:080")
+ self.assertEqual(p.path, b"/doc/")
+ self.assertEqual(p.query, b"query=yes")
+ self.assertEqual(p.fragment, b"frag")
+ self.assertEqual(p.username, b"User")
+ self.assertEqual(p.password, b"Pass")
+ self.assertEqual(p.hostname, b"www.python.org")
+ self.assertEqual(p.port, 80)
+ self.assertEqual(p.geturl(), base_url.encode("utf-8"))
+
+ # Test that trailing space is preserved as some applications rely on
+ # this within query strings.
+ query_spaces_url = "https://www.python.org:88/doc/?query= "
+ p = urllib.parse.urlsplit(noise.decode("utf-8") + query_spaces_url)
+ self.assertEqual(p.scheme, "https")
+ self.assertEqual(p.netloc, "www.python.org:88")
+ self.assertEqual(p.path, "/doc/")
+ self.assertEqual(p.query, "query= ")
+ self.assertEqual(p.port, 88)
+ self.assertEqual(p.geturl(), query_spaces_url)
+
+ p = urllib.parse.urlsplit("www.pypi.org ")
+ # That "hostname" gets considered a "path" due to the
+ # trailing space and our existing logic... YUCK...
+ # and re-assembles via geturl aka unurlsplit into the original.
+ # django.core.validators.URLValidator (at least through v3.2) relies on
+ # this, for better or worse, to catch it in a ValidationError via its
+ # regular expressions.
+ # Here we test the basic round trip concept of such a trailing space.
+ self.assertEqual(urllib.parse.urlunsplit(p), "www.pypi.org ")
+
+ # with scheme as cache-key
+ url = "//www.python.org/"
+ scheme = noise.decode("utf-8") + "https" + noise.decode("utf-8")
+ for _ in range(2):
+ p = urllib.parse.urlsplit(url, scheme=scheme)
+ self.assertEqual(p.scheme, "https")
+ self.assertEqual(p.geturl(), "https://www.python.org/")
+
def test_attributes_bad_port(self):
"""Check handling of invalid ports."""
for bytes in (False, True):
@@ -667,7 +726,7 @@ class UrlParseTestCase(unittest.TestCase):
for port in ("foo", "1.5", "-1", "0x10"):
with self.subTest(bytes=bytes, parse=parse, port=port):
netloc = "www.example.net:" + port
- url = "http://" + netloc
+ url = "http://" + netloc + "/"
if bytes:
netloc = netloc.encode("ascii")
url = url.encode("ascii")
diff --git a/Lib/urllib/parse.py b/Lib/urllib/parse.py
index ac6e7a9cee..717e990997 100644
--- a/Lib/urllib/parse.py
+++ b/Lib/urllib/parse.py
@@ -25,6 +25,10 @@ currently not entirely compliant with this RFC due to defacto
scenarios for parsing, and for backward compatibility purposes, some
parsing quirks from older RFCs are retained. The testcases in
test_urlparse.py provides a good indicator of parsing behavior.
+
+The WHATWG URL Parser spec should also be considered. We are not compliant with
+it either due to existing user code API behavior expectations (Hyrum's Law).
+It serves as a useful guide when making changes.
"""
import re
@@ -76,6 +80,10 @@ scheme_chars = ('abcdefghijklmnopqrstuvwxyz'
'0123456789'
'+-.')
+# Leading and trailing C0 control and space to be stripped per WHATWG spec.
+# == "".join([chr(i) for i in range(0, 0x20 + 1)])
+_WHATWG_C0_CONTROL_OR_SPACE = '\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f '
+
# Unsafe bytes to be removed per WHATWG spec
_UNSAFE_URL_BYTES_TO_REMOVE = ['\t', '\r', '\n']
@@ -426,6 +434,10 @@ def urlsplit(url, scheme='', allow_fragments=True):
url, scheme, _coerce_result = _coerce_args(url, scheme)
url = _remove_unsafe_bytes_from_url(url)
scheme = _remove_unsafe_bytes_from_url(scheme)
+ # Only lstrip url as some applications rely on preserving trailing space.
+ # (https://url.spec.whatwg.org/#concept-basic-url-parser would strip both)
+ url = url.lstrip(_WHATWG_C0_CONTROL_OR_SPACE)
+ scheme = scheme.strip(_WHATWG_C0_CONTROL_OR_SPACE)
allow_fragments = bool(allow_fragments)
key = url, scheme, allow_fragments, type(url), type(scheme)
cached = _parse_cache.get(key, None)
diff --git a/Misc/NEWS.d/next/Security/2023-03-07-20-59-17.gh-issue-102153.14CLSZ.rst b/Misc/NEWS.d/next/Security/2023-03-07-20-59-17.gh-issue-102153.14CLSZ.rst
new file mode 100644
index 0000000000..e57ac4ed3a
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2023-03-07-20-59-17.gh-issue-102153.14CLSZ.rst
@@ -0,0 +1,3 @@
+:func:`urllib.parse.urlsplit` now strips leading C0 control and space
+characters following the specification for URLs defined by WHATWG in
+response to CVE-2023-24329. Patch by Illia Volochii.

View file

@ -1,47 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Erlend E. Aasland" <erlend.aasland@protonmail.com>
Date: Sun, 6 Nov 2022 22:39:34 +0100
Subject: 00407: gh-99086: Fix implicit int compiler warning in configure check
for PTHREAD_SCOPE_SYSTEM
Co-authored-by: Sam James <sam@cmpct.info>
---
.../next/Build/2022-11-04-02-58-10.gh-issue-99086.DV_4Br.rst | 1 +
configure | 2 +-
configure.ac | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
create mode 100644 Misc/NEWS.d/next/Build/2022-11-04-02-58-10.gh-issue-99086.DV_4Br.rst
diff --git a/Misc/NEWS.d/next/Build/2022-11-04-02-58-10.gh-issue-99086.DV_4Br.rst b/Misc/NEWS.d/next/Build/2022-11-04-02-58-10.gh-issue-99086.DV_4Br.rst
new file mode 100644
index 0000000000..e320ecfdfb
--- /dev/null
+++ b/Misc/NEWS.d/next/Build/2022-11-04-02-58-10.gh-issue-99086.DV_4Br.rst
@@ -0,0 +1 @@
+Fix ``-Wimplicit-int`` compiler warning in :program:`configure` check for ``PTHREAD_SCOPE_SYSTEM``.
diff --git a/configure b/configure
index e39c16eee2..32c27a468d 100755
--- a/configure
+++ b/configure
@@ -10837,7 +10837,7 @@ else
void *foo(void *parm) {
return NULL;
}
- main() {
+ int main() {
pthread_attr_t attr;
pthread_t id;
if (pthread_attr_init(&attr)) exit(-1);
diff --git a/configure.ac b/configure.ac
index d5ca7172ca..f7668224be 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3185,7 +3185,7 @@ if test "$posix_threads" = "yes"; then
void *foo(void *parm) {
return NULL;
}
- main() {
+ int main() {
pthread_attr_t attr;
pthread_t id;
if (pthread_attr_init(&attr)) exit(-1);

View file

@ -1,25 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: ngie-eign <1574099+ngie-eign@users.noreply.github.com>
Date: Mon, 25 Feb 2019 21:34:24 -0800
Subject: 00409: bpo-13497: Fix `broken nice` configure test
Per POSIX, `nice(3)` requires `unistd.h` and `exit(3)` requires `stdlib.h`.
Fixing the test will prevent false positives with pedantic compilers like clang.
---
configure.ac | 2 ++
1 file changed, 2 insertions(+)
diff --git a/configure.ac b/configure.ac
index f7668224be..2696661ab9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -4946,6 +4946,8 @@ LIBS=$LIBS_no_readline
AC_MSG_CHECKING(for broken nice())
AC_CACHE_VAL(ac_cv_broken_nice, [
AC_RUN_IFELSE([AC_LANG_SOURCE([[
+#include <stdlib.h>
+#include <unistd.h>
int main()
{
int val1 = nice(1);

View file

@ -1,113 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Joshua Root <jmr@macports.org>
Date: Mon, 14 Dec 2020 07:56:34 +1100
Subject: 00410: bpo-42598: Fix implicit function declarations in configure
This is invalid in C99 and later and is an error with some compilers
(e.g. clang in Xcode 12), and can thus cause configure checks to
produce incorrect results.
---
.../Build/2020-12-13-14-43-10.bpo-42598.7ipr5H.rst | 2 ++
configure | 13 +++++++------
configure.ac | 13 +++++++------
3 files changed, 16 insertions(+), 12 deletions(-)
create mode 100644 Misc/NEWS.d/next/Build/2020-12-13-14-43-10.bpo-42598.7ipr5H.rst
diff --git a/Misc/NEWS.d/next/Build/2020-12-13-14-43-10.bpo-42598.7ipr5H.rst b/Misc/NEWS.d/next/Build/2020-12-13-14-43-10.bpo-42598.7ipr5H.rst
new file mode 100644
index 0000000000..7dafc105c4
--- /dev/null
+++ b/Misc/NEWS.d/next/Build/2020-12-13-14-43-10.bpo-42598.7ipr5H.rst
@@ -0,0 +1,2 @@
+Fix implicit function declarations in configure which could have resulted in
+incorrect configuration checks. Patch contributed by Joshua Root.
diff --git a/configure b/configure
index 32c27a468d..68a46deef5 100755
--- a/configure
+++ b/configure
@@ -10840,10 +10840,10 @@ else
int main() {
pthread_attr_t attr;
pthread_t id;
- if (pthread_attr_init(&attr)) exit(-1);
- if (pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM)) exit(-1);
- if (pthread_create(&id, &attr, foo, NULL)) exit(-1);
- exit(0);
+ if (pthread_attr_init(&attr)) return (-1);
+ if (pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM)) return (-1);
+ if (pthread_create(&id, &attr, foo, NULL)) return (-1);
+ return (0);
}
_ACEOF
if ac_fn_c_try_run "$LINENO"; then :
@@ -14891,7 +14891,7 @@ else
int main()
{
/* Success: exit code 0 */
- exit((((wchar_t) -1) < ((wchar_t) 0)) ? 0 : 1);
+ return ((((wchar_t) -1) < ((wchar_t) 0)) ? 0 : 1);
}
_ACEOF
@@ -15213,7 +15213,7 @@ else
int main()
{
- exit(((-1)>>3 == -1) ? 0 : 1);
+ return (((-1)>>3 == -1) ? 0 : 1);
}
_ACEOF
@@ -15725,6 +15725,7 @@ else
/* end confdefs.h. */
#include <poll.h>
+#include <unistd.h>
int main()
{
diff --git a/configure.ac b/configure.ac
index 2696661ab9..9d2ad9afba 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3188,10 +3188,10 @@ if test "$posix_threads" = "yes"; then
int main() {
pthread_attr_t attr;
pthread_t id;
- if (pthread_attr_init(&attr)) exit(-1);
- if (pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM)) exit(-1);
- if (pthread_create(&id, &attr, foo, NULL)) exit(-1);
- exit(0);
+ if (pthread_attr_init(&attr)) return (-1);
+ if (pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM)) return (-1);
+ if (pthread_create(&id, &attr, foo, NULL)) return (-1);
+ return (0);
}]])],
[ac_cv_pthread_system_supported=yes],
[ac_cv_pthread_system_supported=no],
@@ -4743,7 +4743,7 @@ then
int main()
{
/* Success: exit code 0 */
- exit((((wchar_t) -1) < ((wchar_t) 0)) ? 0 : 1);
+ return ((((wchar_t) -1) < ((wchar_t) 0)) ? 0 : 1);
}
]])],
[ac_cv_wchar_t_signed=yes],
@@ -4818,7 +4818,7 @@ AC_CACHE_VAL(ac_cv_rshift_extends_sign, [
AC_RUN_IFELSE([AC_LANG_SOURCE([[
int main()
{
- exit(((-1)>>3 == -1) ? 0 : 1);
+ return (((-1)>>3 == -1) ? 0 : 1);
}
]])],
[ac_cv_rshift_extends_sign=yes],
@@ -4970,6 +4970,7 @@ AC_MSG_CHECKING(for broken poll())
AC_CACHE_VAL(ac_cv_broken_poll,
AC_RUN_IFELSE([AC_LANG_SOURCE([[
#include <poll.h>
+#include <unistd.h>
int main()
{

View file

@ -1,500 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Victor Stinner <vstinner@python.org>
Date: Fri, 15 Dec 2023 16:10:40 +0100
Subject: 00415: [CVE-2023-27043] gh-102988: Reject malformed addresses in
email.parseaddr() (#111116)
Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.
Co-Authored-By: Thomas Dwyer <github@tomd.tel>
---
Doc/library/email.utils.rst | 19 +-
Lib/email/utils.py | 151 ++++++++++++-
Lib/test/test_email/test_email.py | 204 +++++++++++++++++-
...-10-20-15-28-08.gh-issue-102988.dStNO7.rst | 8 +
4 files changed, 361 insertions(+), 21 deletions(-)
create mode 100644 Misc/NEWS.d/next/Library/2023-10-20-15-28-08.gh-issue-102988.dStNO7.rst
diff --git a/Doc/library/email.utils.rst b/Doc/library/email.utils.rst
index 63fae2ab84..d1e1898591 100644
--- a/Doc/library/email.utils.rst
+++ b/Doc/library/email.utils.rst
@@ -60,13 +60,18 @@ of the new API.
begins with angle brackets, they are stripped off.
-.. function:: parseaddr(address)
+.. function:: parseaddr(address, *, strict=True)
Parse address -- which should be the value of some address-containing field such
as :mailheader:`To` or :mailheader:`Cc` -- into its constituent *realname* and
*email address* parts. Returns a tuple of that information, unless the parse
fails, in which case a 2-tuple of ``('', '')`` is returned.
+ If *strict* is true, use a strict parser which rejects malformed inputs.
+
+ .. versionchanged:: 3.13
+ Add *strict* optional parameter and reject malformed inputs by default.
+
.. function:: formataddr(pair, charset='utf-8')
@@ -84,12 +89,15 @@ of the new API.
Added the *charset* option.
-.. function:: getaddresses(fieldvalues)
+.. function:: getaddresses(fieldvalues, *, strict=True)
This method returns a list of 2-tuples of the form returned by ``parseaddr()``.
*fieldvalues* is a sequence of header field values as might be returned by
- :meth:`Message.get_all <email.message.Message.get_all>`. Here's a simple
- example that gets all the recipients of a message::
+ :meth:`Message.get_all <email.message.Message.get_all>`.
+
+ If *strict* is true, use a strict parser which rejects malformed inputs.
+
+ Here's a simple example that gets all the recipients of a message::
from email.utils import getaddresses
@@ -99,6 +107,9 @@ of the new API.
resent_ccs = msg.get_all('resent-cc', [])
all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs)
+ .. versionchanged:: 3.13
+ Add *strict* optional parameter and reject malformed inputs by default.
+
.. function:: parsedate(date)
diff --git a/Lib/email/utils.py b/Lib/email/utils.py
index 39c2240607..f83b7e5d7e 100644
--- a/Lib/email/utils.py
+++ b/Lib/email/utils.py
@@ -48,6 +48,7 @@ TICK = "'"
specialsre = re.compile(r'[][\\()<>@,:;".]')
escapesre = re.compile(r'[\\"]')
+
def _has_surrogates(s):
"""Return True if s contains surrogate-escaped binary data."""
# This check is based on the fact that unless there are surrogates, utf8
@@ -106,12 +107,127 @@ def formataddr(pair, charset='utf-8'):
return address
+def _iter_escaped_chars(addr):
+ pos = 0
+ escape = False
+ for pos, ch in enumerate(addr):
+ if escape:
+ yield (pos, '\\' + ch)
+ escape = False
+ elif ch == '\\':
+ escape = True
+ else:
+ yield (pos, ch)
+ if escape:
+ yield (pos, '\\')
-def getaddresses(fieldvalues):
- """Return a list of (REALNAME, EMAIL) for each fieldvalue."""
- all = COMMASPACE.join(fieldvalues)
- a = _AddressList(all)
- return a.addresslist
+
+def _strip_quoted_realnames(addr):
+ """Strip real names between quotes."""
+ if '"' not in addr:
+ # Fast path
+ return addr
+
+ start = 0
+ open_pos = None
+ result = []
+ for pos, ch in _iter_escaped_chars(addr):
+ if ch == '"':
+ if open_pos is None:
+ open_pos = pos
+ else:
+ if start != open_pos:
+ result.append(addr[start:open_pos])
+ start = pos + 1
+ open_pos = None
+
+ if start < len(addr):
+ result.append(addr[start:])
+
+ return ''.join(result)
+
+
+supports_strict_parsing = True
+
+def getaddresses(fieldvalues, *, strict=True):
+ """Return a list of (REALNAME, EMAIL) or ('','') for each fieldvalue.
+
+ When parsing fails for a fieldvalue, a 2-tuple of ('', '') is returned in
+ its place.
+
+ If strict is true, use a strict parser which rejects malformed inputs.
+ """
+
+ # If strict is true, if the resulting list of parsed addresses is greater
+ # than the number of fieldvalues in the input list, a parsing error has
+ # occurred and consequently a list containing a single empty 2-tuple [('',
+ # '')] is returned in its place. This is done to avoid invalid output.
+ #
+ # Malformed input: getaddresses(['alice@example.com <bob@example.com>'])
+ # Invalid output: [('', 'alice@example.com'), ('', 'bob@example.com')]
+ # Safe output: [('', '')]
+
+ if not strict:
+ all = COMMASPACE.join(str(v) for v in fieldvalues)
+ a = _AddressList(all)
+ return a.addresslist
+
+ fieldvalues = [str(v) for v in fieldvalues]
+ fieldvalues = _pre_parse_validation(fieldvalues)
+ addr = COMMASPACE.join(fieldvalues)
+ a = _AddressList(addr)
+ result = _post_parse_validation(a.addresslist)
+
+ # Treat output as invalid if the number of addresses is not equal to the
+ # expected number of addresses.
+ n = 0
+ for v in fieldvalues:
+ # When a comma is used in the Real Name part it is not a deliminator.
+ # So strip those out before counting the commas.
+ v = _strip_quoted_realnames(v)
+ # Expected number of addresses: 1 + number of commas
+ n += 1 + v.count(',')
+ if len(result) != n:
+ return [('', '')]
+
+ return result
+
+
+def _check_parenthesis(addr):
+ # Ignore parenthesis in quoted real names.
+ addr = _strip_quoted_realnames(addr)
+
+ opens = 0
+ for pos, ch in _iter_escaped_chars(addr):
+ if ch == '(':
+ opens += 1
+ elif ch == ')':
+ opens -= 1
+ if opens < 0:
+ return False
+ return (opens == 0)
+
+
+def _pre_parse_validation(email_header_fields):
+ accepted_values = []
+ for v in email_header_fields:
+ if not _check_parenthesis(v):
+ v = "('', '')"
+ accepted_values.append(v)
+
+ return accepted_values
+
+
+def _post_parse_validation(parsed_email_header_tuples):
+ accepted_values = []
+ # The parser would have parsed a correctly formatted domain-literal
+ # The existence of an [ after parsing indicates a parsing failure
+ for v in parsed_email_header_tuples:
+ if '[' in v[1]:
+ v = ('', '')
+ accepted_values.append(v)
+
+ return accepted_values
@@ -214,16 +330,33 @@ def parsedate_to_datetime(data):
tzinfo=datetime.timezone(datetime.timedelta(seconds=tz)))
-def parseaddr(addr):
+def parseaddr(addr, *, strict=True):
"""
Parse addr into its constituent realname and email address parts.
Return a tuple of realname and email address, unless the parse fails, in
which case return a 2-tuple of ('', '').
+
+ If strict is True, use a strict parser which rejects malformed inputs.
"""
- addrs = _AddressList(addr).addresslist
- if not addrs:
- return '', ''
+ if not strict:
+ addrs = _AddressList(addr).addresslist
+ if not addrs:
+ return ('', '')
+ return addrs[0]
+
+ if isinstance(addr, list):
+ addr = addr[0]
+
+ if not isinstance(addr, str):
+ return ('', '')
+
+ addr = _pre_parse_validation([addr])[0]
+ addrs = _post_parse_validation(_AddressList(addr).addresslist)
+
+ if not addrs or len(addrs) > 1:
+ return ('', '')
+
return addrs[0]
diff --git a/Lib/test/test_email/test_email.py b/Lib/test/test_email/test_email.py
index e4e40b612f..ce36efc1b1 100644
--- a/Lib/test/test_email/test_email.py
+++ b/Lib/test/test_email/test_email.py
@@ -19,6 +19,7 @@ except ImportError:
import email
import email.policy
+import email.utils
from email.charset import Charset
from email.header import Header, decode_header, make_header
@@ -3207,15 +3208,154 @@ Foo
[('Al Person', 'aperson@dom.ain'),
('Bud Person', 'bperson@dom.ain')])
+ def test_getaddresses_comma_in_name(self):
+ """GH-106669 regression test."""
+ self.assertEqual(
+ utils.getaddresses(
+ [
+ '"Bud, Person" <bperson@dom.ain>',
+ 'aperson@dom.ain (Al Person)',
+ '"Mariusz Felisiak" <to@example.com>',
+ ]
+ ),
+ [
+ ('Bud, Person', 'bperson@dom.ain'),
+ ('Al Person', 'aperson@dom.ain'),
+ ('Mariusz Felisiak', 'to@example.com'),
+ ],
+ )
+
+ def test_parsing_errors(self):
+ """Test for parsing errors from CVE-2023-27043 and CVE-2019-16056"""
+ alice = 'alice@example.org'
+ bob = 'bob@example.com'
+ empty = ('', '')
+
+ # Test utils.getaddresses() and utils.parseaddr() on malformed email
+ # addresses: default behavior (strict=True) rejects malformed address,
+ # and strict=False which tolerates malformed address.
+ for invalid_separator, expected_non_strict in (
+ ('(', [(f'<{bob}>', alice)]),
+ (')', [('', alice), empty, ('', bob)]),
+ ('<', [('', alice), empty, ('', bob), empty]),
+ ('>', [('', alice), empty, ('', bob)]),
+ ('[', [('', f'{alice}[<{bob}>]')]),
+ (']', [('', alice), empty, ('', bob)]),
+ ('@', [empty, empty, ('', bob)]),
+ (';', [('', alice), empty, ('', bob)]),
+ (':', [('', alice), ('', bob)]),
+ ('.', [('', alice + '.'), ('', bob)]),
+ ('"', [('', alice), ('', f'<{bob}>')]),
+ ):
+ address = f'{alice}{invalid_separator}<{bob}>'
+ with self.subTest(address=address):
+ self.assertEqual(utils.getaddresses([address]),
+ [empty])
+ self.assertEqual(utils.getaddresses([address], strict=False),
+ expected_non_strict)
+
+ self.assertEqual(utils.parseaddr([address]),
+ empty)
+ self.assertEqual(utils.parseaddr([address], strict=False),
+ ('', address))
+
+ # Comma (',') is treated differently depending on strict parameter.
+ # Comma without quotes.
+ address = f'{alice},<{bob}>'
+ self.assertEqual(utils.getaddresses([address]),
+ [('', alice), ('', bob)])
+ self.assertEqual(utils.getaddresses([address], strict=False),
+ [('', alice), ('', bob)])
+ self.assertEqual(utils.parseaddr([address]),
+ empty)
+ self.assertEqual(utils.parseaddr([address], strict=False),
+ ('', address))
+
+ # Real name between quotes containing comma.
+ address = '"Alice, alice@example.org" <bob@example.com>'
+ expected_strict = ('Alice, alice@example.org', 'bob@example.com')
+ self.assertEqual(utils.getaddresses([address]), [expected_strict])
+ self.assertEqual(utils.getaddresses([address], strict=False), [expected_strict])
+ self.assertEqual(utils.parseaddr([address]), expected_strict)
+ self.assertEqual(utils.parseaddr([address], strict=False),
+ ('', address))
+
+ # Valid parenthesis in comments.
+ address = 'alice@example.org (Alice)'
+ expected_strict = ('Alice', 'alice@example.org')
+ self.assertEqual(utils.getaddresses([address]), [expected_strict])
+ self.assertEqual(utils.getaddresses([address], strict=False), [expected_strict])
+ self.assertEqual(utils.parseaddr([address]), expected_strict)
+ self.assertEqual(utils.parseaddr([address], strict=False),
+ ('', address))
+
+ # Invalid parenthesis in comments.
+ address = 'alice@example.org )Alice('
+ self.assertEqual(utils.getaddresses([address]), [empty])
+ self.assertEqual(utils.getaddresses([address], strict=False),
+ [('', 'alice@example.org'), ('', ''), ('', 'Alice')])
+ self.assertEqual(utils.parseaddr([address]), empty)
+ self.assertEqual(utils.parseaddr([address], strict=False),
+ ('', address))
+
+ # Two addresses with quotes separated by comma.
+ address = '"Jane Doe" <jane@example.net>, "John Doe" <john@example.net>'
+ self.assertEqual(utils.getaddresses([address]),
+ [('Jane Doe', 'jane@example.net'),
+ ('John Doe', 'john@example.net')])
+ self.assertEqual(utils.getaddresses([address], strict=False),
+ [('Jane Doe', 'jane@example.net'),
+ ('John Doe', 'john@example.net')])
+ self.assertEqual(utils.parseaddr([address]), empty)
+ self.assertEqual(utils.parseaddr([address], strict=False),
+ ('', address))
+
+ # Test email.utils.supports_strict_parsing attribute
+ self.assertEqual(email.utils.supports_strict_parsing, True)
+
def test_getaddresses_nasty(self):
- eq = self.assertEqual
- eq(utils.getaddresses(['foo: ;']), [('', '')])
- eq(utils.getaddresses(
- ['[]*-- =~$']),
- [('', ''), ('', ''), ('', '*--')])
- eq(utils.getaddresses(
- ['foo: ;', '"Jason R. Mastaler" <jason@dom.ain>']),
- [('', ''), ('Jason R. Mastaler', 'jason@dom.ain')])
+ for addresses, expected in (
+ (['"Sürname, Firstname" <to@example.com>'],
+ [('Sürname, Firstname', 'to@example.com')]),
+
+ (['foo: ;'],
+ [('', '')]),
+
+ (['foo: ;', '"Jason R. Mastaler" <jason@dom.ain>'],
+ [('', ''), ('Jason R. Mastaler', 'jason@dom.ain')]),
+
+ ([r'Pete(A nice \) chap) <pete(his account)@silly.test(his host)>'],
+ [('Pete (A nice ) chap his account his host)', 'pete@silly.test')]),
+
+ (['(Empty list)(start)Undisclosed recipients :(nobody(I know))'],
+ [('', '')]),
+
+ (['Mary <@machine.tld:mary@example.net>, , jdoe@test . example'],
+ [('Mary', 'mary@example.net'), ('', ''), ('', 'jdoe@test.example')]),
+
+ (['John Doe <jdoe@machine(comment). example>'],
+ [('John Doe (comment)', 'jdoe@machine.example')]),
+
+ (['"Mary Smith: Personal Account" <smith@home.example>'],
+ [('Mary Smith: Personal Account', 'smith@home.example')]),
+
+ (['Undisclosed recipients:;'],
+ [('', '')]),
+
+ ([r'<boss@nil.test>, "Giant; \"Big\" Box" <bob@example.net>'],
+ [('', 'boss@nil.test'), ('Giant; "Big" Box', 'bob@example.net')]),
+ ):
+ with self.subTest(addresses=addresses):
+ self.assertEqual(utils.getaddresses(addresses),
+ expected)
+ self.assertEqual(utils.getaddresses(addresses, strict=False),
+ expected)
+
+ addresses = ['[]*-- =~$']
+ self.assertEqual(utils.getaddresses(addresses),
+ [('', '')])
+ self.assertEqual(utils.getaddresses(addresses, strict=False),
+ [('', ''), ('', ''), ('', '*--')])
def test_getaddresses_embedded_comment(self):
"""Test proper handling of a nested comment"""
@@ -3397,6 +3537,54 @@ multipart/report
m = cls(*constructor, policy=email.policy.default)
self.assertIs(m.policy, email.policy.default)
+ def test_iter_escaped_chars(self):
+ self.assertEqual(list(utils._iter_escaped_chars(r'a\\b\"c\\"d')),
+ [(0, 'a'),
+ (2, '\\\\'),
+ (3, 'b'),
+ (5, '\\"'),
+ (6, 'c'),
+ (8, '\\\\'),
+ (9, '"'),
+ (10, 'd')])
+ self.assertEqual(list(utils._iter_escaped_chars('a\\')),
+ [(0, 'a'), (1, '\\')])
+
+ def test_strip_quoted_realnames(self):
+ def check(addr, expected):
+ self.assertEqual(utils._strip_quoted_realnames(addr), expected)
+
+ check('"Jane Doe" <jane@example.net>, "John Doe" <john@example.net>',
+ ' <jane@example.net>, <john@example.net>')
+ check(r'"Jane \"Doe\"." <jane@example.net>',
+ ' <jane@example.net>')
+
+ # special cases
+ check(r'before"name"after', 'beforeafter')
+ check(r'before"name"', 'before')
+ check(r'b"name"', 'b') # single char
+ check(r'"name"after', 'after')
+ check(r'"name"a', 'a') # single char
+ check(r'"name"', '')
+
+ # no change
+ for addr in (
+ 'Jane Doe <jane@example.net>, John Doe <john@example.net>',
+ 'lone " quote',
+ ):
+ self.assertEqual(utils._strip_quoted_realnames(addr), addr)
+
+
+ def test_check_parenthesis(self):
+ addr = 'alice@example.net'
+ self.assertTrue(utils._check_parenthesis(f'{addr} (Alice)'))
+ self.assertFalse(utils._check_parenthesis(f'{addr} )Alice('))
+ self.assertFalse(utils._check_parenthesis(f'{addr} (Alice))'))
+ self.assertFalse(utils._check_parenthesis(f'{addr} ((Alice)'))
+
+ # Ignore real name between quotes
+ self.assertTrue(utils._check_parenthesis(f'")Alice((" {addr}'))
+
# Test the iterator/generators
class TestIterators(TestEmailBase):
diff --git a/Misc/NEWS.d/next/Library/2023-10-20-15-28-08.gh-issue-102988.dStNO7.rst b/Misc/NEWS.d/next/Library/2023-10-20-15-28-08.gh-issue-102988.dStNO7.rst
new file mode 100644
index 0000000000..3d0e9e4078
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2023-10-20-15-28-08.gh-issue-102988.dStNO7.rst
@@ -0,0 +1,8 @@
+:func:`email.utils.getaddresses` and :func:`email.utils.parseaddr` now
+return ``('', '')`` 2-tuples in more situations where invalid email
+addresses are encountered instead of potentially inaccurate values. Add
+optional *strict* parameter to these two functions: use ``strict=False`` to
+get the old behavior, accept malformed inputs.
+``getattr(email.utils, 'supports_strict_parsing', False)`` can be use to check
+if the *strict* paramater is available. Patch by Thomas Dwyer and Victor
+Stinner to improve the CVE-2023-27043 fix.

View file

@ -1,66 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Miro=20Hron=C4=8Dok?= <miro@hroncok.cz>
Date: Tue, 5 Dec 2023 21:02:06 +0100
Subject: 00419: gh-112769: test_zlib: Fix comparison of ZLIB_RUNTIME_VERSION
with non-int suffix (GH-112771) (GH-112774)
zlib-ng defines the version as "1.3.0.zlib-ng".
(cherry picked from commit d384813ff18b33280a90b6d2011654528a2b6ad1)
---
Lib/test/test_zlib.py | 28 ++++++++++++++++------------
1 file changed, 16 insertions(+), 12 deletions(-)
diff --git a/Lib/test/test_zlib.py b/Lib/test/test_zlib.py
index b7170b4ff5..93615e1e1e 100644
--- a/Lib/test/test_zlib.py
+++ b/Lib/test/test_zlib.py
@@ -16,6 +16,20 @@ requires_Decompress_copy = unittest.skipUnless(
'requires Decompress.copy()')
+def _zlib_runtime_version_tuple(zlib_version=zlib.ZLIB_RUNTIME_VERSION):
+ # Register "1.2.3" as "1.2.3.0"
+ # or "1.2.0-linux","1.2.0.f","1.2.0.f-linux"
+ v = zlib_version.split('-', 1)[0].split('.')
+ if len(v) < 4:
+ v.append('0')
+ elif not v[-1].isnumeric():
+ v[-1] = '0'
+ return tuple(map(int, v))
+
+
+ZLIB_RUNTIME_VERSION_TUPLE = _zlib_runtime_version_tuple()
+
+
class VersionTestCase(unittest.TestCase):
def test_library_version(self):
@@ -437,9 +451,8 @@ class CompressObjectTestCase(BaseCompressTestCase, unittest.TestCase):
sync_opt = ['Z_NO_FLUSH', 'Z_SYNC_FLUSH', 'Z_FULL_FLUSH',
'Z_PARTIAL_FLUSH']
- ver = tuple(int(v) for v in zlib.ZLIB_RUNTIME_VERSION.split('.'))
# Z_BLOCK has a known failure prior to 1.2.5.3
- if ver >= (1, 2, 5, 3):
+ if ZLIB_RUNTIME_VERSION_TUPLE >= (1, 2, 5, 3):
sync_opt.append('Z_BLOCK')
sync_opt = [getattr(zlib, opt) for opt in sync_opt
@@ -762,16 +775,7 @@ class CompressObjectTestCase(BaseCompressTestCase, unittest.TestCase):
def test_wbits(self):
# wbits=0 only supported since zlib v1.2.3.5
- # Register "1.2.3" as "1.2.3.0"
- # or "1.2.0-linux","1.2.0.f","1.2.0.f-linux"
- v = zlib.ZLIB_RUNTIME_VERSION.split('-', 1)[0].split('.')
- if len(v) < 4:
- v.append('0')
- elif not v[-1].isnumeric():
- v[-1] = '0'
-
- v = tuple(map(int, v))
- supports_wbits_0 = v >= (1, 2, 3, 5)
+ supports_wbits_0 = ZLIB_RUNTIME_VERSION_TUPLE >= (1, 2, 3, 5)
co = zlib.compressobj(level=1, wbits=15)
zlib15 = co.compress(HAMLET_SCENE) + co.flush()

View file

@ -1,106 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Serhiy Storchaka <storchaka@gmail.com>
Date: Sun, 11 Feb 2024 12:08:39 +0200
Subject: 00422: gh-115133: Fix tests for XMLPullParser with Expat 2.6.0
Feeding the parser by too small chunks defers parsing to prevent
CVE-2023-52425. Future versions of Expat may be more reactive.
(cherry picked from commit 4a08e7b3431cd32a0daf22a33421cd3035343dc4)
---
Lib/test/test_xml_etree.py | 58 ++++++++++++-------
...-02-08-14-21-28.gh-issue-115133.ycl4ko.rst | 2 +
2 files changed, 38 insertions(+), 22 deletions(-)
create mode 100644 Misc/NEWS.d/next/Library/2024-02-08-14-21-28.gh-issue-115133.ycl4ko.rst
diff --git a/Lib/test/test_xml_etree.py b/Lib/test/test_xml_etree.py
index acaa519f42..2195eb9485 100644
--- a/Lib/test/test_xml_etree.py
+++ b/Lib/test/test_xml_etree.py
@@ -10,6 +10,7 @@ import html
import io
import operator
import pickle
+import pyexpat
import sys
import types
import unittest
@@ -97,6 +98,10 @@ EXTERNAL_ENTITY_XML = """\
<document>&entity;</document>
"""
+fails_with_expat_2_6_0 = (unittest.expectedFailure
+ if pyexpat.version_info >= (2, 6, 0) else
+ lambda test: test)
+
class ModuleTest(unittest.TestCase):
def test_sanity(self):
# Import sanity.
@@ -1044,28 +1049,37 @@ class XMLPullParserTest(unittest.TestCase):
self.assertEqual([(action, elem.tag) for action, elem in events],
expected)
- def test_simple_xml(self):
- for chunk_size in (None, 1, 5):
- with self.subTest(chunk_size=chunk_size):
- parser = ET.XMLPullParser()
- self.assert_event_tags(parser, [])
- self._feed(parser, "<!-- comment -->\n", chunk_size)
- self.assert_event_tags(parser, [])
- self._feed(parser,
- "<root>\n <element key='value'>text</element",
- chunk_size)
- self.assert_event_tags(parser, [])
- self._feed(parser, ">\n", chunk_size)
- self.assert_event_tags(parser, [('end', 'element')])
- self._feed(parser, "<element>text</element>tail\n", chunk_size)
- self._feed(parser, "<empty-element/>\n", chunk_size)
- self.assert_event_tags(parser, [
- ('end', 'element'),
- ('end', 'empty-element'),
- ])
- self._feed(parser, "</root>\n", chunk_size)
- self.assert_event_tags(parser, [('end', 'root')])
- self.assertIsNone(parser.close())
+ def test_simple_xml(self, chunk_size=None):
+ parser = ET.XMLPullParser()
+ self.assert_event_tags(parser, [])
+ self._feed(parser, "<!-- comment -->\n", chunk_size)
+ self.assert_event_tags(parser, [])
+ self._feed(parser,
+ "<root>\n <element key='value'>text</element",
+ chunk_size)
+ self.assert_event_tags(parser, [])
+ self._feed(parser, ">\n", chunk_size)
+ self.assert_event_tags(parser, [('end', 'element')])
+ self._feed(parser, "<element>text</element>tail\n", chunk_size)
+ self._feed(parser, "<empty-element/>\n", chunk_size)
+ self.assert_event_tags(parser, [
+ ('end', 'element'),
+ ('end', 'empty-element'),
+ ])
+ self._feed(parser, "</root>\n", chunk_size)
+ self.assert_event_tags(parser, [('end', 'root')])
+ self.assertIsNone(parser.close())
+
+ @fails_with_expat_2_6_0
+ def test_simple_xml_chunk_1(self):
+ self.test_simple_xml(chunk_size=1)
+
+ @fails_with_expat_2_6_0
+ def test_simple_xml_chunk_5(self):
+ self.test_simple_xml(chunk_size=5)
+
+ def test_simple_xml_chunk_22(self):
+ self.test_simple_xml(chunk_size=22)
def test_feed_while_iterating(self):
parser = ET.XMLPullParser()
diff --git a/Misc/NEWS.d/next/Library/2024-02-08-14-21-28.gh-issue-115133.ycl4ko.rst b/Misc/NEWS.d/next/Library/2024-02-08-14-21-28.gh-issue-115133.ycl4ko.rst
new file mode 100644
index 0000000000..6f1015235c
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2024-02-08-14-21-28.gh-issue-115133.ycl4ko.rst
@@ -0,0 +1,2 @@
+Fix tests for :class:`~xml.etree.ElementTree.XMLPullParser` with Expat
+2.6.0.

View file

@ -1,161 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Matthias Klose <doko42@users.noreply.github.com>
Date: Mon, 30 Apr 2018 19:22:16 +0200
Subject: 00423: bpo-33377: Add triplets for mips-r6 and riscv
---
.../2018-04-30-16-53-00.bpo-33377.QBh6vP.rst | 2 +
configure | 42 ++++++++++++++++++-
configure.ac | 28 +++++++++++++
3 files changed, 71 insertions(+), 1 deletion(-)
create mode 100644 Misc/NEWS.d/next/Build/2018-04-30-16-53-00.bpo-33377.QBh6vP.rst
diff --git a/Misc/NEWS.d/next/Build/2018-04-30-16-53-00.bpo-33377.QBh6vP.rst b/Misc/NEWS.d/next/Build/2018-04-30-16-53-00.bpo-33377.QBh6vP.rst
new file mode 100644
index 0000000000..f5dbd23c7c
--- /dev/null
+++ b/Misc/NEWS.d/next/Build/2018-04-30-16-53-00.bpo-33377.QBh6vP.rst
@@ -0,0 +1,2 @@
+Add new triplets for mips r6 and riscv variants (used in extension
+suffixes).
diff --git a/configure b/configure
index 68a46deef5..6ea6e7d742 100755
--- a/configure
+++ b/configure
@@ -785,6 +785,7 @@ infodir
docdir
oldincludedir
includedir
+runstatedir
localstatedir
sharedstatedir
sysconfdir
@@ -898,6 +899,7 @@ datadir='${datarootdir}'
sysconfdir='${prefix}/etc'
sharedstatedir='${prefix}/com'
localstatedir='${prefix}/var'
+runstatedir='${localstatedir}/run'
includedir='${prefix}/include'
oldincludedir='/usr/include'
docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
@@ -1150,6 +1152,15 @@ do
| -silent | --silent | --silen | --sile | --sil)
silent=yes ;;
+ -runstatedir | --runstatedir | --runstatedi | --runstated \
+ | --runstate | --runstat | --runsta | --runst | --runs \
+ | --run | --ru | --r)
+ ac_prev=runstatedir ;;
+ -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
+ | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
+ | --run=* | --ru=* | --r=*)
+ runstatedir=$ac_optarg ;;
+
-sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
ac_prev=sbindir ;;
-sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@@ -1287,7 +1298,7 @@ fi
for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \
datadir sysconfdir sharedstatedir localstatedir includedir \
oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
- libdir localedir mandir
+ libdir localedir mandir runstatedir
do
eval ac_val=\$$ac_var
# Remove trailing slashes.
@@ -1440,6 +1451,7 @@ Fine tuning of the installation directories:
--sysconfdir=DIR read-only single-machine data [PREFIX/etc]
--sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com]
--localstatedir=DIR modifiable single-machine data [PREFIX/var]
+ --runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run]
--libdir=DIR object code libraries [EPREFIX/lib]
--includedir=DIR C header files [PREFIX/include]
--oldincludedir=DIR C header files for non-gcc [/usr/include]
@@ -5261,6 +5273,26 @@ cat >> conftest.c <<EOF
ia64-linux-gnu
# elif defined(__m68k__) && !defined(__mcoldfire__)
m68k-linux-gnu
+# elif defined(__mips_hard_float) && defined(__mips_isa_rev) && (__mips_isa_rev >=6) && defined(_MIPSEL)
+# if _MIPS_SIM == _ABIO32
+ mipsisa32r6el-linux-gnu
+# elif _MIPS_SIM == _ABIN32
+ mipsisa64r6el-linux-gnuabin32
+# elif _MIPS_SIM == _ABI64
+ mipsisa64r6el-linux-gnuabi64
+# else
+# error unknown platform triplet
+# endif
+# elif defined(__mips_hard_float) && defined(__mips_isa_rev) && (__mips_isa_rev >=6)
+# if _MIPS_SIM == _ABIO32
+ mipsisa32r6-linux-gnu
+# elif _MIPS_SIM == _ABIN32
+ mipsisa64r6-linux-gnuabin32
+# elif _MIPS_SIM == _ABI64
+ mipsisa64r6-linux-gnuabi64
+# else
+# error unknown platform triplet
+# endif
# elif defined(__mips_hard_float) && defined(_MIPSEL)
# if _MIPS_SIM == _ABIO32
mipsel-linux-gnu
@@ -5303,6 +5335,14 @@ cat >> conftest.c <<EOF
sparc64-linux-gnu
# elif defined(__sparc__)
sparc-linux-gnu
+# elif defined(__riscv)
+# if __riscv_xlen == 32
+ riscv32-linux-gnu
+# elif __riscv_xlen == 64
+ riscv64-linux-gnu
+# else
+# error unknown platform triplet
+# endif
# else
# error unknown platform triplet
# endif
diff --git a/configure.ac b/configure.ac
index 9d2ad9afba..046aed9d7c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -804,6 +804,26 @@ cat >> conftest.c <<EOF
ia64-linux-gnu
# elif defined(__m68k__) && !defined(__mcoldfire__)
m68k-linux-gnu
+# elif defined(__mips_hard_float) && defined(__mips_isa_rev) && (__mips_isa_rev >=6) && defined(_MIPSEL)
+# if _MIPS_SIM == _ABIO32
+ mipsisa32r6el-linux-gnu
+# elif _MIPS_SIM == _ABIN32
+ mipsisa64r6el-linux-gnuabin32
+# elif _MIPS_SIM == _ABI64
+ mipsisa64r6el-linux-gnuabi64
+# else
+# error unknown platform triplet
+# endif
+# elif defined(__mips_hard_float) && defined(__mips_isa_rev) && (__mips_isa_rev >=6)
+# if _MIPS_SIM == _ABIO32
+ mipsisa32r6-linux-gnu
+# elif _MIPS_SIM == _ABIN32
+ mipsisa64r6-linux-gnuabin32
+# elif _MIPS_SIM == _ABI64
+ mipsisa64r6-linux-gnuabi64
+# else
+# error unknown platform triplet
+# endif
# elif defined(__mips_hard_float) && defined(_MIPSEL)
# if _MIPS_SIM == _ABIO32
mipsel-linux-gnu
@@ -846,6 +866,14 @@ cat >> conftest.c <<EOF
sparc64-linux-gnu
# elif defined(__sparc__)
sparc-linux-gnu
+# elif defined(__riscv)
+# if __riscv_xlen == 32
+ riscv32-linux-gnu
+# elif __riscv_xlen == 64
+ riscv64-linux-gnu
+# else
+# error unknown platform triplet
+# endif
# else
# error unknown platform triplet
# endif

View file

@ -1,295 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Lumir Balhar <lbalhar@redhat.com>
Date: Wed, 24 Apr 2024 00:19:23 +0200
Subject: 00426: CVE-2023-6597
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Combines Two fixes for tempfile.TemporaryDirectory:
https://github.com/python/cpython/commit/e9b51c0ad81da1da11ae65840ac8b50a8521373c
https://github.com/python/cpython/commit/02a9259c717738dfe6b463c44d7e17f2b6d2cb3a
Co-authored-by: Søren Løvborg <sorenl@unity3d.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
---
Lib/tempfile.py | 44 +++++++++-
Lib/test/test_tempfile.py | 166 +++++++++++++++++++++++++++++++++++---
2 files changed, 199 insertions(+), 11 deletions(-)
diff --git a/Lib/tempfile.py b/Lib/tempfile.py
index 2cb5434ba7..8e401f548a 100644
--- a/Lib/tempfile.py
+++ b/Lib/tempfile.py
@@ -276,6 +276,23 @@ def _mkstemp_inner(dir, pre, suf, flags, output_type):
"No usable temporary file name found")
+def _dont_follow_symlinks(func, path, *args):
+ # Pass follow_symlinks=False, unless not supported on this platform.
+ if func in _os.supports_follow_symlinks:
+ func(path, *args, follow_symlinks=False)
+ elif _os.name == 'nt' or not _os.path.islink(path):
+ func(path, *args)
+
+
+def _resetperms(path):
+ try:
+ chflags = _os.chflags
+ except AttributeError:
+ pass
+ else:
+ _dont_follow_symlinks(chflags, path, 0)
+ _dont_follow_symlinks(_os.chmod, path, 0o700)
+
# User visible interfaces.
def gettempprefix():
@@ -794,9 +811,32 @@ class TemporaryDirectory(object):
self, self._cleanup, self.name,
warn_message="Implicitly cleaning up {!r}".format(self))
+ @classmethod
+ def _rmtree(cls, name):
+ def onerror(func, path, exc_info):
+ if issubclass(exc_info[0], PermissionError):
+ try:
+ if path != name:
+ _resetperms(_os.path.dirname(path))
+ _resetperms(path)
+
+ try:
+ _os.unlink(path)
+ # PermissionError is raised on FreeBSD for directories
+ except (IsADirectoryError, PermissionError):
+ cls._rmtree(path)
+ except FileNotFoundError:
+ pass
+ elif issubclass(exc_info[0], FileNotFoundError):
+ pass
+ else:
+ raise
+
+ _shutil.rmtree(name, onerror=onerror)
+
@classmethod
def _cleanup(cls, name, warn_message):
- _shutil.rmtree(name)
+ cls._rmtree(name)
_warnings.warn(warn_message, ResourceWarning)
def __repr__(self):
@@ -810,4 +850,4 @@ class TemporaryDirectory(object):
def cleanup(self):
if self._finalizer.detach():
- _shutil.rmtree(self.name)
+ self._rmtree(self.name)
diff --git a/Lib/test/test_tempfile.py b/Lib/test/test_tempfile.py
index 710756bde6..c5560e12e7 100644
--- a/Lib/test/test_tempfile.py
+++ b/Lib/test/test_tempfile.py
@@ -1298,19 +1298,25 @@ class NulledModules:
class TestTemporaryDirectory(BaseTestCase):
"""Test TemporaryDirectory()."""
- def do_create(self, dir=None, pre="", suf="", recurse=1):
+ def do_create(self, dir=None, pre="", suf="", recurse=1, dirs=1, files=1):
if dir is None:
dir = tempfile.gettempdir()
tmp = tempfile.TemporaryDirectory(dir=dir, prefix=pre, suffix=suf)
self.nameCheck(tmp.name, dir, pre, suf)
- # Create a subdirectory and some files
- if recurse:
- d1 = self.do_create(tmp.name, pre, suf, recurse-1)
- d1.name = None
- with open(os.path.join(tmp.name, "test.txt"), "wb") as f:
- f.write(b"Hello world!")
+ self.do_create2(tmp.name, recurse, dirs, files)
return tmp
+ def do_create2(self, path, recurse=1, dirs=1, files=1):
+ # Create subdirectories and some files
+ if recurse:
+ for i in range(dirs):
+ name = os.path.join(path, "dir%d" % i)
+ os.mkdir(name)
+ self.do_create2(name, recurse-1, dirs, files)
+ for i in range(files):
+ with open(os.path.join(path, "test%d.txt" % i), "wb") as f:
+ f.write(b"Hello world!")
+
def test_mkdtemp_failure(self):
# Check no additional exception if mkdtemp fails
# Previously would raise AttributeError instead
@@ -1350,11 +1356,108 @@ class TestTemporaryDirectory(BaseTestCase):
"TemporaryDirectory %s exists after cleanup" % d1.name)
self.assertTrue(os.path.exists(d2.name),
"Directory pointed to by a symlink was deleted")
- self.assertEqual(os.listdir(d2.name), ['test.txt'],
+ self.assertEqual(os.listdir(d2.name), ['test0.txt'],
"Contents of the directory pointed to by a symlink "
"were deleted")
d2.cleanup()
+ @support.skip_unless_symlink
+ def test_cleanup_with_symlink_modes(self):
+ # cleanup() should not follow symlinks when fixing mode bits (#91133)
+ with self.do_create(recurse=0) as d2:
+ file1 = os.path.join(d2, 'file1')
+ open(file1, 'wb').close()
+ dir1 = os.path.join(d2, 'dir1')
+ os.mkdir(dir1)
+ for mode in range(8):
+ mode <<= 6
+ with self.subTest(mode=format(mode, '03o')):
+ def test(target, target_is_directory):
+ d1 = self.do_create(recurse=0)
+ symlink = os.path.join(d1.name, 'symlink')
+ os.symlink(target, symlink,
+ target_is_directory=target_is_directory)
+ try:
+ os.chmod(symlink, mode, follow_symlinks=False)
+ except NotImplementedError:
+ pass
+ try:
+ os.chmod(symlink, mode)
+ except FileNotFoundError:
+ pass
+ os.chmod(d1.name, mode)
+ d1.cleanup()
+ self.assertFalse(os.path.exists(d1.name))
+
+ with self.subTest('nonexisting file'):
+ test('nonexisting', target_is_directory=False)
+ with self.subTest('nonexisting dir'):
+ test('nonexisting', target_is_directory=True)
+
+ with self.subTest('existing file'):
+ os.chmod(file1, mode)
+ old_mode = os.stat(file1).st_mode
+ test(file1, target_is_directory=False)
+ new_mode = os.stat(file1).st_mode
+ self.assertEqual(new_mode, old_mode,
+ '%03o != %03o' % (new_mode, old_mode))
+
+ with self.subTest('existing dir'):
+ os.chmod(dir1, mode)
+ old_mode = os.stat(dir1).st_mode
+ test(dir1, target_is_directory=True)
+ new_mode = os.stat(dir1).st_mode
+ self.assertEqual(new_mode, old_mode,
+ '%03o != %03o' % (new_mode, old_mode))
+
+ @unittest.skipUnless(hasattr(os, 'chflags'), 'requires os.chflags')
+ @support.skip_unless_symlink
+ def test_cleanup_with_symlink_flags(self):
+ # cleanup() should not follow symlinks when fixing flags (#91133)
+ flags = stat.UF_IMMUTABLE | stat.UF_NOUNLINK
+ self.check_flags(flags)
+
+ with self.do_create(recurse=0) as d2:
+ file1 = os.path.join(d2, 'file1')
+ open(file1, 'wb').close()
+ dir1 = os.path.join(d2, 'dir1')
+ os.mkdir(dir1)
+ def test(target, target_is_directory):
+ d1 = self.do_create(recurse=0)
+ symlink = os.path.join(d1.name, 'symlink')
+ os.symlink(target, symlink,
+ target_is_directory=target_is_directory)
+ try:
+ os.chflags(symlink, flags, follow_symlinks=False)
+ except NotImplementedError:
+ pass
+ try:
+ os.chflags(symlink, flags)
+ except FileNotFoundError:
+ pass
+ os.chflags(d1.name, flags)
+ d1.cleanup()
+ self.assertFalse(os.path.exists(d1.name))
+
+ with self.subTest('nonexisting file'):
+ test('nonexisting', target_is_directory=False)
+ with self.subTest('nonexisting dir'):
+ test('nonexisting', target_is_directory=True)
+
+ with self.subTest('existing file'):
+ os.chflags(file1, flags)
+ old_flags = os.stat(file1).st_flags
+ test(file1, target_is_directory=False)
+ new_flags = os.stat(file1).st_flags
+ self.assertEqual(new_flags, old_flags)
+
+ with self.subTest('existing dir'):
+ os.chflags(dir1, flags)
+ old_flags = os.stat(dir1).st_flags
+ test(dir1, target_is_directory=True)
+ new_flags = os.stat(dir1).st_flags
+ self.assertEqual(new_flags, old_flags)
+
@support.cpython_only
def test_del_on_collection(self):
# A TemporaryDirectory is deleted when garbage collected
@@ -1385,7 +1488,7 @@ class TestTemporaryDirectory(BaseTestCase):
tmp2 = os.path.join(tmp.name, 'test_dir')
os.mkdir(tmp2)
- with open(os.path.join(tmp2, "test.txt"), "w") as f:
+ with open(os.path.join(tmp2, "test0.txt"), "w") as f:
f.write("Hello world!")
{mod}.tmp = tmp
@@ -1453,6 +1556,51 @@ class TestTemporaryDirectory(BaseTestCase):
self.assertEqual(name, d.name)
self.assertFalse(os.path.exists(name))
+ def test_modes(self):
+ for mode in range(8):
+ mode <<= 6
+ with self.subTest(mode=format(mode, '03o')):
+ d = self.do_create(recurse=3, dirs=2, files=2)
+ with d:
+ # Change files and directories mode recursively.
+ for root, dirs, files in os.walk(d.name, topdown=False):
+ for name in files:
+ os.chmod(os.path.join(root, name), mode)
+ os.chmod(root, mode)
+ d.cleanup()
+ self.assertFalse(os.path.exists(d.name))
+
+ def check_flags(self, flags):
+ # skip the test if these flags are not supported (ex: FreeBSD 13)
+ filename = support.TESTFN
+ try:
+ open(filename, "w").close()
+ try:
+ os.chflags(filename, flags)
+ except OSError as exc:
+ # "OSError: [Errno 45] Operation not supported"
+ self.skipTest(f"chflags() doesn't support flags "
+ f"{flags:#b}: {exc}")
+ else:
+ os.chflags(filename, 0)
+ finally:
+ support.unlink(filename)
+
+ @unittest.skipUnless(hasattr(os, 'chflags'), 'requires os.lchflags')
+ def test_flags(self):
+ flags = stat.UF_IMMUTABLE | stat.UF_NOUNLINK
+ self.check_flags(flags)
+
+ d = self.do_create(recurse=3, dirs=2, files=2)
+ with d:
+ # Change files and directories flags recursively.
+ for root, dirs, files in os.walk(d.name, topdown=False):
+ for name in files:
+ os.chflags(os.path.join(root, name), flags)
+ os.chflags(root, flags)
+ d.cleanup()
+ self.assertFalse(os.path.exists(d.name))
+
if __name__ == "__main__":
unittest.main()

View file

@ -1,314 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: John Jolly <john.jolly@gmail.com>
Date: Tue, 30 Jan 2018 01:51:35 -0700
Subject: 00427: ZipExtFile tell and seek, CVE-2024-0450
Backport of seek and tell methods for ZipExtFile makes it
possible to backport the fix for CVE-2024-0450.
Combines:
https://github.com/python/cpython/commit/066df4fd454d6ff9be66e80b2a65995b10af174f
https://github.com/python/cpython/commit/66363b9a7b9fe7c99eba3a185b74c5fdbf842eba
---
Doc/library/zipfile.rst | 6 +-
Lib/test/test_zipfile.py | 94 +++++++++++++++++++
Lib/zipfile.py | 94 +++++++++++++++++++
.../2017-12-21-22-00-11.bpo-22908.cVm89I.rst | 2 +
...-09-28-13-15-51.gh-issue-109858.43e2dg.rst | 3 +
5 files changed, 196 insertions(+), 3 deletions(-)
create mode 100644 Misc/NEWS.d/next/Library/2017-12-21-22-00-11.bpo-22908.cVm89I.rst
create mode 100644 Misc/NEWS.d/next/Library/2023-09-28-13-15-51.gh-issue-109858.43e2dg.rst
diff --git a/Doc/library/zipfile.rst b/Doc/library/zipfile.rst
index b65b61d8da..5c28ce52c2 100644
--- a/Doc/library/zipfile.rst
+++ b/Doc/library/zipfile.rst
@@ -230,9 +230,9 @@ ZipFile Objects
With *mode* ``'r'`` the file-like object
(``ZipExtFile``) is read-only and provides the following methods:
:meth:`~io.BufferedIOBase.read`, :meth:`~io.IOBase.readline`,
- :meth:`~io.IOBase.readlines`, :meth:`__iter__`,
- :meth:`~iterator.__next__`. These objects can operate independently of
- the ZipFile.
+ :meth:`~io.IOBase.readlines`, :meth:`~io.IOBase.seek`,
+ :meth:`~io.IOBase.tell`, :meth:`__iter__`, :meth:`~iterator.__next__`.
+ These objects can operate independently of the ZipFile.
With ``mode='w'``, a writable file handle is returned, which supports the
:meth:`~io.BufferedIOBase.write` method. While a writable file handle is open,
diff --git a/Lib/test/test_zipfile.py b/Lib/test/test_zipfile.py
index e62b82e1d3..03799090b9 100644
--- a/Lib/test/test_zipfile.py
+++ b/Lib/test/test_zipfile.py
@@ -1610,6 +1610,100 @@ class OtherTests(unittest.TestCase):
self.assertEqual(zipf.read('baz'), msg3)
self.assertEqual(zipf.namelist(), ['foo', 'bar', 'baz'])
+ def test_seek_tell(self):
+ # Test seek functionality
+ txt = b"Where's Bruce?"
+ bloc = txt.find(b"Bruce")
+ # Check seek on a file
+ with zipfile.ZipFile(TESTFN, "w") as zipf:
+ zipf.writestr("foo.txt", txt)
+ with zipfile.ZipFile(TESTFN, "r") as zipf:
+ with zipf.open("foo.txt", "r") as fp:
+ fp.seek(bloc, os.SEEK_SET)
+ self.assertEqual(fp.tell(), bloc)
+ fp.seek(-bloc, os.SEEK_CUR)
+ self.assertEqual(fp.tell(), 0)
+ fp.seek(bloc, os.SEEK_CUR)
+ self.assertEqual(fp.tell(), bloc)
+ self.assertEqual(fp.read(5), txt[bloc:bloc+5])
+ fp.seek(0, os.SEEK_END)
+ self.assertEqual(fp.tell(), len(txt))
+ # Check seek on memory file
+ data = io.BytesIO()
+ with zipfile.ZipFile(data, mode="w") as zipf:
+ zipf.writestr("foo.txt", txt)
+ with zipfile.ZipFile(data, mode="r") as zipf:
+ with zipf.open("foo.txt", "r") as fp:
+ fp.seek(bloc, os.SEEK_SET)
+ self.assertEqual(fp.tell(), bloc)
+ fp.seek(-bloc, os.SEEK_CUR)
+ self.assertEqual(fp.tell(), 0)
+ fp.seek(bloc, os.SEEK_CUR)
+ self.assertEqual(fp.tell(), bloc)
+ self.assertEqual(fp.read(5), txt[bloc:bloc+5])
+ fp.seek(0, os.SEEK_END)
+ self.assertEqual(fp.tell(), len(txt))
+
+ @requires_zlib
+ def test_full_overlap(self):
+ data = (
+ b'PK\x03\x04\x14\x00\x00\x00\x08\x00\xa0lH\x05\xe2\x1e'
+ b'8\xbb\x10\x00\x00\x00\t\x04\x00\x00\x01\x00\x00\x00a\xed'
+ b'\xc0\x81\x08\x00\x00\x00\xc00\xd6\xfbK\\d\x0b`P'
+ b'K\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00\xa0lH\x05\xe2'
+ b'\x1e8\xbb\x10\x00\x00\x00\t\x04\x00\x00\x01\x00\x00\x00\x00'
+ b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00aPK'
+ b'\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00\xa0lH\x05\xe2\x1e'
+ b'8\xbb\x10\x00\x00\x00\t\x04\x00\x00\x01\x00\x00\x00\x00\x00'
+ b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00bPK\x05'
+ b'\x06\x00\x00\x00\x00\x02\x00\x02\x00^\x00\x00\x00/\x00\x00'
+ b'\x00\x00\x00'
+ )
+ with zipfile.ZipFile(io.BytesIO(data), 'r') as zipf:
+ self.assertEqual(zipf.namelist(), ['a', 'b'])
+ zi = zipf.getinfo('a')
+ self.assertEqual(zi.header_offset, 0)
+ self.assertEqual(zi.compress_size, 16)
+ self.assertEqual(zi.file_size, 1033)
+ zi = zipf.getinfo('b')
+ self.assertEqual(zi.header_offset, 0)
+ self.assertEqual(zi.compress_size, 16)
+ self.assertEqual(zi.file_size, 1033)
+ self.assertEqual(len(zipf.read('a')), 1033)
+ with self.assertRaisesRegex(zipfile.BadZipFile, 'File name.*differ'):
+ zipf.read('b')
+
+ @requires_zlib
+ def test_quoted_overlap(self):
+ data = (
+ b'PK\x03\x04\x14\x00\x00\x00\x08\x00\xa0lH\x05Y\xfc'
+ b'8\x044\x00\x00\x00(\x04\x00\x00\x01\x00\x00\x00a\x00'
+ b'\x1f\x00\xe0\xffPK\x03\x04\x14\x00\x00\x00\x08\x00\xa0l'
+ b'H\x05\xe2\x1e8\xbb\x10\x00\x00\x00\t\x04\x00\x00\x01\x00'
+ b'\x00\x00b\xed\xc0\x81\x08\x00\x00\x00\xc00\xd6\xfbK\\'
+ b'd\x0b`PK\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00\xa0'
+ b'lH\x05Y\xfc8\x044\x00\x00\x00(\x04\x00\x00\x01'
+ b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
+ b'\x00aPK\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00\xa0l'
+ b'H\x05\xe2\x1e8\xbb\x10\x00\x00\x00\t\x04\x00\x00\x01\x00'
+ b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00$\x00\x00\x00'
+ b'bPK\x05\x06\x00\x00\x00\x00\x02\x00\x02\x00^\x00\x00'
+ b'\x00S\x00\x00\x00\x00\x00'
+ )
+ with zipfile.ZipFile(io.BytesIO(data), 'r') as zipf:
+ self.assertEqual(zipf.namelist(), ['a', 'b'])
+ zi = zipf.getinfo('a')
+ self.assertEqual(zi.header_offset, 0)
+ self.assertEqual(zi.compress_size, 52)
+ self.assertEqual(zi.file_size, 1064)
+ zi = zipf.getinfo('b')
+ self.assertEqual(zi.header_offset, 36)
+ self.assertEqual(zi.compress_size, 16)
+ self.assertEqual(zi.file_size, 1033)
+ with self.assertRaisesRegex(zipfile.BadZipFile, 'Overlapped entries'):
+ zipf.read('a')
+ self.assertEqual(len(zipf.read('b')), 1033)
+
def tearDown(self):
unlink(TESTFN)
unlink(TESTFN2)
diff --git a/Lib/zipfile.py b/Lib/zipfile.py
index edde0c5fd4..e6d7676079 100644
--- a/Lib/zipfile.py
+++ b/Lib/zipfile.py
@@ -338,6 +338,7 @@ class ZipInfo (object):
'compress_size',
'file_size',
'_raw_time',
+ '_end_offset',
)
def __init__(self, filename="NoName", date_time=(1980,1,1,0,0,0)):
@@ -376,6 +377,7 @@ class ZipInfo (object):
self.volume = 0 # Volume number of file header
self.internal_attr = 0 # Internal attributes
self.external_attr = 0 # External file attributes
+ self._end_offset = None # Start of the next local header or central directory
# Other attributes are set by class ZipFile:
# header_offset Byte offset to the file header
# CRC CRC-32 of the uncompressed file
@@ -718,6 +720,18 @@ class _SharedFile:
self._close = close
self._lock = lock
self._writing = writing
+ self.seekable = file.seekable
+ self.tell = file.tell
+
+ def seek(self, offset, whence=0):
+ with self._lock:
+ if self.writing():
+ raise ValueError("Can't reposition in the ZIP file while "
+ "there is an open writing handle on it. "
+ "Close the writing handle before trying to read.")
+ self._file.seek(self._pos)
+ self._pos = self._file.tell()
+ return self._pos
def read(self, n=-1):
with self._lock:
@@ -768,6 +782,9 @@ class ZipExtFile(io.BufferedIOBase):
# Read from compressed files in 4k blocks.
MIN_READ_SIZE = 4096
+ # Chunk size to read during seek
+ MAX_SEEK_READ = 1 << 24
+
def __init__(self, fileobj, mode, zipinfo, decrypter=None,
close_fileobj=False):
self._fileobj = fileobj
@@ -800,6 +817,17 @@ class ZipExtFile(io.BufferedIOBase):
else:
self._expected_crc = None
+ self._seekable = False
+ try:
+ if fileobj.seekable():
+ self._orig_compress_start = fileobj.tell()
+ self._orig_compress_size = zipinfo.compress_size
+ self._orig_file_size = zipinfo.file_size
+ self._orig_start_crc = self._running_crc
+ self._seekable = True
+ except AttributeError:
+ pass
+
def __repr__(self):
result = ['<%s.%s' % (self.__class__.__module__,
self.__class__.__qualname__)]
@@ -985,6 +1013,62 @@ class ZipExtFile(io.BufferedIOBase):
finally:
super().close()
+ def seekable(self):
+ return self._seekable
+
+ def seek(self, offset, whence=0):
+ if not self._seekable:
+ raise io.UnsupportedOperation("underlying stream is not seekable")
+ curr_pos = self.tell()
+ if whence == 0: # Seek from start of file
+ new_pos = offset
+ elif whence == 1: # Seek from current position
+ new_pos = curr_pos + offset
+ elif whence == 2: # Seek from EOF
+ new_pos = self._orig_file_size + offset
+ else:
+ raise ValueError("whence must be os.SEEK_SET (0), "
+ "os.SEEK_CUR (1), or os.SEEK_END (2)")
+
+ if new_pos > self._orig_file_size:
+ new_pos = self._orig_file_size
+
+ if new_pos < 0:
+ new_pos = 0
+
+ read_offset = new_pos - curr_pos
+ buff_offset = read_offset + self._offset
+
+ if buff_offset >= 0 and buff_offset < len(self._readbuffer):
+ # Just move the _offset index if the new position is in the _readbuffer
+ self._offset = buff_offset
+ read_offset = 0
+ elif read_offset < 0:
+ # Position is before the current position. Reset the ZipExtFile
+
+ self._fileobj.seek(self._orig_compress_start)
+ self._running_crc = self._orig_start_crc
+ self._compress_left = self._orig_compress_size
+ self._left = self._orig_file_size
+ self._readbuffer = b''
+ self._offset = 0
+ self._decompressor = zipfile._get_decompressor(self._compress_type)
+ self._eof = False
+ read_offset = new_pos
+
+ while read_offset > 0:
+ read_len = min(self.MAX_SEEK_READ, read_offset)
+ self.read(read_len)
+ read_offset -= read_len
+
+ return self.tell()
+
+ def tell(self):
+ if not self._seekable:
+ raise io.UnsupportedOperation("underlying stream is not seekable")
+ filepos = self._orig_file_size - self._left - len(self._readbuffer) + self._offset
+ return filepos
+
class _ZipWriteFile(io.BufferedIOBase):
def __init__(self, zf, zinfo, zip64):
@@ -1264,6 +1348,12 @@ class ZipFile:
if self.debug > 2:
print("total", total)
+ end_offset = self.start_dir
+ for zinfo in sorted(self.filelist,
+ key=lambda zinfo: zinfo.header_offset,
+ reverse=True):
+ zinfo._end_offset = end_offset
+ end_offset = zinfo.header_offset
def namelist(self):
"""Return a list of file names in the archive."""
@@ -1418,6 +1508,10 @@ class ZipFile:
'File name in directory %r and header %r differ.'
% (zinfo.orig_filename, fname))
+ if (zinfo._end_offset is not None and
+ zef_file.tell() + zinfo.compress_size > zinfo._end_offset):
+ raise BadZipFile(f"Overlapped entries: {zinfo.orig_filename!r} (possible zip bomb)")
+
# check for encrypted flag & handle password
is_encrypted = zinfo.flag_bits & 0x1
zd = None
diff --git a/Misc/NEWS.d/next/Library/2017-12-21-22-00-11.bpo-22908.cVm89I.rst b/Misc/NEWS.d/next/Library/2017-12-21-22-00-11.bpo-22908.cVm89I.rst
new file mode 100644
index 0000000000..4f3cc01660
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2017-12-21-22-00-11.bpo-22908.cVm89I.rst
@@ -0,0 +1,2 @@
+Added seek and tell to the ZipExtFile class. This only works if the file
+object used to open the zipfile is seekable.
diff --git a/Misc/NEWS.d/next/Library/2023-09-28-13-15-51.gh-issue-109858.43e2dg.rst b/Misc/NEWS.d/next/Library/2023-09-28-13-15-51.gh-issue-109858.43e2dg.rst
new file mode 100644
index 0000000000..be279caffc
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2023-09-28-13-15-51.gh-issue-109858.43e2dg.rst
@@ -0,0 +1,3 @@
+Protect :mod:`zipfile` from "quoted-overlap" zipbomb. It now raises
+BadZipFile when try to read an entry that overlaps with other entry or
+central directory.

View file

@ -1,356 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Petr Viktorin <encukou@gmail.com>
Date: Tue, 7 May 2024 11:58:20 +0200
Subject: 00431: CVE-2024-4032: incorrect IPv4 and IPv6 private ranges
Upstream issue: https://github.com/python/cpython/issues/113171
Backported from 3.8.
---
Doc/library/ipaddress.rst | 43 ++++++++-
Doc/tools/susp-ignored.csv | 8 ++
Lib/ipaddress.py | 95 +++++++++++++++----
Lib/test/test_ipaddress.py | 52 ++++++++++
...-03-14-01-38-44.gh-issue-113171.VFnObz.rst | 9 ++
5 files changed, 186 insertions(+), 21 deletions(-)
create mode 100644 Misc/NEWS.d/next/Library/2024-03-14-01-38-44.gh-issue-113171.VFnObz.rst
diff --git a/Doc/library/ipaddress.rst b/Doc/library/ipaddress.rst
index 4ce1ed1ced..18613babc9 100644
--- a/Doc/library/ipaddress.rst
+++ b/Doc/library/ipaddress.rst
@@ -166,18 +166,53 @@ write code that handles both IP versions correctly. Address objects are
.. attribute:: is_private
- ``True`` if the address is allocated for private networks. See
+ ``True`` if the address is defined as not globally reachable by
iana-ipv4-special-registry_ (for IPv4) or iana-ipv6-special-registry_
- (for IPv6).
+ (for IPv6) with the following exceptions:
+
+ * ``is_private`` is ``False`` for the shared address space (``100.64.0.0/10``)
+ * For IPv4-mapped IPv6-addresses the ``is_private`` value is determined by the
+ semantics of the underlying IPv4 addresses and the following condition holds
+ (see :attr:`IPv6Address.ipv4_mapped`)::
+
+ address.is_private == address.ipv4_mapped.is_private
+
+ ``is_private`` has value opposite to :attr:`is_global`, except for the shared address space
+ (``100.64.0.0/10`` range) where they are both ``False``.
+
+ .. versionchanged:: 3.8.20
+
+ Fixed some false positives and false negatives.
+
+ * ``192.0.0.0/24`` is considered private with the exception of ``192.0.0.9/32`` and
+ ``192.0.0.10/32`` (previously: only the ``192.0.0.0/29`` sub-range was considered private).
+ * ``64:ff9b:1::/48`` is considered private.
+ * ``2002::/16`` is considered private.
+ * There are exceptions within ``2001::/23`` (otherwise considered private): ``2001:1::1/128``,
+ ``2001:1::2/128``, ``2001:3::/32``, ``2001:4:112::/48``, ``2001:20::/28``, ``2001:30::/28``.
+ The exceptions are not considered private.
.. attribute:: is_global
- ``True`` if the address is allocated for public networks. See
+ ``True`` if the address is defined as globally reachable by
iana-ipv4-special-registry_ (for IPv4) or iana-ipv6-special-registry_
- (for IPv6).
+ (for IPv6) with the following exception:
+
+ For IPv4-mapped IPv6-addresses the ``is_private`` value is determined by the
+ semantics of the underlying IPv4 addresses and the following condition holds
+ (see :attr:`IPv6Address.ipv4_mapped`)::
+
+ address.is_global == address.ipv4_mapped.is_global
+
+ ``is_global`` has value opposite to :attr:`is_private`, except for the shared address space
+ (``100.64.0.0/10`` range) where they are both ``False``.
.. versionadded:: 3.4
+ .. versionchanged:: 3.8.20
+
+ Fixed some false positives and false negatives, see :attr:`is_private` for details.
+
.. attribute:: is_unspecified
``True`` if the address is unspecified. See :RFC:`5735` (for IPv4)
diff --git a/Doc/tools/susp-ignored.csv b/Doc/tools/susp-ignored.csv
index ed434ce77d..6bc0741b12 100644
--- a/Doc/tools/susp-ignored.csv
+++ b/Doc/tools/susp-ignored.csv
@@ -160,6 +160,14 @@ library/ipaddress,,:db00,2001:db00::0/24
library/ipaddress,,::,2001:db00::0/24
library/ipaddress,,:db00,2001:db00::0/ffff:ff00::
library/ipaddress,,::,2001:db00::0/ffff:ff00::
+library/ipaddress,,:ff9b,64:ff9b:1::/48
+library/ipaddress,,::,64:ff9b:1::/48
+library/ipaddress,,::,2001::
+library/ipaddress,,::,2001:1::
+library/ipaddress,,::,2001:3::
+library/ipaddress,,::,2001:4:112::
+library/ipaddress,,::,2001:20::
+library/ipaddress,,::,2001:30::
library/itertools,,:step,elements from seq[start:stop:step]
library/itertools,,:stop,elements from seq[start:stop:step]
library/logging.handlers,,:port,host:port
diff --git a/Lib/ipaddress.py b/Lib/ipaddress.py
index 98492136ca..55d4d62d70 100644
--- a/Lib/ipaddress.py
+++ b/Lib/ipaddress.py
@@ -1302,18 +1302,41 @@ class IPv4Address(_BaseV4, _BaseAddress):
@property
@functools.lru_cache()
def is_private(self):
- """Test if this address is allocated for private networks.
+ """``True`` if the address is defined as not globally reachable by
+ iana-ipv4-special-registry_ (for IPv4) or iana-ipv6-special-registry_
+ (for IPv6) with the following exceptions:
- Returns:
- A boolean, True if the address is reserved per
- iana-ipv4-special-registry.
+ * ``is_private`` is ``False`` for ``100.64.0.0/10``
+ * For IPv4-mapped IPv6-addresses the ``is_private`` value is determined by the
+ semantics of the underlying IPv4 addresses and the following condition holds
+ (see :attr:`IPv6Address.ipv4_mapped`)::
+ address.is_private == address.ipv4_mapped.is_private
+
+ ``is_private`` has value opposite to :attr:`is_global`, except for the ``100.64.0.0/10``
+ IPv4 range where they are both ``False``.
"""
- return any(self in net for net in self._constants._private_networks)
+ return (
+ any(self in net for net in self._constants._private_networks)
+ and all(self not in net for net in self._constants._private_networks_exceptions)
+ )
@property
@functools.lru_cache()
def is_global(self):
+ """``True`` if the address is defined as globally reachable by
+ iana-ipv4-special-registry_ (for IPv4) or iana-ipv6-special-registry_
+ (for IPv6) with the following exception:
+
+ For IPv4-mapped IPv6-addresses the ``is_private`` value is determined by the
+ semantics of the underlying IPv4 addresses and the following condition holds
+ (see :attr:`IPv6Address.ipv4_mapped`)::
+
+ address.is_global == address.ipv4_mapped.is_global
+
+ ``is_global`` has value opposite to :attr:`is_private`, except for the ``100.64.0.0/10``
+ IPv4 range where they are both ``False``.
+ """
return self not in self._constants._public_network and not self.is_private
@property
@@ -1548,13 +1571,15 @@ class _IPv4Constants:
_public_network = IPv4Network('100.64.0.0/10')
+ # Not globally reachable address blocks listed on
+ # https://www.iana.org/assignments/iana-ipv4-special-registry/iana-ipv4-special-registry.xhtml
_private_networks = [
IPv4Network('0.0.0.0/8'),
IPv4Network('10.0.0.0/8'),
IPv4Network('127.0.0.0/8'),
IPv4Network('169.254.0.0/16'),
IPv4Network('172.16.0.0/12'),
- IPv4Network('192.0.0.0/29'),
+ IPv4Network('192.0.0.0/24'),
IPv4Network('192.0.0.170/31'),
IPv4Network('192.0.2.0/24'),
IPv4Network('192.168.0.0/16'),
@@ -1565,6 +1590,11 @@ class _IPv4Constants:
IPv4Network('255.255.255.255/32'),
]
+ _private_networks_exceptions = [
+ IPv4Network('192.0.0.9/32'),
+ IPv4Network('192.0.0.10/32'),
+ ]
+
_reserved_network = IPv4Network('240.0.0.0/4')
_unspecified_address = IPv4Address('0.0.0.0')
@@ -1953,23 +1983,42 @@ class IPv6Address(_BaseV6, _BaseAddress):
@property
@functools.lru_cache()
def is_private(self):
- """Test if this address is allocated for private networks.
+ """``True`` if the address is defined as not globally reachable by
+ iana-ipv4-special-registry_ (for IPv4) or iana-ipv6-special-registry_
+ (for IPv6) with the following exceptions:
- Returns:
- A boolean, True if the address is reserved per
- iana-ipv6-special-registry.
+ * ``is_private`` is ``False`` for ``100.64.0.0/10``
+ * For IPv4-mapped IPv6-addresses the ``is_private`` value is determined by the
+ semantics of the underlying IPv4 addresses and the following condition holds
+ (see :attr:`IPv6Address.ipv4_mapped`)::
+ address.is_private == address.ipv4_mapped.is_private
+
+ ``is_private`` has value opposite to :attr:`is_global`, except for the ``100.64.0.0/10``
+ IPv4 range where they are both ``False``.
"""
- return any(self in net for net in self._constants._private_networks)
+ ipv4_mapped = self.ipv4_mapped
+ if ipv4_mapped is not None:
+ return ipv4_mapped.is_private
+ return (
+ any(self in net for net in self._constants._private_networks)
+ and all(self not in net for net in self._constants._private_networks_exceptions)
+ )
@property
def is_global(self):
- """Test if this address is allocated for public networks.
+ """``True`` if the address is defined as globally reachable by
+ iana-ipv4-special-registry_ (for IPv4) or iana-ipv6-special-registry_
+ (for IPv6) with the following exception:
- Returns:
- A boolean, true if the address is not reserved per
- iana-ipv6-special-registry.
+ For IPv4-mapped IPv6-addresses the ``is_private`` value is determined by the
+ semantics of the underlying IPv4 addresses and the following condition holds
+ (see :attr:`IPv6Address.ipv4_mapped`)::
+ address.is_global == address.ipv4_mapped.is_global
+
+ ``is_global`` has value opposite to :attr:`is_private`, except for the ``100.64.0.0/10``
+ IPv4 range where they are both ``False``.
"""
return not self.is_private
@@ -2236,19 +2285,31 @@ class _IPv6Constants:
_multicast_network = IPv6Network('ff00::/8')
+ # Not globally reachable address blocks listed on
+ # https://www.iana.org/assignments/iana-ipv6-special-registry/iana-ipv6-special-registry.xhtml
_private_networks = [
IPv6Network('::1/128'),
IPv6Network('::/128'),
IPv6Network('::ffff:0:0/96'),
+ IPv6Network('64:ff9b:1::/48'),
IPv6Network('100::/64'),
IPv6Network('2001::/23'),
- IPv6Network('2001:2::/48'),
IPv6Network('2001:db8::/32'),
- IPv6Network('2001:10::/28'),
+ # IANA says N/A, let's consider it not globally reachable to be safe
+ IPv6Network('2002::/16'),
IPv6Network('fc00::/7'),
IPv6Network('fe80::/10'),
]
+ _private_networks_exceptions = [
+ IPv6Network('2001:1::1/128'),
+ IPv6Network('2001:1::2/128'),
+ IPv6Network('2001:3::/32'),
+ IPv6Network('2001:4:112::/48'),
+ IPv6Network('2001:20::/28'),
+ IPv6Network('2001:30::/28'),
+ ]
+
_reserved_networks = [
IPv6Network('::/8'), IPv6Network('100::/8'),
IPv6Network('200::/7'), IPv6Network('400::/6'),
diff --git a/Lib/test/test_ipaddress.py b/Lib/test/test_ipaddress.py
index 7de444af4a..716846b2ae 100644
--- a/Lib/test/test_ipaddress.py
+++ b/Lib/test/test_ipaddress.py
@@ -1665,6 +1665,10 @@ class IpaddrUnitTest(unittest.TestCase):
self.assertEqual(True, ipaddress.ip_address(
'172.31.255.255').is_private)
self.assertEqual(False, ipaddress.ip_address('172.32.0.0').is_private)
+ self.assertFalse(ipaddress.ip_address('192.0.0.0').is_global)
+ self.assertTrue(ipaddress.ip_address('192.0.0.9').is_global)
+ self.assertTrue(ipaddress.ip_address('192.0.0.10').is_global)
+ self.assertFalse(ipaddress.ip_address('192.0.0.255').is_global)
self.assertEqual(True,
ipaddress.ip_address('169.254.100.200').is_link_local)
@@ -1680,6 +1684,40 @@ class IpaddrUnitTest(unittest.TestCase):
self.assertEqual(False, ipaddress.ip_address('128.0.0.0').is_loopback)
self.assertEqual(True, ipaddress.ip_network('0.0.0.0').is_unspecified)
+ def testPrivateNetworks(self):
+ self.assertEqual(True, ipaddress.ip_network("0.0.0.0/0").is_private)
+ self.assertEqual(False, ipaddress.ip_network("1.0.0.0/8").is_private)
+
+ self.assertEqual(True, ipaddress.ip_network("0.0.0.0/8").is_private)
+ self.assertEqual(True, ipaddress.ip_network("10.0.0.0/8").is_private)
+ self.assertEqual(True, ipaddress.ip_network("127.0.0.0/8").is_private)
+ self.assertEqual(True, ipaddress.ip_network("169.254.0.0/16").is_private)
+ self.assertEqual(True, ipaddress.ip_network("172.16.0.0/12").is_private)
+ self.assertEqual(True, ipaddress.ip_network("192.0.0.0/29").is_private)
+ self.assertEqual(False, ipaddress.ip_network("192.0.0.9/32").is_private)
+ self.assertEqual(True, ipaddress.ip_network("192.0.0.170/31").is_private)
+ self.assertEqual(True, ipaddress.ip_network("192.0.2.0/24").is_private)
+ self.assertEqual(True, ipaddress.ip_network("192.168.0.0/16").is_private)
+ self.assertEqual(True, ipaddress.ip_network("198.18.0.0/15").is_private)
+ self.assertEqual(True, ipaddress.ip_network("198.51.100.0/24").is_private)
+ self.assertEqual(True, ipaddress.ip_network("203.0.113.0/24").is_private)
+ self.assertEqual(True, ipaddress.ip_network("240.0.0.0/4").is_private)
+ self.assertEqual(True, ipaddress.ip_network("255.255.255.255/32").is_private)
+
+ self.assertEqual(False, ipaddress.ip_network("::/0").is_private)
+ self.assertEqual(False, ipaddress.ip_network("::ff/128").is_private)
+
+ self.assertEqual(True, ipaddress.ip_network("::1/128").is_private)
+ self.assertEqual(True, ipaddress.ip_network("::/128").is_private)
+ self.assertEqual(True, ipaddress.ip_network("::ffff:0:0/96").is_private)
+ self.assertEqual(True, ipaddress.ip_network("100::/64").is_private)
+ self.assertEqual(True, ipaddress.ip_network("2001:2::/48").is_private)
+ self.assertEqual(False, ipaddress.ip_network("2001:3::/48").is_private)
+ self.assertEqual(True, ipaddress.ip_network("2001:db8::/32").is_private)
+ self.assertEqual(True, ipaddress.ip_network("2001:10::/28").is_private)
+ self.assertEqual(True, ipaddress.ip_network("fc00::/7").is_private)
+ self.assertEqual(True, ipaddress.ip_network("fe80::/10").is_private)
+
def testReservedIpv6(self):
self.assertEqual(True, ipaddress.ip_network('ffff::').is_multicast)
@@ -1753,6 +1791,20 @@ class IpaddrUnitTest(unittest.TestCase):
self.assertEqual(True, ipaddress.ip_address('0::0').is_unspecified)
self.assertEqual(False, ipaddress.ip_address('::1').is_unspecified)
+ self.assertFalse(ipaddress.ip_address('64:ff9b:1::').is_global)
+ self.assertFalse(ipaddress.ip_address('2001::').is_global)
+ self.assertTrue(ipaddress.ip_address('2001:1::1').is_global)
+ self.assertTrue(ipaddress.ip_address('2001:1::2').is_global)
+ self.assertFalse(ipaddress.ip_address('2001:2::').is_global)
+ self.assertTrue(ipaddress.ip_address('2001:3::').is_global)
+ self.assertFalse(ipaddress.ip_address('2001:4::').is_global)
+ self.assertTrue(ipaddress.ip_address('2001:4:112::').is_global)
+ self.assertFalse(ipaddress.ip_address('2001:10::').is_global)
+ self.assertTrue(ipaddress.ip_address('2001:20::').is_global)
+ self.assertTrue(ipaddress.ip_address('2001:30::').is_global)
+ self.assertFalse(ipaddress.ip_address('2001:40::').is_global)
+ self.assertFalse(ipaddress.ip_address('2002::').is_global)
+
# some generic IETF reserved addresses
self.assertEqual(True, ipaddress.ip_address('100::').is_reserved)
self.assertEqual(True, ipaddress.ip_network('4000::1/128').is_reserved)
diff --git a/Misc/NEWS.d/next/Library/2024-03-14-01-38-44.gh-issue-113171.VFnObz.rst b/Misc/NEWS.d/next/Library/2024-03-14-01-38-44.gh-issue-113171.VFnObz.rst
new file mode 100644
index 0000000000..f9a72473be
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2024-03-14-01-38-44.gh-issue-113171.VFnObz.rst
@@ -0,0 +1,9 @@
+Fixed various false positives and false negatives in
+
+* :attr:`ipaddress.IPv4Address.is_private` (see these docs for details)
+* :attr:`ipaddress.IPv4Address.is_global`
+* :attr:`ipaddress.IPv6Address.is_private`
+* :attr:`ipaddress.IPv6Address.is_global`
+
+Also in the corresponding :class:`ipaddress.IPv4Network` and :class:`ipaddress.IPv6Network`
+attributes.

View file

@ -1,381 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tom=C3=A1=C5=A1=20Hrn=C4=8Diar?= <thrnciar@redhat.com>
Date: Fri, 16 Aug 2024 14:12:58 +0200
Subject: 00435: gh-121650: Encode newlines in headers, and verify headers are
sound (GH-122233)
Per RFC 2047:
> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects
It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.
This should fail for custom fold() implementations that aren't careful
about newlines.
(cherry picked from commit 097633981879b3c9de9a1dd120d3aa585ecc2384)
This patch also contains modified commit cherry picked from
c5bba853d5e7836f6d4340e18721d3fb3a6ee0f7.
This commit was backported to simplify the backport of the other commit
fixing CVE. The only modification is a removal of one test case which
tests multiple changes in Python 3.7 and it wasn't working properly
with Python 3.6 where we backported only one change.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>
---
Doc/library/email.errors.rst | 6 ++
Doc/library/email.policy.rst | 18 ++++++
Lib/email/_header_value_parser.py | 9 +++
Lib/email/_policybase.py | 8 +++
Lib/email/errors.py | 4 ++
Lib/email/generator.py | 16 ++++-
Lib/test/test_email/test_generator.py | 62 +++++++++++++++++++
Lib/test/test_email/test_headerregistry.py | 16 +++++
Lib/test/test_email/test_policy.py | 26 ++++++++
.../2019-07-09-11-20-21.bpo-37482.auzvev.rst | 1 +
...-07-27-16-10-41.gh-issue-121650.nf6oc9.rst | 5 ++
11 files changed, 170 insertions(+), 1 deletion(-)
create mode 100644 Misc/NEWS.d/next/Library/2019-07-09-11-20-21.bpo-37482.auzvev.rst
create mode 100644 Misc/NEWS.d/next/Library/2024-07-27-16-10-41.gh-issue-121650.nf6oc9.rst
diff --git a/Doc/library/email.errors.rst b/Doc/library/email.errors.rst
index 511ad16358..7e51f74467 100644
--- a/Doc/library/email.errors.rst
+++ b/Doc/library/email.errors.rst
@@ -59,6 +59,12 @@ The following exception classes are defined in the :mod:`email.errors` module:
:class:`~email.mime.image.MIMEImage`).
+.. exception:: HeaderWriteError()
+
+ Raised when an error occurs when the :mod:`~email.generator` outputs
+ headers.
+
+
Here is the list of the defects that the :class:`~email.parser.FeedParser`
can find while parsing messages. Note that the defects are added to the message
where the problem was found, so for example, if a message nested inside a
diff --git a/Doc/library/email.policy.rst b/Doc/library/email.policy.rst
index 8e70762598..8617b2ed5b 100644
--- a/Doc/library/email.policy.rst
+++ b/Doc/library/email.policy.rst
@@ -229,6 +229,24 @@ added matters. To illustrate::
.. versionadded:: 3.6
+
+ .. attribute:: verify_generated_headers
+
+ If ``True`` (the default), the generator will raise
+ :exc:`~email.errors.HeaderWriteError` instead of writing a header
+ that is improperly folded or delimited, such that it would
+ be parsed as multiple headers or joined with adjacent data.
+ Such headers can be generated by custom header classes or bugs
+ in the ``email`` module.
+
+ As it's a security feature, this defaults to ``True`` even in the
+ :class:`~email.policy.Compat32` policy.
+ For backwards compatible, but unsafe, behavior, it must be set to
+ ``False`` explicitly.
+
+ .. versionadded:: 3.8.20
+
+
The following :class:`Policy` method is intended to be called by code using
the email library to create policy instances with custom settings:
diff --git a/Lib/email/_header_value_parser.py b/Lib/email/_header_value_parser.py
index bc9c9b6241..04035b2612 100644
--- a/Lib/email/_header_value_parser.py
+++ b/Lib/email/_header_value_parser.py
@@ -92,6 +92,8 @@ TOKEN_ENDS = TSPECIALS | WSP
ASPECIALS = TSPECIALS | set("*'%")
ATTRIBUTE_ENDS = ASPECIALS | WSP
EXTENDED_ATTRIBUTE_ENDS = ATTRIBUTE_ENDS - set('%')
+NLSET = {'\n', '\r'}
+SPECIALSNL = SPECIALS | NLSET
def quote_string(value):
return '"'+str(value).replace('\\', '\\\\').replace('"', r'\"')+'"'
@@ -2611,6 +2613,13 @@ def _refold_parse_tree(parse_tree, *, policy):
wrap_as_ew_blocked -= 1
continue
tstr = str(part)
+ if not want_encoding:
+ if part.token_type == 'ptext':
+ # Encode if tstr contains special characters.
+ want_encoding = not SPECIALSNL.isdisjoint(tstr)
+ else:
+ # Encode if tstr contains newlines.
+ want_encoding = not NLSET.isdisjoint(tstr)
try:
tstr.encode(encoding)
charset = encoding
diff --git a/Lib/email/_policybase.py b/Lib/email/_policybase.py
index c9cbadd2a8..d1f48211f9 100644
--- a/Lib/email/_policybase.py
+++ b/Lib/email/_policybase.py
@@ -157,6 +157,13 @@ class Policy(_PolicyBase, metaclass=abc.ABCMeta):
message_factory -- the class to use to create new message objects.
If the value is None, the default is Message.
+ verify_generated_headers
+ -- if true, the generator verifies that each header
+ they are properly folded, so that a parser won't
+ treat it as multiple headers, start-of-body, or
+ part of another header.
+ This is a check against custom Header & fold()
+ implementations.
"""
raise_on_defect = False
@@ -165,6 +172,7 @@ class Policy(_PolicyBase, metaclass=abc.ABCMeta):
max_line_length = 78
mangle_from_ = False
message_factory = None
+ verify_generated_headers = True
def handle_defect(self, obj, defect):
"""Based on policy, either raise defect or call register_defect.
diff --git a/Lib/email/errors.py b/Lib/email/errors.py
index d28a680010..1a0d5c63e6 100644
--- a/Lib/email/errors.py
+++ b/Lib/email/errors.py
@@ -29,6 +29,10 @@ class CharsetError(MessageError):
"""An illegal charset was given."""
+class HeaderWriteError(MessageError):
+ """Error while writing headers."""
+
+
# These are parsing defects which the parser was able to work around.
class MessageDefect(ValueError):
"""Base class for a message defect."""
diff --git a/Lib/email/generator.py b/Lib/email/generator.py
index ae670c2353..6deb95ba8a 100644
--- a/Lib/email/generator.py
+++ b/Lib/email/generator.py
@@ -14,12 +14,14 @@ import random
from copy import deepcopy
from io import StringIO, BytesIO
from email.utils import _has_surrogates
+from email.errors import HeaderWriteError
UNDERSCORE = '_'
NL = '\n' # XXX: no longer used by the code below.
NLCRE = re.compile(r'\r\n|\r|\n')
fcre = re.compile(r'^From ', re.MULTILINE)
+NEWLINE_WITHOUT_FWSP = re.compile(r'\r\n[^ \t]|\r[^ \n\t]|\n[^ \t]')
@@ -219,7 +221,19 @@ class Generator:
def _write_headers(self, msg):
for h, v in msg.raw_items():
- self.write(self.policy.fold(h, v))
+ folded = self.policy.fold(h, v)
+ if self.policy.verify_generated_headers:
+ linesep = self.policy.linesep
+ if not folded.endswith(self.policy.linesep):
+ raise HeaderWriteError(
+ f'folded header does not end with {linesep!r}: {folded!r}')
+ folded_no_linesep = folded
+ if folded.endswith(linesep):
+ folded_no_linesep = folded[:-len(linesep)]
+ if NEWLINE_WITHOUT_FWSP.search(folded_no_linesep):
+ raise HeaderWriteError(
+ f'folded header contains newline: {folded!r}')
+ self.write(folded)
# A blank line always separates headers from body
self.write(self._NL)
diff --git a/Lib/test/test_email/test_generator.py b/Lib/test/test_email/test_generator.py
index c1aeaefab7..cdf1075bab 100644
--- a/Lib/test/test_email/test_generator.py
+++ b/Lib/test/test_email/test_generator.py
@@ -5,6 +5,7 @@ from email import message_from_string, message_from_bytes
from email.message import EmailMessage
from email.generator import Generator, BytesGenerator
from email import policy
+import email.errors
from test.test_email import TestEmailBase, parameterize
@@ -215,6 +216,44 @@ class TestGeneratorBase:
g.flatten(msg)
self.assertEqual(s.getvalue(), self.typ(expected))
+ def test_keep_encoded_newlines(self):
+ msg = self.msgmaker(self.typ(textwrap.dedent("""\
+ To: nobody
+ Subject: Bad subject=?UTF-8?Q?=0A?=Bcc: injection@example.com
+
+ None
+ """)))
+ expected = textwrap.dedent("""\
+ To: nobody
+ Subject: Bad subject=?UTF-8?Q?=0A?=Bcc: injection@example.com
+
+ None
+ """)
+ s = self.ioclass()
+ g = self.genclass(s, policy=self.policy.clone(max_line_length=80))
+ g.flatten(msg)
+ self.assertEqual(s.getvalue(), self.typ(expected))
+
+ def test_keep_long_encoded_newlines(self):
+ msg = self.msgmaker(self.typ(textwrap.dedent("""\
+ To: nobody
+ Subject: Bad subject =?UTF-8?Q?=0A?=Bcc: injection@example.com
+
+ None
+ """)))
+ expected = textwrap.dedent("""\
+ To: nobody
+ Subject: Bad subject \n\
+ =?utf-8?q?=0A?=Bcc:
+ injection@example.com
+
+ None
+ """)
+ s = self.ioclass()
+ g = self.genclass(s, policy=self.policy.clone(max_line_length=30))
+ g.flatten(msg)
+ self.assertEqual(s.getvalue(), self.typ(expected))
+
class TestGenerator(TestGeneratorBase, TestEmailBase):
@@ -223,6 +262,29 @@ class TestGenerator(TestGeneratorBase, TestEmailBase):
ioclass = io.StringIO
typ = str
+ def test_verify_generated_headers(self):
+ """gh-121650: by default the generator prevents header injection"""
+ class LiteralHeader(str):
+ name = 'Header'
+ def fold(self, **kwargs):
+ return self
+
+ for text in (
+ 'Value\r\nBad Injection\r\n',
+ 'NoNewLine'
+ ):
+ with self.subTest(text=text):
+ message = message_from_string(
+ "Header: Value\r\n\r\nBody",
+ policy=self.policy,
+ )
+
+ del message['Header']
+ message['Header'] = LiteralHeader(text)
+
+ with self.assertRaises(email.errors.HeaderWriteError):
+ message.as_string()
+
class TestBytesGenerator(TestGeneratorBase, TestEmailBase):
diff --git a/Lib/test/test_email/test_headerregistry.py b/Lib/test/test_email/test_headerregistry.py
index 08634daa7f..d7c1878940 100644
--- a/Lib/test/test_email/test_headerregistry.py
+++ b/Lib/test/test_email/test_headerregistry.py
@@ -1546,6 +1546,22 @@ class TestAddressAndGroup(TestEmailBase):
class TestFolding(TestHeaderBase):
+ def test_address_display_names(self):
+ """Test the folding and encoding of address headers."""
+ for name, result in (
+ ('Foo Bar, France', '"Foo Bar, France"'),
+ ('Foo Bar (France)', '"Foo Bar (France)"'),
+ ('Foo Bar, España', 'Foo =?utf-8?q?Bar=2C_Espa=C3=B1a?='),
+ ('Foo Bar (España)', 'Foo Bar =?utf-8?b?KEVzcGHDsWEp?='),
+ ('Foo, Bar España', '=?utf-8?q?Foo=2C_Bar_Espa=C3=B1a?='),
+ ('Foo, Bar [España]', '=?utf-8?q?Foo=2C_Bar_=5BEspa=C3=B1a=5D?='),
+ ('Foo Bär, France', 'Foo =?utf-8?q?B=C3=A4r=2C?= France'),
+ ('Foo Bär <France>', 'Foo =?utf-8?q?B=C3=A4r_=3CFrance=3E?='),
+ ):
+ h = self.make_header('To', Address(name, addr_spec='a@b.com'))
+ self.assertEqual(h.fold(policy=policy.default),
+ 'To: %s <a@b.com>\n' % result)
+
def test_short_unstructured(self):
h = self.make_header('subject', 'this is a test')
self.assertEqual(h.fold(policy=policy.default),
diff --git a/Lib/test/test_email/test_policy.py b/Lib/test/test_email/test_policy.py
index 6999e4af10..76198392e5 100644
--- a/Lib/test/test_email/test_policy.py
+++ b/Lib/test/test_email/test_policy.py
@@ -26,6 +26,7 @@ class PolicyAPITests(unittest.TestCase):
'raise_on_defect': False,
'mangle_from_': True,
'message_factory': None,
+ 'verify_generated_headers': True,
}
# These default values are the ones set on email.policy.default.
# If any of these defaults change, the docs must be updated.
@@ -265,6 +266,31 @@ class PolicyAPITests(unittest.TestCase):
with self.assertRaises(email.errors.HeaderParseError):
policy.fold("Subject", subject)
+ def test_verify_generated_headers(self):
+ """Turning protection off allows header injection"""
+ policy = email.policy.default.clone(verify_generated_headers=False)
+ for text in (
+ 'Header: Value\r\nBad: Injection\r\n',
+ 'Header: NoNewLine'
+ ):
+ with self.subTest(text=text):
+ message = email.message_from_string(
+ "Header: Value\r\n\r\nBody",
+ policy=policy,
+ )
+ class LiteralHeader(str):
+ name = 'Header'
+ def fold(self, **kwargs):
+ return self
+
+ del message['Header']
+ message['Header'] = LiteralHeader(text)
+
+ self.assertEqual(
+ message.as_string(),
+ f"{text}\nBody",
+ )
+
# XXX: Need subclassing tests.
# For adding subclassed objects, make sure the usual rules apply (subclass
# wins), but that the order still works (right overrides left).
diff --git a/Misc/NEWS.d/next/Library/2019-07-09-11-20-21.bpo-37482.auzvev.rst b/Misc/NEWS.d/next/Library/2019-07-09-11-20-21.bpo-37482.auzvev.rst
new file mode 100644
index 0000000000..e09ff63eed
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2019-07-09-11-20-21.bpo-37482.auzvev.rst
@@ -0,0 +1 @@
+Fix serialization of display name in originator or destination address fields with both encoded words and special chars.
diff --git a/Misc/NEWS.d/next/Library/2024-07-27-16-10-41.gh-issue-121650.nf6oc9.rst b/Misc/NEWS.d/next/Library/2024-07-27-16-10-41.gh-issue-121650.nf6oc9.rst
new file mode 100644
index 0000000000..83dd28d4ac
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2024-07-27-16-10-41.gh-issue-121650.nf6oc9.rst
@@ -0,0 +1,5 @@
+:mod:`email` headers with embedded newlines are now quoted on output. The
+:mod:`~email.generator` will now refuse to serialize (write) headers that
+are unsafely folded or delimited; see
+:attr:`~email.policy.Policy.verify_generated_headers`. (Contributed by Bas
+Bloemsaat and Petr Viktorin in :gh:`121650`.)

View file

@ -1,248 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Seth Michael Larson <seth@python.org>
Date: Wed, 4 Sep 2024 10:41:42 -0500
Subject: 00437: CVE-2024-6232 Remove backtracking when parsing tarfile headers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* Remove backtracking when parsing tarfile headers
* Rewrite PAX header parsing to be stricter
* Optimize parsing of GNU extended sparse headers v0.0
(cherry picked from commit 34ddb64d088dd7ccc321f6103d23153256caa5d4)
Co-authored-by: Seth Michael Larson <seth@python.org>
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Lumír Balhar <lbalhar@redhat.com>
---
Lib/tarfile.py | 104 +++++++++++-------
Lib/test/test_tarfile.py | 42 +++++++
...-07-02-13-39-20.gh-issue-121285.hrl-yI.rst | 2 +
3 files changed, 111 insertions(+), 37 deletions(-)
create mode 100644 Misc/NEWS.d/next/Security/2024-07-02-13-39-20.gh-issue-121285.hrl-yI.rst
diff --git a/Lib/tarfile.py b/Lib/tarfile.py
index c18590325a..ee1bf37bfd 100755
--- a/Lib/tarfile.py
+++ b/Lib/tarfile.py
@@ -846,6 +846,9 @@ _NAMED_FILTERS = {
# Sentinel for replace() defaults, meaning "don't change the attribute"
_KEEP = object()
+# Header length is digits followed by a space.
+_header_length_prefix_re = re.compile(br"([0-9]{1,20}) ")
+
class TarInfo(object):
"""Informational class which holds the details about an
archive member given by a tar header block.
@@ -1371,41 +1374,60 @@ class TarInfo(object):
else:
pax_headers = tarfile.pax_headers.copy()
- # Check if the pax header contains a hdrcharset field. This tells us
- # the encoding of the path, linkpath, uname and gname fields. Normally,
- # these fields are UTF-8 encoded but since POSIX.1-2008 tar
- # implementations are allowed to store them as raw binary strings if
- # the translation to UTF-8 fails.
- match = re.search(br"\d+ hdrcharset=([^\n]+)\n", buf)
- if match is not None:
- pax_headers["hdrcharset"] = match.group(1).decode("utf-8")
-
- # For the time being, we don't care about anything other than "BINARY".
- # The only other value that is currently allowed by the standard is
- # "ISO-IR 10646 2000 UTF-8" in other words UTF-8.
- hdrcharset = pax_headers.get("hdrcharset")
- if hdrcharset == "BINARY":
- encoding = tarfile.encoding
- else:
- encoding = "utf-8"
-
# Parse pax header information. A record looks like that:
# "%d %s=%s\n" % (length, keyword, value). length is the size
# of the complete record including the length field itself and
- # the newline. keyword and value are both UTF-8 encoded strings.
- regex = re.compile(br"(\d+) ([^=]+)=")
+ # the newline.
pos = 0
- while True:
- match = regex.match(buf, pos)
+ encoding = None
+ raw_headers = []
+ while len(buf) > pos and buf[pos] != 0x00:
+ match = _header_length_prefix_re.match(buf, pos)
if not match:
- break
+ raise InvalidHeaderError("invalid header")
+ try:
+ length = int(match.group(1))
+ except ValueError:
+ raise InvalidHeaderError("invalid header")
+ # Headers must be at least 5 bytes, shortest being '5 x=\n'.
+ # Value is allowed to be empty.
+ if length < 5:
+ raise InvalidHeaderError("invalid header")
+ if pos + length > len(buf):
+ raise InvalidHeaderError("invalid header")
- length, keyword = match.groups()
- length = int(length)
- if length == 0:
+ header_value_end_offset = match.start(1) + length - 1 # Last byte of the header
+ keyword_and_value = buf[match.end(1) + 1:header_value_end_offset]
+ raw_keyword, equals, raw_value = keyword_and_value.partition(b"=")
+
+ # Check the framing of the header. The last character must be '\n' (0x0A)
+ if not raw_keyword or equals != b"=" or buf[header_value_end_offset] != 0x0A:
raise InvalidHeaderError("invalid header")
- value = buf[match.end(2) + 1:match.start(1) + length - 1]
+ raw_headers.append((length, raw_keyword, raw_value))
+
+ # Check if the pax header contains a hdrcharset field. This tells us
+ # the encoding of the path, linkpath, uname and gname fields. Normally,
+ # these fields are UTF-8 encoded but since POSIX.1-2008 tar
+ # implementations are allowed to store them as raw binary strings if
+ # the translation to UTF-8 fails. For the time being, we don't care about
+ # anything other than "BINARY". The only other value that is currently
+ # allowed by the standard is "ISO-IR 10646 2000 UTF-8" in other words UTF-8.
+ # Note that we only follow the initial 'hdrcharset' setting to preserve
+ # the initial behavior of the 'tarfile' module.
+ if raw_keyword == b"hdrcharset" and encoding is None:
+ if raw_value == b"BINARY":
+ encoding = tarfile.encoding
+ else: # This branch ensures only the first 'hdrcharset' header is used.
+ encoding = "utf-8"
+
+ pos += length
+
+ # If no explicit hdrcharset is set, we use UTF-8 as a default.
+ if encoding is None:
+ encoding = "utf-8"
+ # After parsing the raw headers we can decode them to text.
+ for length, raw_keyword, raw_value in raw_headers:
# Normally, we could just use "utf-8" as the encoding and "strict"
# as the error handler, but we better not take the risk. For
# example, GNU tar <= 1.23 is known to store filenames it cannot
@@ -1413,17 +1435,16 @@ class TarInfo(object):
# hdrcharset=BINARY header).
# We first try the strict standard encoding, and if that fails we
# fall back on the user's encoding and error handler.
- keyword = self._decode_pax_field(keyword, "utf-8", "utf-8",
+ keyword = self._decode_pax_field(raw_keyword, "utf-8", "utf-8",
tarfile.errors)
if keyword in PAX_NAME_FIELDS:
- value = self._decode_pax_field(value, encoding, tarfile.encoding,
+ value = self._decode_pax_field(raw_value, encoding, tarfile.encoding,
tarfile.errors)
else:
- value = self._decode_pax_field(value, "utf-8", "utf-8",
+ value = self._decode_pax_field(raw_value, "utf-8", "utf-8",
tarfile.errors)
pax_headers[keyword] = value
- pos += length
# Fetch the next header.
try:
@@ -1438,7 +1459,7 @@ class TarInfo(object):
elif "GNU.sparse.size" in pax_headers:
# GNU extended sparse format version 0.0.
- self._proc_gnusparse_00(next, pax_headers, buf)
+ self._proc_gnusparse_00(next, raw_headers)
elif pax_headers.get("GNU.sparse.major") == "1" and pax_headers.get("GNU.sparse.minor") == "0":
# GNU extended sparse format version 1.0.
@@ -1460,15 +1481,24 @@ class TarInfo(object):
return next
- def _proc_gnusparse_00(self, next, pax_headers, buf):
+ def _proc_gnusparse_00(self, next, raw_headers):
"""Process a GNU tar extended sparse header, version 0.0.
"""
offsets = []
- for match in re.finditer(br"\d+ GNU.sparse.offset=(\d+)\n", buf):
- offsets.append(int(match.group(1)))
numbytes = []
- for match in re.finditer(br"\d+ GNU.sparse.numbytes=(\d+)\n", buf):
- numbytes.append(int(match.group(1)))
+ for _, keyword, value in raw_headers:
+ if keyword == b"GNU.sparse.offset":
+ try:
+ offsets.append(int(value.decode()))
+ except ValueError:
+ raise InvalidHeaderError("invalid header")
+
+ elif keyword == b"GNU.sparse.numbytes":
+ try:
+ numbytes.append(int(value.decode()))
+ except ValueError:
+ raise InvalidHeaderError("invalid header")
+
next.sparse = list(zip(offsets, numbytes))
def _proc_gnusparse_01(self, next, pax_headers):
diff --git a/Lib/test/test_tarfile.py b/Lib/test/test_tarfile.py
index f261048615..04ef000e71 100644
--- a/Lib/test/test_tarfile.py
+++ b/Lib/test/test_tarfile.py
@@ -1046,6 +1046,48 @@ class PaxReadTest(LongnameTest, ReadTest, unittest.TestCase):
finally:
tar.close()
+ def test_pax_header_bad_formats(self):
+ # The fields from the pax header have priority over the
+ # TarInfo.
+ pax_header_replacements = (
+ b" foo=bar\n",
+ b"0 \n",
+ b"1 \n",
+ b"2 \n",
+ b"3 =\n",
+ b"4 =a\n",
+ b"1000000 foo=bar\n",
+ b"0 foo=bar\n",
+ b"-12 foo=bar\n",
+ b"000000000000000000000000036 foo=bar\n",
+ )
+ pax_headers = {"foo": "bar"}
+
+ for replacement in pax_header_replacements:
+ with self.subTest(header=replacement):
+ tar = tarfile.open(tmpname, "w", format=tarfile.PAX_FORMAT,
+ encoding="iso8859-1")
+ try:
+ t = tarfile.TarInfo()
+ t.name = "pax" # non-ASCII
+ t.uid = 1
+ t.pax_headers = pax_headers
+ tar.addfile(t)
+ finally:
+ tar.close()
+
+ with open(tmpname, "rb") as f:
+ data = f.read()
+ self.assertIn(b"11 foo=bar\n", data)
+ data = data.replace(b"11 foo=bar\n", replacement)
+
+ with open(tmpname, "wb") as f:
+ f.truncate()
+ f.write(data)
+
+ with self.assertRaisesRegex(tarfile.ReadError, r"file could not be opened successfully"):
+ tarfile.open(tmpname, encoding="iso8859-1")
+
class WriteTestBase(TarTest):
# Put all write tests in here that are supposed to be tested
diff --git a/Misc/NEWS.d/next/Security/2024-07-02-13-39-20.gh-issue-121285.hrl-yI.rst b/Misc/NEWS.d/next/Security/2024-07-02-13-39-20.gh-issue-121285.hrl-yI.rst
new file mode 100644
index 0000000000..81f918bfe2
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2024-07-02-13-39-20.gh-issue-121285.hrl-yI.rst
@@ -0,0 +1,2 @@
+Remove backtracking from tarfile header parsing for ``hdrcharset``, PAX, and
+GNU sparse headers.

View file

@ -1,280 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Victor Stinner <vstinner@python.org>
Date: Fri, 1 Nov 2024 14:11:47 +0100
Subject: 00443: gh-124651: Quote template strings in `venv` activation scripts
(cherry picked from 3.9)
---
Lib/test/test_venv.py | 82 +++++++++++++++++++
Lib/venv/__init__.py | 42 ++++++++--
Lib/venv/scripts/common/activate | 8 +-
Lib/venv/scripts/posix/activate.csh | 8 +-
Lib/venv/scripts/posix/activate.fish | 8 +-
...-09-28-02-03-04.gh-issue-124651.bLBGtH.rst | 1 +
6 files changed, 132 insertions(+), 17 deletions(-)
create mode 100644 Misc/NEWS.d/next/Library/2024-09-28-02-03-04.gh-issue-124651.bLBGtH.rst
diff --git a/Lib/test/test_venv.py b/Lib/test/test_venv.py
index 842470fef0..67fdcd86bb 100644
--- a/Lib/test/test_venv.py
+++ b/Lib/test/test_venv.py
@@ -13,6 +13,8 @@ import struct
import subprocess
import sys
import tempfile
+import shlex
+import shutil
from test.support import (captured_stdout, captured_stderr, requires_zlib,
can_symlink, EnvironmentVarGuard, rmtree)
import unittest
@@ -80,6 +82,10 @@ class BaseTest(unittest.TestCase):
result = f.read()
return result
+ def assertEndsWith(self, string, tail):
+ if not string.endswith(tail):
+ self.fail(f"String {string!r} does not end with {tail!r}")
+
class BasicTest(BaseTest):
"""Test venv module functionality."""
@@ -293,6 +299,82 @@ class BasicTest(BaseTest):
'import sys; print(sys.executable)'])
self.assertEqual(out.strip(), envpy.encode())
+ # gh-124651: test quoted strings
+ @unittest.skipIf(os.name == 'nt', 'contains invalid characters on Windows')
+ def test_special_chars_bash(self):
+ """
+ Test that the template strings are quoted properly (bash)
+ """
+ rmtree(self.env_dir)
+ bash = shutil.which('bash')
+ if bash is None:
+ self.skipTest('bash required for this test')
+ env_name = '"\';&&$e|\'"'
+ env_dir = os.path.join(os.path.realpath(self.env_dir), env_name)
+ builder = venv.EnvBuilder(clear=True)
+ builder.create(env_dir)
+ activate = os.path.join(env_dir, self.bindir, 'activate')
+ test_script = os.path.join(self.env_dir, 'test_special_chars.sh')
+ with open(test_script, "w") as f:
+ f.write(f'source {shlex.quote(activate)}\n'
+ 'python -c \'import sys; print(sys.executable)\'\n'
+ 'python -c \'import os; print(os.environ["VIRTUAL_ENV"])\'\n'
+ 'deactivate\n')
+ out, err = check_output([bash, test_script])
+ lines = out.splitlines()
+ self.assertTrue(env_name.encode() in lines[0])
+ self.assertEndsWith(lines[1], env_name.encode())
+
+ # gh-124651: test quoted strings
+ @unittest.skipIf(os.name == 'nt', 'contains invalid characters on Windows')
+ def test_special_chars_csh(self):
+ """
+ Test that the template strings are quoted properly (csh)
+ """
+ rmtree(self.env_dir)
+ csh = shutil.which('tcsh') or shutil.which('csh')
+ if csh is None:
+ self.skipTest('csh required for this test')
+ env_name = '"\';&&$e|\'"'
+ env_dir = os.path.join(os.path.realpath(self.env_dir), env_name)
+ builder = venv.EnvBuilder(clear=True)
+ builder.create(env_dir)
+ activate = os.path.join(env_dir, self.bindir, 'activate.csh')
+ test_script = os.path.join(self.env_dir, 'test_special_chars.csh')
+ with open(test_script, "w") as f:
+ f.write(f'source {shlex.quote(activate)}\n'
+ 'python -c \'import sys; print(sys.executable)\'\n'
+ 'python -c \'import os; print(os.environ["VIRTUAL_ENV"])\'\n'
+ 'deactivate\n')
+ out, err = check_output([csh, test_script])
+ lines = out.splitlines()
+ self.assertTrue(env_name.encode() in lines[0])
+ self.assertEndsWith(lines[1], env_name.encode())
+
+ # gh-124651: test quoted strings on Windows
+ @unittest.skipUnless(os.name == 'nt', 'only relevant on Windows')
+ def test_special_chars_windows(self):
+ """
+ Test that the template strings are quoted properly on Windows
+ """
+ rmtree(self.env_dir)
+ env_name = "'&&^$e"
+ env_dir = os.path.join(os.path.realpath(self.env_dir), env_name)
+ builder = venv.EnvBuilder(clear=True)
+ builder.create(env_dir)
+ activate = os.path.join(env_dir, self.bindir, 'activate.bat')
+ test_batch = os.path.join(self.env_dir, 'test_special_chars.bat')
+ with open(test_batch, "w") as f:
+ f.write('@echo off\n'
+ f'"{activate}" & '
+ f'{self.exe} -c "import sys; print(sys.executable)" & '
+ f'{self.exe} -c "import os; print(os.environ[\'VIRTUAL_ENV\'])" & '
+ 'deactivate')
+ out, err = check_output([test_batch])
+ lines = out.splitlines()
+ self.assertTrue(env_name.encode() in lines[0])
+ self.assertEndsWith(lines[1], env_name.encode())
+
@unittest.skipUnless(os.name == 'nt', 'only relevant on Windows')
def test_unicode_in_batch_file(self):
"""
diff --git a/Lib/venv/__init__.py b/Lib/venv/__init__.py
index 716129d139..0c44dfd07d 100644
--- a/Lib/venv/__init__.py
+++ b/Lib/venv/__init__.py
@@ -10,6 +10,7 @@ import shutil
import subprocess
import sys
import types
+import shlex
logger = logging.getLogger(__name__)
@@ -280,11 +281,41 @@ class EnvBuilder:
:param context: The information for the environment creation request
being processed.
"""
- text = text.replace('__VENV_DIR__', context.env_dir)
- text = text.replace('__VENV_NAME__', context.env_name)
- text = text.replace('__VENV_PROMPT__', context.prompt)
- text = text.replace('__VENV_BIN_NAME__', context.bin_name)
- text = text.replace('__VENV_PYTHON__', context.env_exe)
+ replacements = {
+ '__VENV_DIR__': context.env_dir,
+ '__VENV_NAME__': context.env_name,
+ '__VENV_PROMPT__': context.prompt,
+ '__VENV_BIN_NAME__': context.bin_name,
+ '__VENV_PYTHON__': context.env_exe,
+ }
+
+ def quote_ps1(s):
+ """
+ This should satisfy PowerShell quoting rules [1], unless the quoted
+ string is passed directly to Windows native commands [2].
+ [1]: https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_quoting_rules
+ [2]: https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_parsing#passing-arguments-that-contain-quote-characters
+ """
+ s = s.replace("'", "''")
+ return f"'{s}'"
+
+ def quote_bat(s):
+ return s
+
+ # gh-124651: need to quote the template strings properly
+ quote = shlex.quote
+ script_path = context.script_path
+ if script_path.endswith('.ps1'):
+ quote = quote_ps1
+ elif script_path.endswith('.bat'):
+ quote = quote_bat
+ else:
+ # fallbacks to POSIX shell compliant quote
+ quote = shlex.quote
+
+ replacements = {key: quote(s) for key, s in replacements.items()}
+ for key, quoted in replacements.items():
+ text = text.replace(key, quoted)
return text
def install_scripts(self, context, path):
@@ -321,6 +352,7 @@ class EnvBuilder:
with open(srcfile, 'rb') as f:
data = f.read()
if not srcfile.endswith('.exe'):
+ context.script_path = srcfile
try:
data = data.decode('utf-8')
data = self.replace_variables(data, context)
diff --git a/Lib/venv/scripts/common/activate b/Lib/venv/scripts/common/activate
index fff0765af5..c2e2f968fa 100644
--- a/Lib/venv/scripts/common/activate
+++ b/Lib/venv/scripts/common/activate
@@ -37,11 +37,11 @@ deactivate () {
# unset irrelevant variables
deactivate nondestructive
-VIRTUAL_ENV="__VENV_DIR__"
+VIRTUAL_ENV=__VENV_DIR__
export VIRTUAL_ENV
_OLD_VIRTUAL_PATH="$PATH"
-PATH="$VIRTUAL_ENV/__VENV_BIN_NAME__:$PATH"
+PATH="$VIRTUAL_ENV/"__VENV_BIN_NAME__":$PATH"
export PATH
# unset PYTHONHOME if set
@@ -54,8 +54,8 @@ fi
if [ -z "${VIRTUAL_ENV_DISABLE_PROMPT:-}" ] ; then
_OLD_VIRTUAL_PS1="${PS1:-}"
- if [ "x__VENV_PROMPT__" != x ] ; then
- PS1="__VENV_PROMPT__${PS1:-}"
+ if [ "x"__VENV_PROMPT__ != x ] ; then
+ PS1=__VENV_PROMPT__"${PS1:-}"
else
if [ "`basename \"$VIRTUAL_ENV\"`" = "__" ] ; then
# special case for Aspen magic directories
diff --git a/Lib/venv/scripts/posix/activate.csh b/Lib/venv/scripts/posix/activate.csh
index b0c7028a92..0e90d54008 100644
--- a/Lib/venv/scripts/posix/activate.csh
+++ b/Lib/venv/scripts/posix/activate.csh
@@ -8,17 +8,17 @@ alias deactivate 'test $?_OLD_VIRTUAL_PATH != 0 && setenv PATH "$_OLD_VIRTUAL_PA
# Unset irrelevant variables.
deactivate nondestructive
-setenv VIRTUAL_ENV "__VENV_DIR__"
+setenv VIRTUAL_ENV __VENV_DIR__
set _OLD_VIRTUAL_PATH="$PATH"
-setenv PATH "$VIRTUAL_ENV/__VENV_BIN_NAME__:$PATH"
+setenv PATH "$VIRTUAL_ENV/"__VENV_BIN_NAME__":$PATH"
set _OLD_VIRTUAL_PROMPT="$prompt"
if (! "$?VIRTUAL_ENV_DISABLE_PROMPT") then
- if ("__VENV_NAME__" != "") then
- set env_name = "__VENV_NAME__"
+ if (__VENV_NAME__ != "") then
+ set env_name = __VENV_NAME__
else
if (`basename "VIRTUAL_ENV"` == "__") then
# special case for Aspen magic directories
diff --git a/Lib/venv/scripts/posix/activate.fish b/Lib/venv/scripts/posix/activate.fish
index 4d4f0bd7a4..0407f9c7be 100644
--- a/Lib/venv/scripts/posix/activate.fish
+++ b/Lib/venv/scripts/posix/activate.fish
@@ -29,10 +29,10 @@ end
# unset irrelevant variables
deactivate nondestructive
-set -gx VIRTUAL_ENV "__VENV_DIR__"
+set -gx VIRTUAL_ENV __VENV_DIR__
set -gx _OLD_VIRTUAL_PATH $PATH
-set -gx PATH "$VIRTUAL_ENV/__VENV_BIN_NAME__" $PATH
+set -gx PATH "$VIRTUAL_ENV/"__VENV_BIN_NAME__ $PATH
# unset PYTHONHOME if set
if set -q PYTHONHOME
@@ -52,8 +52,8 @@ if test -z "$VIRTUAL_ENV_DISABLE_PROMPT"
set -l old_status $status
# Prompt override?
- if test -n "__VENV_PROMPT__"
- printf "%s%s" "__VENV_PROMPT__" (set_color normal)
+ if test -n __VENV_PROMPT__
+ printf "%s%s" __VENV_PROMPT__ (set_color normal)
else
# ...Otherwise, prepend env
set -l _checkbase (basename "$VIRTUAL_ENV")
diff --git a/Misc/NEWS.d/next/Library/2024-09-28-02-03-04.gh-issue-124651.bLBGtH.rst b/Misc/NEWS.d/next/Library/2024-09-28-02-03-04.gh-issue-124651.bLBGtH.rst
new file mode 100644
index 0000000000..17fc917139
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2024-09-28-02-03-04.gh-issue-124651.bLBGtH.rst
@@ -0,0 +1 @@
+Properly quote template strings in :mod:`venv` activation scripts.

View file

@ -1,110 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Miss Islington (bot)"
<31488909+miss-islington@users.noreply.github.com>
Date: Tue, 9 May 2023 23:35:24 -0700
Subject: 00444: Security fix for CVE-2024-11168
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
gh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (GH-103849)
Tests are adjusted because Python <3.9 don't support scoped IPv6 addresses.
(cherry picked from commit 29f348e232e82938ba2165843c448c2b291504c5)
Co-authored-by: JohnJamesUtley <81572567+JohnJamesUtley@users.noreply.github.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Lumír Balhar <lbalhar@redhat.com>
---
Lib/test/test_urlparse.py | 26 +++++++++++++++++++
Lib/urllib/parse.py | 15 +++++++++++
...-04-26-09-54-25.gh-issue-103848.aDSnpR.rst | 2 ++
3 files changed, 43 insertions(+)
create mode 100644 Misc/NEWS.d/next/Library/2023-04-26-09-54-25.gh-issue-103848.aDSnpR.rst
diff --git a/Lib/test/test_urlparse.py b/Lib/test/test_urlparse.py
index 7fd61ffea9..090d2f17bf 100644
--- a/Lib/test/test_urlparse.py
+++ b/Lib/test/test_urlparse.py
@@ -1076,6 +1076,32 @@ class UrlParseTestCase(unittest.TestCase):
self.assertEqual(p2.scheme, 'tel')
self.assertEqual(p2.path, '+31641044153')
+ def test_invalid_bracketed_hosts(self):
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[192.0.2.146]/Path?Query')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[important.com:8000]/Path?Query')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[v123r.IP]/Path?Query')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[v12ae]/Path?Query')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[v.IP]/Path?Query')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[v123.]/Path?Query')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[v]/Path?Query')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[0439:23af::2309::fae7:1234]/Path?Query')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[0439:23af:2309::fae7:1234:2342:438e:192.0.2.146]/Path?Query')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@]v6a.ip[/Path')
+
+ def test_splitting_bracketed_hosts(self):
+ p1 = urllib.parse.urlsplit('scheme://user@[v6a.ip]/path?query')
+ self.assertEqual(p1.hostname, 'v6a.ip')
+ self.assertEqual(p1.username, 'user')
+ self.assertEqual(p1.path, '/path')
+ p2 = urllib.parse.urlsplit('scheme://user@[0439:23af:2309::fae7]/path?query')
+ self.assertEqual(p2.hostname, '0439:23af:2309::fae7')
+ self.assertEqual(p2.username, 'user')
+ self.assertEqual(p2.path, '/path')
+ p3 = urllib.parse.urlsplit('scheme://user@[0439:23af:2309::fae7:1234:192.0.2.146]/path?query')
+ self.assertEqual(p3.hostname, '0439:23af:2309::fae7:1234:192.0.2.146')
+ self.assertEqual(p3.username, 'user')
+ self.assertEqual(p3.path, '/path')
+
def test_telurl_params(self):
p1 = urllib.parse.urlparse('tel:123-4;phone-context=+1-650-516')
self.assertEqual(p1.scheme, 'tel')
diff --git a/Lib/urllib/parse.py b/Lib/urllib/parse.py
index 717e990997..bf186b7984 100644
--- a/Lib/urllib/parse.py
+++ b/Lib/urllib/parse.py
@@ -34,6 +34,7 @@ It serves as a useful guide when making changes.
import re
import sys
import collections
+import ipaddress
__all__ = ["urlparse", "urlunparse", "urljoin", "urldefrag",
"urlsplit", "urlunsplit", "urlencode", "parse_qs",
@@ -425,6 +426,17 @@ def _remove_unsafe_bytes_from_url(url):
url = url.replace(b, "")
return url
+# Valid bracketed hosts are defined in
+# https://www.rfc-editor.org/rfc/rfc3986#page-49 and https://url.spec.whatwg.org/
+def _check_bracketed_host(hostname):
+ if hostname.startswith('v'):
+ if not re.match(r"\Av[a-fA-F0-9]+\..+\Z", hostname):
+ raise ValueError(f"IPvFuture address is invalid")
+ else:
+ ip = ipaddress.ip_address(hostname) # Throws Value Error if not IPv6 or IPv4
+ if isinstance(ip, ipaddress.IPv4Address):
+ raise ValueError(f"An IPv4 address cannot be in brackets")
+
def urlsplit(url, scheme='', allow_fragments=True):
"""Parse a URL into 5 components:
<scheme>://<netloc>/<path>?<query>#<fragment>
@@ -480,6 +492,9 @@ def urlsplit(url, scheme='', allow_fragments=True):
if (('[' in netloc and ']' not in netloc) or
(']' in netloc and '[' not in netloc)):
raise ValueError("Invalid IPv6 URL")
+ if '[' in netloc and ']' in netloc:
+ bracketed_host = netloc.partition('[')[2].partition(']')[0]
+ _check_bracketed_host(bracketed_host)
if allow_fragments and '#' in url:
url, fragment = url.split('#', 1)
if '?' in url:
diff --git a/Misc/NEWS.d/next/Library/2023-04-26-09-54-25.gh-issue-103848.aDSnpR.rst b/Misc/NEWS.d/next/Library/2023-04-26-09-54-25.gh-issue-103848.aDSnpR.rst
new file mode 100644
index 0000000000..81e5904aa6
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2023-04-26-09-54-25.gh-issue-103848.aDSnpR.rst
@@ -0,0 +1,2 @@
+Add checks to ensure that ``[`` bracketed ``]`` hosts found by
+:func:`urllib.parse.urlsplit` are of IPv6 or IPvFuture format.

View file

@ -1,61 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Dima Pasechnik <dimpase@gmail.com>
Date: Wed, 18 Dec 2024 14:31:08 +0100
Subject: 00446: Resolve sinpi name clash with libm
bpo-36106: Resolve sinpi name clash with libm (IEEE-754 violation). (GH-12027)
The standard math library (libm) may follow IEEE-754 recommendation to
include an implementation of sinPi(), i.e. sinPi(x):=sin(pi*x).
And this triggers a name clash, found by FreeBSD developer
Steve Kargl, who worken on putting sinpi into libm used on FreeBSD
(it has to be named "sinpi", not "sinPi", cf. e.g.
https://en.cppreference.com/w/c/experimental/fpext4).
(cherry picked from commit f57cd8288dbe6aba99c057f37ad6d58f8db75350)
Co-authored-by: Victor Stinner <vstinner@python.org>
---
Modules/mathmodule.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/Modules/mathmodule.c b/Modules/mathmodule.c
index 95ea4f7fef..670f1a04ad 100644
--- a/Modules/mathmodule.c
+++ b/Modules/mathmodule.c
@@ -67,7 +67,7 @@ static const double sqrtpi = 1.772453850905516027298167483341145182798;
static const double logpi = 1.144729885849400174143427351353058711647;
static double
-sinpi(double x)
+m_sinpi(double x)
{
double y, r;
int n;
@@ -296,7 +296,7 @@ m_tgamma(double x)
integer. */
if (absx > 200.0) {
if (x < 0.0) {
- return 0.0/sinpi(x);
+ return 0.0/m_sinpi(x);
}
else {
errno = ERANGE;
@@ -320,7 +320,7 @@ m_tgamma(double x)
}
z = z * lanczos_g / y;
if (x < 0.0) {
- r = -pi / sinpi(absx) / absx * exp(y) / lanczos_sum(absx);
+ r = -pi / m_sinpi(absx) / absx * exp(y) / lanczos_sum(absx);
r -= z * r;
if (absx < 140.0) {
r /= pow(y, absx - 0.5);
@@ -390,7 +390,7 @@ m_lgamma(double x)
r += (absx - 0.5) * (log(absx + lanczos_g - 0.5) - 1);
if (x < 0.0)
/* Use reflection formula to get value for negative x. */
- r = logpi - log(fabs(sinpi(absx))) - log(absx) - r;
+ r = logpi - log(fabs(m_sinpi(absx))) - log(absx) - r;
if (Py_IS_INFINITY(r))
errno = ERANGE;
return r;

View file

@ -1,119 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Seth Michael Larson <seth@python.org>
Date: Fri, 31 Jan 2025 11:41:34 -0600
Subject: 00450: CVE-2025-0938: Disallow square brackets ([ and ]) in domain
names for parsed URLs
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
---
Lib/test/test_urlparse.py | 37 ++++++++++++++++++-
Lib/urllib/parse.py | 20 +++++++++-
...-01-28-14-08-03.gh-issue-105704.EnhHxu.rst | 4 ++
3 files changed, 58 insertions(+), 3 deletions(-)
create mode 100644 Misc/NEWS.d/next/Security/2025-01-28-14-08-03.gh-issue-105704.EnhHxu.rst
diff --git a/Lib/test/test_urlparse.py b/Lib/test/test_urlparse.py
index 090d2f17bf..8b2f5ca50f 100644
--- a/Lib/test/test_urlparse.py
+++ b/Lib/test/test_urlparse.py
@@ -1087,16 +1087,51 @@ class UrlParseTestCase(unittest.TestCase):
self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[0439:23af::2309::fae7:1234]/Path?Query')
self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@[0439:23af:2309::fae7:1234:2342:438e:192.0.2.146]/Path?Query')
self.assertRaises(ValueError, urllib.parse.urlsplit, 'Scheme://user@]v6a.ip[/Path')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[v6a.ip]')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[v6a.ip].suffix')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[v6a.ip]/')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[v6a.ip].suffix/')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[v6a.ip]?')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[v6a.ip].suffix?')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[::1]')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[::1].suffix')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[::1]/')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[::1].suffix/')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[::1]?')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[::1].suffix?')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[::1]:a')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[::1].suffix:a')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[::1]:a1')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[::1].suffix:a1')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[::1]:1a')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[::1].suffix:1a')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[::1]:')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[::1].suffix:/')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[::1]:?')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://user@prefix.[v6a.ip]')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://user@[v6a.ip].suffix')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://[v6a.ip')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://v6a.ip]')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://]v6a.ip[')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://]v6a.ip')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://v6a.ip[')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix.[v6a.ip')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://v6a.ip].suffix')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix]v6a.ip[suffix')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://prefix]v6a.ip')
+ self.assertRaises(ValueError, urllib.parse.urlsplit, 'scheme://v6a.ip[suffix')
def test_splitting_bracketed_hosts(self):
- p1 = urllib.parse.urlsplit('scheme://user@[v6a.ip]/path?query')
+ p1 = urllib.parse.urlsplit('scheme://user@[v6a.ip]:1234/path?query')
self.assertEqual(p1.hostname, 'v6a.ip')
self.assertEqual(p1.username, 'user')
self.assertEqual(p1.path, '/path')
+ self.assertEqual(p1.port, 1234)
p2 = urllib.parse.urlsplit('scheme://user@[0439:23af:2309::fae7]/path?query')
self.assertEqual(p2.hostname, '0439:23af:2309::fae7')
self.assertEqual(p2.username, 'user')
self.assertEqual(p2.path, '/path')
+ self.assertIs(p2.port, None)
p3 = urllib.parse.urlsplit('scheme://user@[0439:23af:2309::fae7:1234:192.0.2.146]/path?query')
self.assertEqual(p3.hostname, '0439:23af:2309::fae7:1234:192.0.2.146')
self.assertEqual(p3.username, 'user')
diff --git a/Lib/urllib/parse.py b/Lib/urllib/parse.py
index bf186b7984..af41edf2ca 100644
--- a/Lib/urllib/parse.py
+++ b/Lib/urllib/parse.py
@@ -426,6 +426,23 @@ def _remove_unsafe_bytes_from_url(url):
url = url.replace(b, "")
return url
+def _check_bracketed_netloc(netloc):
+ # Note that this function must mirror the splitting
+ # done in NetlocResultMixins._hostinfo().
+ hostname_and_port = netloc.rpartition('@')[2]
+ before_bracket, have_open_br, bracketed = hostname_and_port.partition('[')
+ if have_open_br:
+ # No data is allowed before a bracket.
+ if before_bracket:
+ raise ValueError("Invalid IPv6 URL")
+ hostname, _, port = bracketed.partition(']')
+ # No data is allowed after the bracket but before the port delimiter.
+ if port and not port.startswith(":"):
+ raise ValueError("Invalid IPv6 URL")
+ else:
+ hostname, _, port = hostname_and_port.partition(':')
+ _check_bracketed_host(hostname)
+
# Valid bracketed hosts are defined in
# https://www.rfc-editor.org/rfc/rfc3986#page-49 and https://url.spec.whatwg.org/
def _check_bracketed_host(hostname):
@@ -493,8 +510,7 @@ def urlsplit(url, scheme='', allow_fragments=True):
(']' in netloc and '[' not in netloc)):
raise ValueError("Invalid IPv6 URL")
if '[' in netloc and ']' in netloc:
- bracketed_host = netloc.partition('[')[2].partition(']')[0]
- _check_bracketed_host(bracketed_host)
+ _check_bracketed_netloc(netloc)
if allow_fragments and '#' in url:
url, fragment = url.split('#', 1)
if '?' in url:
diff --git a/Misc/NEWS.d/next/Security/2025-01-28-14-08-03.gh-issue-105704.EnhHxu.rst b/Misc/NEWS.d/next/Security/2025-01-28-14-08-03.gh-issue-105704.EnhHxu.rst
new file mode 100644
index 0000000000..bff1bc6b0d
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2025-01-28-14-08-03.gh-issue-105704.EnhHxu.rst
@@ -0,0 +1,4 @@
+When using :func:`urllib.parse.urlsplit` and :func:`urllib.parse.urlparse` host
+parsing would not reject domain names containing square brackets (``[`` and
+``]``). Square brackets are only valid for IPv6 and IPvFuture hosts according to
+`RFC 3986 Section 3.2.2 <https://www.rfc-editor.org/rfc/rfc3986#section-3.2.2>`__.

View file

@ -1,49 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Miss Islington (bot)"
<31488909+miss-islington@users.noreply.github.com>
Date: Mon, 31 Mar 2025 20:29:04 +0200
Subject: 00452: Properly apply exported CFLAGS for dtrace/systemtap builds
When using --with-dtrace the resulting object file could be missing
specific CFLAGS exported by the build system due to the systemtap
script using specific defaults.
Exporting the CC and CFLAGS variables before the dtrace invocation
allows us to properly apply CFLAGS exported by the build system
even when cross-compiling.
Co-authored-by: stratakis <cstratak@redhat.com>
---
Makefile.pre.in | 4 ++--
.../next/Build/2025-03-31-19-22-41.gh-issue-131865.PIJy7X.rst | 2 ++
2 files changed, 4 insertions(+), 2 deletions(-)
create mode 100644 Misc/NEWS.d/next/Build/2025-03-31-19-22-41.gh-issue-131865.PIJy7X.rst
diff --git a/Makefile.pre.in b/Makefile.pre.in
index b074b26039..825cefafd9 100644
--- a/Makefile.pre.in
+++ b/Makefile.pre.in
@@ -892,13 +892,13 @@ Python/frozen.o: $(srcdir)/Python/importlib.h $(srcdir)/Python/importlib_externa
# an include guard, so we can't use a pipeline to transform its output.
Include/pydtrace_probes.h: $(srcdir)/Include/pydtrace.d
$(MKDIR_P) Include
- $(DTRACE) $(DFLAGS) -o $@ -h -s $<
+ CC="$(CC)" CFLAGS="$(CFLAGS)" $(DTRACE) $(DFLAGS) -o $@ -h -s $<
: sed in-place edit with POSIX-only tools
sed 's/PYTHON_/PyDTrace_/' $@ > $@.tmp
mv $@.tmp $@
Python/pydtrace.o: $(srcdir)/Include/pydtrace.d $(DTRACE_DEPS)
- $(DTRACE) $(DFLAGS) -o $@ -G -s $< $(DTRACE_DEPS)
+ CC="$(CC)" CFLAGS="$(CFLAGS)" $(DTRACE) $(DFLAGS) -o $@ -G -s $< $(DTRACE_DEPS)
Objects/typeobject.o: Objects/typeslots.inc
diff --git a/Misc/NEWS.d/next/Build/2025-03-31-19-22-41.gh-issue-131865.PIJy7X.rst b/Misc/NEWS.d/next/Build/2025-03-31-19-22-41.gh-issue-131865.PIJy7X.rst
new file mode 100644
index 0000000000..a287e0b228
--- /dev/null
+++ b/Misc/NEWS.d/next/Build/2025-03-31-19-22-41.gh-issue-131865.PIJy7X.rst
@@ -0,0 +1,2 @@
+The DTrace build now properly passes the ``CC`` and ``CFLAGS`` variables
+to the ``dtrace`` command when utilizing SystemTap on Linux.

View file

@ -1,43 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Victor Stinner <vstinner@python.org>
Date: Thu, 3 Apr 2025 18:26:17 +0200
Subject: 00457: ssl: Raise OSError for ERR_LIB_SYS
The patch resolves the flakiness of test_ftplib
Backported from upstream 3.10+:
https://github.com/python/cpython/pull/127361
Co-authored-by: Petr Viktorin <encukou@gmail.com>
---
Modules/_ssl.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/Modules/_ssl.c b/Modules/_ssl.c
index 3375b2bf3f..ab8a327d10 100644
--- a/Modules/_ssl.c
+++ b/Modules/_ssl.c
@@ -638,6 +638,11 @@ PySSL_SetError(PySSLSocket *obj, int ret, const char *filename, int lineno)
errstr = "Some I/O error occurred";
}
} else {
+ if (ERR_GET_LIB(e) == ERR_LIB_SYS) {
+ // A system error is being reported; reason is set to errno
+ errno = ERR_GET_REASON(e);
+ return PyErr_SetFromErrno(PyExc_OSError);
+ }
p = PY_SSL_ERROR_SYSCALL;
}
break;
@@ -648,6 +653,11 @@ PySSL_SetError(PySSLSocket *obj, int ret, const char *filename, int lineno)
if (e == 0)
/* possible? */
errstr = "A failure in the SSL library occurred";
+ if (ERR_GET_LIB(e) == ERR_LIB_SYS) {
+ // A system error is being reported; reason is set to errno
+ errno = ERR_GET_REASON(e);
+ return PyErr_SetFromErrno(PyExc_OSError);
+ }
break;
}
default:

File diff suppressed because it is too large Load diff

View file

@ -1,212 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexander Urieles <aeurielesn@users.noreply.github.com>
Date: Mon, 28 Jul 2025 17:37:26 +0200
Subject: 00467: tarfile CVE-2025-8194
tarfile now validates archives to ensure member offsets are non-negative (GH-137027)
Co-authored-by: Gregory P. Smith <greg@krypto.org>
---
Lib/tarfile.py | 3 +
Lib/test/test_tarfile.py | 156 ++++++++++++++++++
...-07-23-00-35-29.gh-issue-130577.c7EITy.rst | 3 +
3 files changed, 162 insertions(+)
create mode 100644 Misc/NEWS.d/next/Library/2025-07-23-00-35-29.gh-issue-130577.c7EITy.rst
diff --git a/Lib/tarfile.py b/Lib/tarfile.py
index 1a7d5f772a..4f536cb002 100755
--- a/Lib/tarfile.py
+++ b/Lib/tarfile.py
@@ -1582,6 +1582,9 @@ class TarInfo(object):
"""Round up a byte count by BLOCKSIZE and return it,
e.g. _block(834) => 1024.
"""
+ # Only non-negative offsets are allowed
+ if count < 0:
+ raise InvalidHeaderError("invalid offset")
blocks, remainder = divmod(count, BLOCKSIZE)
if remainder:
blocks += 1
diff --git a/Lib/test/test_tarfile.py b/Lib/test/test_tarfile.py
index c5d837e716..484f114180 100644
--- a/Lib/test/test_tarfile.py
+++ b/Lib/test/test_tarfile.py
@@ -43,6 +43,7 @@ bz2name = os.path.join(TEMPDIR, "testtar.tar.bz2")
xzname = os.path.join(TEMPDIR, "testtar.tar.xz")
tmpname = os.path.join(TEMPDIR, "tmp.tar")
dotlessname = os.path.join(TEMPDIR, "testtar")
+SPACE = b" "
md5_regtype = "65f477c818ad9e15f7feab0c6d37742f"
md5_sparse = "a54fbc4ca4f4399a90e1b27164012fc6"
@@ -4005,6 +4006,161 @@ class TestExtractionFilters(unittest.TestCase):
self.expect_exception(TypeError) # errorlevel is not int
+class OffsetValidationTests(unittest.TestCase):
+ tarname = tmpname
+ invalid_posix_header = (
+ # name: 100 bytes
+ tarfile.NUL * tarfile.LENGTH_NAME
+ # mode, space, null terminator: 8 bytes
+ + b"000755" + SPACE + tarfile.NUL
+ # uid, space, null terminator: 8 bytes
+ + b"000001" + SPACE + tarfile.NUL
+ # gid, space, null terminator: 8 bytes
+ + b"000001" + SPACE + tarfile.NUL
+ # size, space: 12 bytes
+ + b"\xff" * 11 + SPACE
+ # mtime, space: 12 bytes
+ + tarfile.NUL * 11 + SPACE
+ # chksum: 8 bytes
+ + b"0011407" + tarfile.NUL
+ # type: 1 byte
+ + tarfile.REGTYPE
+ # linkname: 100 bytes
+ + tarfile.NUL * tarfile.LENGTH_LINK
+ # magic: 6 bytes, version: 2 bytes
+ + tarfile.POSIX_MAGIC
+ # uname: 32 bytes
+ + tarfile.NUL * 32
+ # gname: 32 bytes
+ + tarfile.NUL * 32
+ # devmajor, space, null terminator: 8 bytes
+ + tarfile.NUL * 6 + SPACE + tarfile.NUL
+ # devminor, space, null terminator: 8 bytes
+ + tarfile.NUL * 6 + SPACE + tarfile.NUL
+ # prefix: 155 bytes
+ + tarfile.NUL * tarfile.LENGTH_PREFIX
+ # padding: 12 bytes
+ + tarfile.NUL * 12
+ )
+ invalid_gnu_header = (
+ # name: 100 bytes
+ tarfile.NUL * tarfile.LENGTH_NAME
+ # mode, null terminator: 8 bytes
+ + b"0000755" + tarfile.NUL
+ # uid, null terminator: 8 bytes
+ + b"0000001" + tarfile.NUL
+ # gid, space, null terminator: 8 bytes
+ + b"0000001" + tarfile.NUL
+ # size, space: 12 bytes
+ + b"\xff" * 11 + SPACE
+ # mtime, space: 12 bytes
+ + tarfile.NUL * 11 + SPACE
+ # chksum: 8 bytes
+ + b"0011327" + tarfile.NUL
+ # type: 1 byte
+ + tarfile.REGTYPE
+ # linkname: 100 bytes
+ + tarfile.NUL * tarfile.LENGTH_LINK
+ # magic: 8 bytes
+ + tarfile.GNU_MAGIC
+ # uname: 32 bytes
+ + tarfile.NUL * 32
+ # gname: 32 bytes
+ + tarfile.NUL * 32
+ # devmajor, null terminator: 8 bytes
+ + tarfile.NUL * 8
+ # devminor, null terminator: 8 bytes
+ + tarfile.NUL * 8
+ # padding: 167 bytes
+ + tarfile.NUL * 167
+ )
+ invalid_v7_header = (
+ # name: 100 bytes
+ tarfile.NUL * tarfile.LENGTH_NAME
+ # mode, space, null terminator: 8 bytes
+ + b"000755" + SPACE + tarfile.NUL
+ # uid, space, null terminator: 8 bytes
+ + b"000001" + SPACE + tarfile.NUL
+ # gid, space, null terminator: 8 bytes
+ + b"000001" + SPACE + tarfile.NUL
+ # size, space: 12 bytes
+ + b"\xff" * 11 + SPACE
+ # mtime, space: 12 bytes
+ + tarfile.NUL * 11 + SPACE
+ # chksum: 8 bytes
+ + b"0010070" + tarfile.NUL
+ # type: 1 byte
+ + tarfile.REGTYPE
+ # linkname: 100 bytes
+ + tarfile.NUL * tarfile.LENGTH_LINK
+ # padding: 255 bytes
+ + tarfile.NUL * 255
+ )
+ valid_gnu_header = tarfile.TarInfo("filename").tobuf(tarfile.GNU_FORMAT)
+ data_block = b"\xff" * tarfile.BLOCKSIZE
+
+ def _write_buffer(self, buffer):
+ with open(self.tarname, "wb") as f:
+ f.write(buffer)
+
+ def _get_members(self, ignore_zeros=None):
+ with open(self.tarname, "rb") as f:
+ with tarfile.open(
+ mode="r", fileobj=f, ignore_zeros=ignore_zeros
+ ) as tar:
+ return tar.getmembers()
+
+ def _assert_raises_read_error_exception(self):
+ with self.assertRaisesRegex(
+ tarfile.ReadError, "file could not be opened successfully"
+ ):
+ self._get_members()
+
+ def test_invalid_offset_header_validations(self):
+ for tar_format, invalid_header in (
+ ("posix", self.invalid_posix_header),
+ ("gnu", self.invalid_gnu_header),
+ ("v7", self.invalid_v7_header),
+ ):
+ with self.subTest(format=tar_format):
+ self._write_buffer(invalid_header)
+ self._assert_raises_read_error_exception()
+
+ def test_early_stop_at_invalid_offset_header(self):
+ buffer = self.valid_gnu_header + self.invalid_gnu_header + self.valid_gnu_header
+ self._write_buffer(buffer)
+ members = self._get_members()
+ self.assertEqual(len(members), 1)
+ self.assertEqual(members[0].name, "filename")
+ self.assertEqual(members[0].offset, 0)
+
+ def test_ignore_invalid_archive(self):
+ # 3 invalid headers with their respective data
+ buffer = (self.invalid_gnu_header + self.data_block) * 3
+ self._write_buffer(buffer)
+ members = self._get_members(ignore_zeros=True)
+ self.assertEqual(len(members), 0)
+
+ def test_ignore_invalid_offset_headers(self):
+ for first_block, second_block, expected_offset in (
+ (
+ (self.valid_gnu_header),
+ (self.invalid_gnu_header + self.data_block),
+ 0,
+ ),
+ (
+ (self.invalid_gnu_header + self.data_block),
+ (self.valid_gnu_header),
+ 1024,
+ ),
+ ):
+ self._write_buffer(first_block + second_block)
+ members = self._get_members(ignore_zeros=True)
+ self.assertEqual(len(members), 1)
+ self.assertEqual(members[0].name, "filename")
+ self.assertEqual(members[0].offset, expected_offset)
+
+
def setUpModule():
support.unlink(TEMPDIR)
os.makedirs(TEMPDIR)
diff --git a/Misc/NEWS.d/next/Library/2025-07-23-00-35-29.gh-issue-130577.c7EITy.rst b/Misc/NEWS.d/next/Library/2025-07-23-00-35-29.gh-issue-130577.c7EITy.rst
new file mode 100644
index 0000000000..342cabbc86
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2025-07-23-00-35-29.gh-issue-130577.c7EITy.rst
@@ -0,0 +1,3 @@
+:mod:`tarfile` now validates archives to ensure member offsets are
+non-negative. (Contributed by Alexander Enrique Urieles Nieto in
+:gh:`130577`.)

View file

@ -1,42 +0,0 @@
diff --git a/pip/_internal/utils/misc.py b/pip/_internal/utils/misc.py
index 84a421f..fbdb654 100644
--- a/pip/_internal/utils/misc.py
+++ b/pip/_internal/utils/misc.py
@@ -532,6 +532,13 @@ def untar_file(filename, location):
if leading:
fn = split_leading_dir(fn)[1]
path = os.path.join(location, fn)
+
+ # Call the `data` filter for its side effect (raising exception)
+ try:
+ tarfile.data_filter(member.replace(name=fn), location)
+ except tarfile.LinkOutsideDestinationError:
+ pass
+
if member.isdir():
ensure_dir(path)
elif member.issym():
diff --git a/pip/_vendor/distlib/util.py b/pip/_vendor/distlib/util.py
index 0b14a93..8f3f12e 100644
--- a/pip/_vendor/distlib/util.py
+++ b/pip/_vendor/distlib/util.py
@@ -1238,6 +1238,19 @@ def unarchive(archive_filename, dest_dir, format=None, check=True):
for tarinfo in archive.getmembers():
if not isinstance(tarinfo.name, text_type):
tarinfo.name = tarinfo.name.decode('utf-8')
+
+ # Limit extraction of dangerous items, if this Python
+ # allows it easily. If not, just trust the input.
+ # See: https://docs.python.org/3/library/tarfile.html#extraction-filters
+ def extraction_filter(member, path):
+ """Run tarfile.tar_fillter, but raise the expected ValueError"""
+ # This is only called if the current Python has tarfile filters
+ try:
+ return tarfile.tar_filter(member, path)
+ except tarfile.FilterError as exc:
+ raise ValueError(str(exc))
+ archive.extraction_filter = extraction_filter
+
archive.extractall(dest_dir)
finally:

View file

@ -1,38 +0,0 @@
execute:
how: tmt
environment:
pybasever: '3.6'
discover:
- name: tests_python
how: shell
url: https://src.fedoraproject.org/tests/python.git
tests:
- name: smoke
path: /smoke
test: "VERSION=${pybasever} TOX_REQUIRES='virtualenv<20.22.0' ./venv.sh"
- name: debugsmoke
path: /smoke
test: "PYTHON=python${pybasever}dm TOX=false VERSION=${pybasever} INSTALL_OR_SKIP=true ./venv.sh"
- name: marshalparser
path: /marshalparser
test: "VERSION=${pybasever} SAMPLE=10 ./test_marshalparser_compatibility.sh"
prepare:
- name: Install dependencies
how: install
package:
- gcc
- python3-tox
- python${pybasever}
- glibc-all-langpacks # for locale tests
- marshalparser # for testing compatibility (magic numbers) with marshalparser
- dnf # for upgrade
- name: Update packages
how: shell
script: dnf upgrade -y
- name: rpm_qa
order: 100
how: shell
script: rpm -qa | sort | tee $TMT_PLAN_DATA/rpmqa.txt

View file

@ -77,9 +77,3 @@ addFilter(r'\bpython3(\.\d+)?\.(src|spec): (E|W): specfile-error\s+$')
# SPELLING ERRORS
addFilter(r'spelling-error .* en_US (bytecode|pyc|filename|tkinter|namespaces|pytest) ')
# These bundled provides are declared twice, as they're bundled twice
# separately in pip and setuptools.
addFilter(r'useless-provides bundled\(python3dist\(packaging\)\)')
addFilter(r'useless-provides bundled\(python3dist\(setuptools\)\)')
addFilter(r'useless-provides bundled\(python3dist\(six\)\)')

File diff suppressed because it is too large Load diff

View file

@ -1,83 +0,0 @@
From 8af1b3e03edc8a38565558aff3bf1689c1ca3545 Mon Sep 17 00:00:00 2001
From: Lumir Balhar <lbalhar@redhat.com>
Date: Fri, 26 Jul 2024 13:49:11 +0200
Subject: [PATCH] CVE-2024-6345
---
setuptools/package_index.py | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
diff --git a/setuptools/package_index.py b/setuptools/package_index.py
index bdcf4a6..1d3e5b4 100755
--- a/setuptools/package_index.py
+++ b/setuptools/package_index.py
@@ -1,4 +1,5 @@
"""PyPI and direct package downloading"""
+import subprocess
import sys
import os
import re
@@ -848,7 +849,7 @@ class PackageIndex(Environment):
def _download_svn(self, url, filename):
url = url.split('#', 1)[0] # remove any fragment for svn's sake
- creds = ''
+ creds = []
if url.lower().startswith('svn:') and '@' in url:
scheme, netloc, path, p, q, f = urllib.parse.urlparse(url)
if not netloc and path.startswith('//') and '/' in path[2:]:
@@ -857,14 +858,14 @@ class PackageIndex(Environment):
if auth:
if ':' in auth:
user, pw = auth.split(':', 1)
- creds = " --username=%s --password=%s" % (user, pw)
+ creds = ["--username=" + user, "--password=" + pw]
else:
- creds = " --username=" + auth
+ creds = ["--username=" + auth]
netloc = host
parts = scheme, netloc, url, p, q, f
url = urllib.parse.urlunparse(parts)
self.info("Doing subversion checkout from %s to %s", url, filename)
- os.system("svn checkout%s -q %s %s" % (creds, url, filename))
+ subprocess.check_call(["svn", "checkout"] + creds + ["-q", url, filename])
return filename
@staticmethod
@@ -890,14 +891,11 @@ class PackageIndex(Environment):
url, rev = self._vcs_split_rev_from_url(url, pop_prefix=True)
self.info("Doing git clone from %s to %s", url, filename)
- os.system("git clone --quiet %s %s" % (url, filename))
+ subprocess.check_call(["git", "clone", "--quiet", url, filename])
if rev is not None:
self.info("Checking out %s", rev)
- os.system("(cd %s && git checkout --quiet %s)" % (
- filename,
- rev,
- ))
+ subprocess.check_call(["git", "-C", filename, "checkout", "--quiet", rev])
return filename
@@ -906,14 +904,11 @@ class PackageIndex(Environment):
url, rev = self._vcs_split_rev_from_url(url, pop_prefix=True)
self.info("Doing hg clone from %s to %s", url, filename)
- os.system("hg clone --quiet %s %s" % (url, filename))
+ subprocess.check_call(["hg", "clone", "--quiet", url, filename])
if rev is not None:
self.info("Updating to %s", rev)
- os.system("(cd %s && hg up -C -r %s -q)" % (
- filename,
- rev,
- ))
+ subprocess.check_call(["hg", "--cwd", filename, "up", "-C", "-r", rev, "-q"])
return filename
--
2.45.2

22
tests/tests.yml Normal file
View file

@ -0,0 +1,22 @@
---
- hosts: localhost
roles:
- role: standard-test-basic
tags:
- classic
repositories:
- repo: "https://src.fedoraproject.org/tests/python.git"
dest: "python"
tests:
- smoke:
dir: python/smoke
run: VERSION=3.6 ./venv.sh
- marshalparser:
dir: python/marshalparser
run: VERSION=3.6 SAMPLE=10 test_marshalparser_compatibility.sh
required_packages:
- gcc
- python3-tox
- python3.6
- glibc-all-langpacks # for locale tests
- marshalparser # for testing compatibility (magic numbers) with marshalparser