gh-113775: Fix HttpOnly Prefix Issue in MozillaCookieJar.save method#113795
gh-113775: Fix HttpOnly Prefix Issue in MozillaCookieJar.save method#113795kairi003 wants to merge 7 commits intopython:mainfrom
Conversation
Cookie.has_nonstandard_attr method is case-sensitive, but HTTP is case-insensitive for cookie attribute names. This option is useful for MozillaCookieJar to check the HttpOnly flag when saving.
This commit updates the value of the HTTPONLY_ATTR constant from "HTTPOnly" to the more commonly used "HttpOnly". It makes HTTP communication more consistent.
Modified attribute checking in MozillaCookieJar.save to be case-insensitive, aligning with HTTP standards. This change resolves the issue where HttpOnly prefix was not correctly appended due to case-sensitive checks.
145e60e to
6329495
Compare
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
4f99b12 to
6329495
Compare
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
picnixz
left a comment
There was a problem hiding this comment.
I don't really know whether it should considered a feature or a bug fix as we're aligning with the RFC now. I think it could be considered a bug fix but OTOH, we're changing the HTTPONLY_ATTR.
Now, be careful that the value is also used in _really_load:
if line.startswith(HTTPONLY_PREFIX):
rest[HTTPONLY_ATTR] = ""
line = line[len(HTTPONLY_PREFIX):]In this case, shouldn't we do something as well with the case? because otherwise, files that were successfully parsed, wouldn't be now.
cc @vadmium as my favorite web RFC expert c:
| for key in ["foo1", "foo2", "foo3"]: | ||
| matches = [x for x in lines if key in x] | ||
| self.assertEqual(len(matches), 1, | ||
| "Incorrect number of matches for cookie with value %r" % key) | ||
| self.assertTrue(matches[0].startswith("#HttpOnly_"), | ||
| "Cookie with value %r is missing the HttpOnly prefix" % key) | ||
|
|
||
| # Check that the HttpOnly prefix is not added to the correct cookies | ||
| for key in ["foo4"]: | ||
| matches = [x for x in lines if key in x] | ||
| self.assertEqual(len(matches), 1, | ||
| "Incorrect number of matches for cookie with value %r" % key) | ||
| self.assertFalse(matches[0].startswith("#HttpOnly_"), | ||
| "Cookie with value %r has the HttpOnly prefix" % key) |
There was a problem hiding this comment.
You can use self.subTest(key=key) and drop the assertion messages. It could be easier.
|
@picnixz
You're right, Here's what I've found upon further review:
Therefore, this change only affects users who directly reference If full backward compatibility is required, we could keep I am not a specialist in this area, so I will defer to the judgment of the reviewers and those who have more expertise. Thank you :) |
|
As a side note, when you make a typical request, import requests
resp = requests.get('https://google.com')
cookie = list(resp.cookies)[0]
print(cookie._rest)
# -> {'HttpOnly': None, 'SameSite': 'lax'} |
That I know. I encountered this kind of issues when writing my own crawler so I'm not surprised. However, the best way to do make it smooth is to support both names, and emit a warning if |
|
@picnixz Both values in `_rest` dictHTTPONLY_ATTR = "HttpOnly"
HTTPONLY_ATTR_OLD = "HTTPOnly"
class MozillaCookieJar(FileCookieJar):
def _really_load(self, f, filename, ignore_discard, ignore_expires):
...
if line.startswith(HTTPONLY_PREFIX):
rest[HTTPONLY_ATTR] = ""
rest[HTTPONLY_ATTR_OLD] = ""
...
class Cookie:
def has_nonstandard_attr(self, name, case_insensitive=False):
if name == HTTPONLY_ATTR_OLD:
warnings.warn(warning_message)
...
def get_nonstandard_attr(self, name, default=None):
if name == HTTPONLY_ATTR_OLD:
warnings.warn(warning_message)
...
def set_nonstandard_attr(self, name, value):
if name == HTTPONLY_ATTR_OLD:
warnings.warn(warning_message)
...Automatically replacingclass MozillaCookieJar(FileCookieJar):
def _really_load(self, f, filename, ignore_discard, ignore_expires):
...
if line.startswith(HTTPONLY_PREFIX):
rest[HTTPONLY_ATTR] = ""
...
class Cookie:
def has_nonstandard_attr(self, name, case_insensitive=False):
if name == HTTPONLY_ATTR_OLD:
warnings.warn(warning_message)
name = HTTPONLY_ATTR
...
def get_nonstandard_attr(self, name, default=None):
if name == HTTPONLY_ATTR_OLD:
warnings.warn(warning_message)
name = HTTPONLY_ATTR
...
def set_nonstandard_attr(self, name, value):
if name == HTTPONLY_ATTR_OLD:
warnings.warn(warning_message)
name = HTTPONLY_ATTR
... |
|
I'm more worried with APIs overriding those methods so I think we should rather not change it automatically. We should either way document this change but I don't know which one would be the best (which is why I'd like to wait for another core dev). Personally, I would just replace |
|
I think adding the new case_insensitive public parameter is not great for a bug fix. I presume there should be an “Added/changed in version X” notice for it in the documentation. As a new feature, I would make it a keyword-only parameter and probably also add it to the get_nonstandard_attr method. And I wonder if case_insensitive=True should be the default or only behaviour. On one hand this would be a change in behaviour; on the other hand it might mitigate compatibility concerns with changing HTTPONLY_ADDR. Regarding the letter case of HTTPONLY_ATTR, changing it seems unnecessary to fix the bug. I understand changing it would only affect the “nonstandard attr” name in Cookie objects read from a file. That seems to have minimal reward and minimal risk, but it does seem nicer to use the casing from RFC 6265. I would only make the change in the next Python version. |
|
This PR is stale because it has been open for 30 days with no activity. |
MozillaCookieJar recognizes and writes the
#HttpOnly_prefix, based on curl's specifications. However, it is uncommon for cookies obtained via HTTP to be written with the HttpOnly prefix.This issue stems from the case-sensitive nature of the HttpOnly attribute check and the fact that the value of the constant
HTTPONLY_ATTRis"HTTPOnly"instead of the more commonly used"HttpOnly".In accordance with RFC6265, this PR proposes the addition of a
case_insensitiveoption toCookie.has_nonstandard_attr. Additionally, it modifies the value ofHTTPONLY_ATTRto the more widely recognized"HttpOnly".This PR includes code changes, docs fixes, and additional tests.
📚 Documentation preview 📚: https://cpython-previews--113795.org.readthedocs.build/