Solver:
Let’s add a take a look at case first to confirm the difficulty with a number of slashes and authentication.
Let’s test the present take a look at file first to see the place we should always add our take a look at.
Studying exams/test_requests.py
Let’s add our take a look at case to the TestRequests class. First, let’s discover a good location close to different URL-related exams.
Trying to find test_path
Excellent, let’s add our take a look at case proper after the test_path_is_not_double_encoded take a look at.
IDG
Now let’s modify the prepare_url technique in fashions.py to protect a number of slashes. The problem is probably going in how the trail is dealt with earlier than being handed to urlunparse.
IDG
Be aware {that a} null is a suspect alternative of marker.
Solver:
Let’s run the exams to confirm our adjustments repair the difficulty. We’ll use pytest to run simply our new take a look at case.
$ python -m pytest exams/test_requests.py -k test_path_with_multiple_slashes -v ============================= take a look at session begins ============================== platform linux -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0 -- /usr/native/bin/python cachedir: .pytest_cache rootdir: /repo configfile: pyproject.toml plugins: cov-5.0.0, httpbin-2.1.0 gathering ... collected 330 gadgets / 329 deselected / 1 chosen exams/test_requests.py::TestRequests::test_path_with_multiple_slashes FAILED [100%] =================================== FAILURES =================================== _________________ TestRequests.test_path_with_multiple_slashes _________________ self =
def test_path_with_multiple_slashes(self): """Check that URLs with a number of slashes are preserved.""" url = "http://instance.com//path//to//useful resource" request = requests.Request("GET", url).put together() > assert request.url == url E AssertionError: assert 'http://examp...topercent00resource' == 'http://examp.../to//useful resource' E E - http://instance.com//path//to//useful resource E ? ^ ^^ ^^ E + http://instance.com/%00pathpercent00topercent00resource E ? ^^^ ^^^ ^^^ exams/test_requests.py:151: AssertionError =========================== brief take a look at abstract data ============================ FAILED exams/test_requests.py::TestRequests::test_path_with_multiple_slashes ====================== 1 failed, 329 deselected in 0.75s ======================= Exit Code: 1 The take a look at failure exhibits that our null byte substitute strategy isn’t working appropriately – the null bytes are getting URL encoded. Let’s strive a unique strategy utilizing a extra URL-safe non permanent marker.
IDG
We have to add the re import on the prime of the file.
Let’s test the present imports first.
Studying src/requests/fashions.py
I see we are able to add re to the present imports. Let’s add it with the opposite customary library imports on the prime.
IDG
Now let’s run the take a look at once more to see if our new strategy works.
$ python -m pytest exams/test_requests.py -k test_path_with_multiple_slashes -v ============================= take a look at session begins ============================== platform linux -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0 -- /usr/native/bin/python cachedir: .pytest_cache rootdir: /repo configfile: pyproject.toml plugins: httpbin-2.1.0, cov-5.0.0 gathering ... collected 330 gadgets / 329 deselected / 1 chosen exams/test_requests.py::TestRequests::test_path_with_multiple_slashes FAILED [100%] =================================== FAILURES =================================== _________________ TestRequests.test_path_with_multiple_slashes _________________ self =
def test_path_with_multiple_slashes(self): """Check that URLs with a number of slashes are preserved.""" url = "http://instance.com//path//to//useful resource" > request = requests.Request("GET", url).put together() exams/test_requests.py:150: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ src/requests/fashions.py:299: in put together p.put together( src/requests/fashions.py:368: in put together self.prepare_url(url, params) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = , url="http://instance.com/path/to/useful resource" params = {} def prepare_url(self, url, params): """Prepares the given HTTP URL.""" #: Settle for objects which have string representations. #: We're unable to blindly name unicode/str features #: as this may embrace the bytestring indicator (b'') #: on python 3.x. #: https://github.com/psf/requests/pull/2238 if isinstance(url, bytes): url = url.decode("utf8") else: url = str(url) # Take away main whitespaces from url url = url.lstrip() # Do not do any URL preparation for non-HTTP schemes like `mailto`, # `knowledge` and many others to work round exceptions from `url_parse`, which # handles RFC 3986 solely. if ":" in url and never url.decrease().startswith("http"): self.url = url return # Help for unicode domains and paths. strive: scheme, auth, host, port, path, question, fragment = parse_url(url) besides LocationParseError as e: elevate InvalidURL(*e.args) if not scheme: elevate MissingSchema( f"Invalid URL {url!r}: No scheme provided. " f"Maybe you meant https://{url}?" ) if not host: elevate InvalidURL(f"Invalid URL {url!r}: No host provided") # Basically, we wish to strive IDNA encoding the hostname if the string comprises # non-ASCII characters. This permits customers to mechanically get the right IDNA # behaviour. For strings containing solely ASCII characters, we have to additionally confirm # it does not begin with a wildcard (*), earlier than permitting the unencoded hostname. if not unicode_is_ascii(host): strive: host = self._get_idna_encoded_host(host) besides UnicodeError: elevate InvalidURL("URL has an invalid label.") elif host.startswith(("*", ".")): elevate InvalidURL("URL has an invalid label.") # Rigorously reconstruct the community location netloc = auth or "" if netloc: netloc += "@" netloc += host if port: netloc += f":{port}" # Naked domains aren't legitimate URLs. if not path: path = "/" else: # Protect a number of consecutive slashes within the path original_path = path path = re.sub("//+", "/", path) # Normalize to single slashes for parsing if isinstance(params, (str, bytes)): params = to_native_string(params) enc_params = self._encode_params(params) if enc_params: if question: question = f"{question}&{enc_params}" else: question = enc_params url = requote_uri(urlunparse([scheme, netloc, path, None, query, fragment])) # Restore a number of slashes in the event that they had been current within the authentic path if path != "/" and "original_path" in locals(): # Discover the trail portion within the remaining URL and change it > url_parts = listing(urlparse(url)) E NameError: title 'urlparse' isn't outlined src/requests/fashions.py:490: NameError =========================== brief take a look at abstract data ============================ FAILED exams/test_requests.py::TestRequests::test_path_with_multiple_slashes ====================== 1 failed, 329 deselected in 0.55s ======================= Exit Code: 1 Ah, we have to use the imported urlparse. Let’s repair the imports and check out a less complicated strategy.
IDG
Let’s run the take a look at once more to see if this less complicated strategy works.
$ python -m pytest exams/test_requests.py -k test_path_with_multiple_slashes -v ============================= take a look at session begins ============================== platform linux -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0 -- /usr/native/bin/python cachedir: .pytest_cache rootdir: /repo configfile: pyproject.toml plugins: httpbin-2.1.0, cov-5.0.0 gathering ... collected 330 gadgets / 329 deselected / 1 chosen exams/test_requests.py::TestRequests::test_path_with_multiple_slashes PASSED [100%] ====================== 1 handed, 329 deselected in 0.62s ======================= Exit Code: 0
The take a look at passes. The less complicated answer works as a result of we’re letting the URL parts go via with none normalization. Let’s submit our adjustments which embrace:
- Added a take a look at case to confirm a number of slashes are preserved in URLs with and with out authentication
- Modified prepare_url to protect the precise path construction with out normalization
- Added a remark explaining the preservation of a number of slashes
Solver completed
Dialogue with a founder
Someplace in the course of the session proven within the earlier part, you’ll see that I pressed the proceed fixing button and advised Solver, “You’re heading in the right direction; please proceed.”
I puzzled why Solver had stopped right here to attend for consumer enter, so I despatched in a problem report and requested, “Why does Solver cease after figuring out the issue? Sure, it did proceed after I pressed the suitable button. This can be a query, not a bug report.”