Url Parsing in Python

Here’s the translation of the Go URL parsing example to Python, formatted in Markdown suitable for Hugo:

Our URL parsing program demonstrates how to parse URLs in Python.

from urllib.parse import urlparse, parse_qs

def main():
    # We'll parse this example URL, which includes a
    # scheme, authentication info, host, port, path,
    # query params, and query fragment.
    s = "postgres://user:pass@host.com:5432/path?k=v#f"

    # Parse the URL and ensure there are no errors.
    u = urlparse(s)

    # Accessing the scheme is straightforward.
    print(u.scheme)

    # For authentication info, we need to split the netloc.
    auth, rest = u.netloc.split('@', 1)
    user, password = auth.split(':', 1)
    print(f"{user}:{password}")
    print(user)
    print(password)

    # The host contains both the hostname and the port,
    # if present. We can split them manually.
    host, port = rest.split(':', 1)
    print(f"{host}:{port}")
    print(host)
    print(port)

    # Here we extract the path and the fragment after
    # the #.
    print(u.path)
    print(u.fragment)

    # To get query params in a string of k=v format,
    # use query. You can also parse query params
    # into a dictionary.
    print(u.query)
    m = parse_qs(u.query)
    print(m)
    print(m['k'][0])

if __name__ == "__main__":
    main()

Running our URL parsing program shows all the different pieces that we extracted.

$ python url_parsing.py
postgres
user:pass
user
pass
host.com:5432
host.com
5432
/path
f
k=v
{'k': ['v']}
v

This Python code uses the urllib.parse module to parse URLs. The urlparse function is used to break down the URL into its components, and parse_qs is used to parse the query string into a dictionary.

Note that Python’s urlparse doesn’t separate the username and password automatically, so we had to do that manually. Similarly, we manually split the host and port.

The rest of the functionality is quite similar to the original example, demonstrating how to access various parts of a parsed URL in Python.