darc.parse.URL_PAT: List[re.Pattern]

Regular expression patterns to match all reasonable URLs.

Currently, we have two builtin patterns:

  1. HTTP(S) and other regular URLs, e.g. WebSocket, IRC, etc.

re.compile(r'(?P<url>((https?|wss?|irc):)?(//)?\w+(\.\w+)+/?\S*)', re.UNICODE),
  1. Bitcoin accounts, data URIs, (ED2K) magnet links, email addresses, telephone numbers, JavaScript functions, etc.

re.compile(r'(?P<url>(bitcoin|data|ed2k|magnet|mailto|script|tel):\w+)', re.ASCII)
Environ

DARC_URL_PAT

See also

The patterns are used in darc.parse.extract_links_from_text().