-
darc.parse.
URL_PAT
: List[re.Pattern]¶ Regular expression patterns to match all reasonable URLs.
Currently, we have two builtin patterns:
HTTP(S) and other regular URLs, e.g. WebSocket, IRC, etc.
re.compile(r'(?P<url>((https?|wss?|irc):)?(//)?\w+(\.\w+)+/?\S*)', re.UNICODE),
Bitcoin accounts, data URIs, (ED2K) magnet links, email addresses, telephone numbers, JavaScript functions, etc.
re.compile(r'(?P<url>(bitcoin|data|ed2k|magnet|mailto|script|tel):\w+)', re.ASCII)
- Environ
See also
The patterns are used in
darc.parse.extract_links_from_text()
.