Default Hooks¶
The darc.sites.default
module is the fallback for sites
customisation.
- class darc.sites.default.DefaultSite[source]¶
Bases:
BaseSite
Default hooks.
- static crawler(timestamp, session, link)[source]¶
Default crawler hook.
- Parameters:
timestamp (
datetime
) – Timestamp of the worker node reference.session (requests.Session) – Session object with proxy settings.
link (
Link
) – Link object to be crawled.
- Returns:
The final response object with crawled data.
- Return type:
See also
- static loader(timestamp, driver, link)[source]¶
Default loader hook.
When loading, if
SE_WAIT
is a valid time lapse, the function will sleep for such time to wait for the page to finish loading contents.- Parameters:
- Returns:
The web driver object with loaded data.
- Return type:
selenium.webdriver.Chrome
Note
Internally,
selenium
will wait for the browser to finish loading the pages before return (i.e. the web API eventDOMContentLoaded
). However, some extra scripts may take more time running after the event.See also