Base Sites Customisation

The darc.sites._abc module provides the abstract base class for sites customisation implementation. All sites customisation must inherit from the BaseSite exclusively.

Important

The BaseSite class is NOT intended to be used directly from the darc.sites._abc module. Instead, you are recommended to import it from darc.sites respectively.

class darc.sites._abc.BaseSite[source]

Bases: object

Abstract base class for sites customisation.

static crawler(timestamp, session, link)[source]

Crawler hook for my site.

Parameters:
  • timestamp (datetime) – Timestamp of the worker node reference.

  • session (Session) – Session object with proxy settings.

  • link (Link) – Link object to be crawled.

Raises:

LinkNoReturn – This link has no return response.

Return type:

Union[NoReturn, Response]

static loader(timestamp, driver, link)[source]

Loader hook for my site.

Parameters:
  • timestamp (datetime) – Timestamp of the worker node reference.

  • driver (selenium.webdriver.Chrome) – Web driver object with proxy settings.

  • link (Link) – Link object to be loaded.

Raises:

LinkNoReturn – This link has no return response.

Return type:

Union[NoReturn, WebDriver]

hostname: Optional[List[str]] = None

Hostnames (case insensitive) the sites customisation is designed for.