Task Queues¶
The darc.model.tasks
module defines the data models
required for the task queue of darc
.
See also
Please refer to darc.db
module for more information
about the task queues.
Hostname Queue¶
Important
The hostname queue is a set named queue_hostname
in
a Redis based task queue.
The darc.model.tasks.hostname
model contains the data model
defined for the hostname queue.
-
class
darc.model.tasks.hostname.
HostnameQueueModel
(*args, **kwargs)[source]¶ Bases:
darc.model.abc.BaseModel
Hostname task queue.
-
DoesNotExist
¶ alias of
HostnameQueueModelDoesNotExist
-
hostname
: Union[str, peewee.TextField] = <TextField: HostnameQueueModel.hostname>¶ Hostname (c.f.
link.host
).
-
id
= <AutoField: HostnameQueueModel.id>¶
-
timestamp
: Union[datetime.datetime, peewee.DateTimeField] = <DateTimeField: HostnameQueueModel.timestamp>¶ Timestamp of last update.
-
Crawler Queue¶
The darc.model.tasks.requests
model contains the data model
defined for the crawler
queue.
-
class
darc.model.tasks.requests.
RequestsQueueModel
(*args, **kwargs)[source]¶ Bases:
darc.model.abc.BaseModel
Task queue for
crawler()
.-
DoesNotExist
¶ alias of
RequestsQueueModelDoesNotExist
-
hash
: Union[str, peewee.CharField] = <CharField: RequestsQueueModel.hash>¶ Sha256 hash value (c.f.
Link.name
).
-
id
= <AutoField: RequestsQueueModel.id>¶
-
link
: Union[darc.link.Link, darc.model.utils.PickleField] = <PickleField: RequestsQueueModel.link>¶ Pickled target
Link
instance.
-
text
: Union[str, peewee.TextField] = <TextField: RequestsQueueModel.text>¶ URL as raw text (c.f.
Link.url
).
-
timestamp
: Union[datetime.datetime, peewee.DateTimeField] = <DateTimeField: RequestsQueueModel.timestamp>¶ Timestamp of last update.
-
Loader Queue¶
The darc.model.tasks.selenium
model contains the data model
defined for the loader
queue.
-
class
darc.model.tasks.selenium.
SeleniumQueueModel
(*args, **kwargs)[source]¶ Bases:
darc.model.abc.BaseModel
Task queue for
loader()
.-
DoesNotExist
¶ alias of
SeleniumQueueModelDoesNotExist
-
hash
: Union[str, peewee.CharField] = <CharField: SeleniumQueueModel.hash>¶ Sha256 hash value (c.f.
Link.name
).
-
id
= <AutoField: SeleniumQueueModel.id>¶
-
link
: Union[darc.link.Link, darc.model.utils.PickleField] = <PickleField: SeleniumQueueModel.link>¶ Pickled target
Link
instance.
-
text
: Union[str, peewee.TextField] = <TextField: SeleniumQueueModel.text>¶ URL as raw text (c.f.
Link.url
).
-
timestamp
: Union[datetime.datetime, peewee.DateTimeField] = <DateTimeField: SeleniumQueueModel.timestamp>¶ Timestamp of last update.
-