Docker Integration
==================
The :mod:`darc` project is integrated with Docker and
Compose. Though published to `Docker Hub`_, you can
still build by yourself.
.. _Docker Hub: https://hub.docker.com/r/jsnbzh/darc
.. important::
The ``debug`` image contains miscellaneous documents,
i.e. whole repository in it; and pre-installed some
useful tools for debugging, such as IPython, etc.
The Docker image is based on `Ubuntu Bionic`_ (18.04 LTS),
setting up all Python dependencies for the :mod:`darc`
project, installing `Google Chrome`_ (version
79.0.3945.36) and corresponding `ChromeDriver`_, as well as
installing and configuring Tor_, I2P_, ZeroNet_, FreeNet_,
NoIP_ proxies.
.. _Ubuntu Bionic: http://releases.ubuntu.com/18.04.4
.. _Google Chrome: https://www.google.com/chrome
.. _ChromeDriver: https://chromedriver.chromium.org
.. _Tor: https://www.torproject.org
.. _I2P: https://geti2p.net
.. _ZeroNet: https://zeronet.io
.. _Freenet: https://freenetproject.org
.. _NoIP: https://www.noip.com
.. note::
`NoIP`_ is currently not fully integrated in the
:mod:`darc` due to misunderstanding in the configuration
process. Contributions are welcome.
When building the image, there is an *optional* argument
for setting up a *non-root* user, c.f. environment variable
:data:`DARC_USER` and module constant :data:`~darc.const.DARC_USER`.
By default, the username is ``darc``.
.. raw:: html
Content of Dockerfile
.. literalinclude:: ../../Dockerfile
:language: dockerfile
.. note::
* ``retry`` is a shell script for retrying the commands until
success
.. raw:: html
Content of retry
.. literalinclude:: ../../extra/retry.sh
:language: shell
.. raw:: html
* ``pty-install`` is a Python script simulating user
input for APT package installation with ``DEBIAN_FRONTEND``
set as ``Teletype``.
.. raw:: html
Content of pty-install
.. literalinclude:: ../../extra/install.py
:language: shell
.. raw:: html
.. raw:: html
As always, you can also use Docker Compose to manage the :mod:`darc`
image. Environment variables can be set as described in the
`configuration `_ section.
.. raw:: html
Content of docker-compose.yml
.. literalinclude:: ../../docker-compose.yml
:language: yaml
.. raw:: html
.. note::
Should you wish to run :mod:`darc` in reboot mode, i.e. set
:envvar:`DARC_REBOOT` and/or :data:`~darc.const.REBOOT`
as :data:`True`, you may wish to change the entrypoint to
.. code:: shell
bash /app/run.sh
where ``run.sh`` is a shell script wraps around :mod:`darc`
especially for reboot mode.
.. raw:: html
Content of run.sh
.. literalinclude:: ../../extra/run.sh
:language: shell
.. raw:: html
In such scenario, you can customise your ``run.sh`` to, for
instance, archive then upload current data crawled by :mod:`darc`
to somewhere else and save up some disk space.