Docker Integration ================== The :mod:`darc` project is integrated with Docker and Compose. Though published to `Docker Hub`_, you can still build by yourself. .. _Docker Hub: https://hub.docker.com/r/jsnbzh/darc .. important:: The ``debug`` image contains miscellaneous documents, i.e. whole repository in it; and pre-installed some useful tools for debugging, such as IPython, etc. The Docker image is based on `Ubuntu Bionic`_ (18.04 LTS), setting up all Python dependencies for the :mod:`darc` project, installing `Google Chrome`_ (version 79.0.3945.36) and corresponding `ChromeDriver`_, as well as installing and configuring Tor_, I2P_, ZeroNet_, FreeNet_, NoIP_ proxies. .. _Ubuntu Bionic: http://releases.ubuntu.com/18.04.4 .. _Google Chrome: https://www.google.com/chrome .. _ChromeDriver: https://chromedriver.chromium.org .. _Tor: https://www.torproject.org .. _I2P: https://geti2p.net .. _ZeroNet: https://zeronet.io .. _Freenet: https://freenetproject.org .. _NoIP: https://www.noip.com .. note:: `NoIP`_ is currently not fully integrated in the :mod:`darc` due to misunderstanding in the configuration process. Contributions are welcome. When building the image, there is an *optional* argument for setting up a *non-root* user, c.f. environment variable :data:`DARC_USER` and module constant :data:`~darc.const.DARC_USER`. By default, the username is ``darc``. .. raw:: html
Content of Dockerfile .. literalinclude:: ../../Dockerfile :language: dockerfile .. note:: * ``retry`` is a shell script for retrying the commands until success .. raw:: html
Content of retry .. literalinclude:: ../../extra/retry.sh :language: shell .. raw:: html
* ``pty-install`` is a Python script simulating user input for APT package installation with ``DEBIAN_FRONTEND`` set as ``Teletype``. .. raw:: html
Content of pty-install .. literalinclude:: ../../extra/install.py :language: shell .. raw:: html
.. raw:: html
As always, you can also use Docker Compose to manage the :mod:`darc` image. Environment variables can be set as described in the `configuration `_ section. .. raw:: html
Content of docker-compose.yml .. literalinclude:: ../../docker-compose.yml :language: yaml .. raw:: html
.. note:: Should you wish to run :mod:`darc` in reboot mode, i.e. set :envvar:`DARC_REBOOT` and/or :data:`~darc.const.REBOOT` as :data:`True`, you may wish to change the entrypoint to .. code-block:: shell bash /app/run.sh where ``run.sh`` is a shell script wraps around :mod:`darc` especially for reboot mode. .. raw:: html
Content of run.sh .. literalinclude:: ../../extra/run.sh :language: shell .. raw:: html
In such scenario, you can customise your ``run.sh`` to, for instance, archive then upload current data crawled by :mod:`darc` to somewhere else and save up some disk space.