* Luigi orchestrator (gsicrawler). This service provides both the task executer and a web interface to check your workflows status.
The tasks are executed periodically according to the period in `crontasks.py`.
* Luigi orchestrator (orchestrator). This service provides both the task executer and a web interface to check your workflows status.
The tasks are executed periodically according to the period in `tasks.py:Main`.
By default, the period is 24h.
The web interface shows the status of the tasks, and it is available on http://localhost:8082
* GSICrawler: a service to get data from different sources (e.g. Twitter, Facebook) (available on http://localhost:5000 and http://localhost:5555 (flower)).
* Elasticsearch: the official elasticsearch image. It is available on localhost:19200
* Senpy, used for sentiment and semantic analysis. It is mapped to http://localhost:5000/
* Somedi dashboard (sefarad), a website developed with sefarad. It displays the data stored in elasticsearch.
It is available on http://localhost:8080.
* Redis: a dependency of GSICrawler.
The docker-compose definition adds all these services to the same network, so they can communicate with each other using the service name, without exposing external ports.
The endpoints used in each service (e.g. the elasticsearch endpoint in the gsicrawler service) are configurable through environment variables.