Commit 342f52dd authored by Oscar Araque's avatar Oscar Araque
Browse files

Changed video in documentation

parent bda7f754
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 2d5a5c7809c635f59f57d3d6c5b0e526
config: b90dfcb485c09835d2491617fac32ade
tags: 645f666f9bcd5a90fca523b33c5a78b7
......@@ -6,7 +6,7 @@ GSI Crawler is an innovative and useful framework which aims to extract informat
.. image:: images/crawler1.png
:align: center
In this documentation we are going to introduce this framework, detailing the global architecture of the project and explaining each module functionality. Finally we will expose most a case study in order to better understand the system itself. A demo video about GSI Crawler is available `here <https://www.youtube.com/watch?v=x9jzGDZs5hY&feature=youtu.be>`_.
In this documentation we are going to introduce this framework, detailing the global architecture of the project and explaining each module functionality. Finally we will expose most a case study in order to better understand the system itself. A demo video about GSI Crawler is available `here <https://www.youtube.com/watch?v=3s894sjevBQ&feature=youtu.be>`_.
......
......@@ -272,10 +272,15 @@ div.admonition {
}
div.admonition tt.xref, div.admonition code.xref, div.admonition a tt {
background-color: #FBFBFB;
background-color: ;
border-bottom: 1px solid #fafafa;
}
dd div.admonition {
margin-left: -60px;
padding-left: 60px;
}
div.admonition p.admonition-title {
font-family: 'Garamond', 'Georgia', serif;
font-weight: normal;
......@@ -438,16 +443,6 @@ table.field-list p {
margin-bottom: 0.8em;
}
/* Cloned from
* https://github.com/sphinx-doc/sphinx/commit/ef60dbfce09286b20b7385333d63a60321784e68
*/
.field-name {
-moz-hyphens: manual;
-ms-hyphens: manual;
-webkit-hyphens: manual;
hyphens: manual;
}
table.footnote td.label {
width: .1px;
padding: 0.3em 0 0.3em 0.5em;
......@@ -493,6 +488,11 @@ dl pre, blockquote pre, li pre {
padding-left: 30px;
}
dl dl pre {
margin-left: -90px;
padding-left: 90px;
}
tt, code {
background-color: #ecf0f3;
color: #222;
......
......@@ -4,7 +4,7 @@
*
* Sphinx stylesheet -- basic theme.
*
* :copyright: Copyright 2007-2017 by the Sphinx team, see AUTHORS.
* :copyright: Copyright 2007-2016 by the Sphinx team, see AUTHORS.
* :license: BSD, see LICENSE for details.
*
*/
......@@ -398,13 +398,6 @@ table.field-list td, table.field-list th {
margin: 0;
}
.field-name {
-moz-hyphens: manual;
-ms-hyphens: manual;
-webkit-hyphens: manual;
hyphens: manual;
}
/* -- other body styles ----------------------------------------------------- */
ol.arabic {
......
......@@ -4,7 +4,7 @@
*
* Sphinx JavaScript utilities for all documentation.
*
* :copyright: Copyright 2007-2017 by the Sphinx team, see AUTHORS.
* :copyright: Copyright 2007-2016 by the Sphinx team, see AUTHORS.
* :license: BSD, see LICENSE for details.
*
*/
......
......@@ -4,7 +4,7 @@
*
* Sphinx JavaScript utilities for the full-text search.
*
* :copyright: Copyright 2007-2017 by the Sphinx team, see AUTHORS.
* :copyright: Copyright 2007-2016 by the Sphinx team, see AUTHORS.
* :license: BSD, see LICENSE for details.
*
*/
......
......@@ -4,7 +4,7 @@
*
* sphinx.websupport utilities for all documentation.
*
* :copyright: Copyright 2007-2017 by the Sphinx team, see AUTHORS.
* :copyright: Copyright 2007-2016 by the Sphinx team, see AUTHORS.
* :license: BSD, see LICENSE for details.
*
*/
......
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Architecture &#8212; GSI Crawler 1.0 documentation</title>
<link rel="stylesheet" href="_static/alabaster.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: './',
......@@ -31,7 +34,7 @@
<meta name="viewport" content="width=device-width, initial-scale=0.9, maximum-scale=0.9" />
</head>
<body>
<body role="document">
<div class="document">
......@@ -64,7 +67,7 @@
<ul class="simple">
<li><strong>Fetch</strong> refers to the process of obtaining tweets, comments or any content which is desired to be analyzed, from the provided URL. Most of the times, this task involves webpage parsing, recognizing valuable information contained inside html tags and building a new JSON file with the selected data. This process is commonly known as <em>scraping</em> a website. In order to facilitate this filtering process,there exist multiple extensions or libraries that offer a well-formed structure to carry out this task in a more comfortable way. Inside the Tasks Server, we have imported the Scrapy library in order to agilize the data mining process. Scrapy is an open source and collaborative framework for extracting the data from websites, in a fast, simple, yet extensible way. It is based on sub classes named <em>spiders</em>, which contain the required methods to extract the information. Apart from the use of the Scrapy library, several APIs have also been used for retrieving data. The GSI Crawler application has three available scrapers, one for each Twitter and Reddit platform, and another one which includes spiders for different news sources. So to conclude, this task focuses on extracting the valuable data and generates a JSON which can be analyzed by the following task in the pipeline.</li>
<li><strong>Analyze</strong> task is responsible of taking the input JSON file generated by the previous task, parsing it and analyzing each text strign using Senpy remote server for it. Senpy service is based on HTTP calls, obtaining an analyzed result for the text attached in the request. Once the task has collected the analysis result, it generates another JSON containing the original sentence and its analysis result.</li>
<li><strong>Store</strong> process consists on storing the JSON generated previously which contains the analysis result inside ElasticSearch instance or Fuseki. ElasticSearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores the data so it is possible to discover the expected and uncover the unexpected. To carry out the saving process, its necessary to provide two arguments, the <strong>index</strong>, which represents the elastic index where the information will be saved, and the <strong>doc type</strong>, which allows to categorize information that belongs to the same index. It exists a third parameter which is the <strong>id</strong> of the query, but it is automatically generated by default.</li>
<li><strong>Store</strong> process consists on storing the JSON generated previously which contains the analysis result inside ElasticSearch instance or Fuseki. ElasticSearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores the data so it is possible to discover the expected and uncover the unexpected. To carry out the saving process, it&#8217;s necessary to provide two arguments, the <strong>index</strong>, which represents the elastic index where the information will be saved, and the <strong>doc type</strong>, which allows to categorize information that belongs to the same index. It exists a third parameter which is the <strong>id</strong> of the query, but it is automatically generated by default.</li>
</ul>
<p>To better understand these concepts, we are going to give a clear example that shows how the pipeline processes work internally. Imagine that the user requests a <strong>sentiment</strong> analysis for a certain <strong>Tweet</strong>. One ElasticSearch parameters approach that would fit could be, <strong>twitter</strong> as the ElasticSearch <em>index</em>, <strong>sentiment</strong> as the <em>doc type</em> because there could exist an emotion within the same platform, and lastly the <em>id</em> that could be the <strong>datetime</strong> when the task request was triggered.</p>
<p>Once the Luigi orchestator has been explained, we will conclude this section detailing how the server behaves when it receives a user request, and what parameters are mandatory to run the operation. The workflow is shown in diagram below:</p>
......@@ -138,8 +141,8 @@
&copy;2017, Antonio F. Llamas and Rodrigo Barbado Esteban.
|
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.6.3</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.10</a>
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.5</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.9</a>
|
<a href="_sources/architecture.rst.txt"
......
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Index &#8212; GSI Crawler 1.0 documentation</title>
<link rel="stylesheet" href="_static/alabaster.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: './',
......@@ -31,7 +34,7 @@
<meta name="viewport" content="width=device-width, initial-scale=0.9, maximum-scale=0.9" />
</head>
<body>
<body role="document">
<div class="document">
......@@ -99,8 +102,8 @@
&copy;2017, Antonio F. Llamas and Rodrigo Barbado Esteban.
|
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.6.3</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.10</a>
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.5</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.9</a>
</div>
......
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>What is GSI Crawler? &#8212; GSI Crawler 1.0 documentation</title>
<link rel="stylesheet" href="_static/alabaster.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: './',
......@@ -32,7 +35,7 @@
<meta name="viewport" content="width=device-width, initial-scale=0.9, maximum-scale=0.9" />
</head>
<body>
<body role="document">
<div class="document">
......@@ -44,7 +47,7 @@
<h1>What is GSI Crawler?<a class="headerlink" href="#what-is-gsi-crawler" title="Permalink to this headline"></a></h1>
<p>GSI Crawler is an innovative and useful framework which aims to extract information from web pages enriching following semantic approaches. At the moment, there are three available platforms: Twitter, Reddit and News. The user interacts with the tool through a web interface, selecting the analysis type he wants to carry out and the platform that is going to be examined.</p>
<img alt="_images/crawler1.png" class="align-center" src="_images/crawler1.png" />
<p>In this documentation we are going to introduce this framework, detailing the global architecture of the project and explaining each module functionality. Finally we will expose most a case study in order to better understand the system itself. A demo video about GSI Crawler is available <a class="reference external" href="https://www.youtube.com/watch?v=x9jzGDZs5hY&amp;feature=youtu.be">here</a>.</p>
<p>In this documentation we are going to introduce this framework, detailing the global architecture of the project and explaining each module functionality. Finally we will expose most a case study in order to better understand the system itself. A demo video about GSI Crawler is available <a class="reference external" href="https://www.youtube.com/watch?v=3s894sjevBQ&amp;feature=youtu.be">here</a>.</p>
</div>
......@@ -100,8 +103,8 @@
&copy;2017, Antonio F. Llamas and Rodrigo Barbado Esteban.
|
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.6.3</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.10</a>
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.5</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.9</a>
|
<a href="_sources/gsicrawler.rst.txt"
......
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Welcome to GSI Crawler’s documentation! &#8212; GSI Crawler 1.0 documentation</title>
<link rel="stylesheet" href="_static/alabaster.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: './',
......@@ -31,7 +34,7 @@
<meta name="viewport" content="width=device-width, initial-scale=0.9, maximum-scale=0.9" />
</head>
<body>
<body role="document">
<div class="document">
......@@ -40,7 +43,7 @@
<div class="body" role="main">
<div class="section" id="welcome-to-gsi-crawler-s-documentation">
<h1>Welcome to GSI Crawlers documentation!<a class="headerlink" href="#welcome-to-gsi-crawler-s-documentation" title="Permalink to this headline"></a></h1>
<h1>Welcome to GSI Crawler&#8217;s documentation!<a class="headerlink" href="#welcome-to-gsi-crawler-s-documentation" title="Permalink to this headline"></a></h1>
<p>Contents:</p>
<div class="toctree-wrapper compound">
<ul>
......@@ -119,8 +122,8 @@
&copy;2017, Antonio F. Llamas and Rodrigo Barbado Esteban.
|
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.6.3</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.10</a>
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.5</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.9</a>
|
<a href="_sources/index.rst.txt"
......
......@@ -2,6 +2,5 @@
# Project: GSI Crawler
# Version:
# The remainder of this file is compressed using zlib.
xmM
09x
n݈(]B]D kx=ObkZ]Oޠf=A`UNB>#%'bLf 4)ڠǫ!/4ytX~RYyCF]y=s2vv%zL܋YHGitDb\'- F_%1kLJ{
\ No newline at end of file
xm= 0sr"P:Bgr4mMk#{do$@YP yp*SD,N@3=F|X
eJDY89Uwdؗmњrlۈ?Cjrūdii7Y6Y&}0hc &f=
\ No newline at end of file
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Search &#8212; GSI Crawler 1.0 documentation</title>
<link rel="stylesheet" href="_static/alabaster.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: './',
......@@ -38,7 +41,7 @@
</head>
<body>
<body role="document">
<div class="document">
......@@ -112,8 +115,8 @@
&copy;2017, Antonio F. Llamas and Rodrigo Barbado Esteban.
|
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.6.3</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.10</a>
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.5</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.9</a>
</div>
......
Search.setIndex({docnames:["architecture","gsicrawler","index","tutorials"],envversion:53,filenames:["architecture.rst","gsicrawler.rst","index.rst","tutorials.rst"],objects:{},objnames:{},objtypes:{},terms:{"000z":[],"02t21":3,"03t14":3,"04t18":3,"05t16":[],"26z":3,"29z":3,"30z":3,"36z":3,"55km":3,"case":[0,1,3],"class":[0,3],"default":0,"final":[1,3],"function":[0,1],"garc\u00eda":[],"import":[0,3],"jos\u00e9":[],"new":[0,1,2],"return":3,"s\u00e1nchez":[],"true":[],"try":[],Adding:[],For:[0,3],Its:[],Las:3,One:0,That:0,The:[0,1,3],Then:3,There:3,These:0,_id:3,_index:3,_score:3,_scrapi:3,_search:3,_shard:[],_sourc:3,_type:3,abl:[],about:[1,3],abov:[0,3],absa:[],acces:3,access:3,access_token:[],access_token_secret:[],accommod:[],accomplish:[],accord:[0,3],aces:[],achiev:[],acquisit:0,across:[],activ:0,add:3,add_demo:[],add_tweet:[],added:3,adding:3,addit:[],additionali:[],addr:[],address:3,adit:[],administr:0,after:0,afterward:[],again:3,against:[],agil:0,aim:[0,1],alberto:[],alert:[],algo:[],algorithm:[],align:3,all:[0,3],alloc:[],allow:[0,3],along:[],also:[0,3],amado:[],amazon:[],ambush:3,amount:[],analys:[],analysi:[0,1],analysistyp:[],analyt:0,analyz:0,anew:[],ani:0,anoth:0,apart:[0,3],api:[0,3],api_key_meaning_cloud:3,aplic:[],app:2,app_nam:[],appear:3,append:[],appli:[],applic:0,approach:[0,1],araqu:[],architectur:[1,2],argument:0,arm:3,articl:[],articlebodi:3,ask:[],aspect:[],assad:3,asset:3,assign:[],associ:[],asyncron:[],attach:0,attr:[],attract:[],attribut:[],author:3,automat:0,aux:3,avail:[0,1,3],background:3,baghdadi:3,bar:[],base:[0,3],bashar:3,basic:[],becaus:0,been:0,befor:[0,3],begin:[],behav:0,behavior:[],being:[0,3],belong:0,below:[0,3],better:[0,1],between:[],big:[],bodi:3,book:[],both:[0,3],bower:[],bower_compon:[],brand:[],briefli:[],bring:[],browser:3,buffer:3,build:[0,3],call:[0,3],can:[0,3],capabl:0,card:[],carlo:[],carmona:[],carri:[0,1],categor:0,cdn:3,center:3,central:0,certain:0,certainti:0,chang:3,charg:3,chart:3,check:[],choos:[],chown:[],citi:3,clear:0,client:0,clone:3,cluster:3,cnn:3,cnnnext:3,code:3,collabor:0,collect:[0,3],color:3,com:3,comfort:0,command:[],comment:0,commonli:0,commun:[],complet:[],complex:3,compon:[2,3],compos:3,compound:[],concaten:[],concept:0,concert:3,conclud:0,concret:[],cond:[],config:[],configur:[],conflict:3,connect:[],consequ:[],consider:[],consist:0,consol:3,consumer_kei:[],consumer_secret:[],contain:[0,3],containsemotionsanalysi:[],containssentimentsanalysi:[],content:[0,2,3],convent:[],copi:[],corcuera:[],core:0,correctli:[],could:[0,3],crawl:2,crawler:0,crawler_endpoint:[],crawler_endpoint_extern:[],crawlertask:3,creat:3,credenti:3,cron:0,crontask:[],crowd:[],css:[],current:[],custom:[],dai:3,dam:3,dashboard:[0,2],data:[0,2],databas:[],dataset:[],date:[],datemodifi:3,datepublish:3,datetim:0,dbpedia:[],decod:[],deeper:[],deepth:[],def:3,defin:0,demo:[1,3],demodashboard:3,depend:3,depict:[],describ:0,descript:[],design:[],desir:0,destin:[],detail:[0,1,3],develop:2,diagram:0,dict:3,dictionari:[],did:3,difer:[],differ:[0,3],direct:[],directori:3,discov:[0,3],displai:[0,3],distribut:0,dit:3,div:[],divid:0,do_some_funct:[],doc:0,doc_typ:[],docker:3,dockerfil:[],doctyp:[],document:[0,1,3],doesn:[],dom:[],domain:[],domest:3,don:[],donald:3,done:[],download:[],dsaa:[],due:[],dump:3,each:[0,1,3],easi:[],easili:[],edit:3,editor:[],effort:[],elast:[0,3],elasticdemo:[],elasticsearch:[0,3],element:3,elev:[],email:[],emilio:[],emoji:[],emot:0,enabl:[],encapsul:[],end:[],endpoint:[],engin:0,enhanc:[],enrich:[0,1,2],enriqu:[],enter:3,entiti:3,entri:[],env:3,env_fil:[],enviro:[],environ:[0,3],error:[],es_endpoint:3,es_endpoint_extern:3,es_port:3,essentiali:[],etc:[],europ:[],european:[],even:[],everi:[],everydai:0,examin:1,exampl:[0,3],exec:[],execut:3,exemplifi:[],exist:0,expect:0,explain:[0,1,3],exploit:[],explor:3,expos:1,express:[],extend:[],extens:0,extern:[],extra:[],extract:[0,1,3],facebook:[],facilit:0,fail:[],failur:[],fake:[],fals:[],fast:0,featur:[],fernando:[],fetch:0,fetchdatatask:[],field:3,figur:0,file:[0,3],filenam:[],filepath:3,filesystem:3,filter:0,financi:[],find:3,finish:3,fire:3,first:2,firstli:[],firstpublishd:3,fit:0,flask:[],flow:[],focu:[],focus:0,folder:3,follow:[0,1,3],footbal:[],footballmood:[],forc:3,form:0,format:3,found:3,four:[],foursquar:[],framework:[0,1],from:[0,1,3],fulful:0,fuseki:[0,3],fuseki_endpoint:3,fuseki_endpoint_dashboard:[],fuseki_endpoint_extern:3,fuseki_password:3,fuseki_port:3,gather:3,gener:[0,3],geoloc:3,get:2,git:3,github:3,gitlab:[],give:0,given:[],glanc:2,global:1,goal:[],goe:3,going:[0,1,3],googl:3,googleplac:[],graphic:[],group:0,grow:0,gsi2017fuseki:[],gsi:0,gsicrawl:3,handl:[],has:[0,3],hashtag:[],haspolar:[],have:[0,3],hawija:3,headlin:3,heart:0,hello:[],help:[],here:[1,3],high:0,highli:[],hit:[],homepag:[],host:[],hostil:3,how:[0,3],href:[],html:[0,3],http:[0,3],ico:3,icon:3,identif:3,iglesia:[],ignacio:[],iii:2,illustr:[],imag:3,imagin:0,implement:[],incid:3,includ:0,incom:0,incurs:3,independ:0,index:[0,3],inferfac:[],infil:[],inform:[0,1,3],ingest:0,initi:3,innov:1,input:0,insid:[0,3],instal:2,instanc:0,instruct:3,intellig:[],interact:[0,1],interest:[],interfac:[0,1],intern:0,introduc:1,involv:0,iraq:3,iraqi:3,iron:[],isi:3,item:[],its:[0,3],itself:1,javascript:[],job:[],jpg:3,json:[0,3],just:[],keyboard:[],kill:3,know:[],known:0,lab:3,las:3,lastli:0,lastmodifiedd:3,later:0,latest:3,layer:[],less:[],let:[],level:0,librari:[0,3],licens:[],like:3,line:3,link:3,list:[],load:[],local:3,localhost:3,localtarget:3,locat:[],log:3,logo:[],look:[],luigi:[0,3],luigi_auto_en:[],luigi_endpoint:[],luigi_endpoint_extern:[],made:3,mai:[],mail:[],main:[0,3],mainli:[],make:[],manag:0,mandatori:0,mani:3,manuel:[],marketplac:[],marl:[],mass:3,materi:[],max_scor:[],mean:[],meaningcloud:3,mechan:[],media:[],method:0,middleeast:3,militari:3,mine:0,miss:[],mit:[],modern:[],modifi:[],modul:[1,2],modular:0,moment:[0,1],monitor:[],more:[0,3],most:[0,1],move:[],msdkfmsdflsdml:[],multipl:0,mum:3,murder:3,must:3,my_compon:[],my_dashboard_rout:[],mydashboard:[],myfootballtweet:[],myweb:[],myweb_compon:[],name:[0,3],necessari:0,need:3,network:[],newdashboard:[],newsarticl:3,newsitem:3,next:3,niger:3,node:[],node_modul:[],node_path:[],nomber:3,notic:[],notif:[],notifi:[],now:3,num:3,number:[0,3],object:[],observ:[],obtain:[0,3],obtent:[],offer:[0,3],onc:0,one:[0,3],ones:[],onli:3,onlin:[],ontolog:[],onyx:[],open:[0,3],openn:[],oper:[0,3],opinion:[],option:3,orchest:0,orchestr:[],order:[0,1,3],org:[],organ:0,organis:[],origin:0,oscar:[],other:0,our:3,out:[0,1],outfil:3,output:3,overview:2,own:[],packag:[],page:[1,3],paper:[],paradigm:0,paramet:[0,3],pars:0,part:[],parti:[],particip:[],partit:[],pascual:[],path:3,peopl:3,perform:[],period:0,permiss:[],persist:[],pertin:[],petit:0,piec:3,pipelin:[0,3],pipelinetask:3,place:3,plata:[],platform:[0,1],player:[],pleas:[0,3],point:0,polar:[],polarityvalu:[],polit:3,polym:2,polymerel:3,pop:[],popul:[],port:[],possibl:[0,3],power:0,pragmat:[],pre:[],prebuilt:[],present:[],presid:3,pretti:3,previou:0,previous:0,print:3,proceed:[],process:[0,3],produc:[],product:[],profund:[],program:3,progress:[],project:[1,3],properli:[],properti:[],provid:0,put:[],python3:[],python:[],querei:[],queri:0,queu:[],queue:[],quickest:3,rada:[],ran:[],rdf:[],read:[],readi:[],recaptur:3,receiv:0,recogn:0,recognit:[],recommend:[],reddit:[0,1],redirect:[],refer:0,refresh:3,regim:3,regist:[],rel:[],relat:3,relev:[],reload:[],rememb:[],remot:0,repositori:3,repres:0,represent:0,request:[0,3],requir:[0,3],resolut:[],resourc:[],respect:[],respons:[0,3],response_json:[],rest:0,restart:[],restaur:[],result:[0,3],retriev:0,retrievecnnnew:3,retrievenytimesnew:[],reusabl:[],review:[],root:3,rout:[],rtype:[],run:[0,3],russia:3,saavedra:[],sai:3,same:0,save:0,scalabl:[],scenario:[],schedul:[],schema:3,scheme:[],score:[],scrap:[],scrape:[0,3],scraper:[0,3],scrapi:[0,3],scrapytask:[],scratch:[],script:[],seach:[],search:[0,3],search_queri:[],second:3,section:[0,3],see:3,sefarad:[0,3],sefarad_demo:[],select:[0,1],selector:[],self:3,semant:[0,1,2],semev:[],send:[],senpi:[0,3],senpytask:[],sent:0,sentenc:0,sentiment:[0,3],sentisdata:[],sequenc:0,server:2,servic:[0,3],set:[0,3],sever:0,shoot:3,should:3,show:[0,3],shown:0,side:[],similar:[],simpl:0,simpli:0,size:3,smtp:[],smtp_host:[],smtp_port:[],social:[],solv:0,some:3,sophist:[],sourc:[0,3],sourcecod:[],sourcer:[],spain:[],span:[],sparql:[],spec:[],special:[],specif:[],specifi:[],spider:0,stack:0,standard:[],start:2,static_fil:[],step:[0,3],storag:[0,2],store:[0,3],stori:3,str:3,strign:0,string:[],structur:[0,3],studi:1,style:[],sub:0,subindex:[],submodul:0,success:3,successfulli:[],sudo:3,suggest:[],summari:[],suppli:[],support:[],surround:3,syria:3,syrian:3,system:[0,1],tag:0,take:0,talk:[],tanf:3,target:3,task:[2,3],tass:[],team:[],techniqu:[],technolog:[],tediou:[],templat:[],terror:3,test:[],text:0,thai:[],thank:0,thei:3,them:[],thi:[0,1,3],third:0,those:3,three:[0,1],through:1,thumbnail:3,thumbnailurl:3,time:0,timed_out:[],timestamp:[],titl:3,tmp:3,todai:[],took:[],tool:[0,1,3],top:[],topic:3,total:[],tourpedia:[],track:[],tracker:[],treat:[],trend:[],trigger:0,triplet:[],troop:3,trump:3,tuesdai:3,turn:[],turner:3,tutori:2,tutorial2:3,tutorial3:3,tutorialtask:3,tweet:0,twitter:[0,1,3],twitter_access_token:3,twitter_access_token_secret:3,twitter_consumer_kei:3,twitter_consumer_secret:3,two:0,type:[0,1,3],ubuntu:3,unanalys:[],uncov:0,understand:[0,1],understood:[],unexpect:0,updat:[],upload:[],upm:3,uri:[],url:[0,3],usb:[],use:[0,3],used:0,useful:1,user:[0,1],user_loc:[],uses:0,using:0,utf:[],valid:0,valu:3,valuabl:0,variabl:3,vega:3,via:[],video:1,view:0,visit:[0,3],visual:[0,3],visualis:[],wai:[0,3],want:[1,3],web:[1,2,3],webpag:0,websit:0,wednesdai:3,well:0,were:[],what:[0,2],whatev:[],when:[0,3],where:0,whether:3,which:[0,1,3],whose:0,widget:[],wikipedia:[],wire:[],wish:3,within:0,work:0,workflow:[0,3],would:[0,3],write:3,wsgi:[],www:3,xxxx:[],yet:0,yml:[],you:3,your:2,youraccesstoken:3,youraccesstokensecret:3,yourconsumerkei:3,yourconsumersecret:3,yourfusekiendpoint:3,yourfusekiendpointextern:[],yourfusekipass:3,yourmeaningcloudapikei:3,zone:3},titles:["Architecture","What is GSI Crawler?","Welcome to GSI Crawler\u2019s documentation!","Getting started"],titleterms:{"import":[],"new":3,Adding:[],about:[],analys:[],app:0,architectur:0,aspect:[],avail:[],collect:[],compon:0,compos:[],configur:[],crawl:3,crawler:[1,2,3],cron:[],custom:[],dashboard:3,data:3,dataset:[],dbpedia:[],demo:[],develop:3,docker:[],document:2,elast:[],elasticsearch:[],enrich:3,extra:[],file:[],financi:[],first:3,footballmood:[],get:3,glanc:3,gsi:[1,2,3],gsicrawl:[],guid:[],iii:3,indic:[],instal:3,json:[],knowledg:[],librari:[],load:[],luigi:[],modul:0,overview:0,own:[],pipelin:[],polym:0,previou:[],quick:[],refer:[],run:[],seach:[],semant:3,senpi:[],server:0,servic:[],sourc:[],sparql:[],start:3,storag:3,store:[],tabl:[],task:0,tourpedia:[],tracker:[],tutori:3,tweet:[],twitter:[],updat:[],visualis:[],web:0,welcom:2,what:1,widget:[],your:3}})
\ No newline at end of file
Search.setIndex({docnames:["architecture","gsicrawler","index","tutorials"],envversion:51,filenames:["architecture.rst","gsicrawler.rst","index.rst","tutorials.rst"],objects:{},objnames:{},objtypes:{},terms:{"02t21":3,"03t14":3,"04t18":3,"26z":3,"29z":3,"30z":3,"36z":3,"55km":3,"case":[0,1,3],"class":[0,3],"default":0,"final":[1,3],"function":[0,1],"import":[0,3],"new":[0,1,2],"return":3,For:[0,3],Las:3,One:0,That:0,The:[0,1,3],Then:3,There:3,These:0,_id:3,_index:3,_score:3,_scrapi:3,_search:3,_sourc:3,_type:3,about:[1,3],abov:[0,3],acces:3,access:3,accord:[0,3],acquisit:0,activ:0,add:3,added:3,adding:3,address:3,administr:0,after:0,again:3,agil:0,aim:[0,1],align:3,all:[0,3],allow:[0,3],also:[0,3],ambush:3,analysi:[0,1],analyt:0,analyz:0,ani:0,anoth:0,apart:[0,3],api:[0,3],api_key_meaning_cloud:3,app:2,appear:3,applic:0,approach:[0,1],architectur:[1,2],argument:0,arm:3,articlebodi:3,assad:3,asset:3,attach:0,author:3,automat:0,aux:3,avail:[0,1,3],background:3,baghdadi:3,base:[0,3],bashar:3,becaus:0,been:0,befor:[0,3],behav:0,being:[0,3],belong:0,below:[0,3],better:[0,1],bodi:3,both:[0,3],browser:3,buffer:3,build:[0,3],call:[0,3],can:[0,3],capabl:0,carri:[0,1],categor:0,cdn:3,center:3,central:0,certain:0,certainti:0,chang:3,charg:3,chart:3,citi:3,clear:0,client:0,clone:3,cluster:3,cnn:3,cnnnext:3,code:3,collabor:0,collect:[0,3],color:3,com:3,comfort:0,comment:0,commonli:0,complex:3,compon:[2,3],compos:3,concept:0,concert:3,conclud:0,conflict:3,consist:0,consol:3,contain:[0,3],content:[0,2,3],core:0,could:[0,3],crawl:2,crawler:0,crawlertask:3,creat:3,credenti:3,cron:0,dai:3,dam:3,dashboard:[0,2],data:[0,2],datemodifi:3,datepublish:3,datetim:0,def:3,defin:0,demo:[1,3],demodashboard:3,depend:3,describ:0,desir:0,detail:[0,1,3],develop:2,diagram:0,dict:3,did:3,differ:[0,3],directori:3,discov:[0,3],displai:[0,3],distribut:0,dit:3,divid:0,doc:0,docker:3,document:[0,1,3],domest:3,donald:3,dump:3,each:[0,1,3],edit:3,elast:[0,3],elasticsearch:[0,3],element:3,emot:0,engin:0,enrich:[0,1,2],enter:3,entiti:3,env:3,environ:[0,3],es_endpoint:3,es_endpoint_extern:3,es_port:3,everydai:0,examin:1,exampl:[0,3],execut:3,exist:0,expect:0,explain:[0,1,3],explor:3,expos:1,extens:0,extract:[0,1,3],facilit:0,fast:0,fetch:0,field:3,figur:0,file:[0,3],filepath:3,filesystem:3,filter:0,find:3,finish:3,fire:3,first:2,firstpublishd:3,fit:0,focus:0,folder:3,follow:[0,1,3],forc:3,form:0,format:3,found:3,framework:[0,1],from:[0,1,3],fulful:0,fuseki:[0,3],fuseki_endpoint:3,fuseki_endpoint_extern:3,fuseki_password:3,fuseki_port:3,gather:3,gener:[0,3],geoloc:3,get:2,git:3,github:3,give:0,glanc:2,global:1,goe:3,going:[0,1,3],googl:3,group:0,grow:0,gsi:0,gsicrawl:3,has:[0,3],have:[0,3],hawija:3,headlin:3,heart:0,here:[1,3],high:0,hostil:3,how:[0,3],html:[0,3],http:[0,3],ico:3,icon:3,identif:3,iii:2,imag:3,imagin:0,incid:3,includ:0,incom:0,incurs:3,independ:0,index:[0,3],inform:[0,1,3],ingest:0,initi:3,innov:1,input:0,insid:[0,3],instal:2,instanc:0,instruct:3,interact:[0,1],interfac:[0,1],intern:0,introduc:1,involv:0,iraq:3,iraqi:3,isi:3,its:[0,3],itself:1,jpg:3,json:[0,3],kill:3,known:0,lab:3,las:3,lastli:0,lastmodifiedd:3,later:0,latest:3,level:0,librari:[0,3],like:3,line:3,link:3,local:3,localhost:3,localtarget:3,log:3,luigi:[0,3],made:3,main:[0,3],manag:0,mandatori:0,mani:3,mass:3,meaningcloud:3,method:0,middleeast:3,militari:3,mine:0,modul:[1,2],modular:0,moment:[0,1],more:[0,3],most:[0,1],multipl:0,mum:3,murder:3,must:3,name:[0,3],necessari:0,need:3,newsarticl:3,newsitem:3,next:3,niger:3,nomber:3,now:3,num:3,number:[0,3],obtain:[0,3],offer:[0,3],onc:0,one:[0,3],onli:3,open:[0,3],oper:[0,3],option:3,orchest:0,order:[0,1,3],organ:0,origin:0,other:0,our:3,out:[0,1],outfil:3,output:3,overview:2,page:[1,3],paradigm:0,paramet:[0,3],pars:0,path:3,peopl:3,period:0,petit:0,piec:3,pipelin:[0,3],pipelinetask:3,place:3,platform:[0,1],pleas:[0,3],point:0,polit:3,polym:2,polymerel:3,possibl:[0,3],power:0,presid:3,pretti:3,previou:0,previous:0,print:3,process:[0,3],program:3,project:[1,3],provid:0,queri:0,quickest:3,recaptur:3,receiv:0,recogn:0,reddit:[0,1],refer:0,refresh:3,regim:3,relat:3,remot:0,repositori:3,repres:0,represent:0,request:[0,3],requir:[0,3],respons:[0,3],rest:0,result:[0,3],retriev:0,retrievecnnnew:3,root:3,run:[0,3],russia:3,sai:3,same:0,save:0,schema:3,scrape:[0,3],scraper:[0,3],scrapi:[0,3],search:[0,3],second:3,section:[0,3],see:3,sefarad:[0,3],select:[0,1],self:3,semant:[0,1,2],senpi:[0,3],sent:0,sentenc:0,sentiment:[0,3],sequenc:0,server:2,servic:[0,3],set:[0,3],sever:0,shoot:3,should:3,show:[0,3],shown:0,simpl:0,simpli:0,size:3,solv:0,some:3,sourc:[0,3],spider:0,stack:0,start:2,step:[0,3],storag:[0,2],store:[0,3],stori:3,str:3,strign:0,structur:[0,3],studi:1,sub:0,submodul:0,success:3,sudo:3,surround:3,syria:3,syrian:3,system:[0,1],tag:0,take:0,tanf:3,target:3,task:[2,3],terror:3,text:0,thank:0,thei:3,thi:[0,1,3],third:0,those:3,three:[0,1],through:1,thumbnail:3,thumbnailurl:3,time:0,titl:3,tmp:3,tool:[0,1,3],topic:3,trigger:0,troop:3,trump:3,tuesdai:3,turner:3,tutori:2,tutorial2:3,tutorial3:3,tutorialtask:3,tweet:0,twitter:[0,1,3],twitter_access_token:3,twitter_access_token_secret:3,twitter_consumer_kei:3,twitter_consumer_secret:3,two:0,type:[0,1,3],ubuntu:3,uncov:0,understand:[0,1],unexpect:0,upm:3,url:[0,3],use:[0,3],used:0,useful:1,user:[0,1],uses:0,using:0,valid:0,valu:3,valuabl:0,variabl:3,vega:3,video:1,view:0,visit:[0,3],visual:[0,3],wai:[0,3],want:[1,3],web:[1,2,3],webpag:0,websit:0,wednesdai:3,well:0,what:[0,2],when:[0,3],where:0,whether:3,which:[0,1,3],whose:0,wish:3,within:0,work:0,workflow:[0,3],would:[0,3],write:3,www:3,yet:0,you:3,your:2,youraccesstoken:3,youraccesstokensecret:3,yourconsumerkei:3,yourconsumersecret:3,yourfusekiendpoint:3,yourfusekipass:3,yourmeaningcloudapikei:3,zone:3},titles:["Architecture","What is GSI Crawler?","Welcome to GSI Crawler&#8217;s documentation!","Getting started"],titleterms:{"new":3,app:0,architectur:0,compon:0,crawl:3,crawler:[1,2,3],dashboard:3,data:3,develop:3,document:2,enrich:3,first:3,get:3,glanc:3,gsi:[1,2,3],iii:3,instal:3,modul:0,overview:0,polym:0,semant:3,server:0,start:3,storag:3,task:0,tutori:3,web:0,welcom:2,what:1,your:3}})
\ No newline at end of file
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Getting started &#8212; GSI Crawler 1.0 documentation</title>
<link rel="stylesheet" href="_static/alabaster.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: './',
......@@ -32,7 +35,7 @@
<meta name="viewport" content="width=device-width, initial-scale=0.9, maximum-scale=0.9" />
</head>
<body>
<body role="document">
<div class="document">
......@@ -247,7 +250,7 @@ We have create the main structure inside demodashboard folder. Open a web browse
</pre></div>
</div>
<p>This icon must be stored inside images folder. Refresh your web browser to see your changes.</p>
<p>This web components has many more options like changing the background color, the title For more information visit <a class="reference external" href="https://lab.cluster.gsi.dit.upm.es/sefarad/number-chart">https://lab.cluster.gsi.dit.upm.es/sefarad/number-chart</a>.</p>
<p>This web components has many more options like changing the background color, the title... For more information visit <a class="reference external" href="https://lab.cluster.gsi.dit.upm.es/sefarad/number-chart">https://lab.cluster.gsi.dit.upm.es/sefarad/number-chart</a>.</p>
<p>You can add as Web Components as you want, there are some examples in <a class="reference external" href="https://github.com/PolymerElements/">https://github.com/PolymerElements/</a></p>
<p>If you wish to discover more about how to create dashboards, please visit <a class="reference external" href="http://sefarad.readthedocs.io/en/latest/">Sefarad documentation</a>.</p>
</div>
......@@ -313,8 +316,8 @@ We have create the main structure inside demodashboard folder. Open a web browse
&copy;2017, Antonio F. Llamas and Rodrigo Barbado Esteban.
|
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.6.3</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.10</a>
Powered by <a href="http://sphinx-doc.org/">Sphinx 1.5</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.9</a>
|
<a href="_sources/tutorials.rst.txt"
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment