Commit bf30c04a authored by Ignacio Corcuera's avatar Ignacio Corcuera

Updated readthedocs

parent 7de5b413
Demo
----
There is a demo available on http://senpy.demos.gsi.dit.upm.es/, where you can a serie of different plugins. You can use them in the playground or make a directly requests to the service.
.. image:: senpy-playground.png
:height: 400px
:width: 800px
:scale: 100 %
:align: center
Plugins Demo
============
The next plugins are available at the demo:
* emoTextAnew extracts the VAD (valence-arousal-dominance) of a sentence by matching words from the ANEW dictionary.
* emoTextWordnetAffect based on the hierarchy of WordnetAffect to calculate the emotion of the sentence.
* vaderSentiment utilizes the software from vaderSentiment to calculate the sentiment of a sentence.
* sentiText is a software developed during the TASS 2015 competition, it has been adapted for English and Spanish.
emoTextANEW plugin
******************
This plugin is going to used the ANEW lexicon dictionary to calculate de VAD (valence-arousal-dominance) of the sentence and the determinate which emotion is closer to this value.
Each emotion has a centroid, which it has been approximated using the formula described in this article:
http://www.aclweb.org/anthology/W10-0208
The plugin is going to look for the words in the sentence that appear in the ANEW dictionary and calculate the average VAD score for the sentence. Once this score is calculated, it is going to seek the emotion that is closest to this value.
emoTextWAF plugin
*****************
This plugin uses WordNet-Affect (http://wndomains.fbk.eu/wnaffect.html) to calculate the percentage of each emotion. The emotions that are going to be used are: anger, fear, disgust, joy and sadness. It is has been used a emotion mapping enlarge the emotions:
* anger : general-dislike
* fear : negative-fear
* disgust : shame
* joy : gratitude, affective, enthusiasm, love, joy, liking
* sadness : ingrattitude, daze, humlity, compassion, despair, anxiety, sadness
sentiText plugin
****************
This plugin is based in the classifier developed for the TASS 2015 competition. It has been developed for Spanish and English. The different phases that has this plugin when it is activated:
* Train both classifiers (English and Spanish).
* Initialize resources (dictionaries,stopwords,etc.).
* Extract bag of words,lemmas and chars.
Once the plugin is activated, the features that are going to be extracted for the classifiers are:
* Matches with the bag of words extracted from the train corpus.
* Sentiment score of the sentences extracted from the dictionaries (lexicons and emoticons).
* Identify negations and intensifiers in the sentences.
* Complementary features such as exclamation and interrogation marks, eloganted and caps words, hashtags, etc.
The plugin has a preprocessor, which is focues on Twitter corpora, that is going to be used for cleaning the text to simplify the feature extraction.
There is more information avaliable in the next article.
Aspect based Sentiment Analysis of Spanish Tweets, Oscar Araque and Ignacio Corcuera-Platas and Constantino Román-Gómez and Carlos A. Iglesias and J. Fernando Sánchez-Rada. http://gsi.dit.upm.es/es/investigacion/publicaciones?view=publication&task=show&id=37
vaderSentiment plugin
*********************
For developing this plugin, it has been used the module vaderSentiment, which is described in the paper: VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text C.J. Hutto and Eric Gilbert Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.
If you use this plugin in your research, please cite the above paper
For more information about the functionality, check the official repository
https://github.com/cjhutto/vaderSentiment
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "http://example.com#NIFExample",
"analysis": [
],
"entries": [
{
"@id": "http://example.org#char=0,40",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:beginIndex": 0,
"nif:endIndex": 40,
"nif:isString": "My favourite actress is Natalie Portman"
}
]
}
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"analysis": [
{
"@id": "me:SAnalysis1",
"@type": "marl:SentimentAnalysis",
"marl:maxPolarityValue": 1,
"marl:minPolarityValue": 0
},
{
"@id": "me:SgAnalysis1",
"@type": "me:SuggestionAnalysis"
},
{
"@id": "me:EmotionAnalysis1",
"@type": "me:EmotionAnalysis"
},
{
"@id": "me:NER1",
"@type": "me:NER"
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
{
"@id": "http://micro.blog/status1#char=5,13",
"nif:beginIndex": 5,
"nif:endIndex": 13,
"nif:anchorOf": "Microsoft",
"me:references": "http://dbpedia.org/page/Microsoft",
"prov:wasGeneratedBy": "me:NER1"
},
{
"@id": "http://micro.blog/status1#char=25,37",
"nif:beginIndex": 25,
"nif:endIndex": 37,
"nif:anchorOf": "Windows Phone",
"me:references": "http://dbpedia.org/page/Windows_Phone",
"prov:wasGeneratedBy": "me:NER1"
}
],
"suggestions": [
{
"@id": "http://micro.blog/status1#char=16,77",
"nif:beginIndex": 16,
"nif:endIndex": 77,
"nif:anchorOf": "put your Windows Phone on your newest #open technology program"
}
],
"sentiments": [
{
"@id": "http://micro.blog/status1#char=80,97",
"nif:beginIndex": 80,
"nif:endIndex": 97,
"nif:anchorOf": "You'll be awesome.",
"marl:hasPolarity": "marl:Positive",
"marl:polarityValue": 0.9,
"prov:wasGeneratedBy": "me:SAnalysis1"
}
],
"emotions": [
{
"@id": "http://micro.blog/status1#char=0,109",
"nif:anchorOf": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"prov:wasGeneratedBy": "me:EAnalysis1",
"onyx:hasEmotion": [
{
"onyx:hasEmotionCategory": "wna:liking"
},
{
"onyx:hasEmotionCategory": "wna:excitement"
}
]
}
]
}
]
}
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"analysis": [
{
"@id": "me:EmotionAnalysis1",
"@type": "onyx:EmotionAnalysis"
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
],
"suggestions": [
],
"sentiments": [
],
"emotions": [
{
"@id": "http://micro.blog/status1#char=0,109",
"nif:anchorOf": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"prov:wasGeneratedBy": "me:EmotionAnalysis1",
"onyx:hasEmotion": [
{
"onyx:hasEmotionCategory": "wna:liking"
},
{
"onyx:hasEmotionCategory": "wna:excitement"
}
]
}
]
}
]
}
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"analysis": [
{
"@id": "me:NER1",
"@type": "me:NERAnalysis"
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
{
"@id": "http://micro.blog/status1#char=5,13",
"nif:beginIndex": 5,
"nif:endIndex": 13,
"nif:anchorOf": "Microsoft",
"me:references": "http://dbpedia.org/page/Microsoft",
"prov:wasGeneratedBy": "me:NER1"
},
{
"@id": "http://micro.blog/status1#char=25,37",
"nif:beginIndex": 25,
"nif:endIndex": 37,
"nif:anchorOf": "Windows Phone",
"me:references": "http://dbpedia.org/page/Windows_Phone",
"prov:wasGeneratedBy": "me:NER1"
}
],
"suggestions": [
],
"sentiments": [
],
"emotionSets": [
]
}
]
}
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"analysis": [
{
"@id": "me:SAnalysis1",
"@type": "marl:SentimentAnalysis",
"marl:maxPolarityValue": 1,
"marl:minPolarityValue": 0
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
],
"suggestions": [
],
"sentiments": [
{
"@id": "http://micro.blog/status1#char=80,97",
"nif:beginIndex": 80,
"nif:endIndex": 97,
"nif:anchorOf": "You'll be awesome.",
"marl:hasPolarity": "marl:Positive",
"marl:polarityValue": 0.9,
"prov:wasGeneratedBy": "me:SAnalysis1"
}
],
"emotionSets": [
]
}
]
}
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"analysis": [
{
"@id": "me:SgAnalysis1",
"@type": "me:SuggestionAnalysis"
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"prov:wasGeneratedBy": "me:SAnalysis1",
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
],
"suggestions": [
{
"@id": "http://micro.blog/status1#char=16,77",
"nif:beginIndex": 16,
"nif:endIndex": 77,
"nif:anchorOf": "put your Windows Phone on your newest #open technology program"
}
],
"sentiments": [
],
"emotionSets": [
]
}
]
}
......@@ -9,8 +9,11 @@ Welcome to Senpy's documentation!
Contents:
.. toctree::
senpy
installation
usage
api
schema
plugins
demo
:maxdepth: 2
Developing new plugins
----------------------
Each plugin represents a different analysis process.There are two types of files that are needed by senpy for loading a plugin:
See the examples at: `<http://github.com/gsi-upm/senpy-plugins-community>`_.
- Definition file, has the ".senpy" extension.
- Code file, is a python file.
Plugins Definitions
===================
The definition file can be written in JSON or YAML, where the data representation consists on attribute-value pairs.
The principal attributes are:
* name: plugin name used in senpy to call the plugin.
* module: indicates the module that will be loaded
.. code:: python
{
"name" : "senpyPlugin",
"module" : "{python code file}"
}
.. code:: python
name: senpyPlugin
module: {python code file}
Plugins Code
=================
The basic methods in a plugin are:
* __init__
* activate: used to load memory-hungry resources
* deactivate: used to free up resources
* analyse: called in every user requests. It takes in the parameters supplied by a user and should return a senpy Results.
Plugins are loaded asynchronously, so don't worry if the activate method takes too long. The plugin will be marked as activated once it is finished executing the method.
F.A.Q.
======
If I'm using a classifier, where should I train it?
???????????????????????????????????????????????????
Training a classifier can be time time consuming. To avoid running the training unnecessarily, you can use ShelfMixin to store the classifier. For instance:
.. code:: python
from senpy.plugins import ShelfMixin, SenpyPlugin
class MyPlugin(ShelfMixin, SenpyPlugin):
def train(self):
''' Code to train the classifier
'''
# Here goes the code
# ...
return classifier
def activate(self):
if 'classifier' not in self.sh:
classifier = self.train()
self.sh['classifier'] = classifier
self.classifier = self.sh['classifier']
def deactivate(self):
self.close()
You can speficy a 'shelf_file' in your .senpy file. By default the ShelfMixin creates a file based on the plugin name and stores it in that plugin's folder.
I want to implement my service as a plugin, How i can do it?
????????????????????????????????????????????????????????????
This example ilustrate how to implement the Sentiment140 service as a plugin in senpy
.. code:: python
class Sentiment140Plugin(SentimentPlugin):
def analyse(self, **params):
lang = params.get("language", "auto")
res = requests.post("http://www.sentiment140.com/api/bulkClassifyJson",
json.dumps({"language": lang,
"data": [{"text": params["input"]}]
}
)
)
p = params.get("prefix", None)
response = Results(prefix=p)
polarity_value = self.maxPolarityValue*int(res.json()["data"][0]
["polarity"]) * 0.25
polarity = "marl:Neutral"
neutral_value = self.maxPolarityValue / 2.0
if polarity_value > neutral_value:
polarity = "marl:Positive"
elif polarity_value < neutral_value:
polarity = "marl:Negative"
entry = Entry(id="Entry0",
nif__isString=params["input"])
sentiment = Sentiment(id="Sentiment0",
prefix=p,
marl__hasPolarity=polarity,
marl__polarityValue=polarity_value)
sentiment.prov__wasGeneratedBy = self.id
entry.sentiments = []
entry.sentiments.append(sentiment)
entry.language = lang
response.entries.append(entry)
return response
Where can I define extra parameters to be introduced in the request to my plugin?
?????????????????????????????????????????????????????????????????????????????????
You can add these parameters in the definition file under the attribute "extra_params" : "{param_name}". The name of the parameter has new attributes-value pairs. The basic attributes are:
* aliases: the different names which can be used in the request to use the parameter.
* required: this option is a boolean and indicates if the parameters is binding in operation plugin.
* options: the different values of the paremeter.
* default: the default value of the parameter, this is useful in case the paremeter is required and you want to have a default value.
.. code:: python
"extra_params": {
"language": {
"aliases": ["language", "l"],
"required": true,
"options": ["es","en"],
"default": "es"
}
}
This example shows how to introduce a parameter associated with language.
The extraction of this paremeter is used in the analyse method of the Plugin interface.
.. code:: python
lang = params.get("language")
Where can I set up variables for using them in my plugin?
?????????????????????????????????????????????????????????
You can add these variables in the definition file with the extracture of attribute-value pair.
Once you have added your variables, the next step is to extract them into the plugin. The plugin's __init__ method has a parameter called `info` where you can extract the values of the variables. This info parameter has the structure of a python dictionary.
Can I activate a DEBUG mode for my plugin?
???????????????????????????????????????????
You can activate the DEBUG mode by the command-line tool using the option -d.
.. code:: bash
python -m senpy -d
Where can I find more code examples?
????????????????????????????????????
See: `<http://github.com/gsi-upm/senpy-plugins-community>`_.
Schema Examples
===============
All the examples in this page use the schema defined in :ref:`schema`.
Simple NIF annotation
---------------------
Description
...........
This example covers the basic example in the NIF documentation: `<http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html>`_.
Representation
..............
.. literalinclude:: examples/example-basic.json
:language: json-ld
Sentiment Analysis
---------------------
Description
...........
Representation
..............
.. literalinclude:: examples/example-sentiment.json
:emphasize-lines: 5-10,25-33
:language: json-ld
Suggestion Mining
-----------------
Description
...........
Representation
..............
.. literalinclude:: examples/example-suggestion.json
:emphasize-lines: 5-8,22-27
:language: json-ld
Emotion Analysis
----------------
Description
...........
Representation
..............
.. literalinclude:: examples/example-emotion.json
:language: json-ld
:emphasize-lines: 5-8,25-37
Named Entity Recognition
------------------------
Description
...........
Representation
..............
.. literalinclude:: examples/example-ner.json
:emphasize-lines: 5-8,19-34
:language: json-ld
Complete example
----------------
Description
...........
This example covers all of the above cases, integrating all the annotations in the same document.
Representation
..............
.. literalinclude:: examples/example-complete.json
:language: json-ld
What is Senpy?
--------------
Senpy is an open source reference implementation of a linked data model for sentiment and emotion analysis services based on the vocabularies NIF, Marl and Onyx.
The overall goal of the reference implementation Senpy is easing the adoption of the proposed linked data model for sentiment and emotion analysis services, so that services from different providers become interoperable. With this aim, the design of the reference implementation has focused on its extensibility and reusability.
A modular approach allows organizations to replace individual components with custom ones developed in-house. Furthermore, organizations can benefit from reusing prepackages modules that provide advanced functionalities, such as algorithms for sentiment and emotion analysis, linked data publication or emotion and sentiment mapping between different providers.
Specifications
==============
The model used in Senpy is based on the following specifications:
* Marl, a vocabulary designed to annotate and describe subjetive opinions expressed on the web or in information systems.
* Onyx, which is built one the same principles as Marl to annotate and describe emotions, and provides interoperability with Emotion Markup Language.
* NIF 2.0, which defines a semantic format and APO for improving interoperability among natural language processing services
Architecture
============
The main component of a sentiment analysis service is the algorithm itself. However, for the algorithm to work, it needs to get the appropriate parameters from the user, format the results according to the defined API, interact with the user whn errors occur or more information is needed, etc.
Senpy proposes a modular and dynamic architecture that allows:
* Implementing different algorithms in a extensible way, yet offering a common interface.
* Offering common services that facilitate development, so developers can focus on implementing new and better algorithms.
The framework consists of two main modules: Senpy core, which is the building block of the service, and Senpy plugins, which consist of the analysis algorithm. The next figure depicts a simplified version of the processes involved in an analysis with the Senpy framework.
.. image:: senpy-architecture.png
:height: 400px
:width: 800px
:scale: 100 %
:align: center
......@@ -15,6 +15,61 @@ Or, alternatively:
This will create a server with any modules found in the current path.
Useful command-line options
===========================
In case you want to load modules, which are located in different folders under the root folder, use the next option.
.. code:: bash
python -m senpy -f .
The default port used by senpy is 5000, but you can change it using the option `--port`.
.. code:: bash
python -m senpy --port 8080
Also, the host can be changed where senpy is deployed. The default value is `127.0.0.1`.
.. code:: bash
python -m senpy --host 0.0.0.0
For more options, see the `--help` page.
Alternatively, you can use the modules included in senpy to build your own application.