SentiStrength Frequently Asked Questions

Why do the reported strengths in the calculations (not the final results) not match the dictionary strengths?

In the Windows version, the result scores are always 1 point less than the dictionary. - This is just a convenience for the way that the algorithm works.

For example:
"vanity" is -2, but in the results is -3
"tender" is 3, but in the results is 2

In the Java version, the results are occasionally out by 1 because some of the working of the algorithm is a little too complex to add into the explanation of the coding.

How does the text size affect results?

Text size affects the results because it is completely ignored - the maximum sentiment strength found anywhere in the text is used. This means that most long texts would have high scores on both positive and negative sentiment. You might need to experiment with different options to alter this. For instance, for larger texts the "average" options might work better than the "maximise" options in the top 4 options.

Does SentiStrength use the linguistic contexts of words?

SentiStrength does not use context at all, except for a few specific exceptions, as noted in the options menu. SentiStrength is quite a simple method designed for short, low quality texts.

How does the domain of discussion affect the results? (e.g., movies, product reviews)

SentiStrength is really designed for short texts without any particular domain so it will work less well on any other type of text. It has parameters that can help it train itself for different kinds of data, however. But to use these, you will need to (a) generate a corpus of at least 1000 human-annotated (strength 1-5) texts for your domain, (b) extend the sentiment word dictionary EmotionLookupTable.txt with any new relevant domain-specific words and (c) train SentiStrength to learn the best emotion term weights on your domain-specific human-coded 1000+ texts. (the Optimise emotion dictionary weights menu option).

I have just run a 10-fold cross-validation assessment. What do the results mean?

To see the overall results for the program clearly, the easiest way is to copy the table at the bottom of the file "...summary.txt.sum.txt" (produced by the classification process) into a spreadsheet program to align the columns. This table contains evidence of the overall performance of the algorithm in n separate identical tests (however many you selected), so the average of each column would be the likely performance of the program.

The key results and abbreviations are (in decreasing order of importance):

Corr+ the correlation between the human coded positive scores and sentistrength positive score predictions. If the values in the table are > 0 then the program is working but the English version gets a score of 0.45-0.55, so if your score is less than this, then the program could possibly be improved with extending the input dictionary.
Corr- the correlation between the human coded negative scores and sentistrength negative score predictions. (see above)
Acc+ the accuracy of the program - the percentage agreement between the human coded positive scores and sentistrength positive score predictions.
Acc- the accuracy of the program - the percentage agreement between the human coded negative scores and sentistrength negative score predictions.
AccWithin1+ the accuracy of the program within 1 - the percentage near agreement between the human coded positive scores and sentistrength positive score predictions, with the difference allowed to be up to 1.
AccWithin1- the accuracy of the program within 1 - the percentage near agreement between the human coded negative scores and sentistrength negative score predictions, with the difference allowed to be up to 1.
MeanAbsErr+ Not important.
MeanAbsErr-Not important.

This process does not produce an optimal sentiment dictionary, but if you select "Optimise the emotion dictionary weights" from the "Sentiment Strength Analysis" menu then this should produce an optimal dictionary (see below).

How do I produce optimised sentiment strength term weights?

You will need:

your own or the original sentistrength files with initial term weights, and
a tab-separated file containing classified texts (positive strength 1-5, negative strength 1-5, text), ideally at least 1000

Select the option "Optimise the emotion dictionary weights" from the "Sentiment Strength Analysis" menu to produce an optimal dictionary.

SentiStrength home page