Key Words display

Top  Previous  Next

 

The display shows

1. each key word

2. its frequency in the source text(s) which these key words are key in, italicised.

3. the name of the source text file (or the word list file name if there's more than one) and %, also in italics.

4. its frequency in the reference corpus

5. the name of the reference corpus file (or the corpus word list file name if that was based on more than one text) and %

6. keyness (chi-square or log likelihood statistic)

7. p value.

The calculation of how unusual the frequency is, is based on the statistical procedure used. The statistic appears to the right of the display. If the procedure is log likelihood, or if chi-square is used and the usual conditions for chi-square obtain (expected value >= 5 in all four cells) the probability (p) will be displayed to the right of the chi-square value.

The criterion for what counts as "outstanding" is based on the minimum probability value selected before the key words were calculated. The smaller the number, the fewer key words in the display. Usually you'll not want more than about 40 key words to handle.

The words appear sorted according to how outstanding their frequencies of occurrence are. Those near the top are outstandingly frequent. At the end of the listing you'll find any which are outstandingly infrequent (negative keywords), in a different colour.

view button

This enables you to see the original source text using Viewer, and will highlight the key words.

See COLUMNS layout to change the individual colours or font of each column of data, e.g. if you don't like the italics.