Colour Categories

The point of it ...

With a concordance or word list on your screen it can be hard for example to know how many of the thousands of entries met certain criteria. For example which ones derived from only a few texts? Which ones ended in -NESS? How many of the concordance lines came both from mytext.txt and from the first 40 words in the sentence, and which ones are they?

The idea is to let you re-sort your existing data by your own criteria. (Since last millennium WordSmith lists have been sortable by standard criteria, and there has long been a Set column for your own classification, but this feature makes it possible to have multiple and complex sorts.)


How to do it

The menu option Compute | Colour categories will be found if the data have a Set column.



The menu option brings up a window where you specify your search criteria. Here is an example:




Complete the form by choosing a data column (above the user chose the File column) and a condition (here X ends with Y and .txt below which will mean 'search the file column seeking any where the File ends in .txt'). Then choose a colour (here colour 67 was chosen) and then press Add a search. Finally, press Find.


As you can just see, the Set column in the concordance has some items coloured.


A more complex example:


where the user wants to process the Word column of data, looking for a condition where the word starts with UN and occurs at least 5 times. For any word which meets this condition, the Set column will show the colour selected.


When you have specified the criteria, press the Find button.



The top of the Colour Categories window shows the percentage results. In the example below, the user has decided to omit their first search and to carry out another on the same word-list which found 188 words ending in NESS which were present in more than 40 BNC texts.


colour_categorising various

But not

This option lets you have a negative condition.




Where are they in the list?

To locate the items which colour categorising has found, simply sort the Set column. (If it's a Freq. list you may have to go to the Alphabetical tab first.) The categorised items float to the top. Here, the 6 words between BE and BF with frequency above 5 are coloured green at the top of the word list, with the 13 NESS items with frequency less than or equal to 5 coloured blue.



Once sorted, the data can be saved as before.


What if I already have a Set classification?

Here is a concordance where the exclamation O or Ah had already been identified and marked in the Set column.


romeo with o and ah set

As the Set column is already in use, classifying further by colour will take second priority to the existing forms typed in. So in this case:


romeo colour categorising

where 58 cases were found where the exclamations came in the first 49% of the text, we see that line 10 goes green (11 did not go green because the criterion was less than 50 and it had exactly 50%)


romeo with o and ah categorised

but clicking the Set column gives priority to the exclamation typed in.


romeo colour categorising sorted

In this case the Os follow the Ahs, and the coloured Ahs follow the uncoloured ones.


Removing the colours?

Use the Clear colours button.


What if more than one condition is met?

If you colour words ending NESS blue, and also colour words starting UN yellow, any word meeting both conditions will get a mixture of the two colours as shown here:




See also : setting categories by typing, colour categories for concordances, search.

Click the Permalink button if you want to copy a link to this page.