Collocation Settings

 

 

To set collocation horizons and other Concord settings, in the main WordSmith Controller menu at the top, choose Concord Settings.

 

collocate_settings_in_controller

 

Collocates are computed case-insensitively (so my in the concordance line will be treated like My) unless you check the case-sensitive box.

If you don't want certain collocates such as THE to be included, use a stop-list.

Numbers will be summarised as # or not according to your language settings.

You can lemmatise (join related forms like SPEAK -> SPEAKS, SPOKE, SPOKEN) using a lemma list file.

 

Minimum Specifications

The minimum length is 1, and minimum frequency is 1 (default is 10). You can specify here how frequently it must have appeared in the neighbourhood of the Search Word. Words which only come once or twice are less likely to be informative. So specifying 5 will only show a collocate which comes 5 or more times in the neighbouring context.

Similarly, you can specify how long a collocate must be for it to be stored in memory, e.g. 3 letters or more would be 3. And a minimal number of texts so as to avoid words which collocate only in one or two texts.

 

Horizons

Here you specify how many words to left and right of the Search Word are to be included in the collocation search: the size of the "neighbourhood" referred to above. The maximum is 25 left and 25 right. Results will later show you these in separate columns so you can examine exactly how many times a given collocate cropped up say 3 words to the left of your Search Word.

The most frequent will be signalled in the most frequent collocate colour (default=red).

 

Breaks

These are

 

break_alternatives

 

which you will see in the bottom right corner of the screen visible in the Controller Concord Settings.

When the collocates are computed, if the setting is to stop at sentence breaks, collocates will be counted within the above horizons but taking sentence breaks into account.

 

For example, if a concordance line contains

 

source, per pointing integration times, respectively. However, when we compared these two maps

 

and the search-word is however,

only

when we compared these two

will be used for collocates because there is a sentence break to the left of the search word. If the setting is "stop at punctuation", then nothing will come into the collocate list for that line (because there is a more major break than punctuation to the left of it, and no word to the right of the search-word before a punctuation symbol.

 

stop at end of text: end of text is by default assumed to be the end of the text file. stop at heading or section: this works by recognising ends of heading or section which you can specify in the text format box (language settings):

 

text_format_in_language_settings

 

 

 

Click the Permalink button if you want to copy a link to this page.