|Top Previous Next|
Reference > Definitions > definitions
The word is defined as a sequence of valid characters with a word separator at each end. Valid characters include all the letters from A to Z, plus all accented characters which can be used in the current character set, plus any user-defined acceptable characters to be included within a word (such as the apostrophe or hyphen).
A word can be of any length but for one to be stored in a word list, you may set the length you prefer (maximum of 50 characters) -- any which exceed your limit will get + tagged onto them at that point. You can decide whether or not to include words including numbers (e.g. $35.50) in text characteristics.
A cluster is a group of words which follow each other in a text. The term phrase is not used here because it has technical senses in linguistics which would imply a grammatical relation between the words in it. In WordList cluster processing or Concord cluster processing there can be no certainty of this, though clusters often do match phrases or idioms. See also: general cluster information.
The sentence is defined as the full-stop, question-mark or exclamation-mark (.?!) immediately followed by one or more word separators and then a capital letter in the current language, a number or a currency symbol. (For more discussion see Starts and Ends of Text Segments or Viewer & Aligner technical information.)
Paragraphs are user-defined. See Starts and Ends of Text Segments for further details.
Headings are also user-defined -- see Starts and Ends of Text Segments.