Technical Aspects in Viewer & Aligner

Top  Previous  Next

Viewer and Aligner > technical aspects

 

When is a sentence not a sentence?

There is no perfect mechanical way of determining sentence-breaks. For example, a heading may well have no final full stop but would normally not be considered part of the sentence which follows it. And a sentence may often have no final full stop, if what follows it is a list of items.

The algorithm used by Viewer & Aligner is: a sentence ends if a full-stop, question-mark or exclamation-mark (.?!) is immediately followed by one or more word separators and if the next non-punctuation symbol is a capital letter A..Z or an accented capital letter, a number or a currency symbol. The same routine is used as in WordList.

 

Consider this chunk from A Tale of Two Cities:

"Wo-ho!" said the coachman. "So, then! One more pull and you're at the top and be damned to you, for I have had trouble enough to get you to it! - Joe!"

 

Viewer & Aligner will mistakenly consider - Joe! as a separate sentence, but handles "Wo-ho!" said the coachman. as one: though the program would split it in two if the word after ho! had a capital lettter (e.g. in Wild Bill, the coachman, said.)

 

Viewer & Aligner cannot therefore be expected to handle all sentence boundaries exactly as you would. (I saw Mr. Smith. would be considered two sentences; several headings may be bundled together as one sentence.) For this reason you can choose Find Short Sentences to seek out any odd one-word sentences.

 

See also: Viewer & Aligner contents