Your choices here are 6:
1. cut out a header
and/or
2. make one change only
3. insert numbering
4. replace some problem characters
5. use a script to determine a whole set of changes. There is an sample to see.
6. link any multi-word units in the text using an underscore character, based on a multi-word unit text file.
If you make one change only you type something into the left box which gets replaced by what is in the right box. In the case above Dorothy will get changed to <Tab>+Dorothy, that is, the word Dorothy will get a tab inserted to its left. The tab was inserted simply by dragging it to the box above it, and when that happened {CHR(9)} appeared automatically being the syntax for a <Tab>. If you know the decimal number for a character you can specify it as {CHR(n)} or simply #n where n represents your number.
With one change only you can see some further options: to force the search-and-replace to be case sensitive, to treat the search as a whole-word search, to cut out any cases where there is more than one space or more than one <Enter> (line-break or paragraph end), to treat the search as literal (see syntax) and to force the results into Unicode.
It might be best to check the confirm each box too if there's any danger of confusing two different Dorothies with each other.
This allows you to insert paragraph numbering into your corpus texts. When you click the specify numbering button you'll get options like these:
With these choices, for each of your texts, a string like <para "1">, <para "2"> etc. will get inserted at its start if the paragraph has at least 50 characters. The "only if containing" box allows you to specify that numbers only get inserted into paragraphs containing a particular (case-sensitive) string of your choice, such as Ulan Bator. Paragraphs here are identified simply as sequences ending in one <Enter>. |
Allows you to specify a number of characters which you wish to replace, either with a character of your choice or to remove completely if you leave the with this character space blank.
|
See also: convert whole file, sample conversion file, syntax, Text Converter Contents.