Concord

1. Trying to concordance "can", it keeps giving me "can't" as well. I've told the text settings that I don't want the apostrophe to be included in the word but the only difference that seems to make is that the resulting concordance treats them as different words in the sorting (it doesn't when the apostrophe is 'included in word').

The problem is one of ambiguity. The apostrophe is used for at least 3 purposes in English (genitives, irony and quotes). Here you're concerned with how Concord recognises the end of a word, since by default if you ask for "can" you want " can ". But there are other word separators besides the space symbol, including carriage returns and punctuation symbols such as apostrophes. In other words, the apostrophe in BROTHERS' must be seen as a word separator, just like the space after SISTER or the full stop after COUSIN.

One solution -- the easiest -- is to delete the unwanted concordance lines with "can't". Sort on the search word (F6) so that they all come together, then delete them all in one go. Another is to use a stop list. This will be a bit slower than solution 1, because it forces Concord to check every occurrence of "can" to see whether it is in the stop list.

2. I wanted to concordance phrases like big, bold and beautiful so I entered *  ,  *  and  * but it didn't work. Why not?

Use *  *,  *  and  * (in other words put an asterisk before the comma). By default Concord assumes anything, even a comma, is a "whole word".

3. Can the software work with large corpora of over 200M words? Would there would be notable delays before concordance lines start appearing on a high-spec PC?

Yes. NO noticeable delays; concordance lines start appearing the millisecond they're found. You can stop at any time if you have enough. But it will take a long time to go through 200M words (= 1,200 MB if pure untagged text, on a fast pc I'd guess about 3 minutes though this depends on search-word etc. and more info is supplied on this in the Help.) See also no. 3 below.

4. Is the corpus indexed?

Not usually. But now it can be, if you so choose. To do lots of concordances always using the same very big corpus you'd be best advised to make an index of it.

5. I got a "General Protection Fault" message when running Concord.

The GPF message means that Concord tried to tread into some memory space it wasn't allowed access to. This is a pain, and even MS Word, Eudora, Windows Help and other programs occasionally do it.

Possibility 1) A GPF is most likely to happen if there is a straightforward error in the program. In the current OUP version this is not very likely in ordinary use; I would have had lots of angry messages since launch date if this usually happened! Solution: get a new version from my website and try again.

Possibility 2) A GPF could be caused by a shortage of memory whilst Concord is trying to do its job. This might happen if there were other sizeable programs such as MS Word loaded up at the same time, or else if there was something wrong with the hard disk. Windows uses the hard disk as a storage area for memory when it runs out of room on the chips in the machine. Solution: re-boot the machine so as to start with a nice fresh reset setup. Then run Scandisk to check whether the hard disk is screwed up at all and correct any faults which appear. Ensure there is at least 25 MB of room on the hard disk. Now run WordSmith again.

Possibility 3) There might be something special about the machine in question, especially if it is somehow different from the usual setup commonly found. I developed WordSmith Tools 1-3 using Pentium PCs running Win 95B, 98, NT and 2000. Things which might be special include: non-standard operating system, networked setup, unusual non-Intel CPU, very old CPU (eg. 386) or very latest CPU, laptop machine. Solution: try it on another machine if possible.



Site Designed by
Nicknet Web Publishing