1. In the WordSmith Text Converter (simplify XML section), use these settings on a copy of your BNC:
which leaves the text like this:
2. Then in the Text Converter's section on using a conversion file, use these lines in your conversion file, on the copy you made in step 1:
which will get rid of the c5=" and ", leaving you with
Now you should have a relatively clean BNC copy, with each word marked simply with <part of speech>next to it.
3. Making a word list with only the verbs.
You will need a tag file which specifies the verbs. For the verbs, this will be OK:
<VVB> /description "lex. verb Base" /colour="Cream on Black"
<VVD> /description "lex. verb Past" /colour="Cream on Maroon"
<VVG> /description "lex. verb Pres. Participle" /colour="Cream on Green"
<VVI> /description "lex. verb Infinitive" /colour="Cream on Purple"
<VVN> /description "lex. verb Past Participle" /colour="Cream on Navy"
<VVZ> /description "lex. verb 3rd Sing." /colour="White on Olive"
4. Load up the tag file and make your word list.
|