Custom settings |
Top Previous Next |
Controller > custom settings
Custom Tagsets In the main Settings | Tags window, you will see this, but you won't find "Shakespeare" as one of the options.
The point of it... The point of this choice is to change a whole series of settings according to the type of corpus you wish to process. When you change the setting above, any valid data as explained below will get loaded into your defaults.
How to do it 1. Create a plain text file called "custom_tag_settings.txt" and save it in your \wsmith4 folder. The format is like this:
<label> </label> <default> </default> (this can be used for one entry only and will determine which label is selected) <entity_file> </entity_file> <tag_file> </tag_file> <tags_exclude_file> </tags_exclude_file> <ignore_string> </ignore_string> <header_string> </header_string> <sentence_begin> </sentence_begin> <sentence_end> </sentence_end> <paragraph_begin> </paragraph_begin> <paragraph_end> </paragraph_end> <heading_begin> </heading_begin> <heading_end> </heading_end> <section_begin> </section_begin> <section_end> </section_end>
Example I wanted a choice of Shakespeare to determine which tags were chosen and how sentences, paragraphs etc. would be recognised in my Shakespeare corpus. Here is how I made "Shakespeare": <1> <label> Shakespeare </label> <entity_file> sgmltrns.tag</entity_file> <tag_file> Shakespeare.tag</tag_file> <tags_exclude_file> Shakespeare exclusion tags.tag</tags_exclude_file> <ignore_string> <*> </ignore_string> <header_string> </Header></header_string> <sentence_begin> </sentence_begin> <sentence_end>auto</sentence_end> <paragraph_begin> </paragraph_begin> <paragraph_end> </paragraph_end> <heading_begin> </heading_begin> <heading_end> </heading_end> <section_begin> </section_begin> <section_end> </section_end> </1> There were <2>...</2>, <3> ... </3> etc. but they aren't supplied here. There was no point in trying to recognise paragraph breaks in Shakespeare plays, but I did want an idea of sentences, to be recognised simply by full stops etc.
See also : Tags as text selectors |