Show/Hide Toolbars

WordSmith Tools Manual

Navigation: Utility Programs > Corpus Checker

corpus sampler

Scroll Prev Top Next More

The point of it


Texts downloaded from the Internet often seem to promise one thing but in fact contain another... A news story may refer incidentally to an issue you are interested in, but really be mainly about something else.


Use the sampler when you need to read the individual texts in your corpus to check whether they meet your purposes. This utility finds a random set and can save that for sharing with others so you can agree on criteria.


How to do it


Choose the number of texts desired. Press Random sample from folder if you want the program to go through a set of texts in a folder (and its sub-folders), or Random Sample from a list if you have a plain text list. The procedure looks at all the text files, and picks out that number at random. Here the random procedure has chosen anywhere within the total.  




Zipped text button

This generates a zipped file, with all the original folder structure of the sample texts preserved:


In the .zip you find a separate copy of each text file, as well as a large text containing a copy of each text glued one after the other. It also contains a plain text list of the text files ready for you or fellow researchers to note down observations about each text. You might find it useful to copy the latter into Excel in your record-keeping.


See also: relevance check.