Please enable JavaScript to view this site.

Navigation: » No topics above this level «

Publications/Authors

Scroll Prev Top Next More

The point of it

The idea here is to be able to select texts from certain publications or certain authors only, e.g. broadsheets, or texts by Smith and Jones but not Robinson.

A list of publications and of authors will have been created (or added to) for the language you're working with, and saved in the same folder as this text, where the program sits too.

 

How to do it

 

download_parser_auth_publ_page

 

The format in the publications file is name<TAB>number where number is the code in the example above: 00001 means publication number 1, 00004 means author 4 in the list. (All this is explained more in the Reference page under Filenames.)

You can load up the list from your already-collected list of publications, or authors. The list is ordered by frequency, unless you check code-sort, whereby you get the list in order of the codes the program has created for each publication so you can easily find a publication.

You can select or unselect all of them and will probably want to pick the more high-frequency ones. You may want to save these preferences.

 

There is a newswire list box; if you load up a plain text file listing newswires you can choose to Select or Unselect them, and the frequency number box will allow you to (un)select any publication whose frequency is at that number or greater. The years button allows you to specify e.g. 2000-2010 to choose a range of years of interest. The Countries button selects by country.

Save this check-list lets you save it (including which items are ticked) as a plain text, and Load from saved list reads in such a saved list.

 

Export

The Export button produces a tab-delimited list which you can read into Excel, showing the articles in the data, for the selected publications or authors. It saves a complete list of all text files which were published in any publications (or by authors) whose names you've checked. (If you choose to export dates only you can get a sorted list with the dates when each publisher (or author) had articles.) You will find that the publication-oriented file-list box (in the Make sub-corpora tab) gets this filename in it, so you can create corpora using this list. Or use such a list as input to WordSmith for text file choosing.

If you've checked random sample, pressing Export generates only a sample of the size you select, e.g. 100 texts. This is useful, for example to send to colleagues when discussing filtering criteria.

 

tog_plus        Check Sample

 

Text by Mike Scott, Help system by Help&Manual

>