The point of it
The idea here is to be able to filter texts according to which country they were published in.
How to do it
Choose a text file which specifies the relevant countries for your language. For example, a small plain text file containing this:
USA:
The New York Times
The Washington Post
San Jose Mercury News (California)
The Houston Chronicle
USA TODAY
UK:
The Times (London)
Daily Mail (London)
The Express
The Evening Standard (London)
The Mirror
will be parsed to use 5 newspaper sources from the USA and 5 from the UK. Each country-name must be followed by a colon. The newspaper names must match those supplied by the provider of the download.
Hint: open up the PUBLICATIONS_ENGLISH.TXT file which you get after the downloaded texts were parsed, and in that you will see all the news-sources ordered alphabetically. Some may be mis-spelled in the original provider's database!
Each country’s results will go into a separate folder if you make sub-corpora.
>