Please enable JavaScript to view this site.

Navigation: » No topics above this level «


Scroll Prev Top Next More

The point of it

The idea here is to be able to filter texts according to which country they were published in.


How to do it

Choose a text file which specifies the relevant countries for your language. For example, a small plain text file containing this:


The New York Times

The Washington Post

San Jose Mercury News (California)

The Houston Chronicle



The Times (London)

Daily Mail (London)

The Express

The Evening Standard (London)

The Mirror

will be parsed to use 5 newspaper sources from the USA and 5 from the UK. Each country-name must be followed by a colon. The newspaper names must match those supplied by the provider of the download.


Hint: open up the PUBLICATIONS_ENGLISH.TXT file which you get after the downloaded texts were parsed, and in that you will see all the news-sources ordered alphabetically. Some may be mis-spelled in the original provider's database!

Each country’s results will go into a separate folder if you make sub-corpora.


Text by Mike Scott, Help system by Help&Manual
