Please enable JavaScript to view this site.

WordSmith Tools Help

Navigation: Utility Programs > Corpus Checker

build sub-corpora

Scroll Prev Top Next More


The point of it

It can be convenient to copy all the text files of one given month into one single text. It uses the date of each text. This function reduced 120 thousand single articles between 2005 and 2018 into 168 monthly texts, easier to handle when seeking trends.



build sub-corpora


How to do it

Choose a source folder, the file-types and a suitable results folder, and choose the delicacy required:  years, months, weeks, days are the obvious choices.

In the screen-shot you can see I added | and a further folder-name (KW) after choosing a source folder j:\climate_study_2019\parsed\GE\GERMAN\FAZ.This allows the program to search within both the FAZ and KW folders, both of which stem from j:\climate_study_2019\parsed\GE\GERMAN.



As each text file is found it gets added to the collection for the appropriate year/month/week/day in a folder of that name. This is as partial view of what I got:


build sub-corpora_result