Show/Hide Toolbars

WordSmith Tools Manual

Navigation: Controller

Batch Processing

Scroll Prev Top Next More

 

 

The point of it...

Batch processing is used when you want to make separate lists, but you don't want the trouble of doing it one by one, manually selecting each text file, making the word list or concordance, saving it, and so on.

If you have selected more than one text file you can ask WordList, Concord and KeyWords to process as a batch.

 

batch_processing

 

Folder where they end up

The name suggested is today's date. Edit it if you like. Whatever you choose will get created when the batch process starts.

The results will be stored in folders stemming from the folder name. That is, if you start making word lists in

c:\wsmith\wordlist\05_07_19_12_00, they will end up like this:

c:\wsmith\wordlist\05_07_19_12_00\0\fred1.lst

c:\wsmith\wordlist\05_07_19_12_00\0\jim2.lst

..

c:\wsmith\wordlist\05_07_19_12_00\0\mary512.lst

then

c:\wsmith\wordlist\05_07_19_12_00\1\joanna513.lst

etc.

Filenames will be the source text filename with the standard extension (.lst, .cnc, .kws).

 

Zip them

If checked, the results are physically stored in a standard .zip file. You can extract them using your standard zipping tool such as Winzip, or you can let WordSmith do it for you. The files within are exactly the same as the uncompressed versions but save disk space -- and the disk system will also be less unhappy than if there are many hundreds of files in the same folder.

If you zip them, you will get

c:\wsmith\wordlist\05_07_19_12_00\batch.zip

and all the sub-files will get deleted unless you check "keep both .zip and results".

 

One file / One file per folder?

The first alternative (default) makes one .zip file with all your individual word-lists in it. Each word-list or concordance or keywords list is for one source text.

But what if your text files are structured like this:

\..\BNC

\..\BNC\written

\..\BNC\written\humanities

\..\BNC\written\medicine

\..\BNC\written\science

\..\BNC\spoken

etc.

The One file per folder, individual zipfiles makes a separate .zip of each separate folderful of textfiles (eg. one for humanities, another for medicine, etc.), with one list for each source text.

The One file per folder, amalgamated zipfiles makes a separate .zip of each folderful, but makes one word-list or concordance from that whole folderful of texts.

 

Batch Processing and Excel

These options may also offer a chance for data to be copied automatically to an Excel file.

 

Faster (Minimal) Processing

 

batch_processing_KWs

 

This checkbox is only enabled if you are about to start a process where more than one kind of result can be computed simultaneously. For example, if you are computing a concordance, by default collocates, patterns and dispersion plots will be computed when each concordance is done. In KeyWords, likewise, there will be dispersion plots, link calculations etc. which will be computed as the KWs are calculated.

If checked, only the minimal computation will be done (KWs in KeyWords processing, concordance in Concord). This will be faster, and you can always get the plots computed later as long as the source texts don't get moved or deleted.

 

Example: you're making word lists and have chosen 1,200 text files which are from a magazine called "The Elephant".

You specify

C:\WSMITH\WORDLIST\ELEPHANT as your folder name.

 

If you already have a folder called C:\WSMITH\WORDLIST\ELEPHANT, you will be asked for permission to erase it and all sub-folders of it!

 

After you press OK,

1,200 new word-lists are created, called trunk.LST, tail.LST .. digestive system.LST. They are all in numbered sub-folders of a folder called

C:\WSMITH\WORDLIST\ELEPHANT.

 

If you did not check "zip them into 1 .zip file", you will find them under C:\WSMITH\WORDLIST\ELEPHANT\0.

If you did check "zip them into 1 .zip file", there is now a C:\WSMITH\WORDLIST\ELEPHANT.ZIP file which contains all your results. (The 1,200 .LST files created will have been erased but the .ZIP file contains all your lists.)

 

The advantage of a .zip file is that it takes up much less disk space and is easy to email to others. WordSmith can access the results from within a .zip file, letting you choose which word list, concordance etc. you want to see.

 

Getting at the results in WordSmith

Choose File | Open as usual, then change the file-type to "Batch file *.zip". When you choose a .zip file, you will see a window listing its contents. Double-click on any one to open it.

 

Note: of course Concord will only succeed in opening a concordance and KeyWords a key word list file. If you choose a .zip file which contains something else, it will give an error message.

 

See also: batch scripts