Importing words from a text list

  Previous topic Next topic JavaScript is required for the print function  


the point of it

You might want a word list based on some data you have obtained in the form of a list, but whose original texts you do not have access to.



Your text file can be in any language (select this before you make the list), and can be in Unicode or ASCII.

But it must follow a similar format as a stop list expects, except that following each word there must be a <tab> character and the frequency as a plain number (decimal points will be ignored). Do not use commas as a thousands delimiter as otherwise they'll be interpreted as different words. The words do not need to be in frequency or alphabetical order.




; My word list for test purposes.

THIS        67543

IT        33218

WILL        2978

BE        5679

COMPLETE        45

AND        99345

UTTER                54

RUBBISH        99

THE        578965

IS        55678


You should get results like these.


importing text into wordlist results


Statistics are calculated in the simplest possible way: the word-lengths (plus mean and standard deviation), and the number of types and tokens. Most procedures need to know the total number of running words (tokens) and the number of different word types so you should manage to use the word-list in KeyWords etc.


The total is computed by adding the frequencies of each word-type (67543+33218+2978 etc. in the example above).

Optionally, a line can start \TOTAL=\ and contain a numerical total, eg.

\TOTAL=\        299981

In this case the total number of tokens will be assumed to be 299981, instead.


how to do it

When you choose the New menu option (Start) in WordList you get a window offering three tabs: a Main tab for most usual purposes,


importing text into wordlist


one for Detailed Consistency, and another (Advanced) for creating a word list using a plain text file.


Choose your .txt file(s) and a suitable folder to save to, add any notes you wish, and press create word list(s) now.

Page url: