Inside a .doc or .docx file there is a lot of extra coding apart from the plain text words. (Actually, a .docx doesn't even seem to show the ordinary text words inside it!) For example, the name of your printer, the owner of the software, information about styles etc. For accurate results, WordSmith needs to use clean text where these have been removed.
converting your .DOC or .DOCX files
The easiest method, for multiple .doc or .docx files, is to convert using the Text Converter.
Alternatively you can do it in Word
To convert a .doc or .docx into plain text in Word can be done thus: Chose File | Save As | Plain text:
then choose Windows (1-byte per character) or Other encoding -- Unicode (2-bytes):
|