MS Word™ documents

Inside a .doc or .docx file there is a lot of extra coding apart from the plain text words. For example, the name of your printer, the owner of the software, information about styles etc. For accurate results, WordSmith needs to use clean text where these have been removed.


converting your .DOC or .DOCX files

The easiest method, for multiple .doc or .docx files, is to convert using the Text Converter.


tog_minus        Alternatively you can do it in Word


To convert a .doc or .docx into plain text in Word  can be done thus:

Chose File | Save As | Plain text:



then choose Windows (1-byte per character)


or Other encoding -- Unicode (2-bytes):






