Text_converter_whole_files_other

 

Unix to Windows

Unix-saved texts don't use the same codes for end-of-paragraph as Windows-saved ones.

 

encrypting using

... allows you to encrypt your text files. You supply your own password in the box to the right. When WordSmith processes your text files, e.g. when running a concordance it will restore the text as needed but otherwise the text will be unintelligible. Encrypted files get the file extension .WSencrypted. For example, if your original is wonderful.txt the copy will be wonderful.WSencrypted. Requires the safer copy to button above to be selected.

 

lemmatising using

... converts each file using a lemma file. If for example your source text has "she was tired" and your lemma file has BE -> AM, WAS, WERE, IS, ARE, then you will get "she be tired" in your converted text file. Where your source text has "Was she tired?" you'll get "Be she tired?"

 

SRT Transcripts

converts SRT files such as those obtained from TED Open Translation Project. If using TED files you may need to add some seconds for the standard TED lead-in.

 

Example

These text files in English (.en), Spanish (.es) Italian (.it) and Japanese (.ja) originally downloaded

 

SRT files

got converted thus:

SRT files converted

To enable Concord to play the .mp4 file I had to change EloraHardy_2015-480p.mp4 to the same title Elora Hardy Magical houses made of bamboo.mp4. Note the file sizes are bigger (converted into Unicode) and the file-names no longer have two dots. This is so that Concord will find a match between its file-name and the transcripts in these 4 languages.  

 

 

See also: Convert Entire Texts

 

Click the Permalink button if you want to copy a link to this page.