obtaining sound and video files

Sources of sound and video files

WordSmith does not provide or include corpora. However, there are specialised corpora such as NECTE, MICASE, ICE and then there are publicly available sources such as the TED Talks. You are expected to respect copyright provisions in all cases.


There is a lot of useful advice at

TED Open Translation Project where you will find transcripts.


These text files in English (.en), Spanish (.es), Italian (.it) and Japanese (.ja) were downloaded from there and later converted using the Text Converter


SRT files



If you wish to use a transcript and sound file format which is incompatible with the syntax described here, please contact us.

Click the Permalink button if you want to copy a link to this page.