Display

Top  Previous  Next

As webgetter works, it shows the URLs visited. If greyed out, they were too small to be of use or haven't been contacted yet. If dark blue, they were saved to disk. Above, you will see the bytes visited, and every time a file which meets your requirments is stored, you'll see the number of files and number of words go up. At the bottom, the current time and elapsed time.

 

There is a tab giving access to a list of the successfully downloaded files.

 

Here is a partial list of what I got with a broadband connection, in 1 minute & 1 second, with the search term "history of the English language" (with quotes).

 

    webgetter_output2

 

As you can see, about 1.3MB of web-pages were examined, and 90,000 words (1.1MB) were found worth saving, with the default settings (they each had to be at least 10K in size and have 300 words). In that time I got a couple of time-outs, presumably because 20 seconds isn't long enough for some websites or servers which are slow and ponderous.

 

 

See also: Settings, Limitations