Custom Processing |
Top Previous Next |
Controller > custom processing
This feature -- which, like API, is not for those without a tame programmer to help -- is found under Adjust Settings | Advanced.
The point of it… I cannot know which criteria you have in processing your texts, other than the criteria already set up (the choice of texts, of search-word, etc.) You might need to do some specialised checks or alteration of data before it enters the WordSmith formats. For example, you might need to lemmatise a word according to the special requirements of your language. This function makes that possible. If for example you have chosen to filter concordances, as Concord processes your text files, every time it finds a match for your search-word, it will call your .dll file. It'll tell your own .dll what it has found, and give it a chance to alter the result or tell Concord to ignore this one.
How to do it… Choose your .dll file (it can have any filename you've chosen for it) and check one or more of the options in the Advanced page. You will need to call standard functions and need to know their names and formats. It is up to you to write your own .dll program which can do the job you want. This can be written in any programming language (C++, Java, Pascal, etc.).
An example for lemmatising a word in WordList
The following DLL is supplied with your installation, compiled & ready to run.
Your .dll needs to contain a function with the following specifications
function WordlistChangeWord( original : pointer; language_identifier : DWORD; is_Unicode : WordBool) : pointer; stdcall;
The language_identifier is a number corresponding to the language you're working with. See List of Locale ID (LCID) Values as Assigned by Microsoft .
So the "original" (sent by WordSmith) can be a PCHAR (7 or 8-bit) or a PWIDECHAR (16-bit Unicode) and the result which your .dll supplies can point to
a) nil (if you simply do not want the original word in your list) b) the same PCHAR/PWIDECHAR if it is not to be changed at all c) a replacement form
Here's an example where the source text was
Today is Easter Day.
The source code for the .dll in Delphi is this
************************************ library WordSmith4CustomDLL;
uses Windows, SysUtils;
{ This example uses a very straightforward Windows routine for comparing strings, CompareStringA and CompareStringW which are in a Windows .dll.
The function does a case-insensitive comparison because NORM_IGNORECASE (=1) is used. If it was replaced by 0, the comparison would be case-sensitive.
In this example, EASTER gets changed to CHRISTMAS. }
function WordlistChangeWord( original : pointer; language_identifier : DWORD; is_Unicode : WordBool) : pointer; stdcall; begin Result := original; if is_Unicode then begin if CompareStringW( language_identifier, NORM_IGNORECASE, PWideChar(original), -1, PWideChar(widestring('EASTER')), -1) - 2 = 0 then Result := pwidechar(widestring('CHRISTMAS')); end else begin if CompareStringA( language_identifier, NORM_IGNORECASE, PChar(original), -1, PChar('EASTER'), -1) - 2 = 0 then Result := pchar('CHRISTMAS'); end; end;
function ConcordChangeWord( original : pointer; language_identifier : DWORD; is_Unicode : WordBool) : pointer; stdcall; begin Result := WordlistChangeWord(original,language_identifier,is_unicode); end;
function KeyWordsChangeWord( original : pointer; language_identifier : DWORD; is_Unicode : WordBool) : pointer; stdcall; begin Result := WordlistChangeWord(original,language_identifier,is_unicode); end;
function HandleConcordanceLine (source_line : pointer; hit_position, hit_length : word; byte_position_in_file, language_id : DWORD; is_Unicode : WordBool; filename : pchar) : pointer; stdcall;
function extrasA : string; begin Result := #9+pchar(filename)+ #9+ IntToStr(byte_position_in_file)+ #9+ IntToStr(hit_position)+ #9+ IntToStr(hit_length); end;
function extrasW : widestring; begin Result := extrasA; end;
var f : TextFile; output_file : string; begin Result := source_line; output_file := ChangeFileExt(ParamStr(0),'')+ '_user_dll_concordance_lines.txt'; if (not IsPathDelimiter(ExpandUNCFileName(ParamStr(0)),1)) and (DiskFree(Ord(UpCase(output_file[1]))-64) > 1024*2000) then try if FileExists(output_file) then begin AssignFile(f,output_file); Append(f); end else begin AssignFile(f,output_file); Rewrite(f); end; if is_Unicode then Writeln(f,pwidechar(source_line)+extrasW) else Writeln(f,pchar(source_line)+extrasA); Flush(f); CloseFile(f); except end; end;
exports
ConcordChangeWord, KeyWordsChangeWord, WordlistChangeWord, HandleConcordanceLine;
begin end.
See also : API, custom settings |