Custom Processing

Top  Previous  Next

Controller > custom processing

 

This feature -- which, like API, is not for those without a tame programmer to help -- is found under Adjust Settings | Advanced.

 

The point of itů

I cannot know which criteria you have in processing your texts, other than the criteria already set up (the choice of texts, of search-word, etc.) You might need to do some specialised checks or alteration of data before it enters the WordSmith formats. For example, you might need to lemmatise a word according to the special requirements of your language.

This function makes that possible. If for example you have chosen to filter concordances, as Concord processes your text files, every time it finds a match for your search-word, it will call your .dll file. It'll tell your own .dll what it has found, and give it a chance to alter the result or tell Concord to ignore this one.

 

How to do itů

Choose your .dll file (it can have any filename you've chosen for it) and check one or more of the options in the Advanced page. You will need to call standard functions and need to know their names and formats. It is up to you to write your own .dll program which can do the job you want. This can be written in any programming language (C++, Java, Pascal, etc.).

 

An example for lemmatising a word in WordList

 

The following DLL is supplied with your installation, compiled & ready to run.

 

Your .dll needs to contain a function with the following specifications

 

function WordlistChangeWord(

  original : pointer;

  language_identifier : DWORD;

  is_Unicode : WordBool) : pointer; stdcall;

 

The language_identifier is a number corresponding to the language you're working with. See List of Locale ID (LCID) Values as Assigned by Microsoft .

 

So the "original" (sent by WordSmith) can be a PCHAR (7 or 8-bit) or a PWIDECHAR (16-bit Unicode) and the result which your .dll supplies can point to

 

a) nil (if you simply do not want the original word in your list)

b) the same PCHAR/PWIDECHAR if it is not to be changed at all

c) a replacement form

 

Here's an example where the source text was

 

Today is Easter Day.

 

 

custom_processingEASTER

 

The source code for the .dll in Delphi is this

 

************************************

library WordSmith4CustomDLL;

 

uses

Windows, SysUtils;

 

{

This example uses a very straightforward Windows routine for comparing

strings, CompareStringA and CompareStringW which are in a Windows .dll.

 

The function does a case-insensitive comparison because

NORM_IGNORECASE (=1) is used. If it was replaced by 0, the comparison

would be case-sensitive.

 

In this example, EASTER gets changed to CHRISTMAS.

}

 

function WordlistChangeWord(

original : pointer;

language_identifier : DWORD;

is_Unicode : WordBool) : pointer; stdcall;

begin

Result := original;

if is_Unicode then begin

   if CompareStringW(

     language_identifier,

     NORM_IGNORECASE,

     PWideChar(original), -1,

     PWideChar(widestring('EASTER')), -1) - 2 = 0

   then

     Result := pwidechar(widestring('CHRISTMAS'));

end else begin

   if CompareStringA(

     language_identifier,

     NORM_IGNORECASE,

     PChar(original), -1,

     PChar('EASTER'), -1) - 2 = 0

   then

     Result := pchar('CHRISTMAS');

end;

end;

 

function ConcordChangeWord(

original : pointer;

language_identifier : DWORD;

is_Unicode : WordBool) : pointer; stdcall;

begin

Result := WordlistChangeWord(original,language_identifier,is_unicode);

end;

 

function KeyWordsChangeWord(

original : pointer;

language_identifier : DWORD;

is_Unicode : WordBool) : pointer; stdcall;

begin

Result := WordlistChangeWord(original,language_identifier,is_unicode);

end;

 

function HandleConcordanceLine

(source_line : pointer;

hit_position,

hit_length : word;

byte_position_in_file,

language_id : DWORD;

is_Unicode : WordBool;

filename : pchar) : pointer; stdcall;

 

function extrasA : string;

begin

   Result := #9+pchar(filename)+

             #9+ IntToStr(byte_position_in_file)+

             #9+ IntToStr(hit_position)+

             #9+ IntToStr(hit_length);

end;

 

function extrasW : widestring;

begin

   Result := extrasA;

end;

 

var f : TextFile;                                                         

output_file : string;

begin

Result := source_line;

output_file := ChangeFileExt(ParamStr(0),'')+

   '_user_dll_concordance_lines.txt';

if (not IsPathDelimiter(ExpandUNCFileName(ParamStr(0)),1)) and

    (DiskFree(Ord(UpCase(output_file[1]))-64) > 1024*2000) then

try

   if FileExists(output_file) then begin

     AssignFile(f,output_file);

     Append(f);

   end else begin

     AssignFile(f,output_file);

     Rewrite(f);

   end;

   if is_Unicode then

     Writeln(f,pwidechar(source_line)+extrasW) 

   else

     Writeln(f,pchar(source_line)+extrasA);

   Flush(f);

   CloseFile(f);

except

end;

end;

 

exports

 

ConcordChangeWord,

KeyWordsChangeWord,

WordlistChangeWord,

HandleConcordanceLine;

 

begin

end.

 

See also : API, custom settings