In this section of text A0T, there are two entity tags which I've marked in bold.
<s n="131"><c PUQ>&bquo;<w DT0>Most <w NN2>churches <w VBB>are <w AV0>completely <w AJ0>unprepared
<w PRP>for <w AT0>the <w NN1>shock <w PRF>of <w VVG>finding <w AT0>an <w AJ0>established <w NN1>member
<w PRF>of <w AT0>the <w NN1>congregation <w VBZ>is <w VVN>infected <w PRP>with <w NP0>HIV <w CJC>or
<w VVG-AJ0>dying <w PRP>with <w NN1>AIDS<c PUN>, <w CJS>even though <w DT0>this <w VBZ>is
<w AV0>increasingly <w AJ0>common<c PUN>.<c PUQ>&equo;
These represent quotation marks, single beginning and ending quotes.
In the XML edition viewed using Internet Explorer, the quote looks like a rather ugly left-leaning single quote mark
but if we examine the same text file carefully using File Viewer, we find out that the ` mark is really represented by the three codes shown in green below.
What is happening is that Internet Explorer (or similar) translates the strange E2 80 98 sequence into `.
If we convert the whole text into Unicode, we get this:
if you look at the character above the cursor you see the sequence of three codes has been converted into one (2018) which looks like `.
In WordSmith, you can use Text Converter to get the BNC into a more useful format.
Page url: http://www.lexically.net/wordsmith/Handling_BNC/index.html?overviewentities.htm