Previous topic Next topic JavaScript is required for the print function  

In this section of text A0T, there are two entity tags which I've marked in bold.


<s n="131"><c PUQ>&bquo;<w DT0>Most <w NN2>churches <w VBB>are <w AV0>completely <w AJ0>unprepared

<w PRP>for <w AT0>the <w NN1>shock <w PRF>of <w VVG>finding <w AT0>an <w AJ0>established <w NN1>member

<w PRF>of <w AT0>the <w NN1>congregation <w VBZ>is <w VVN>infected <w PRP>with <w NP0>HIV <w CJC>or

<w VVG-AJ0>dying <w PRP>with <w NN1>AIDS<c PUN>, <w CJS>even though <w DT0>this <w VBZ>is

<w AV0>increasingly <w AJ0>common<c PUN>.<c PUQ>&equo;


These represent quotation marks, single beginning and ending quotes.


In the XML edition viewed using Internet Explorer, the quote looks like a rather ugly left-leaning single quote mark




but if we examine the same text file carefully using File Viewer, we find out that the ` mark is really represented by the three codes shown in green below.  



What is happening is that Internet Explorer (or similar) translates the strange E2 80 98 sequence into `.


If we convert the whole text into Unicode, we get this:



if you look at the character above the cursor you see the sequence of three codes has been converted into one (2018) which looks like `.


In WordSmith, you can use Text Converter to get the BNC into a more useful format.


Page url: http://www.lexically.net/wordsmith/Handling%20BNC/index.html?overviewentities.htm