Overview

  Previous topic Next topic JavaScript is required for the print function  

In this section of text A0T, there are two entity tags which I've marked in bold.

 

<s n="131"><c PUQ>&bquo;<w DT0>Most <w NN2>churches <w VBB>are <w AV0>completely <w AJ0>unprepared

<w PRP>for <w AT0>the <w NN1>shock <w PRF>of <w VVG>finding <w AT0>an <w AJ0>established <w NN1>member

<w PRF>of <w AT0>the <w NN1>congregation <w VBZ>is <w VVN>infected <w PRP>with <w NP0>HIV <w CJC>or

<w VVG-AJ0>dying <w PRP>with <w NN1>AIDS<c PUN>, <w CJS>even though <w DT0>this <w VBZ>is

<w AV0>increasingly <w AJ0>common<c PUN>.<c PUQ>&equo;

 

These represent quotation marks, single beginning and ending quotes.

 

In the XML edition viewed using Internet Explorer, the quote looks like a rather ugly left-leaning single quote mark

 

XML_entity

 

but if we examine the same text file carefully using File Viewer, we find out that the ` mark is really represented by the three codes shown in green below.  

 

XML_entity_in_File_Viewer

What is happening is that Internet Explorer (or similar) translates the strange E2 80 98 sequence into `.

 

If we convert the whole text into Unicode, we get this:

 

XML_entity_in_File_Viewer_Unicode

if you look at the character above the cursor you see the sequence of three codes has been converted into one (2018) which looks like `.

 

In WordSmith, you can use Text Converter to get the BNC into a more useful format.

 

Page url: http://www.lexically.net/wordsmith/Handling_BNC/index.html?overviewentities.htm