Category Archives: text processing

The other space character… ascii 160

I recently discovered that the space character in ascii has an evil twin.  Like all evil twins it looks exactly the same but it does evil…

Here they are in SSMS:

spacesSideBySide

They look the same – why is there another space character in ascii?

The second space is a non-breaking space character – so any program displaying the text should not separate a line at that point.

Why is ascii 160 evil?

The other day I was trying to extract some data into xml but I kept getting an illegal xml character error.

illegalCharacter

The source data had a single ascii 160 character in it, but all the other spaces were ascii 32.  Even when you find the row of data which has the non breaking space it still requires a lot of work to actually find where the character is!

How can you solve this error?

The best way is to make sure that the data going into your system is cleaned of the non breaking space – if you intend to extract it in XML.

Another possibility is to replace non breaking spaces with breaking spaces in the output.

replace160

Or, if the system that will receive the XML is happy with it, you could change the encoding to UTF-16.

illegalCharToUTF16