Skip to content
Archive of posts tagged UTF-8

Filtering MS Word Text

A common annoyance when dealing with user-supplied content is the way MS Word uses some non-standard character encodings (at least non-standard in terms of the web). Among others, these include the directional (a.k.a. “smart”) quotes. The problem occurs when you output text that contains those characters as a result of a user copying and pasting [...]