In previous web applications I've built, I've had issues with users entering exotic characters into forms which get stored strangely in the database, and sometimes appear different or double-encoded when retrieved from the database and displayed back in the browser. I'm starting a new project now, and I want to prevent these issues right from the start.
What I'm looking for is a checklist of things I can do to prevent character encoding issues such as these, no matter what users enter into forms. If I set my database tables to UTF-8, and set all of my web pages to assume content is UTF-8, is this enough? Will some characters still appear differently than the user entered them? Should I do some validation on the client side that doesn't let users enter in certain characters?