Unlock your potential with a Pro Membership
Here is where you propel your career forward. Come join us today. Learn more.
Character Set Encoding – It Matters!
Lab: PHP String Building and Processing Basics
Video Runtime: 11:42
Character set encoding matters as many language characters are not part of the standard ASCII character set. Instructions such as strlen
will not behave the way you think when using it with extended characters such as €, ©, ®, à, ë, and many others. And languages such as Mandarin and Russian require a different approach when working with strings.
In this episode, let’s talk about why the encoding matters and how the character set is really represented in the computer. Hint: Everything in the computer breaks down into binary, i.e. ones and zeros.
Simple string instructions like strlen
use bytes. Single-byte characters are the 0-9, a-z, A-Z, punctuation (e.g. exclamation mark, period or dot, opening and closing parenthesis, asterisk, comma, hyphen, etc.), operators (such as plus, minus, divide, multiply, greater than, less than, equals, etc.), and control characters. As you move into the extended character set, these are multibyte characters. Instructions such as strlen
will not behave as you expect with these characters.
Additional Resources
Keep It Simple, Stupid (KISS) - the best kiss you'll get in code.
Episodes
Total Lab Runtime: 02:50:33
- 1 Lab Introductionfree 09:39
- 2 Embedding Variables in a Stringpro 15:16
- 3 Embedding Complex Variablespro 13:37
- 4 Concatenating Strings with a Dotpro 08:47
- 5 Concatenating and Assigning Shorthandpro 05:58
- 6 Formatting a String using Placeholderspro 15:16
- 7 Specifying Which Argument in a Formatted Stringpro 04:18
- 8 Has Substringpro 21:47
- 9 Replacing Substringspro 13:03
- 10 Get the String's Lengthpro 14:11
- 11 Character Set Encoding - It Matters!pro 11:42
- 12 Has Substring - for UTF-8pro 16:45
- 13 Replacing a UTF-8 Substringpro 05:56
- 14 Stripping out Characters or Entitiespro 10:51
- 15 Wrap it Upfree 03:27