ASCII

ASCII encodes 128 specified characters into 7bit. https://en.wikipedia.org/wiki/ASCII

UTF-8

UTF-8 is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes. https://en.wikipedia.org/wiki/UTF-8

The first 128 characters (US-ASCII) need one byte.

The next 1,920 characters need two bytes to encode. This covers the remainder of almost all Latin alphabets, and also Greek, Cyrillic, Coptic, Armenian, Hebrew, Arabic, Syriac and Tāna alphabets, as well as Combining Diacritical Marks.

Three bytes are needed for characters in the rest of the Basic Multilingual Plane, which contains virtually all characters in common use[12] including most Chinese, Japanese and Korean [CJK] characters.

Four bytes are needed for characters in the other planes of Unicode, which include less common CJK characters, various historic scripts, mathematical symbols, and emoji (pictographic symbols).

UTF8 in MySQL

try to use utf8mb4 all the time

utf8 (3 bytes)

utf8mb4 (4 bytes)