转自PHP手册:比较substr, mb_substr和mb_strcut
I found this function to be extremely useful.
Here is a practical example, showing the difference between substr(), mb_substr() and mb_strcut():
<?php
mb_internal_encoding('UTF-8');
$string = 'cioèòà';
var_dump(
substr($string, 0, 6),
mb_substr($string, 0, 6),
mb_strcut($string, 0, 6)
);
?>
Output:
string(6) "cioè?"
string(9) "cioèòà"
string(5) "cioè"
Explanation:
$string is long 9 bytes
c - 1 byte
i - 1 byte
o - 1 byte
è - 2 bytes
ò - 2 bytes
à - 2 bytes
substr() works with bytes, so it returns a string which is exactly 6 bytes long. Thus, it truncates the ò character.
mb_substr(), instead, works with characters, so it returns a string which is exactly 6 characters long (but in this case is 9 bytes long).
mb_strcut() works exactly as substr(), but, if the last byte appears to be truncated, it simply omits the character.
When you use
$string = mb_strcut($string, 6);
you can know for sure that strlen($string) <= 6. But no unicode characters will be truncated.
I hope my comment could finally be a simple explanation.