转自PHP手册:比较substr, mb_substr和mb_strcut

By Symphony - Last updated: Monday, August 24, 2009 - Save & Share - Leave a Comment

I found this function to be extremely useful.

Here is a practical example, showing the difference between substr(), mb_substr() and mb_strcut():

<?php
mb_internal_encoding
('UTF-8');
$string = 'cioèòà';
var_dump(
substr($string, 0, 6),
mb_substr($string, 0, 6),
mb_strcut($string, 0, 6)
);
?>

Output:
string(6) "cioè?"
string(9) "cioèòà"
string(5) "cioè"

Explanation:
$string is long 9 bytes
c - 1 byte
i - 1 byte
o - 1 byte
è - 2 bytes
ò - 2 bytes
à - 2 bytes

substr() works with bytes, so it returns a string which is exactly 6 bytes long. Thus, it truncates the ò character.
mb_substr(), instead, works with characters, so it returns a string which is exactly 6 characters long (but in this case is 9 bytes long).
mb_strcut() works exactly as substr(), but, if the last byte appears to be truncated, it simply omits the character.

When you use
$string = mb_strcut($string, 6);
you can know for sure that strlen($string) <= 6. But no unicode characters will be truncated.

I hope my comment could finally be a simple explanation.

Posted in Uncategorized • Tags: Top Of Page

Write a comment