r45734 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r45733‎ | r45734 | r45735 >
Date:17:54, 14 January 2009
Author:simetrical
Status:ok (Comments)
Tags:
Comment:
(bug 16852) padleft and padright now handle multibyte characters and multicharacter pad strings

Patch by RememberTheDot, with adjustments to comments by me
Modified paths:
  • /trunk/phase3/RELEASE-NOTES (modified) (history)
  • /trunk/phase3/includes/parser/CoreParserFunctions.php (modified) (history)
  • /trunk/phase3/maintenance/parserTests.txt (modified) (history)

Diff [purge]

Index: trunk/phase3/RELEASE-NOTES
@@ -34,7 +34,9 @@
3535 * Added "__\" magic word to eat up all whitespace and newlines to the next
3636 non-whitespace character, to facilitate writing readable template code where
3737 whitespace is significant.
38 -* (bug 17002) Add &minor= and &summary= as parameters in the url when editing, to automatically add a summary or a minor edit.
 38+* (bug 17002) Add &minor= and &summary= as parameters in the url when editing,
 39+ to automatically add a summary or a minor edit.
 40+* (bug 16852) padleft and padright now accept multiletter pad characters
3941
4042 === Bug fixes in 1.15 ===
4143 * Fixing the caching issue by using -{T|xxx}- syntax (only applies on wiki with LanguageConverter class)
@@ -42,6 +44,7 @@
4345 * (bug 16968) Special:Upload no longer throws useless warnings.
4446 * (bug 15470) Special:Upload no longer force-capitalizes titles
4547 * (bug 17000) Special:RevisionDelete now checks if the database is locked before trying to delete the edit.
 48+* (bug 16852) padleft and padright now handle multibyte characters correctly
4649
4750 == API changes in 1.15 ==
4851 * (bug 16798) JSON encoding errors for some characters outside the BMP
Index: trunk/phase3/maintenance/parserTests.txt
@@ -7233,6 +7233,24 @@
72347234 </p>
72357235 !! end
72367236
 7237+!! test
 7238+Multibyte character in padleft
 7239+!! input
 7240+{{padleft:-Hello|7|Æ}}
 7241+!! result
 7242+<p>Æ-Hello
 7243+</p>
 7244+!! end
 7245+
 7246+!! test
 7247+Multibyte character in padright
 7248+!! input
 7249+{{padright:Hello-|7|Æ}}
 7250+!! result
 7251+<p>Hello-Æ
 7252+</p>
 7253+!! end
 7254+
72377255 #
72387256 #
72397257 #
Index: trunk/phase3/includes/parser/CoreParserFunctions.php
@@ -310,20 +310,38 @@
311311 return $lang != '' ? $lang : $arg;
312312 }
313313
314 - static function pad( $string = '', $length = 0, $char = 0, $direction = STR_PAD_RIGHT ) {
315 - $length = min( max( $length, 0 ), 500 );
316 - $char = substr( $char, 0, 1 );
317 - return ( $string !== '' && (int)$length > 0 && strlen( trim( (string)$char ) ) > 0 )
318 - ? str_pad( $string, $length, (string)$char, $direction )
319 - : $string;
 314+ /**
 315+ * Unicode-safe str_pad with the restriction that $length is forced to be <= 500
 316+ */
 317+ static function pad( $string, $length, $padding = '0', $direction = STR_PAD_RIGHT ) {
 318+ $lengthOfPadding = mb_strlen( $padding );
 319+ if ( $lengthOfPadding == 0 ) return $string;
 320+
 321+ # The remaining length to add counts down to 0 as padding is added
 322+ $length = min( $length, 500 ) - mb_strlen( $string );
 323+ # $finalPadding is just $padding repeated enough times so that
 324+ # mb_strlen( $string ) + mb_strlen( $finalPadding ) == $length
 325+ $finalPadding = '';
 326+ while ( $length > 0 ) {
 327+ # If $length < $lengthofPadding, truncate $padding so we get the
 328+ # exact length desired.
 329+ $finalPadding .= mb_substr( $padding, 0, $length );
 330+ $length -= $lengthOfPadding;
 331+ }
 332+
 333+ if ( $direction == STR_PAD_LEFT ) {
 334+ return $finalPadding . $string;
 335+ } else {
 336+ return $string . $finalPadding;
 337+ }
320338 }
321339
322 - static function padleft( $parser, $string = '', $length = 0, $char = 0 ) {
323 - return self::pad( $string, $length, $char, STR_PAD_LEFT );
 340+ static function padleft( $parser, $string = '', $length = 0, $padding = '0' ) {
 341+ return self::pad( $string, $length, $padding, STR_PAD_LEFT );
324342 }
325343
326 - static function padright( $parser, $string = '', $length = 0, $char = 0 ) {
327 - return self::pad( $string, $length, $char );
 344+ static function padright( $parser, $string = '', $length = 0, $padding = '0' ) {
 345+ return self::pad( $string, $length, $padding );
328346 }
329347
330348 static function anchorencode( $parser, $text ) {

Comments

#Comment by Brion VIBBER (talk | contribs)   21:28, 20 January 2009

Looks ok... :D

#Comment by Plustgarten (talk | contribs)   21:17, 13 April 2009

What if the padding character in padleft or padright is an HTML character code? In particular, it sure would be useful if it could be (or include) an &nbsp; (which should count as one character). Could we get this effect by judicious application of something like html_entity_decode() to the padding string, before measuring it?

#Comment by Simetrical (talk | contribs)   21:51, 13 April 2009

Would be sensible. Patches are welcome, post them on Bugzilla if you'd like.

Status & tagging log

  NODES
Note 2