Highlight the performance advantages of calling `string.split` with a block, instead of `string.split.each` with the same block. Includes other minor formatting corrections.
100 lines
3.3 KiB
Plaintext
100 lines
3.3 KiB
Plaintext
Returns an array of substrings of +self+
|
||
that are the result of splitting +self+
|
||
at each occurrence of the given field separator +field_sep+.
|
||
|
||
When +field_sep+ is <tt>$;</tt>:
|
||
|
||
- If <tt>$;</tt> is +nil+ (its default value),
|
||
the split occurs just as if +field_sep+ were given as a space character
|
||
(see below).
|
||
|
||
- If <tt>$;</tt> is a string,
|
||
the split occurs just as if +field_sep+ were given as that string
|
||
(see below).
|
||
|
||
When +field_sep+ is <tt>' '</tt> and +limit+ is +0+ (its default value),
|
||
the split occurs at each sequence of whitespace:
|
||
|
||
'abc def ghi'.split(' ') # => ["abc", "def", "ghi"]
|
||
"abc \n\tdef\t\n ghi".split(' ') # => ["abc", "def", "ghi"]
|
||
'abc def ghi'.split(' ') # => ["abc", "def", "ghi"]
|
||
''.split(' ') # => []
|
||
|
||
When +field_sep+ is a string different from <tt>' '</tt>
|
||
and +limit+ is +0+,
|
||
the split occurs at each occurrence of +field_sep+;
|
||
trailing empty substrings are not returned:
|
||
|
||
'abracadabra'.split('ab') # => ["", "racad", "ra"]
|
||
'aaabcdaaa'.split('a') # => ["", "", "", "bcd"]
|
||
''.split('a') # => []
|
||
'3.14159'.split('1') # => ["3.", "4", "59"]
|
||
'!@#$%^$&*($)_+'.split('$') # => ["!@#", "%^", "&*(", ")_+"]
|
||
'тест'.split('т') # => ["", "ес"]
|
||
'こんにちは'.split('に') # => ["こん", "ちは"]
|
||
|
||
When +field_sep+ is a Regexp and +limit+ is +0+,
|
||
the split occurs at each occurrence of a match;
|
||
trailing empty substrings are not returned:
|
||
|
||
'abracadabra'.split(/ab/) # => ["", "racad", "ra"]
|
||
'aaabcdaaa'.split(/a/) # => ["", "", "", "bcd"]
|
||
'aaabcdaaa'.split(//) # => ["a", "a", "a", "b", "c", "d", "a", "a", "a"]
|
||
'1 + 1 == 2'.split(/\W+/) # => ["1", "1", "2"]
|
||
|
||
If the \Regexp contains groups, their matches are also included
|
||
in the returned array:
|
||
|
||
'1:2:3'.split(/(:)()()/, 2) # => ["1", ":", "", "", "2:3"]
|
||
|
||
As seen above, if +limit+ is +0+,
|
||
trailing empty substrings are not returned:
|
||
|
||
'aaabcdaaa'.split('a') # => ["", "", "", "bcd"]
|
||
|
||
If +limit+ is positive integer +n+, no more than <tt>n - 1-</tt>
|
||
splits occur, so that at most +n+ substrings are returned,
|
||
and trailing empty substrings are included:
|
||
|
||
'aaabcdaaa'.split('a', 1) # => ["aaabcdaaa"]
|
||
'aaabcdaaa'.split('a', 2) # => ["", "aabcdaaa"]
|
||
'aaabcdaaa'.split('a', 5) # => ["", "", "", "bcd", "aa"]
|
||
'aaabcdaaa'.split('a', 7) # => ["", "", "", "bcd", "", "", ""]
|
||
'aaabcdaaa'.split('a', 8) # => ["", "", "", "bcd", "", "", ""]
|
||
|
||
Note that if +field_sep+ is a \Regexp containing groups,
|
||
their matches are in the returned array, but do not count toward the limit.
|
||
|
||
If +limit+ is negative, it behaves the same as if +limit+ was zero,
|
||
meaning that there is no limit,
|
||
and trailing empty substrings are included:
|
||
|
||
'aaabcdaaa'.split('a', -1) # => ["", "", "", "bcd", "", "", ""]
|
||
|
||
If a block is given, it is called with each substring and returns +self+:
|
||
|
||
'abc def ghi'.split(' ') {|substring| p substring }
|
||
|
||
Output:
|
||
|
||
"abc"
|
||
"def"
|
||
"ghi"
|
||
=> "abc def ghi"
|
||
|
||
Note that the above example is functionally the same as calling +#each+ after
|
||
+#split+ and giving the same block. However, the above example has better
|
||
performance because it avoids the creation of an intermediate array. Also,
|
||
note the different return values.
|
||
|
||
'abc def ghi'.split(' ').each {|substring| p substring }
|
||
|
||
Output:
|
||
|
||
"abc"
|
||
"def"
|
||
"ghi"
|
||
=> ["abc", "def", "ghi"]
|
||
|
||
Related: String#partition, String#rpartition.
|