spec: be precise about rune/string literals and comments
See #10248 for details. Fixes #10248. Change-Id: I373545b2dca5d1da1c7149eb0a8f6c6dd8071a4c Reviewed-on: https://go-review.googlesource.com/10503 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org>
This commit is contained in:
parent
b1177d390c
commit
37a097519f
@ -1,6 +1,6 @@
|
|||||||
<!--{
|
<!--{
|
||||||
"Title": "The Go Programming Language Specification",
|
"Title": "The Go Programming Language Specification",
|
||||||
"Subtitle": "Version of June 23, 2015",
|
"Subtitle": "Version of July 23, 2015",
|
||||||
"Path": "/ref/spec"
|
"Path": "/ref/spec"
|
||||||
}-->
|
}-->
|
||||||
|
|
||||||
@ -129,27 +129,27 @@ hex_digit = "0" … "9" | "A" … "F" | "a" … "f" .
|
|||||||
<h3 id="Comments">Comments</h3>
|
<h3 id="Comments">Comments</h3>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
There are two forms of comments:
|
Comments serve as program documentation. There are two forms:
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<ol>
|
<ol>
|
||||||
<li>
|
<li>
|
||||||
<i>Line comments</i> start with the character sequence <code>//</code>
|
<i>Line comments</i> start with the character sequence <code>//</code>
|
||||||
and stop at the end of the line. A line comment acts like a newline.
|
and stop at the end of the line.
|
||||||
</li>
|
</li>
|
||||||
<li>
|
<li>
|
||||||
<i>General comments</i> start with the character sequence <code>/*</code>
|
<i>General comments</i> start with the character sequence <code>/*</code>
|
||||||
and continue through the character sequence <code>*/</code>. A general
|
and stop with the first subsequent character sequence <code>*/</code>.
|
||||||
comment containing one or more newlines acts like a newline, otherwise it acts
|
|
||||||
like a space.
|
|
||||||
</li>
|
</li>
|
||||||
</ol>
|
</ol>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Comments do not nest.
|
A comment cannot start inside a <a href="#Rune_literals">rune</a> or
|
||||||
|
<a href="#String_literals">string literal</a>, or inside a comment.
|
||||||
|
A general comment containing no newlines acts like a space.
|
||||||
|
Any other comment acts like a newline.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
|
||||||
<h3 id="Tokens">Tokens</h3>
|
<h3 id="Tokens">Tokens</h3>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
@ -176,11 +176,8 @@ using the following two rules:
|
|||||||
|
|
||||||
<ol>
|
<ol>
|
||||||
<li>
|
<li>
|
||||||
<p>
|
|
||||||
When the input is broken into tokens, a semicolon is automatically inserted
|
When the input is broken into tokens, a semicolon is automatically inserted
|
||||||
into the token stream at the end of a non-blank line if the line's final
|
into the token stream immediately after a line's final token if that token is
|
||||||
token is
|
|
||||||
</p>
|
|
||||||
<ul>
|
<ul>
|
||||||
<li>an
|
<li>an
|
||||||
<a href="#Identifiers">identifier</a>
|
<a href="#Identifiers">identifier</a>
|
||||||
@ -357,9 +354,10 @@ imaginary_lit = (decimals | float_lit) "i" .
|
|||||||
<p>
|
<p>
|
||||||
A rune literal represents a <a href="#Constants">rune constant</a>,
|
A rune literal represents a <a href="#Constants">rune constant</a>,
|
||||||
an integer value identifying a Unicode code point.
|
an integer value identifying a Unicode code point.
|
||||||
A rune literal is expressed as one or more characters enclosed in single quotes.
|
A rune literal is expressed as one or more characters enclosed in single quotes,
|
||||||
Within the quotes, any character may appear except single
|
as in <code>'x'</code> or <code>'\n'</code>.
|
||||||
quote and newline. A single quoted character represents the Unicode value
|
Within the quotes, any character may appear except newline and unescaped single
|
||||||
|
quote. A single quoted character represents the Unicode value
|
||||||
of the character itself,
|
of the character itself,
|
||||||
while multi-character sequences beginning with a backslash encode
|
while multi-character sequences beginning with a backslash encode
|
||||||
values in various formats.
|
values in various formats.
|
||||||
@ -433,6 +431,7 @@ escaped_char = `\` ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | `\` | "'" | `
|
|||||||
'\xff'
|
'\xff'
|
||||||
'\u12e4'
|
'\u12e4'
|
||||||
'\U00101234'
|
'\U00101234'
|
||||||
|
'\'' // rune literal containing single quote character
|
||||||
'aa' // illegal: too many characters
|
'aa' // illegal: too many characters
|
||||||
'\xa' // illegal: too few hexadecimal digits
|
'\xa' // illegal: too few hexadecimal digits
|
||||||
'\0' // illegal: too few octal digits
|
'\0' // illegal: too few octal digits
|
||||||
@ -449,8 +448,8 @@ obtained from concatenating a sequence of characters. There are two forms:
|
|||||||
raw string literals and interpreted string literals.
|
raw string literals and interpreted string literals.
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
Raw string literals are character sequences between back quotes
|
Raw string literals are character sequences between back quotes, as in
|
||||||
<code>``</code>. Within the quotes, any character is legal except
|
<code>`foo`</code>. Within the quotes, any character may appear except
|
||||||
back quote. The value of a raw string literal is the
|
back quote. The value of a raw string literal is the
|
||||||
string composed of the uninterpreted (implicitly UTF-8-encoded) characters
|
string composed of the uninterpreted (implicitly UTF-8-encoded) characters
|
||||||
between the quotes;
|
between the quotes;
|
||||||
@ -461,8 +460,9 @@ are discarded from the raw string value.
|
|||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
Interpreted string literals are character sequences between double
|
Interpreted string literals are character sequences between double
|
||||||
quotes <code>""</code>. The text between the quotes,
|
quotes, as in <code>"bar"</code>.
|
||||||
which may not contain newlines, forms the
|
Within the quotes, any character may appear except newline and unescaped double quote.
|
||||||
|
The text between the quotes forms the
|
||||||
value of the literal, with backslash escapes interpreted as they
|
value of the literal, with backslash escapes interpreted as they
|
||||||
are in <a href="#Rune_literals">rune literals</a> (except that <code>\'</code> is illegal and
|
are in <a href="#Rune_literals">rune literals</a> (except that <code>\'</code> is illegal and
|
||||||
<code>\"</code> is legal), with the same restrictions.
|
<code>\"</code> is legal), with the same restrictions.
|
||||||
@ -484,17 +484,17 @@ interpreted_string_lit = `"` { unicode_value | byte_value } `"` .
|
|||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
<pre>
|
<pre>
|
||||||
`abc` // same as "abc"
|
`abc` // same as "abc"
|
||||||
`\n
|
`\n
|
||||||
\n` // same as "\\n\n\\n"
|
\n` // same as "\\n\n\\n"
|
||||||
"\n"
|
"\n"
|
||||||
""
|
"\"" // same as `"`
|
||||||
"Hello, world!\n"
|
"Hello, world!\n"
|
||||||
"日本語"
|
"日本語"
|
||||||
"\u65e5本\U00008a9e"
|
"\u65e5本\U00008a9e"
|
||||||
"\xff\u00FF"
|
"\xff\u00FF"
|
||||||
"\uD800" // illegal: surrogate half
|
"\uD800" // illegal: surrogate half
|
||||||
"\U00110000" // illegal: invalid Unicode code point
|
"\U00110000" // illegal: invalid Unicode code point
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
|
Loading…
x
Reference in New Issue
Block a user