summaryrefslogtreecommitdiff
path: root/spec
diff options
context:
space:
mode:
authorLukas Rytz <lukas.rytz@typesafe.com>2015-08-24 11:43:35 +0200
committerLukas Rytz <lukas.rytz@typesafe.com>2015-08-24 11:43:35 +0200
commit3d62009a8e2e715fe12981e7c72c5a701ce6bf96 (patch)
tree0ce88e1e4780d8f04d371f67c3e1400ffb819220 /spec
parent3a543d64158e85b65f8998460c832362bdddec4f (diff)
parentab527ce8cc0220443bda5cc3337ebae158c2fe74 (diff)
downloadscala-3d62009a8e2e715fe12981e7c72c5a701ce6bf96.tar.gz
scala-3d62009a8e2e715fe12981e7c72c5a701ce6bf96.tar.bz2
scala-3d62009a8e2e715fe12981e7c72c5a701ce6bf96.zip
Merge pull request #4590 from som-snytt/issue/6810
SI-6810 Disallow EOL in char literal
Diffstat (limited to 'spec')
-rw-r--r--spec/01-lexical-syntax.md49
-rw-r--r--spec/13-syntax-summary.md5
2 files changed, 30 insertions, 24 deletions
diff --git a/spec/01-lexical-syntax.md b/spec/01-lexical-syntax.md
index e26cb796c8..06e3a458a4 100644
--- a/spec/01-lexical-syntax.md
+++ b/spec/01-lexical-syntax.md
@@ -398,40 +398,46 @@ members of type `Boolean`.
### Character Literals
```ebnf
-characterLiteral ::= ‘'’ (printableChar | charEscapeSeq) ‘'’
+characterLiteral ::= ‘'’ (charNoQuoteOrNewline | UnicodeEscape | charEscapeSeq) ‘'’
```
A character literal is a single character enclosed in quotes.
-The character is either a printable unicode character or is described
-by an [escape sequence](#escape-sequences).
+The character can be any Unicode character except the single quote
+delimiter or `\u000A` (LF) or `\u000D` (CR);
+or any Unicode character represented by either a
+[Unicode escape](01-lexical-syntax.html) or by an [escape sequence](#escape-sequences).
> ```scala
> 'a' '\u0041' '\n' '\t'
> ```
-Note that `'\u000A'` is _not_ a valid character literal because
-Unicode conversion is done before literal parsing and the Unicode
-character `\u000A` (line feed) is not a printable
-character. One can use instead the escape sequence `'\n'` or
-the octal escape `'\12'` ([see here](#escape-sequences)).
+Note that although Unicode conversion is done early during parsing,
+so that Unicode characters are generally equivalent to their escaped
+expansion in the source text, literal parsing accepts arbitrary
+Unicode escapes, including the character literal `'\u000A'`,
+which can also be written using the escape sequence `'\n'`.
### String Literals
```ebnf
stringLiteral ::= ‘"’ {stringElement} ‘"’
-stringElement ::= printableCharNoDoubleQuote | charEscapeSeq
+stringElement ::= charNoDoubleQuoteOrNewline | UnicodeEscape | charEscapeSeq
```
-A string literal is a sequence of characters in double quotes. The
-characters are either printable unicode character or are described by
-[escape sequences](#escape-sequences). If the string literal
-contains a double quote character, it must be escaped,
-i.e. `"\""`. The value of a string literal is an instance of
-class `String`.
+A string literal is a sequence of characters in double quotes.
+The characters can be any Unicode character except the double quote
+delimiter or `\u000A` (LF) or `\u000D` (CR);
+or any Unicode character represented by either a
+[Unicode escape](01-lexical-syntax.html) or by an [escape sequence](#escape-sequences).
+
+If the string literal contains a double quote character, it must be escaped using
+`"\""`.
+
+The value of a string literal is an instance of class `String`.
> ```scala
-> "Hello,\nWorld!"
-> "This string contains a \" character."
+> "Hello, world!\n"
+> "\"Hello,\" replied the world."
> ```
#### Multi-Line String Literals
@@ -443,11 +449,10 @@ multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuote} {‘"’}
A multi-line string literal is a sequence of characters enclosed in
triple quotes `""" ... """`. The sequence of characters is
-arbitrary, except that it may contain three or more consuctive quote characters
-only at the very end. Characters
-must not necessarily be printable; newlines or other
-control characters are also permitted. Unicode escapes work as everywhere else, but none
-of the escape sequences [here](#escape-sequences) are interpreted.
+arbitrary, except that it may contain three or more consecutive quote characters
+only at the very end. In particular, embedded newlines
+are permitted. Unicode escapes work as everywhere else, but none
+of the [escape sequences](#escape-sequences) are interpreted.
> ```scala
> """the present string
diff --git a/spec/13-syntax-summary.md b/spec/13-syntax-summary.md
index 7f73e107de..a4b4aae570 100644
--- a/spec/13-syntax-summary.md
+++ b/spec/13-syntax-summary.md
@@ -57,11 +57,12 @@ floatType ::= ‘F’ | ‘f’ | ‘D’ | ‘d’
booleanLiteral ::= ‘true’ | ‘false’
-characterLiteral ::= ‘'’ (printableChar | charEscapeSeq) ‘'’
+characterLiteral ::= ‘'’ (charNoQuoteOrNewline | UnicodeEscape | charEscapeSeq) ‘'’
stringLiteral ::= ‘"’ {stringElement} ‘"’
| ‘"""’ multiLineChars ‘"""’
-stringElement ::= (printableChar except ‘"’)
+stringElement ::= charNoDoubleQuoteOrNewline
+ | UnicodeEscape
| charEscapeSeq
multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuote} {‘"’}