summaryrefslogtreecommitdiff
path: root/spec/01-lexical-syntax.md
diff options
context:
space:
mode:
Diffstat (limited to 'spec/01-lexical-syntax.md')
-rw-r--r--spec/01-lexical-syntax.md46
1 files changed, 27 insertions, 19 deletions
diff --git a/spec/01-lexical-syntax.md b/spec/01-lexical-syntax.md
index 53c8caf745..78f1a1a408 100644
--- a/spec/01-lexical-syntax.md
+++ b/spec/01-lexical-syntax.md
@@ -41,7 +41,7 @@ classes (Unicode general category given in parentheses):
1. Parentheses `‘(’ | ‘)’ | ‘[’ | ‘]’ | ‘{’ | ‘}’ `.
1. Delimiter characters ``‘`’ | ‘'’ | ‘"’ | ‘.’ | ‘;’ | ‘,’ ``.
1. Operator characters. These consist of all printable ASCII characters
- `\u0020` - `\u007F` which are in none of the sets above, mathematical
+ (`\u0020` - `\u007E`) that are in none of the sets above, mathematical
symbols (`Sm`) and other symbols (`So`).
## Identifiers
@@ -49,11 +49,13 @@ classes (Unicode general category given in parentheses):
```ebnf
op ::= opchar {opchar}
varid ::= lower idrest
+boundvarid ::= varid
+ | ‘`’ varid ‘`’
plainid ::= upper idrest
| varid
| op
id ::= plainid
- | ‘`’ stringLiteral ‘`’
+ | ‘`’ { charNoBackQuoteOrNewline | UnicodeEscape | charEscapeSeq } ‘`’
idrest ::= {letter | digit} [‘_’ op]
```
@@ -398,40 +400,46 @@ members of type `Boolean`.
### Character Literals
```ebnf
-characterLiteral ::= ‘'’ (printableChar | charEscapeSeq) ‘'’
+characterLiteral ::= ‘'’ (charNoQuoteOrNewline | UnicodeEscape | charEscapeSeq) ‘'’
```
A character literal is a single character enclosed in quotes.
-The character is either a printable unicode character or is described
-by an [escape sequence](#escape-sequences).
+The character can be any Unicode character except the single quote
+delimiter or `\u000A` (LF) or `\u000D` (CR);
+or any Unicode character represented by either a
+[Unicode escape](01-lexical-syntax.html) or by an [escape sequence](#escape-sequences).
> ```scala
> 'a' '\u0041' '\n' '\t'
> ```
-Note that `'\u000A'` is _not_ a valid character literal because
-Unicode conversion is done before literal parsing and the Unicode
-character `\u000A` (line feed) is not a printable
-character. One can use instead the escape sequence `'\n'` or
-the octal escape `'\12'` ([see here](#escape-sequences)).
+Note that although Unicode conversion is done early during parsing,
+so that Unicode characters are generally equivalent to their escaped
+expansion in the source text, literal parsing accepts arbitrary
+Unicode escapes, including the character literal `'\u000A'`,
+which can also be written using the escape sequence `'\n'`.
### String Literals
```ebnf
stringLiteral ::= ‘"’ {stringElement} ‘"’
-stringElement ::= printableCharNoDoubleQuote | charEscapeSeq
+stringElement ::= charNoDoubleQuoteOrNewline | UnicodeEscape | charEscapeSeq
```
-A string literal is a sequence of characters in double quotes. The
-characters are either printable unicode character or are described by
-[escape sequences](#escape-sequences). If the string literal
-contains a double quote character, it must be escaped,
-i.e. `"\""`. The value of a string literal is an instance of
-class `String`.
+A string literal is a sequence of characters in double quotes.
+The characters can be any Unicode character except the double quote
+delimiter or `\u000A` (LF) or `\u000D` (CR);
+or any Unicode character represented by either a
+[Unicode escape](01-lexical-syntax.html) or by an [escape sequence](#escape-sequences).
+
+If the string literal contains a double quote character, it must be escaped using
+`"\""`.
+
+The value of a string literal is an instance of class `String`.
> ```scala
-> "Hello,\nWorld!"
-> "This string contains a \" character."
+> "Hello, world!\n"
+> "\"Hello,\" replied the world."
> ```
#### Multi-Line String Literals