From 5e2a7881337e008a7de79914646ebe3b4fcd993e Mon Sep 17 00:00:00 2001 From: Iain McGinniss Date: Wed, 17 Oct 2012 22:18:13 +0100 Subject: preface and lexical syntax chapter converted, other chapters split into their own files --- 12-xml-expressions-and-patterns.md | 144 +++++++++++++++++++++++++++++++++++++ 1 file changed, 144 insertions(+) create mode 100644 12-xml-expressions-and-patterns.md (limited to '12-xml-expressions-and-patterns.md') diff --git a/12-xml-expressions-and-patterns.md b/12-xml-expressions-and-patterns.md new file mode 100644 index 0000000000..2f6756b05f --- /dev/null +++ b/12-xml-expressions-and-patterns.md @@ -0,0 +1,144 @@ +XML Expressions and Patterns +============================ + +{\bf By Burak Emir}\bigskip\bigskip + + +This chapter describes the syntactic structure of XML expressions and patterns. +It follows as closely as possible the XML 1.0 specification \cite{w3c:xml}, +changes being mandated by the possibility of embedding Scala code fragments. + +\section{XML expressions} +XML expressions are expressions generated by the following production, where the +opening bracket `<' of the first element must be in a position to start the lexical +[XML mode](#xml-mode). + +\syntax\begin{lstlisting} +XmlExpr ::= XmlContent {Element} +\end{lstlisting} +Well-formedness constraints of the XML specification apply, which +means for instance that start tags and end tags must match, and +attributes may only be defined once, with the exception of constraints +related to entity resolution. + +The following productions describe Scala's extensible markup language, +designed as close as possible to the W3C extensible markup language +standard. Only the productions for attribute values and character data +are changed. Scala does not support declarations, CDATA +sections or processing instructions. Entity references are not +resolved at runtime. + +\syntax\begin{lstlisting} +Element ::= EmptyElemTag + | STag Content ETag + +EmptyElemTag ::= `<' Name {S Attribute} [S] `/>' + +STag ::= `<' Name {S Attribute} [S] `>' +ETag ::= `' +Content ::= [CharData] {Content1 [CharData]} +Content1 ::= XmlContent + | Reference + | ScalaExpr +XmlContent ::= Element + | CDSect + | PI + | Comment +\end{lstlisting} + +If an XML expression is a single element, its value is a runtime +representation of an XML node (an instance of a subclass of +\lstinline@scala.xml.Node@). If the XML expression consists of more +than one element, then its value is a runtime representation of a +sequence of XML nodes (an instance of a subclass of +\lstinline@scala.Seq[scala.xml.Node]@). + +If an XML expression is an entity reference, CDATA section, processing +instructions or a comments, it is represented by an instance of the +corresponding Scala runtime class. + +By default, beginning and trailing whitespace in element content is removed, +and consecutive occurrences of whitespace are replaced by a single space +character \U{0020}. This behavior can be changed to preserve all whitespace +with a compiler option. + +\syntax\begin{lstlisting} +Attribute ::= Name Eq AttValue + +AttValue ::= `"' {CharQ | CharRef} `"' + | `'' {CharA | CharRef} `'' + | ScalaExpr + +ScalaExpr ::= Block + +CharData ::= { CharNoRef } $\mbox{\rm\em without}$ {CharNoRef}`{'CharB {CharNoRef} + $\mbox{\rm\em and without}$ {CharNoRef}`]]>'{CharNoRef} +\end{lstlisting} +XML expressions may contain Scala expressions as attribute values or +within nodes. In the latter case, these are embedded using a single opening +brace `\{' and ended by a closing brace `\}'. To express a single opening braces +within XML text as generated by CharData, it must be doubled. Thus, `\{\{' +represents the XML text `\{' and does not introduce an embedded Scala +expression. + +\syntax\begin{lstlisting} +BaseChar, Char, Comment, CombiningChar, Ideographic, NameChar, S, Reference + ::= $\mbox{\rm\em ``as in W3C XML''}$ + +Char1 ::= Char $\mbox{\rm\em without}$ `<' | `&' +CharQ ::= Char1 $\mbox{\rm\em without}$ `"' +CharA ::= Char1 $\mbox{\rm\em without}$ `'' +CharB ::= Char1 $\mbox{\rm\em without}$ '{' + +Name ::= XNameStart {NameChar} + +XNameStart ::= `_' | BaseChar | Ideographic + $\mbox{\rm\em (as in W3C XML, but without }$ `:' + +\end{lstlisting} +\section{XML patterns}\label{sec:xml-pats} +XML patterns are patterns generated by the following production, where +the opening bracket `<' of the element patterns must be in a position +to start the lexical [XML mode](#xml-mode). + +\syntax\begin{lstlisting} +XmlPattern ::= ElementPattern +\end{lstlisting}%{ElementPattern} +Well-formedness constraints of the XML specification apply. + +An XML pattern has to be a single element pattern. It %expects the type of , and +matches exactly those runtime +representations of an XML tree +that have the same structure as described by the pattern. %If an XML pattern +%consists of more than one element, then it expects the type of sequences +%of runtime representations of XML trees, and matches every sequence whose +%elements match the sequence described by the pattern. +XML patterns may contain Scala patterns(\sref{sec:pattern-match}). + +Whitespace is treated the same way as in XML expressions. Patterns +that are entity references, CDATA sections, processing +instructions and comments match runtime representations which are the +the same. + +By default, beginning and trailing whitespace in element content is removed, +and consecutive occurrences of whitespace are replaced by a single space +character \U{0020}. This behavior can be changed to preserve all whitespace +with a compiler option. + +\syntax\begin{lstlisting} +ElemPattern ::= EmptyElemTagP + | STagP ContentP ETagP + +EmptyElemTagP ::= `<' Name [S] `/>' +STagP ::= `<' Name [S] `>' +ETagP ::= `' +ContentP ::= [CharData] {(ElemPattern|ScalaPatterns) [CharData]} +ContentP1 ::= ElemPattern + | Reference + | CDSect + | PI + | Comment + | ScalaPatterns +ScalaPatterns ::= `{' Patterns `}' +\end{lstlisting} + -- cgit v1.2.3