|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactoring the collections api to support differentiation between
referring to a sequential collection and a parallel collection, and to
support referring to both types of collections.
New set of traits Gen* are now superclasses of both their * and Par* subclasses. For example, GenIterable is a superclass of both Iterable and ParIterable. Iterable and ParIterable are not in a subclassing relation. The new class hierarchy is illustrated below (simplified, not all relations and classes are shown):
TraversableOnce --> GenTraversableOnce
^ ^
| |
Traversable --> GenTraversable
^ ^
| |
Iterable --> GenIterable <-- ParIterable
^ ^ ^
| | |
Seq --> GenSeq <-- ParSeq
(the *Like, *View and *ViewLike traits have a similar hierarchy)
General views extract common view functionality from parallel and
sequential collections.
This design also allows for more flexible extensions to the collections
framework. It also allows slowly factoring out common functionality up
into Gen* traits.
From now on, it is possible to write this:
import collection._
val p = parallel.ParSeq(1, 2, 3)
val g: GenSeq[Int] = p // meaning a General Sequence
val s = g.seq // type of s is Seq[Int]
for (elem <- g) {
// do something without guarantees on sequentiality of foreach
// this foreach may be executed in parallel
}
for (elem <- s) {
// do something with a guarantee that foreach is executed in order, sequentially
}
for (elem <- p) {
// do something concurrently, in parallel
}
This also means that some signatures had to be changed. For example,
method `flatMap` now takes `A => GenTraversableOnce[B]`, and `zip` takes
a `GenIterable[B]`.
Also, there are mutable & immutable Gen* trait variants. They have
generic companion functionality.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Removing toPar* methods, since we've agreed they're difficult to: -
underestand - maintain
Also, changed the docs and some tests appropriately.
Description:
1) Every collection is now parallelizable - switch to the parallel version of the collection is done via `par`.
- Traversable collections and iterators have `par` return a parallel
- collection of type `ParIterable[A]` with the implementation being the
- representative of `ParIterable`s (currently, `ParArray`). Iterable
- collections do the same thing. Sequences refine `par`'s returns type
- to `ParSeq[A]`. Maps and sets do a similar thing.
The above means that the contract for `par` changed - it is no longer guaranteed to be O(1), nor reflect the same underlying data, as was the case for mutable collections before. Method `par` is now at worst linear.
Furthermore, specific collection implementations override `par` to a more efficient alternative - instead of copying the dataset, the dataset is shared between the old and the new version. Implementation complexity may be sublinear or constant in these cases, and the underlying data structure may be shared. Currently, these data structures include parallel arrays, maps and sets, vectors, hash trie maps and sets, and ranges.
Finally, parallel collections implement `par` trivially.
2) Methods `toMap`, `toSet`, `toSeq` and `toIterable` have been refined
for parallel collections to switch between collection types, however,
they never switch an implementation from parallel to sequential. They
may or may not copy the elements, as is the case with sequential
variants of these methods.
3) The preferred way to switch between different collection types,
whether maps, sets and seqs, or parallel and sequential, is now via use
of methods `toIterable`, `toSeq`, `toSet` and `toMap` in combination
with `par` and `seq`.
Review by odersky.
|