| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
| |
This reverts commit 54133abdce0246f6643a1112a5204afb2c4caa82.
|
|
|
|
| |
This reverts commit e480bcfbd269ae1d7a6a92cfb50466cf192fe1fb.
|
| |
|
| |
|
|
|
|
| |
This reverts commit 18f062303303824139998e8fc8f4158217b0dbc3.
|
|
|
|
| |
This reverts commit d08e9604fc9958b7c768e91715c8152db2ed6fd0.
|
|
|
|
|
|
|
|
| |
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes #763 from vanzin/netty-dep-hell and squashes the following commits:
dfb6ce2 [Marcelo Vanzin] Fix dep exclusion: avro-ipc, not avro, depends on netty.
|
| |
|
| |
|
|
|
|
| |
This reverts commit 3d0a44833ab50360bf9feccc861cb5e8c44a4866.
|
|
|
|
| |
This reverts commit 9772d85c6f3893d42044f4bab0e16f8b6287613a.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following on a few more items from SPARK-1802 --
The first commit touches up a few similar problems remaining with the YARN profile. I think this is worth cherry-picking.
The second commit is more of the same for hadoop-client, although the fix is a little more complex. It may or may not be worth bothering with.
Author: Sean Owen <sowen@cloudera.com>
Closes #746 from srowen/SPARK-1802.2 and squashes the following commits:
52aeb41 [Sean Owen] Add more commons-logging, servlet excludes to avoid conflicts in assembly when building for YARN
(cherry picked from commit 4b31f4ec7efab8eabf956284a99bfd96a58b79f7)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This initial commit resolves the conflicts in the Hive profiles as noted in https://issues.apache.org/jira/browse/SPARK-1802 .
Most of the fix was to note that Hive drags in Avro, and so if the hive module depends on Spark's version of the `avro-*` dependencies, it will pull in our exclusions as needed too. But I found we need to copy some exclusions between the two Avro dependencies to get this right. And then had to squash some commons-logging intrusions.
This turned up another annoying find, that `hive-exec` is basically an "assembly" artifact that _also_ packages all of its transitive dependencies. This means the final assembly shows lots of collisions between itself and its dependencies, and even other project dependencies. I have a TODO to examine whether that is going to be a deal-breaker or not.
In the meantime I'm going to tack on a second commit to this PR that will also fix some similar, last collisions in the YARN profile.
Author: Sean Owen <sowen@cloudera.com>
Closes #744 from srowen/SPARK-1802 and squashes the following commits:
a856604 [Sean Owen] Resolve JAR version conflicts specific to Hive profile
(cherry picked from commit 8586bf564fe010dfc19ef26874472a6f85e355fb)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Three issues related to temp files that tests generate – these should be touched up for hygiene but are not urgent.
Modules have a log4j.properties which directs the unit-test.log output file to a directory like `[module]/target/unit-test.log`. But this ends up creating `[module]/[module]/target/unit-test.log` instead of former.
The `work/` directory is not deleted by "mvn clean", in the parent and in modules. Neither is the `checkpoint/` directory created under the various external modules.
Many tests create a temp directory, which is not usually deleted. This can be largely resolved by calling `deleteOnExit()` at creation and trying to call `Utils.deleteRecursively` consistently to clean up, sometimes in an `@After` method.
_If anyone seconds the motion, I can create a more significant change that introduces a new test trait along the lines of `LocalSparkContext`, which provides management of temp directories for subclasses to take advantage of._
Author: Sean Owen <sowen@cloudera.com>
Closes #732 from srowen/SPARK-1798 and squashes the following commits:
5af578e [Sean Owen] Try to consistently delete test temp dirs and files, and set deleteOnExit() for each
b21b356 [Sean Owen] Remove work/ and checkpoint/ dirs with mvn clean
bdd0f41 [Sean Owen] Remove duplicate module dir in log4j.properties output path for tests
(cherry picked from commit 7120a2979d0a9f0f54a88b2416be7ca10e74f409)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enabled Mesos (0.18.1) dependency with shaded protobuf
Why is this needed?
Avoids any protobuf version collision between Mesos and any other
dependency in Spark e.g. Hadoop HDFS 2.2+ or 1.0.4.
Ticket: https://issues.apache.org/jira/browse/SPARK-1806
* Should close https://issues.apache.org/jira/browse/SPARK-1433
Author berngp
Author: Bernardo Gomez Palacio <bernardo.gomezpalacio@gmail.com>
Closes #741 from berngp/feature/SPARK-1806 and squashes the following commits:
5d70646 [Bernardo Gomez Palacio] SPARK-1806: Upgrade Mesos dependency to 0.18.1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
failure
TL;DR is there is a bit of JAR hell trouble with Netty, that can be mostly resolved and will resolve a test failure.
I hit the error described at http://apache-spark-user-list.1001560.n3.nabble.com/SparkContext-startup-time-out-td1753.html while running FlumeStreamingSuite, and have for a short while (is it just me?)
velvia notes:
"I have found a workaround. If you add akka 2.2.4 to your dependencies, then everything works, probably because akka 2.2.4 brings in newer version of Jetty."
There are at least 3 versions of Netty in play in the build:
- the new Flume 1.4.0 dependency brings in io.netty:netty:3.4.0.Final, and that is the immediate problem
- the custom version of akka 2.2.3 depends on io.netty:netty:3.6.6.
- but, Spark Core directly uses io.netty:netty-all:4.0.17.Final
The POMs try to exclude other versions of netty, but are excluding org.jboss.netty:netty, when in fact older versions of io.netty:netty (not netty-all) are also an issue.
The org.jboss.netty:netty excludes are largely unnecessary. I replaced many of them with io.netty:netty exclusions until everything agreed on io.netty:netty-all:4.0.17.Final.
But this didn't work, since Akka 2.2.3 doesn't work with Netty 4.x. Down-grading to 3.6.6.Final across the board made some Spark code not compile.
If the build *keeps* io.netty:netty:3.6.6.Final as well, everything seems to work. Part of the reason seems to be that Netty 3.x used the old `org.jboss.netty` packages. This is less than ideal, but is no worse than the current situation.
So this PR resolves the issue and improves the JAR hell, even if it leaves the existing theoretical Netty 3-vs-4 conflict:
- Remove org.jboss.netty excludes where possible, for clarity; they're not needed except with Hadoop artifacts
- Add io.netty:netty excludes where needed -- except, let akka keep its io.netty:netty
- Change a bit of test code that actually depended on Netty 3.x, to use 4.x equivalent
- Update SBT build accordingly
A better change would be to update Akka far enough such that it agrees on Netty 4.x, but I don't know if that's feasible.
Author: Sean Owen <sowen@cloudera.com>
Closes #723 from srowen/SPARK-1789 and squashes the following commits:
43661b7 [Sean Owen] Update and add Netty excludes to prevent some JAR conflicts that cause test issues
(cherry picked from commit 2b7bd29eb6ee5baf739eec143044ecfc296b9b1f)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I think we are hitting this issue in some perf tests: https://github.com/Parquet/parquet-mr/commit/6aed5288fd4a1398063a5a219b2ae4a9f71b02cf
Credit to @aarondav !
Author: Michael Armbrust <michael@databricks.com>
Closes #684 from marmbrus/upgradeParquet and squashes the following commits:
e10a619 [Michael Armbrust] Upgrade parquet library.
(cherry picked from commit 4d6055329846f5e09472e5f844127a5ab5880e15)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We use org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter in spark on yarn but are not included it in the assembly jar.
I tested this on yarn cluster by removing the yarn jars from the classpath and spark runs fine now.
Author: Thomas Graves <tgraves@apache.org>
Closes #406 from tgravescs/SPARK-1474 and squashes the following commits:
1548bf9 [Thomas Graves] SPARK-1474: Spark on yarn assembly doesn't include org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
(cherry picked from commit 1e829905c791fbf1dfd8e0c1caa62ead7354605e)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See related discussion at https://github.com/apache/spark/pull/468
This PR may still overstep what you have in mind, but let me put it on the table to start. Besides fixing the issue, it has one substantive change, and that is to manage Hadoop-specific things only in Hadoop-related profiles. This does _not_ remove `yarn.version`.
- Moves the YARN and Hadoop profiles together in pom.xml. Sorry that this makes the diff a little hard to grok but the changes are only as follows.
- Removes `hadoop.major.version`
- Introduce `hadoop-2.2` and `hadoop-2.3` profiles to control Hadoop-specific changes:
- like the protobuf version issue - this was only 'solved' now by enabling YARN for 2.2+, which is really an orthogonal issue
- like the jets3t version issue now
- Hadoop profiles set an appropriate default `hadoop.version`, that can be overridden
- _(YARN profiles in the parent now only exist to add the sub-module)_
- Fixes the jets3t dependency issue
- and makes it a runtime dependency
- and centralizes config of this guy in the parent pom
- Updates build docs
- Updates SBT build too
- and fixes a regex problem along the way
Author: Sean Owen <sowen@cloudera.com>
Closes #629 from srowen/SPARK-1556 and squashes the following commits:
c3fa967 [Sean Owen] Fix hadoop-2.4 profile typo in doc
a2105fd [Sean Owen] Add hadoop-2.4 profile and don't set hadoop.version in profiles
274f4f9 [Sean Owen] Make jets3t a runtime dependency, and bring its exclusion up into parent config
bbed826 [Sean Owen] Use jets3t 0.9.0 for Hadoop 2.3+ (and correct similar regex issue in SBT build)
f21f356 [Sean Owen] Build changes to set up for jets3t fix
(cherry picked from commit 73b0cbcc241cca3d318ff74340e80b02f884acbd)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...park built for hadoop 2.3.0 , 2.4.0
Author: witgo <witgo@qq.com>
Closes #628 from witgo/SPARK-1693_new and squashes the following commits:
e3af968 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1693_new
dc63905 [witgo] SPARK-1693: Most of the tests throw a java.lang.SecurityException when spark built for hadoop 2.3.0 , 2.4.0
(cherry picked from commit d940e4c16aaa7b60daf1229a99bc4d3455c0240d)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
it's used in ReplSuite, and return to use lang3 utility in Utils.scala
For consideration. This was proposed in related discussion: https://github.com/apache/spark/pull/569
Author: Sean Owen <sowen@cloudera.com>
Closes #635 from srowen/SPARK-1629.2 and squashes the following commits:
a442b98 [Sean Owen] Depend on commons lang3 (already used by tachyon) as it's used in ReplSuite, and return to use lang3 utility in Utils.scala
(cherry picked from commit f5041579ff573f988b673c2506fa4edc32f5ad84)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a part of [PR 590](https://github.com/apache/spark/pull/590)
Author: witgo <witgo@qq.com>
Closes #626 from witgo/yarn_version and squashes the following commits:
c390631 [witgo] restore the yarn dependency declarations
f8a4ad8 [witgo] revert remove the dependency of avro in yarn-alpha
2df6cf5 [witgo] review commit
a1d876a [witgo] review commit
20e7e3e [witgo] review commit
c76763b [witgo] The default value of yarn.version is equal to hadoop.version
(cherry picked from commit fb0543224bcedb8ae3aab4a7ddcc6111a03378fe)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1, Fix SPARK-1441: compile spark core error with hadoop 0.23.x
2, Fix SPARK-1491: maven hadoop-provided profile fails to build
3, Fix org.scala-lang: * ,org.apache.avro:* inconsistent versions dependency
4, A modified on the sql/catalyst/pom.xml,sql/hive/pom.xml,sql/core/pom.xml (Four spaces formatted into two spaces)
Author: witgo <witgo@qq.com>
Closes #480 from witgo/format_pom and squashes the following commits:
03f652f [witgo] review commit
b452680 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
bee920d [witgo] revert fix SPARK-1629: Spark Core missing commons-lang dependence
7382a07 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
6902c91 [witgo] fix SPARK-1629: Spark Core missing commons-lang dependence
0da4bc3 [witgo] merge master
d1718ed [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
e345919 [witgo] add avro dependency to yarn-alpha
77fad08 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
62d0862 [witgo] Fix org.scala-lang: * inconsistent versions dependency
1a162d7 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
934f24d [witgo] review commit
cf46edc [witgo] exclude jruby
06e7328 [witgo] Merge branch 'SparkBuild' into format_pom
99464d2 [witgo] fix maven hadoop-provided profile fails to build
0c6c1fc [witgo] Fix compile spark core error with hadoop 0.23.x
6851bec [witgo] Maintain consistent SparkBuild.scala, pom.xml
(cherry picked from commit 030f2c2126d5075576cd6d83a1ee7462c48b953b)
Conflicts:
sql/catalyst/pom.xml
sql/core/pom.xml
sql/hive/pom.xml
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It registers more Scala classes, including things like Ranges that we had to register manually before. See https://github.com/twitter/chill/releases for Chill's change log.
Author: Matei Zaharia <matei@databricks.com>
Closes #543 from mateiz/chill-0.3.6 and squashes the following commits:
a1dc5e0 [Matei Zaharia] Upgrade Chill to 0.3.6 and remove our special registration of Ranges
(cherry picked from commit a24d918c71f6ac4adbe3ae363ef69f4658118938)
Signed-off-by: Matei Zaharia <matei@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the Unnecessary lift-json dependency from pom.xml
Author: Sandeep <sandeep@techaddict.me>
Closes #536 from techaddict/FIX-SPARK-1078 and squashes the following commits:
bd0fd1d [Sandeep] Fix [SPARK-1078]: Replace lift-json with json4s-jackson. Remove the Unnecessary lift-json dependency from pom.xml
(cherry picked from commit 095b5182536a43e2ae738be93294ee5215d86581)
Signed-off-by: Reynold Xin <rxin@apache.org>
|
| |
|
| |
|
|
|
|
| |
This reverts commit 6cc698fc378256fee9111f66c691ced27f54e973.
|
|
|
|
| |
This reverts commit 188f7c3f68e93b3e9347ec02e21f5943874b4741.
|
| |
|
| |
|
|
|
|
| |
This reverts commit c3c6ea05d9d02a38b97388d583828ca82c5181db.
|
|
|
|
| |
This reverts commit 0b49305297033b70cbb525bef54b70a14deeb238.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A quick fix for https://issues.apache.org/jira/browse/SPARK-1520
By excluding fastutil, we bring the number of files in the assembly jar back under 65536, so Java 7 won't create the assembly jar in zip64 format, which cannot be read by Java 6.
With this change, the assembly jar now has about 60000 entries (58000 files), tested with both sbt and maven.
Author: Xiangrui Meng <meng@databricks.com>
Closes #437 from mengxr/remove-fastutil and squashes the following commits:
00f9beb [Xiangrui Meng] remove fastutil from dependencies
(cherry picked from commit aa17f022c59af02b04b977da9017671ef14d664a)
Signed-off-by: Reynold Xin <rxin@apache.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An initial API that exposes SparkSQL functionality in PySpark. A PythonRDD composed of dictionaries, with string keys and primitive values (boolean, float, int, long, string) can be converted into a SchemaRDD that supports sql queries.
```
from pyspark.context import SQLContext
sqlCtx = SQLContext(sc)
rdd = sc.parallelize([{"field1" : 1, "field2" : "row1"}, {"field1" : 2, "field2": "row2"}, {"field1" : 3, "field2": "row3"}])
srdd = sqlCtx.applySchema(rdd)
sqlCtx.registerRDDAsTable(srdd, "table1")
srdd2 = sqlCtx.sql("SELECT field1 AS f1, field2 as f2 from table1")
srdd2.collect()
```
The last line yields ```[{"f1" : 1, "f2" : "row1"}, {"f1" : 2, "f2": "row2"}, {"f1" : 3, "f2": "row3"}]```
Author: Ahir Reddy <ahirreddy@gmail.com>
Author: Michael Armbrust <michael@databricks.com>
Closes #363 from ahirreddy/pysql and squashes the following commits:
0294497 [Ahir Reddy] Updated log4j properties to supress Hive Warns
307d6e0 [Ahir Reddy] Style fix
6f7b8f6 [Ahir Reddy] Temporary fix MIMA checker. Since we now assemble Spark jar with Hive, we don't want to check the interfaces of all of our hive dependencies
3ef074a [Ahir Reddy] Updated documentation because classes moved to sql.py
29245bf [Ahir Reddy] Cache underlying SchemaRDD instead of generating and caching PythonRDD
f2312c7 [Ahir Reddy] Moved everything into sql.py
a19afe4 [Ahir Reddy] Doc fixes
6d658ba [Ahir Reddy] Remove the metastore directory created by the HiveContext tests in SparkSQL
521ff6d [Ahir Reddy] Trying to get spark to build with hive
ab95eba [Ahir Reddy] Set SPARK_HIVE=true on jenkins
ded03e7 [Ahir Reddy] Added doc test for HiveContext
22de1d4 [Ahir Reddy] Fixed maven pyrolite dependency
e4da06c [Ahir Reddy] Display message if hive is not built into spark
227a0be [Michael Armbrust] Update API links. Fix Hive example.
58e2aa9 [Michael Armbrust] Build Docs for pyspark SQL Api. Minor fixes.
4285340 [Michael Armbrust] Fix building of Hive API Docs.
38a92b0 [Michael Armbrust] Add note to future non-python developers about python docs.
337b201 [Ahir Reddy] Changed com.clearspring.analytics stream version from 2.4.0 to 2.5.1 to match SBT build, and added pyrolite to maven build
40491c9 [Ahir Reddy] PR Changes + Method Visibility
1836944 [Michael Armbrust] Fix comments.
e00980f [Michael Armbrust] First draft of python sql programming guide.
b0192d3 [Ahir Reddy] Added Long, Double and Boolean as usable types + unit test
f98a422 [Ahir Reddy] HiveContexts
79621cf [Ahir Reddy] cleaning up cruft
b406ba0 [Ahir Reddy] doctest formatting
20936a5 [Ahir Reddy] Added tests and documentation
e4d21b4 [Ahir Reddy] Added pyrolite dependency
79f739d [Ahir Reddy] added more tests
7515ba0 [Ahir Reddy] added more tests :)
d26ec5e [Ahir Reddy] added test
e9f5b8d [Ahir Reddy] adding tests
906d180 [Ahir Reddy] added todo explaining cost of creating Row object in python
251f99d [Ahir Reddy] for now only allow dictionaries as input
09b9980 [Ahir Reddy] made jrdd explicitly lazy
c608947 [Ahir Reddy] SchemaRDD now has all RDD operations
725c91e [Ahir Reddy] awesome row objects
55d1c76 [Ahir Reddy] return row objects
4fe1319 [Ahir Reddy] output dictionaries correctly
be079de [Ahir Reddy] returning dictionaries works
cd5f79f [Ahir Reddy] Switched to using Scala SQLContext
e948bd9 [Ahir Reddy] yippie
4886052 [Ahir Reddy] even better
c0fb1c6 [Ahir Reddy] more working
043ca85 [Ahir Reddy] working
5496f9f [Ahir Reddy] doesn't crash
b8b904b [Ahir Reddy] Added schema rdd class
67ba875 [Ahir Reddy] java to python, and python to java
bcc0f23 [Ahir Reddy] Java to python
ab6025d [Ahir Reddy] compiling
(cherry picked from commit c99bcb7feaa761c5826f2e1d844d0502a3b79538)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For your consideration: scalac currently notes a number of feature warnings during compilation:
```
[warn] there were 65 feature warning(s); re-run with -feature for details
```
Warnings are like:
```
[warn] /Users/srowen/Documents/spark/core/src/main/scala/org/apache/spark/SparkContext.scala:1261: implicit conversion method rddToPairRDDFunctions should be enabled
[warn] by making the implicit value scala.language.implicitConversions visible.
[warn] This can be achieved by adding the import clause 'import scala.language.implicitConversions'
[warn] or by setting the compiler option -language:implicitConversions.
[warn] See the Scala docs for value scala.language.implicitConversions for a discussion
[warn] why the feature should be explicitly enabled.
[warn] implicit def rddToPairRDDFunctions[K: ClassTag, V: ClassTag](rdd: RDD[(K, V)]) =
[warn] ^
```
scalac is suggesting that it's just best practice to explicitly enable certain language features by importing them where used.
This PR simply adds the imports it suggests (and squashes one other Java warning along the way). This leaves just deprecation warnings in the build.
Author: Sean Owen <sowen@cloudera.com>
Closes #404 from srowen/SPARK-1488 and squashes the following commits:
8598980 [Sean Owen] Quiet scalac warnings about language features by explicitly importing language features.
39bc831 [Sean Owen] Enable -feature in scalac to emit language feature warnings
(cherry picked from commit 0247b5c5467ca1b0d03ba929a78fa4d805582d84)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(This is for discussion at this point -- I'm not suggesting this should be committed.)
This is what removing fastutil looks like. Much of it is straightforward, like using `java.io` buffered stream classes, and Guava for murmurhash3.
Uses of the `FastByteArrayOutputStream` were a little trickier. In only one case though do I think the change to use `java.io` actually entails an extra array copy.
The rest is using `OpenHashMap` and `OpenHashSet`. These are now written in terms of more scala-like operations.
`OpenHashMap` is where I made three non-trivial changes to make it work, and they need review:
- It is no longer private
- The key must be a `ClassTag`
- Unless a lot of other code changes, the key type can't enforce being a supertype of `Null`
It all works and tests pass, and I think there is reason to believe it's OK from a speed perspective.
But what about those last changes?
Author: Sean Owen <sowen@cloudera.com>
Closes #266 from srowen/SPARK-1057-alternate and squashes the following commits:
2601129 [Sean Owen] Fix Map return type error not previously caught
ec65502 [Sean Owen] Updates from matei's review
00bc81e [Sean Owen] Remove use of fastutil and replace with use of java.io, spark.util and Guava classes
(cherry picked from commit 165e06a74c3d75e6b7341c120943add8b035b96a)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
| |
This reverts commit 12c077d5aa0b76a808a55db625c9677a52bd43f9.
|