| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
Author: Raymond Liu <raymond.liu@intel.com>
Closes #611 from colorant/curator and squashes the following commits:
7556aa1 [Raymond Liu] Address review comments
af92e1f [Raymond Liu] Fix coding style
964f3c2 [Raymond Liu] Ignore NodeExists exception
6df2966 [Raymond Liu] Rewrite zookeeper client code with curator
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prompted by a recent thread on the mailing list, I tried and failed to see if Spark can be made independent of log4j. There are a few cases where control of the underlying logging is pretty useful, and to do that, you have to bind to a specific logger.
Instead I propose some tidying that leaves Spark's use of log4j, but gets rid of warnings and should still enable downstream users to switch. The idea is to pipe everything (except log4j) through SLF4J, and have Spark use SLF4J directly when logging, and where Spark needs to output info (REPL and tests), bind from SLF4J to log4j.
This leaves the same behavior in Spark. It means that downstream users who want to use something except log4j should:
- Exclude dependencies on log4j, slf4j-log4j12 from Spark
- Include dependency on log4j-over-slf4j
- Include dependency on another logger X, and another slf4j-X
- Recreate any log config that Spark does, that is needed, in the other logger's config
That sounds about right.
Here are the key changes:
- Include the jcl-over-slf4j shim everywhere by depending on it in core.
- Exclude dependencies on commons-logging from third-party libraries.
- Include the jul-to-slf4j shim everywhere by depending on it in core.
- Exclude slf4j-* dependencies from third-party libraries to prevent collision or warnings
- Added missing slf4j-log4j12 binding to GraphX, Bagel module tests
And minor/incidental changes:
- Update to SLF4J 1.7.5, which happily matches Hadoop 2’s version and is a recommended update over 1.7.2
- (Remove a duplicate HBase dependency declaration in SparkBuild.scala)
- (Remove a duplicate mockito dependency declaration that was causing warnings and bugging me)
Author: Sean Owen <sowen@cloudera.com>
Closes #570 from srowen/SPARK-1071 and squashes the following commits:
52eac9f [Sean Owen] Add slf4j-over-log4j12 dependency to core (non-test) and remove it from things that depend on core.
77a7fa9 [Sean Owen] SPARK-1071: Tidy logging strategy and use of log4j
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
#522 got messed after i rewrote the branch hadoop_jar_name. So created a new one.
Author: Bijay Bisht <bijay.bisht@gmail.com>
Closes #584 from bijaybisht/hadoop_jar_name_on_0.9.0 and squashes the following commits:
1b6fb3c [Bijay Bisht] Ported hadoopClient jar for < 1.0.1 fix
(cherry picked from commit 8093de1bb319e86dcf0d6d8d97b043a2bc1aa8f2)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Pt 2
Continuation of PR #557
With this all scala style errors are fixed across the code base !!
The reason for creating a separate PR was to not interrupt an already reviewed and ready to merge PR. Hope this gets reviewed soon and merged too.
Author: Prashant Sharma <prashant.s@imaginea.com>
Closes #567 and squashes the following commits:
3b1ec30 [Prashant Sharma] scala style fixes
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build.
Author: Patrick Wendell <pwendell@gmail.com>
Author: Prashant Sharma <scrapcodes@gmail.com>
== Merge branch commits ==
commit 1a8bd1c059b842cb95cc246aaea74a79fec684f4
Author: Prashant Sharma <scrapcodes@gmail.com>
Date: Sun Feb 9 17:39:07 2014 +0530
scala style fixes
commit f91709887a8e0b608c5c2b282db19b8a44d53a43
Author: Patrick Wendell <pwendell@gmail.com>
Date: Fri Jan 24 11:22:53 2014 -0800
Adding scalastyle snapshot
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Version number to 1.0.0-SNAPSHOT
Since 0.9.0-incubating is done and out the door, we shouldn't be building 0.9.0-incubating-SNAPSHOT anymore.
@pwendell
Author: Mark Hamstra <markhamstra@gmail.com>
== Merge branch commits ==
commit 1b00a8a7c1a7f251b4bb3774b84b9e64758eaa71
Author: Mark Hamstra <markhamstra@gmail.com>
Date: Wed Feb 5 09:30:32 2014 -0800
Version number to 1.0.0-SNAPSHOT
|
| |
|
|
|
|
|
|
|
|
|
| |
Upgrade junit-interface plugin from 0.9 to 0.10.
I noticed that the JavaAPISuite tests didn't
appear to display any output locally or under
Jenkins, making it difficult to know whether they
were running. This change increases the verbosity
to more closely match the ScalaTest tests.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove Typesafe Config usage and conf files to fix nested property names
With Typesafe Config we had the subtle problem of no longer allowing
nested property names, which are used for a few of our properties:
http://apache-spark-developers-list.1001551.n3.nabble.com/Config-properties-broken-in-master-td208.html
This PR is for branch 0.9 but should be added into master too.
(cherry picked from commit 34e911ce9a9f91f3259189861779032069257852)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
GraphX: Unifying Graphs and Tables
GraphX extends Spark's distributed fault-tolerant collections API and interactive console with a new graph API which leverages recent advances in graph systems (e.g., [GraphLab](http://graphlab.org)) to enable users to easily and interactively build, transform, and reason about graph structured data at scale. See http://amplab.github.io/graphx/.
Thanks to @jegonzal, @rxin, @ankurdave, @dcrankshaw, @jianpingjwang, @amatsukawa, @kellrott, and @adamnovak.
Tasks left:
- [x] Graph-level uncache
- [x] Uncache previous iterations in Pregel
- [x] ~~Uncache previous iterations in GraphLab~~ (postponed to post-release)
- [x] - Describe GC issue with GraphLab
- [ ] Write `docs/graphx-programming-guide.md`
- [x] - Mention future Bagel support in docs
- [ ] - Section on caching/uncaching in docs: As with Spark, cache something that is used more than once. In an iterative algorithm, try to cache and force (i.e., materialize) something every iteration, then uncache the cached things that depended on the newly materialized RDD but that won't be referenced again.
- [x] Undo modifications to core collections and instead copy them to org.apache.spark.graphx
- [x] Make Graph serializable to work around capture in Spark shell
- [x] Rename graph -> graphx in package name and subproject
- [x] Remove standalone PageRank
- [x] ~~Fix amplab/graphx#52 by checking `iter.hasNext`~~
|
| |\ |
|
| | | |
|
| |\ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Conflicts:
README.md
core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala
core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala
core/src/main/scala/org/apache/spark/util/collection/PrimitiveKeyOpenHashMap.scala
pom.xml
project/SparkBuild.scala
repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala
|
| |\ \ \ |
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Conflicts:
project/SparkBuild.scala
|
| |\ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | | |
indexedrdd_graphx
|
| |\ \ \ \ \ \ |
|
| | | | | | | | |
|
| |\ \ \ \ \ \ \ |
|
| |\ \ \ \ \ \ \ \ |
|
| | |\ \ \ \ \ \ \ \
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
Conflicts:
run
run2.cmd
|
| | |\ \ \ \ \ \ \ \ \
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Conflicts:
run
|
| | |\ \ \ \ \ \ \ \ \ \ |
|
| | | | | | | | | | | | | |
|
| | |\ \ \ \ \ \ \ \ \ \ \ |
|
| | |\ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
Conflicts:
run2.cmd
|
| | |\ \ \ \ \ \ \ \ \ \ \ \ \ |
|
| | | | | | | | | | | | | | | | |
|
| |_|_|_|_|_|_|_|_|_|_|_|_|_|/
|/| | | | | | | | | | | | | | |
|
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
Upgrade Kafka dependecy to 0.8.0 release version
|
| | |_|_|_|_|_|_|_|_|_|_|_|_|/
| |/| | | | | | | | | | | | | |
|
|\| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
Conflicts:
core/src/test/scala/org/apache/spark/deploy/JsonProtocolSuite.scala
pom.xml
|
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
Refactored the streaming project to separate external libraries like Twitter, Kafka, Flume, etc.
At a high level, these are the following changes.
1. All the external code was put in `SPARK_HOME/external/` as separate SBT projects and Maven modules. Their artifact names are `spark-streaming-twitter`, `spark-streaming-kafka`, etc. Both SparkBuild.scala and pom.xml files have been updated. References to external libraries and repositories have been removed from the settings of root and streaming projects/modules.
2. To avail the external functionality (say, creating a Twitter stream), the developer has to `import org.apache.spark.streaming.twitter._` . For Scala API, the developer has to call `TwitterUtils.createStream(streamingContext, ...)`. For the Java API, the developer has to call `TwitterUtils.createStream(javaStreamingContext, ...)`.
3. Each external project has its own scala and java unit tests. Note the unit tests of each external library use classes of the streaming unit tests (`TestSuiteBase`, `LocalJavaStreamingContext`, etc.). To enable this code sharing among test classes, `dependsOn(streaming % "compile->compile,test->test")` was used in the SparkBuild.scala . In the streaming/pom.xml, an additional `maven-jar-plugin` was necessary to capture this dependency (see comment inside the pom.xml for more information).
4. Jars of the external projects have been added to examples project but not to the assembly project.
5. In some files, imports have been rearrange to conform to the Spark coding guidelines.
|
| | |\ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
Conflicts:
examples/src/main/java/org/apache/spark/streaming/examples/JavaFlumeEventCount.java
streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala
streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala
streaming/src/test/java/org/apache/spark/streaming/JavaAPISuite.java
streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala
streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala
|
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
repositoris from other poms and sbt.
|
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
their own self-contained scala API, java API, scala unit tests and java unit tests. Updated examples to use the external projects.
|
| | | | | | | | | | | | | | | | | |
|
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
Add a script to download sbt if not present on the system
As per the discussion on the dev mailing list this script will use the system sbt if present or otherwise attempt to install the sbt launcher. The fall back error message in the event it fails instructs the user to install sbt. While the URLs it fetches from aren't controlled by the spark project directly, they are stable and the current authoritative sources.
|
| | | | | | | | | | | | | | | | | | |
|
| | | | | | | | | | | | | | | | | | |
|
| | | |/ / / / / / / / / / / / / /
| | |/| | | | | | | | | | | | | | |
|
|/ / / / / / / / / / / / / / / / |
|
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
Made java options to be applied during tests so that they become self explanatory.
|
| |/ / / / / / / / / / / / / / /
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
explanatory.
|
|/ / / / / / / / / / / / / / / |
|
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
Conflicts:
core/src/test/scala/org/apache/spark/DriverSuite.scala
docs/python-programming-guide.md
|
| | | | | | | | | | | | | | | | |
|
| | | | | | | | | | | | | | | | |
|