aboutsummaryrefslogtreecommitdiff
path: root/project
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'master' into akka-bug-fixPrashant Sharma2013-12-111-7/+23
|\ | | | | | | | | | | | | | | | | | | Conflicts: core/pom.xml core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala pom.xml project/SparkBuild.scala streaming/pom.xml yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala
| * Use published "org.spark-project.akka-*" in sbt build for Hadoop-2.2 ↵Harvey Feng2013-12-031-13/+15
| | | | | | | | | | | | | | | | | | dependencies. This also includes: -Change `isNewYarn` to `isNewHadoop`, since the protobuf-2.5 dependency is from Hadoop-2.2 itself. -Regexp bugix Credits to @alig for this patch.
| * Merge remote-tracking branch 'origin/master' into yarn-2.2Harvey Feng2013-11-261-0/+1
| |\ | | | | | | | | | | | | Conflicts: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
| * | Add optional Hadoop 2.2 settings in sbt build.Harvey Feng2013-11-261-9/+23
| | | | | | | | | | | | | | | If the Hadoop used is version 2.2 or derived from it, then Spark will be compiled against protobuf-2.5 and a protobuf-2.5 version of Akka 2.0.5.
* | | Merge branch 'master' into scala-2.10-wipPrashant Sharma2013-11-251-1/+2
|\ \ \ | | |/ | |/| | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/rdd/RDD.scala project/SparkBuild.scala
| * | Merge pull request #151 from russellcardullo/add-graphite-sinkMatei Zaharia2013-11-241-0/+1
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | Add graphite sink for metrics This adds a metrics sink for graphite. The sink must be configured with the host and port of a graphite node and optionally may be configured with a prefix that will be prepended to all metrics that are sent to graphite.
| | * Add graphite sink for metricsRussell Cardullo2013-11-081-0/+1
| | | | | | | | | | | | | | | | | | | | | This adds a metrics sink for graphite. The sink must be configured with the host and port of a graphite node and optionally may be configured with a prefix that will be prepended to all metrics that are sent to graphite.
* | | Use Kafka 2.10 (again)Aaron Davidson2013-11-141-2/+3
| | |
* | | Various merge correctionsAaron Davidson2013-11-141-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've diff'd this patch against my own -- since they were both created independently, this means that two sets of eyes have gone over all the merge conflicts that were created, so I'm feeling significantly more confident in the resulting PR. @rxin has looked at the changes to the repl and is resoundingly confident that they are correct.
* | | Some fixes for previous master merge commitsRaymond Liu2013-11-151-0/+1
| | |
* | | Merge branch 'master' into scala-2.10Raymond Liu2013-11-142-3/+4
|\| |
| * | Merge pull request #165 from NathanHowell/kerberos-masterMatei Zaharia2013-11-132-2/+2
| |\ \ | | | | | | | | | | | | | | | | | | | | spark-assembly.jar fails to authenticate with YARN ResourceManager The META-INF/services/ sbt MergeStrategy was discarding support for Kerberos, among others. This pull request changes to a merge strategy similar to sbt-assembly's default. I've also included an update to sbt-assembly 0.9.2, a minor fix to it's zip file handling.
| | * | Upgrade to sbt-assembly 0.9.2Nathan Howell2013-11-121-1/+1
| | | |
| | * | spark-assembly.jar fails to authenticate with YARN ResourceManagerNathan Howell2013-11-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sbt-assembly is setup to pick the first META-INF/services/org.apache.hadoop.security.SecurityInfo file instead of merging them. This causes Kerberos authentication to fail, this manifests itself in the "info:null" debug log statement: DEBUG SaslRpcClient: Get token info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:null DEBUG SaslRpcClient: Get kerberos info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:null ERROR UserGroupInformation: PriviledgedActionException as:foo@BAR (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] DEBUG UserGroupInformation: PrivilegedAction as:foo@BAR (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:583) WARN Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] ERROR UserGroupInformation: PriviledgedActionException as:foo@BAR (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] This previously would just contain a single class: $ unzip -c assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar META-INF/services/org.apache.hadoop.security.SecurityInfo Archive: assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar inflating: META-INF/services/org.apache.hadoop.security.SecurityInfo org.apache.hadoop.security.AnnotatedSecurityInfo And now has the full list of classes: $ unzip -c assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar META-INF/services/org.apache.hadoop.security.SecurityInfoArchive: assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar inflating: META-INF/services/org.apache.hadoop.security.SecurityInfo org.apache.hadoop.security.AnnotatedSecurityInfo org.apache.hadoop.mapreduce.v2.app.MRClientSecurityInfo org.apache.hadoop.mapreduce.v2.security.client.ClientHSSecurityInfo org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo org.apache.hadoop.yarn.security.ContainerManagerSecurityInfo org.apache.hadoop.yarn.security.SchedulerSecurityInfo org.apache.hadoop.yarn.security.admin.AdminSecurityInfo org.apache.hadoop.yarn.server.RMNMSecurityInfoClass
| * | | Merge pull request #137 from tgravescs/sparkYarnJarsHdfsRebaseMatei Zaharia2013-11-121-1/+2
| |\ \ \ | | |/ / | |/| | | | | | | | | | | | | | Allow spark on yarn to be run from HDFS. Allows the spark.jar, app.jar, and log4j.properties to be put into hdfs. Allows you to specify the files on a different hdfs cluster and it will copy them over. It makes sure permissions are correct and makes sure to put things into public distributed cache so they can be reused amongst users if their permissions are appropriate. Also add a bit of error handling for missing arguments.
| | * | Add mockito to the sbt buildtgravescs2013-11-111-1/+2
| | |/
| * / Add spark-tools assembly to spark-class classpath.Josh Rosen2013-11-091-1/+1
| |/ | | | | | | | | This allows the JavaAPICompletenessChecker to be run with Spark 0.8+.
* | Merge branch 'master' into scala-2.10Raymond Liu2013-11-131-5/+30
|\|
| * Exclude jopt from kafka dependency.Patrick Wendell2013-10-251-0/+1
| | | | | | | | | | | | | | Kafka uses an older version of jopt that causes bad conflicts with the version used by spark-perf. It's not easy to remove this downstream because of the way that spark-perf uses Spark (by including a spark assembly as an unmanaged jar). This fixes the problem at its source by just never including it.
| * Fix Maven build to use MQTT repositoryMatei Zaharia2013-10-231-3/+3
| |
| * Merge pull request #64 from prabeesh/masterMatei Zaharia2013-10-231-1/+5
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MQTT Adapter for Spark Streaming MQTT is a machine-to-machine (M2M)/Internet of Things connectivity protocol. It was designed as an extremely lightweight publish/subscribe messaging transport. You may read more about it here http://mqtt.org/ Message Queue Telemetry Transport (MQTT) is an open message protocol for M2M communications. It enables the transfer of telemetry-style data in the form of messages from devices like sensors and actuators, to mobile phones, embedded systems on vehicles, or laptops and full scale computers. The protocol was invented by Andy Stanford-Clark of IBM, and Arlen Nipper of Cirrus Link Solutions This protocol enables a publish/subscribe messaging model in an extremely lightweight way. It is useful for connections with remote locations where line of code and network bandwidth is a constraint. MQTT is one of the widely used protocol for 'Internet of Things'. This protocol is getting much attraction as anything and everything is getting connected to internet and they all produce data. Researchers and companies predict some 25 billion devices will be connected to the internet by 2015. Plugin/Support for MQTT is available in popular MQs like RabbitMQ, ActiveMQ etc. Support for MQTT in Spark will help people with Internet of Things (IoT) projects to use Spark Streaming for their real time data processing needs (from sensors and other embedded devices etc).
| | * remove unused dependencyprabeesh2013-10-171-2/+0
| | |
| | * added mqtt adapter library dependenciesprabeesh2013-10-161-1/+7
| | |
| * | Merge pull request #56 from jerryshao/kafka-0.8-devMatei Zaharia2013-10-211-3/+6
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | Upgrade Kafka 0.7.2 to Kafka 0.8.0-beta1 for Spark Streaming Conflicts: streaming/pom.xml
| | * | Upgrade Kafka 0.7.2 to Kafka 0.8.0-beta1 for Spark Streamingjerryshao2013-10-121-3/+6
| | | |
| * | | Merge pull request #66 from shivaram/sbt-assembly-depsMatei Zaharia2013-10-181-3/+10
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add SBT target to assemble dependencies This pull request is an attempt to address the long assembly build times during development. Instead of rebuilding the assembly jar for every Spark change, this pull request adds a new SBT target `spark` that packages all the Spark modules and builds an assembly of the dependencies. So the work flow that should work now would be something like ``` ./sbt/sbt spark # Doing this once should suffice ## Make changes ./sbt/sbt compile ./sbt/sbt test or ./spark-shell ```
| | * | | Rename SBT target to assemble-deps.Shivaram Venkataraman2013-10-161-5/+5
| | | | |
| | * | | Merge branch 'master' of https://github.com/apache/incubator-spark into ↵Shivaram Venkataraman2013-10-152-11/+44
| | |\| | | | | | | | | | | | | | | | | sbt-assembly-deps
| | * | | Add a comment and exclude toolsShivaram Venkataraman2013-10-111-1/+2
| | | | |
| | * | | Add new SBT target for dependency assemblyShivaram Venkataraman2013-10-091-1/+7
| | | |/ | | |/|
| * | | Fixing spark streaming example and a bug in examples build.Patrick Wendell2013-10-151-0/+1
| | |/ | |/| | | | | | | | | | | | | - Examples assembly included a log4j.properties which clobbered Spark's - Example had an error where some classes weren't serializable - Did some other clean-up in this example
| * | Merge pull request #19 from aarondav/master-zkMatei Zaharia2013-10-101-0/+1
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Standalone Scheduler fault tolerance using ZooKeeper This patch implements full distributed fault tolerance for standalone scheduler Masters. There is only one master Leader at a time, which is actively serving scheduling requests. If this Leader crashes, another master will eventually be elected, reconstruct the state from the first Master, and continue serving scheduling requests. Leader election is performed using the ZooKeeper leader election pattern. We try to minimize the use of ZooKeeper and the assumptions about ZooKeeper's behavior, so there is a layer of retries and session monitoring on top of the ZooKeeper client. Master failover follows directly from the single-node Master recovery via the file system (patch d5a96fe), save that the Master state is stored in ZooKeeper instead. Configuration: By default, no recovery mechanism is enabled (spark.deploy.recoveryMode = NONE). By setting spark.deploy.recoveryMode to ZOOKEEPER and setting spark.deploy.zookeeper.url to an appropriate ZooKeeper URL, ZooKeeper recovery mode is enabled. By setting spark.deploy.recoveryMode to FILESYSTEM and setting spark.deploy.recoveryDirectory to an appropriate directory accessible by the Master, we will keep the behavior of from d5a96fe. Additionally, places where a Master could be specificied by a spark:// url can now take comma-delimited lists to specify backup masters. Note that this is only used for registration of NEW Workers and application Clients. Once a Worker or Client has registered with the Master Leader, it is "in the system" and will never need to register again.
| | * | Standalone Scheduler fault tolerance using ZooKeeperAaron Davidson2013-09-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements full distributed fault tolerance for standalone scheduler Masters. There is only one master Leader at a time, which is actively serving scheduling requests. If this Leader crashes, another master will eventually be elected, reconstruct the state from the first Master, and continue serving scheduling requests. Leader election is performed using the ZooKeeper leader election pattern. We try to minimize the use of ZooKeeper and the assumptions about ZooKeeper's behavior, so there is a layer of retries and session monitoring on top of the ZooKeeper client. Master failover follows directly from the single-node Master recovery via the file system (patch 194ba4b8), save that the Master state is stored in ZooKeeper instead. Configuration: By default, no recovery mechanism is enabled (spark.deploy.recoveryMode = NONE). By setting spark.deploy.recoveryMode to ZOOKEEPER and setting spark.deploy.zookeeper.url to an appropriate ZooKeeper URL, ZooKeeper recovery mode is enabled. By setting spark.deploy.recoveryMode to FILESYSTEM and setting spark.deploy.recoveryDirectory to an appropriate directory accessible by the Master, we will keep the behavior of from 194ba4b8. Additionally, places where a Master could be specificied by a spark:// url can now take comma-delimited lists to specify backup masters. Note that this is only used for registration of NEW Workers and application Clients. Once a Worker or Client has registered with the Master Leader, it is "in the system" and will never need to register again. Forthcoming: Documentation, tests (! - only ad hoc testing has been performed so far) I do not intend for this commit to be merged until tests are added, but this patch should still be mostly reviewable until then.
* | | | Updating to latest akka 2.2.3, which fixes our only failing Driver SuitePrashant Sharma2013-10-241-4/+4
| | | |
* | | | Merge branch 'scala-2.10' of github.com:ScrapCodes/spark into scala-2.10Prashant Sharma2013-10-101-8/+14
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala project/SparkBuild.scala
| * | | | Merge branch 'master' into wip-merge-masterPrashant Sharma2013-10-081-6/+8
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: bagel/pom.xml core/pom.xml core/src/test/scala/org/apache/spark/ui/UISuite.scala examples/pom.xml mllib/pom.xml pom.xml project/SparkBuild.scala repl/pom.xml streaming/pom.xml tools/pom.xml In scala 2.10, a shorter representation is used for naming artifacts so changed to shorter scala version for artifacts and made it a property in pom.
| | * | | Merge pull request #31 from sundeepn/branch-0.8Reynold Xin2013-10-071-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resolving package conflicts with hadoop 0.23.9 Hadoop 0.23.9 is having a package conflict with easymock's dependencies. (cherry picked from commit 023e3fdf008b3194a36985a07923df9aaf64e520) Signed-off-by: Reynold Xin <rxin@apache.org>
| * | | | Merge branch 'master' into scala-2.10Prashant Sharma2013-10-051-0/+3
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/test/scala/org/apache/spark/DistributedSuite.scala project/SparkBuild.scala
| | * | | ask ivy/sbt to check local maven repo under ~/.m2Du Li2013-10-011-0/+3
| | | | |
| * | | | Merge branch 'master' into scala-2.10Prashant Sharma2013-10-011-2/+3
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressUI.scala docs/_config.yml project/SparkBuild.scala repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala
| | * | | Removed scala -optimize flag.Reynold Xin2013-09-261-1/+1
| | |/ /
| | * | Merge pull request #930 from holdenk/masterReynold Xin2013-09-261-1/+1
| | |\ \ | | | | | | | | | | Add mapPartitionsWithIndex
| | | * | Fix build on ubuntuHolden Karau2013-09-141-1/+1
| | | | |
| | * | | Update build version in masterPatrick Wendell2013-09-241-1/+1
| | | | |
* | | | | scala 2.10 requires Java 1.6,Martin Weindel2013-10-051-3/+3
|/ / / / | | | | | | | | | | | | using Scala 2.10.3, resolved maven-scala-plugin warning
* | | | Sync with master and some build fixesPrashant Sharma2013-09-261-8/+8
|\| | |
| * | | Bumping Mesos version to 0.13.0Patrick Wendell2013-09-151-1/+1
| |/ /
* | | fixed maven build for scala 2.10Prashant Sharma2013-09-261-2/+1
| | |
* | | Akka 2.2 migrationPrashant Sharma2013-09-221-9/+9
| | |
* | | Merge branch 'master' of git://github.com/mesos/spark into scala-2.10Prashant Sharma2013-09-152-6/+31
|\| | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala project/SparkBuild.scala