spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
...
\| \| \| * \| \| \|	Merge branch 'reorgscripts' into scripts-reorg	shane-huang	2013-09-27	40	-87/+175
\| \| \| \|\ \ \ \
\| \| \| \| * \| \| \|	rm bin/spark.cmd as we don't have windows test environment. Will added it ↵	shane-huang	2013-09-26	1	-27/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	later if needed Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| \| \| * \| \| \|	fix paths and change spark to use APP_MEM as application driver memory ↵	shane-huang	2013-09-26	3	-35/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instead of SPARK_MEM, user should add application jars to SPARK_CLASSPATH Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| \| \| * \| \| \|	fix path	shane-huang	2013-09-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| \| \| * \| \| \|	add scripts in bin	shane-huang	2013-09-23	12	-17/+163
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| \| \| * \| \| \|	moved user scripts to bin folder	shane-huang	2013-09-23	11	-0/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| \| \| * \| \| \|	add admin scripts to sbin	shane-huang	2013-09-23	14	-47/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| \| \| * \| \| \|	added spark-class and spark-executor to sbin	shane-huang	2013-09-23	14	-22/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| * \| \| \| \| \| \|	Merge pull request #285 from colorant/yarn-refactor	Patrick Wendell	2014-01-02	36	-1226/+189
\| \|\ \ \ \ \ \ \ \| \| \|_\|_\|_\|/ / / \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Yarn refactor
\| \| * \| \| \| \| \|	fix docs for yarn	Raymond Liu	2014-01-03	2	-5/+2
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	minor fix for loginfo	Raymond Liu	2014-01-03	1	-1/+1
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	move duplicate pom config into parent pom	Raymond Liu	2014-01-03	3	-179/+84
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Using name yarn-alpha/yarn instead of yarn-2.0/yarn-2.2	Raymond Liu	2014-01-03	18	-30/+30
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Add yarn/common/src/test dir in building script	Raymond Liu	2014-01-03	1	-0/+7
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Fix yarn/README.md	Raymond Liu	2014-01-03	1	-6/+4
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Clean up unused files for yarn	Raymond Liu	2014-01-03	4	-311/+0
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Fix pom for build yarn/2.x with yarn/common into one jar	Raymond Liu	2014-01-03	4	-36/+202
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Use unmanaged source dir to include common yarn code	Raymond Liu	2014-01-03	1	-11/+15
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	merge yarn/scheduler yarn/common code into one directory	Raymond Liu	2014-01-03	3	-0/+0
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Need to send dummy hello message to actually estabilish akka connection.	Raymond Liu	2014-01-03	2	-0/+4
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	A few clean up for yarn 2.0 code	Raymond Liu	2014-01-03	2	-8/+7
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Update maven build documentation	Raymond Liu	2014-01-03	2	-8/+4
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Fix yarn/README.md and update docs/running-on-yarn.md	Raymond Liu	2014-01-03	2	-3/+1
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Add README for yarn modules	Raymond Liu	2014-01-03	1	-0/+16
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	some code clean up for Yarn 2.2	Raymond Liu	2014-01-03	2	-3/+3
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Fix pom file for scala binary version	Raymond Liu	2014-01-03	6	-8/+8
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Fix yarn/assemble pom file	Raymond Liu	2014-01-03	2	-0/+75
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Change profile name new-yarn to hadoop2.2-yarn	Raymond Liu	2014-01-03	4	-4/+4
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Fix pom for yarn code reorgnaize commit	Raymond Liu	2014-01-03	8	-535/+264
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Reorganize yarn related codes into sub projects to remove duplicate files.	Raymond Liu	2014-01-03	30	-957/+337
\| \|/ / / / / /
\| \| \| \| \| * \|	Changes on top of Prashant's patch.	Patrick Wendell	2014-01-03	10	-72/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Closes #316
\| \| \| \| \| * \|	Restored the previously removed test	Prashant Sharma	2014-01-03	1	-1/+12
\| \| \| \| \| \| \|
\| \| \| \| \| * \|	fixed review comments	Prashant Sharma	2014-01-03	9	-21/+44
\| \| \| \| \| \| \|
\| \| \| \| \| * \|	Merge branch 'master' into spark-1002-remove-jars	Prashant Sharma	2014-01-03	149	-1221/+2291
\| \| \| \| \| \|\ \ \| \| \|_\|_\|_\|/ / \| \|/\| \| \| \| \|
\| * \| \| \| \| \|	Merge pull request #323 from tgravescs/sparkconf_yarn_fix	Patrick Wendell	2014-01-02	10	-113/+101
\| \|\ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fix spark on yarn after the sparkConf changes This fixes it so that spark on yarn now compiles and works after the sparkConf changes. There are also other issues I discovered along the way that are broken: - mvn builds for yarn don't assemble correctly - unset SPARK_EXAMPLES_JAR isn't handled properly anymore - I'm pretty sure spark.conf doesn't actually work as its not distributed with yarn those things can be fixed in separate pr unless others disagree.
\| \| * \| \| \| \| \|	fix yarn-client	Thomas Graves	2014-01-02	2	-8/+10
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Fix yarn build after sparkConf changes	Thomas Graves	2014-01-02	10	-109/+95
\| \| \|/ / / / /
\| * \| \| \| \| \|	Merge pull request #320 from kayousterhout/erroneous_failed_msg	Reynold Xin	2014-01-02	2	-12/+15
\| \|\ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove erroneous FAILED state for killed tasks. Currently, when tasks are killed, the Executor first sends a status update for the task with a "KILLED" state, and then sends a second status update with a "FAILED" state saying that the task failed due to an exception. The second FAILED state is misleading/unncessary, and occurs due to a NonLocalReturnControl Exception that gets thrown due to the way we kill tasks. This commit eliminates that problem. I'm not at all sure that this is the best way to fix this problem, so alternate suggestions welcome. @rxin guessing you're the right person to look at this.
\| \| * \| \| \| \| \|	Remove erroneous FAILED state for killed tasks.	Kay Ousterhout	2014-01-02	2	-12/+15
\| \| \|/ / / / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, when tasks are killed, the Executor first sends a status update for the task with a "KILLED" state, and then sends a second status update with a "FAILED" state saying that the task failed due to an exception. The second FAILED state is misleading/unncessary, and occurs due to a NonLocalReturnControl Exception that gets thrown due to the way we kill tasks. This commit eliminates that problem.
\| * \| \| \| \| \|	Merge pull request #297 from tdas/window-improvement	Patrick Wendell	2014-01-02	9	-172/+388
\| \|\ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improvements to DStream window ops and refactoring of Spark's CheckpointSuite - Added a new RDD - PartitionerAwareUnionRDD. Using this RDD, one can take multiple RDDs partitioned by the same partitioner and unify them into a single RDD while preserving the partitioner. So m RDDs with p partitions each will be unified to a single RDD with p partitions and the same partitioner. The preferred location for each partition of the unified RDD will be the most common preferred location of the corresponding partitions of the parent RDDs. For example, location of partition 0 of the unified RDD will be where most of partition 0 of the parent RDDs are located. - Improved the performance of DStream's reduceByKeyAndWindow and groupByKeyAndWindow. Both these operations work by doing per-batch reduceByKey/groupByKey and then using PartitionerAwareUnionRDD to union the RDDs across the window. This eliminates a shuffle related to the window operation, which can reduce batch processing time by 30-40% for simple workloads. - Fixed bugs and simplified Spark's CheckpointSuite. Some of the tests were incorrect and unreliable. Added missing tests for ZippedRDD. I can go into greater detail if necessary. - Added mapSideCombine option to combineByKeyAndWindow.
\| \| * \| \| \| \| \|	Added Apache boilerplate and class docs to PartitionerAwareUnionRDD.	Tathagata Das	2013-12-26	1	-3/+33
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Removed unncessary options from WindowedDStream.	Tathagata Das	2013-12-26	1	-5/+3
\| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \|	Merge branch 'apache-master' into window-improvement	Tathagata Das	2013-12-26	50	-1548/+1841
\| \| \|\ \ \ \ \ \
\| \| * \ \ \ \ \ \	Merge branch 'master' into window-improvement	Tathagata Das	2013-12-26	37	-123/+465
\| \| \|\ \ \ \ \ \ \
\| \| * \| \| \| \| \| \| \|	Updated groupByKeyAndWindow to be computed incrementally, and added ↵	Tathagata Das	2013-12-26	5	-12/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mapSideCombine to combineByKeyAndWindow.
\| \| * \| \| \| \| \| \| \|	Fixed bug in PartitionAwareUnionRDD	Tathagata Das	2013-12-26	1	-6/+9
\| \| \| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \| \| \|	Merge branch 'scheduler-update' into window-improvement	Tathagata Das	2013-12-23	4	-5/+32
\| \| \|\ \ \ \ \ \ \ \
\| \| * \| \| \| \| \| \| \| \|	Added tests for PartitionerAwareUnionRDD in the CheckpointSuite. Refactored ↵	Tathagata Das	2013-12-20	3	-170/+231
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CheckpointSuite to make the tests simpler and more reliable. Added missing test for ZippedRDD.
\| \| * \| \| \| \| \| \| \| \|	Merge branch 'scheduler-update' into window-improvement	Tathagata Das	2013-12-19	306	-4277/+10714
\| \| \|\ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: streaming/src/main/scala/org/apache/spark/streaming/dstream/WindowedDStream.scala
\| \| * \| \| \| \| \| \| \| \| \|	Added flag in window operation to use partition awaare union.	Tathagata Das	2013-11-21	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|