aboutsummaryrefslogtreecommitdiff
path: root/core
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into ↵Prashant Sharma2014-01-025-7/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | spark-915-segregate-scripts Conflicts: bin/spark-shell core/pom.xml core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala core/src/test/scala/org/apache/spark/DriverSuite.scala python/run-tests sbin/compute-classpath.sh sbin/spark-class sbin/stop-slaves.sh
| * deprecate "spark" script and SPAKR_CLASSPATH environment variableAndrew xia2013-10-122-2/+1
| |
| * Merge branch 'reorgscripts' into scripts-reorgshane-huang2013-09-275-7/+7
| |\
| | * fix paths and change spark to use APP_MEM as application driver memory ↵shane-huang2013-09-261-1/+1
| | | | | | | | | | | | | | | | | | instead of SPARK_MEM, user should add application jars to SPARK_CLASSPATH Signed-off-by: shane-huang <shengsheng.huang@intel.com>
| | * fix pathshane-huang2013-09-261-1/+1
| | | | | | | | | | | | Signed-off-by: shane-huang <shengsheng.huang@intel.com>
| | * added spark-class and spark-executor to sbinshane-huang2013-09-234-6/+6
| | | | | | | | | | | | Signed-off-by: shane-huang <shengsheng.huang@intel.com>
* | | Fixed two uses of conf.get with no default value in MesosMatei Zaharia2014-01-012-2/+2
| | |
* | | Miscellaneous fixes from code review.Matei Zaharia2014-01-0144-174/+195
| | | | | | | | | | | | | | | | | | Also replaced SparkConf.getOrElse with just a "get" that takes a default value, and added getInt, getLong, etc to make code that uses this simpler later on.
* | | Merge remote-tracking branch 'apache/master' into conf2Matei Zaharia2014-01-0111-21/+45
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala
| * \ \ Merge remote-tracking branch 'apache-github/master' into log4j-fix-2Patrick Wendell2014-01-0127-311/+570
| |\ \ \ | | | | | | | | | | | | | | | | | | | | Conflicts: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
| * | | | Adding outer checkout when initializing loggingPatrick Wendell2013-12-311-3/+5
| | | | |
| * | | | Tiny typo fixPatrick Wendell2013-12-311-2/+2
| | | | |
| * | | | Removing use in testPatrick Wendell2013-12-311-2/+0
| | | | |
| * | | | Minor fixesPatrick Wendell2013-12-302-3/+3
| | | | |
| * | | | Removing initLogging entirelyPatrick Wendell2013-12-309-17/+21
| | | | |
| * | | | Response to Shivaram's reviewPatrick Wendell2013-12-301-1/+1
| | | | |
| * | | | SPARK-1008: Logging improvmentsPatrick Wendell2013-12-292-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Adds a default log4j file that gets loaded if users haven't specified a log4j file. 2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings after building with SBT (and I've seen similar warnings on the mailing list).
* | | | | Merge remote-tracking branch 'apache/master' into conf2Matei Zaharia2014-01-0110-210/+450
|\ \ \ \ \ | | |/ / / | |/| | | | | | | | | | | | | Conflicts: project/SparkBuild.scala
| * | | | restore core/pom.xml file modificationliguoqiang2014-01-011-1351/+235
| | | | |
| * | | | Merge pull request #73 from falaki/ApproximateDistinctCountReynold Xin2013-12-3110-232/+1588
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Approximate distinct count Added countApproxDistinct() to RDD and countApproxDistinctByKey() to PairRDDFunctions to approximately count distinct number of elements and distinct number of values per key, respectively. Both functions use HyperLogLog from stream-lib for counting. Both functions take a parameter that controls the trade-off between accuracy and memory consumption. Also added Scala docs and test suites for both methods.
| | * | | | Made the code more compact and readableHossein Falaki2013-12-313-23/+8
| | | | | |
| | * | | | minor improvementsHossein Falaki2013-12-312-4/+5
| | | | | |
| | * | | | Added Java unit tests for countApproxDistinct and countApproxDistinctByKeyHossein Falaki2013-12-301-0/+32
| | | | | |
| | * | | | Added Java API for countApproxDistinctHossein Falaki2013-12-301-0/+11
| | | | | |
| | * | | | Added Java API for countApproxDistinctByKeyHossein Falaki2013-12-301-0/+36
| | | | | |
| | * | | | Renamed countDistinct and countDistinctByKey methods to include ApproxHossein Falaki2013-12-305-15/+15
| | | | | |
| | * | | | Using origin versionHossein Falaki2013-12-30194-4956/+8379
| | |\ \ \ \
| | * | | | | Removed superfluous abs call from test cases.Hossein Falaki2013-12-101-2/+2
| | | | | | |
| | * | | | | Made SerializableHyperLogLog Externalizable and added Kryo testsHossein Falaki2013-10-182-5/+10
| | | | | | |
| | * | | | | Added stream-lib dependency to Maven buildHossein Falaki2013-10-181-0/+4
| | | | | | |
| | * | | | | Improved code style.Hossein Falaki2013-10-174-15/+19
| | | | | | |
| | * | | | | Fixed document typoHossein Falaki2013-10-172-4/+4
| | | | | | |
| | * | | | | Added countDistinctByKey to PairRDDFunctions that counts the approximate ↵Hossein Falaki2013-10-172-0/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | number of unique values for each key in the RDD.
| | * | | | | Added a countDistinct method to RDD that takes takes an accuracy parameter ↵Hossein Falaki2013-10-172-1/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | and returns the (approximate) number of distinct elements in the RDD.
| | * | | | | Added a serializable wrapper for HyperLogLogHossein Falaki2013-10-171-0/+44
| | | | | | |
* | | | | | | Merge remote-tracking branch 'apache/master' into conf2Matei Zaharia2013-12-3119-107/+120
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
| * | | | | | Merge pull request #238 from ngbinh/upgradeNettyPatrick Wendell2013-12-316-42/+58
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final the changes are listed at https://github.com/netty/netty/wiki/New-and-noteworthy
| | * | | | | | Fix failed unit testsBinh Nguyen2013-12-273-13/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also clean up a bit.
| | * | | | | | Fix imports orderBinh Nguyen2013-12-243-5/+2
| | | | | | | |
| | * | | | | | Remove import * and fix some formattingBinh Nguyen2013-12-242-7/+4
| | | | | | | |
| | * | | | | | upgrade Netty from 4.0.0.Beta2 to 4.0.13.FinalBinh Nguyen2013-12-245-29/+40
| | | |/ / / / | | |/| | | |
| * | | | | | Merge pull request #289 from tdas/filestream-fixPatrick Wendell2013-12-315-45/+44
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bug fixes for file input stream and checkpointing - Fixed bugs in the file input stream that led the stream to fail due to transient HDFS errors (listing files when a background thread it deleting fails caused errors, etc.) - Updated Spark's CheckpointRDD and Streaming's CheckpointWriter to use SparkContext.hadoopConfiguration, to allow checkpoints to be written to any HDFS compatible store requiring special configuration. - Changed the API of SparkContext.setCheckpointDir() - eliminated the unnecessary 'useExisting' parameter. Now SparkContext will always create a unique subdirectory within the user specified checkpoint directory. This is to ensure that previous checkpoint files are not accidentally overwritten. - Fixed bug where setting checkpoint directory as a relative local path caused the checkpointing to fail.
| | * | | | | | Fixed comments and long lines based on comments on PR 289.Tathagata Das2013-12-311-1/+2
| | | | | | | |
| | * | | | | | Fixed Python API for sc.setCheckpointDir. Also other fixes based on ↵Tathagata Das2013-12-243-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reynold's comments on PR 289.
| | * | | | | | Merge branch 'apache-master' into filestream-fixTathagata Das2013-12-2430-113/+395
| | |\| | | | |
| | * | | | | | Merge branch 'scheduler-update' into filestream-fixTathagata Das2013-12-19111-632/+740
| | |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala
| | * | | | | | | Fixed multiple file stream and checkpointing bugs.Tathagata Das2013-12-115-43/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Made file stream more robust to transient failures. - Changed Spark.setCheckpointDir API to not have the second 'useExisting' parameter. Spark will always create a unique directory for checkpointing underneath the directory provide to the funtion. - Fixed bug wrt local relative paths as checkpoint directory. - Made DStream and RDD checkpointing use SparkContext.hadoopConfiguration, so that more HDFS compatible filesystems are supported for checkpointing.
| * | | | | | | | Merge pull request #308 from kayousterhout/stage_namingPatrick Wendell2013-12-307-14/+18
| |\ \ \ \ \ \ \ \ | | |_|_|_|_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changed naming of StageCompleted event to be consistent The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.
| | * | | | | | | Updated code style according to Patrick's commentsKay Ousterhout2013-12-291-4/+2
| | | | | | | | |
| | * | | | | | | Changed naming of StageCompleted event to be consistentKay Ousterhout2013-12-277-14/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.