| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
| |
|
| |
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
spark-915-segregate-scripts
Conflicts:
bin/spark-shell
core/pom.xml
core/src/main/scala/org/apache/spark/SparkContext.scala
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala
core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala
core/src/test/scala/org/apache/spark/DriverSuite.scala
python/run-tests
sbin/compute-classpath.sh
sbin/spark-class
sbin/stop-slaves.sh
|
| | |
|
| | |
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | | |
later if needed
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
instead of SPARK_MEM, user should add application jars to SPARK_CLASSPATH
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
SPARK-544. Migrate configuration to a SparkConf class
This is still a work in progress based on Prashant and Evan's code. So far I've done the following:
- Got rid of global SparkContext.globalConf
- Passed SparkConf to serializers and compression codecs
- Made SparkConf public instead of private[spark]
- Improved API of SparkContext and SparkConf
- Switched executor environment vars to be passed through SparkConf
- Fixed some places that were still using system properties
- Fixed some tests, though others are still failing
This still fails several tests in core, repl and streaming, likely due to properties not being set or cleared correctly (some of the tests run fine in isolation). But the API at least is hopefully ready for review. Unfortunately there was a lot of global stuff before due to a "SparkContext.globalConf" method that let you set a "default" configuration of sorts, which meant I had to make some pretty big changes.
|
| | | | |
|
| | | | |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Also replaced SparkConf.getOrElse with just a "get" that takes a default
value, and added getInt, getLong, etc to make code that uses this
simpler later on.
|
| |\ \ \
| |/ / /
|/| | |
| | | |
| | | |
| | | |
| | | | |
Conflicts:
core/src/main/scala/org/apache/spark/SparkContext.scala
core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
SPARK-1008: Logging improvments
1. Adds a default log4j file that gets loaded if users haven't specified a log4j file.
2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings
after building with SBT (and I've seen similar warnings on the mailing list).
|
| |\ \ \ \
| |/ / / /
|/| | | |
| | | | |
| | | | | |
Conflicts:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
|
| | | | | |
|
| | | | | |
|
| | | | | |
|
| | | | | |
|
| | | | | |
|
| | | | | |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
1. Adds a default log4j file that gets loaded if users haven't specified a log4j file.
2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings
after building with SBT (and I've seen similar warnings on the mailing list).
|
| | |\ \ \
| |_|/ / /
|/| | | |
| | | | |
| | | | | |
Conflicts:
project/SparkBuild.scala
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | | |
restore core/pom.xml file modification
|
|/ / / / / |
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Approximate distinct count
Added countApproxDistinct() to RDD and countApproxDistinctByKey() to PairRDDFunctions to approximately count distinct number of elements and distinct number of values per key, respectively. Both functions use HyperLogLog from stream-lib for counting. Both functions take a parameter that controls the trade-off between accuracy and memory consumption. Also added Scala docs and test suites for both methods.
|
| | | | | | |
|
| | | | | | |
|
| | | | | | |
|
| | | | | | |
|
| | | | | | |
|
| | | | | | |
|
| | | | | | |
|
| |\ \ \ \ \ |
|
| | | | | | | |
|
| | | | | | | |
|
| | | | | | | |
|
| | | | | | | |
|
| | | | | | | |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
support.
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
number of unique values for each key in the RDD.
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
and returns the (approximate) number of distinct elements in the RDD.
|