aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Print output from spark-daemon only when it fails to launchMatei Zaharia2013-08-315-9/+15
|
* Various web UI improvements:Matei Zaharia2013-08-3116-164/+971
| | | | | | | | | - Use "fluid" layout that can expand to wide browser windows, instead of the old one's limit of 1200 px - Remove unnecessary <hr> elements - Switch back to Bootstrap's default theme and tweak progress bar colors - Make headers more consistent between deploy and app UIs - Replace some inline CSS with stylesheets
* Delete some code that was added back in a merge and print less info inMatei Zaharia2013-08-312-10/+0
| | | | spark-daemon
* Merge pull request #861 from AndreSchumacher/pyspark_sampling_functionMatei Zaharia2013-08-312-7/+167
|\ | | | | Pyspark sampling function
| * RDD sample() and takeSample() prototypes for PySparkAndre Schumacher2013-08-282-7/+167
| |
* | Merge pull request #870 from JoshRosen/spark-885Matei Zaharia2013-08-311-1/+5
|\ \ | | | | | | Don't send SIGINT / ctrl-c to Py4J gateway subprocess
| * | Don't send SIGINT to Py4J gateway subprocess.Josh Rosen2013-08-281-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses SPARK-885, a usability issue where PySpark's Java gateway process would be killed if the user hit ctrl-c. Note that SIGINT still won't cancel the running s This fix is based on http://stackoverflow.com/questions/5045771
* | | Merge pull request #869 from AndreSchumacher/subtractMatei Zaharia2013-08-301-0/+37
|\ \ \ | | | | | | | | PySpark: implementing subtractByKey(), subtract() and keyBy()
| * | | PySpark: implementing subtractByKey(), subtract() and keyBy()Andre Schumacher2013-08-281-0/+37
| | | |
* | | | Merge pull request #876 from mbautin/master_hadoop_rdd_confReynold Xin2013-08-302-1/+6
|\ \ \ \ | | | | | | | | | | Make HadoopRDD's configuration accessible
| * | | | Also add getConf to NewHadoopRDDMikhail Bautin2013-08-301-0/+3
| | | | |
| * | | | Make HadoopRDD's configuration accessibleMikhail Bautin2013-08-301-1/+3
|/ / / /
* | | | Merge pull request #875 from shivaram/build-fixReynold Xin2013-08-302-11/+8
|\ \ \ \ | | | | | | | | | | Fix broken build by removing addIntercept
| * | | | Fix broken build by removing addInterceptShivaram Venkataraman2013-08-302-11/+8
|/ / / /
* | | | Merge pull request #863 from shivaram/etrain-ridgeEvan Sparks2013-08-2912-297/+755
|\ \ \ \ | | | | | | | | | | Adding linear regression and refactoring Ridge regression to use SGD
| * | | | Center & scale variables in Ridge, Lasso.Shivaram Venkataraman2013-08-2510-228/+347
| | | | | | | | | | | | | | | | | | | | | | | | | Also add a unit test that checks if ridge regression lowers cross-validation error.
| * | | | Fixing typos in Java tests, and addressing alignment issues.Evan Sparks2013-08-184-16/+16
| | | | |
| * | | | Centralizing linear data generator and mllib regression tests to use it.Evan Sparks2013-08-189-282/+84
| | | | |
| * | | | Adding Linear Regression, and refactoring Ridge Regression.Evan Sparks2013-08-188-176/+713
| | | | |
* | | | | Merge pull request #819 from shivaram/sgd-cleanupEvan Sparks2013-08-298-48/+160
|\ \ \ \ \ | | | | | | | | | | | | Change SVM to use {0,1} labels
| * | | | | Add an option to turn off data validation, test it.Shivaram Venkataraman2013-08-255-18/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also moves addIntercept to have default true to make it similar to validateData option
| * | | | | Specify label format in LogisticRegression.Shivaram Venkataraman2013-08-131-0/+6
| | | | | |
| * | | | | Fix SVM model and unit test to work with {0,1}.Shivaram Venkataraman2013-08-135-12/+18
| | | | | | | | | | | | | | | | | | | | | | | | Also rename validateFuncs to validators.
| * | | | | Change SVM to use {0,1} labels.Shivaram Venkataraman2013-08-137-26/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also add a data validation check to make sure classification labels are always 0 or 1 and add an appropriate test case.
* | | | | | Merge pull request #857 from mateiz/assemblyMatei Zaharia2013-08-2949-310/+504
|\ \ \ \ \ \ | | | | | | | | | | | | | | Change build and run instructions to use assemblies
| * | | | | | Update Maven docsMatei Zaharia2013-08-291-29/+26
| | | | | | |
| * | | | | | Fix path to assembly in make-distribution.shMatei Zaharia2013-08-291-1/+1
| | | | | | |
| * | | | | | Update some build instructions because only sbt assembly and mvn packageMatei Zaharia2013-08-295-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | are now needed
| * | | | | | Update Maven build to create assemblies expected by new scriptsMatei Zaharia2013-08-2911-47/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This includes the following changes: - The "assembly" package now builds in Maven by default, and creates an assembly containing both hadoop-client and Spark, unlike the old BigTop distribution assembly that skipped hadoop-client - There is now a bigtop-dist package to build the old BigTop assembly - The repl-bin package is no longer built by default since the scripts don't reply on it; instead it can be enabled with -Prepl-bin - Py4J is now included in the assembly/lib folder as a local Maven repo, so that the Maven package can link to it - run-example now adds the original Spark classpath as well because the Maven examples assembly lists spark-core and such as provided - The various Maven projects add a spark-yarn dependency correctly
| * | | | | | Don't use SPARK_LAUNCH_WITH_SCALA in pysparkMatei Zaharia2013-08-291-5/+0
| | | | | | |
| * | | | | | Find assembly correctly in pysparkMatei Zaharia2013-08-291-1/+3
| | | | | | |
| * | | | | | Fix finding of assembly JAR, as well as some pointers to ./runMatei Zaharia2013-08-2913-17/+18
| | | | | | |
| * | | | | | Provide more memory for testsMatei Zaharia2013-08-292-2/+2
| | | | | | |
| * | | | | | Fix PySpark for assembly run and include it in distMatei Zaharia2013-08-296-5/+41
| | | | | | |
| * | | | | | Change build and run instructions to use assembliesMatei Zaharia2013-08-2925-199/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit makes Spark invocation saner by using an assembly JAR to find all of Spark's dependencies instead of adding all the JARs in lib_managed. It also packages the examples into an assembly and uses that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script with two better-named scripts: "run-examples" for examples, and "spark-class" for Spark internal classes (e.g. REPL, master, etc). This is also designed to minimize the confusion people have in trying to use "run" to run their own classes; it's not meant to do that, but now at least if they look at it, they can modify run-examples to do a decent job for them. As part of this, Bagel's examples are also now properly moved to the examples package instead of bagel.
* | | | | | | Merge pull request #874 from jerryshao/fix-report-bugReynold Xin2013-08-291-2/+2
|\ \ \ \ \ \ \ | |/ / / / / / |/| | | | | | Fix removed block zero size log reporting
| * | | | | | Fix removed block zero size log reportingjerryshao2013-08-301-2/+2
|/ / / / / /
* | | | | | Merge pull request #871 from pwendell/expose-localPatrick Wendell2013-08-281-1/+1
|\ \ \ \ \ \ | | | | | | | | | | | | | | Expose `isLocal` in SparkContext.
| * | | | | | Make local variable publicPatrick Wendell2013-08-281-1/+1
| | | | | | |
* | | | | | | Merge pull request #873 from pwendell/masterMatei Zaharia2013-08-281-1/+1
|\ \ \ \ \ \ \ | |_|_|_|/ / / |/| | | | | | Hot fix for command runner
| * | | | | | Adding extra argsPatrick Wendell2013-08-281-1/+1
| | | | | | |
| * | | | | | Hot fix for command runnerPatrick Wendell2013-08-281-1/+1
| |/ / / / /
* | | | | | Merge pull request #865 from tgravescs/fixtmpdirMatei Zaharia2013-08-285-3/+49
|\ \ \ \ \ \ | | | | | | | | | | | | | | Spark on Yarn should use yarn approved directories for spark.local.dir and tmp
| * | | | | | Change Executor to only look at the env variable SPARK_YARN_MODEY.CORP.YAHOO.COM\tgraves2013-08-281-1/+1
| | | | | | |
| * | | | | | Updated based on review comments.Y.CORP.YAHOO.COM\tgraves2013-08-272-18/+12
| | | | | | |
| * | | | | | Allow for Executors to have different directories then the Spark Master for YarnY.CORP.YAHOO.COM\tgraves2013-08-271-0/+25
| | | | | | |
| * | | | | | Update docs and remove old reference to --user optionY.CORP.YAHOO.COM\tgraves2013-08-261-3/+1
| | | | | | |
| * | | | | | Throw exception if the yarn local dirs isn't setY.CORP.YAHOO.COM\tgraves2013-08-261-1/+5
| | | | | | |
| * | | | | | Change to use Yarn appropriate directories rather then /tmp or the user ↵Y.CORP.YAHOO.COM\tgraves2013-08-263-0/+25
| |/ / / / / | | | | | | | | | | | | | | | | | | specified spark.local.dir
* | | | | | Merge pull request #867 from tgravescs/yarnenvconfigsMatei Zaharia2013-08-273-2/+13
|\ \ \ \ \ \ | |_|_|_|/ / |/| | | | | Spark on Yarn allow users to specify environment variables