aboutsummaryrefslogtreecommitdiff
path: root/core/src
Commit message (Collapse)AuthorAgeFilesLines
* [SPARK-15086][CORE][STREAMING] Deprecate old Java accumulator APISean Owen2016-06-123-7/+17
| | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? - Deprecate old Java accumulator API; should use Scala now - Update Java tests and examples - Don't bother testing old accumulator API in Java 8 (too) - (fix a misspelling too) ## How was this patch tested? Jenkins tests Author: Sean Owen <sowen@cloudera.com> Closes #13606 from srowen/SPARK-15086.
* [SPARK-15806][DOCUMENTATION] update doc for SPARK_MASTER_IPbomeng2016-06-121-1/+7
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? SPARK_MASTER_IP is a deprecated environment variable. It is replaced by SPARK_MASTER_HOST according to MasterArguments.scala. ## How was this patch tested? Manually verified. Author: bomeng <bmeng@us.ibm.com> Closes #13543 from bomeng/SPARK-15806.
* [SPARK-15878][CORE][TEST] fix cleanup in EventLoggingListenerSuite and ↵Imran Rashid2016-06-122-4/+4
| | | | | | | | | | | | | | | | ReplayListenerSuite ## What changes were proposed in this pull request? These tests weren't properly using `LocalSparkContext` so weren't cleaning up correctly when tests failed. ## How was this patch tested? Jenkins. Author: Imran Rashid <irashid@cloudera.com> Closes #13602 from squito/SPARK-15878_cleanup_replaylistener.
* [SPARK-15860] Metrics for codegen size and perfEric Liang2016-06-113-5/+56
| | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? Adds codahale metrics for the codegen source text size and how long it takes to compile. The size is particularly interesting, since the JVM does have hard limits on how large methods can get. To simplify, I added the metrics under a statically-initialized source that is always registered with SparkEnv. ## How was this patch tested? Unit tests Author: Eric Liang <ekl@databricks.com> Closes #13586 from ericl/spark-15860.
* [SPARK-14851][CORE] Support radix sort with nullable longsEric Liang2016-06-116-40/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? This adds support for radix sort of nullable long fields. When a sort field is null and radix sort is enabled, we keep nulls in a separate region of the sort buffer so that radix sort does not need to deal with them. This also has performance benefits when sorting smaller integer types, since the current representation of nulls in two's complement (Long.MIN_VALUE) otherwise forces a full-width radix sort. This strategy for nulls does mean the sort is no longer stable. cc davies ## How was this patch tested? Existing randomized sort tests for correctness. I also tested some TPCDS queries and there does not seem to be any significant regression for non-null sorts. Some test queries (best of 5 runs each). Before change: scala> val start = System.nanoTime; spark.range(5000000).selectExpr("if(id > 5, cast(hash(id) as long), NULL) as h").coalesce(1).orderBy("h").collect(); (System.nanoTime - start) / 1e6 start: Long = 3190437233227987 res3: Double = 4716.471091 After change: scala> val start = System.nanoTime; spark.range(5000000).selectExpr("if(id > 5, cast(hash(id) as long), NULL) as h").coalesce(1).orderBy("h").collect(); (System.nanoTime - start) / 1e6 start: Long = 3190367870952791 res4: Double = 2981.143045 Author: Eric Liang <ekl@databricks.com> Closes #13161 from ericl/sc-2998.
* [SPARK-15879][DOCS][UI] Update logo in UI and docs to add "Apache"Sean Owen2016-06-112-0/+0
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? Use new Spark logo including "Apache" (now, with crushed PNGs). Remove old unreferenced logo files. ## How was this patch tested? Manual check of generated HTML site and Spark UI. I searched for references to the deleted files to make sure they were not used. Author: Sean Owen <sowen@cloudera.com> Closes #13609 from srowen/SPARK-15879.
* [SPARK-15875] Try to use Seq.isEmpty and Seq.nonEmpty instead of Seq.length ↵wangyang2016-06-103-5/+5
| | | | | | | | | | | | | | | | == 0 and Seq.length > 0 ## What changes were proposed in this pull request? In scala, immutable.List.length is an expensive operation so we should avoid using Seq.length == 0 or Seq.lenth > 0, and use Seq.isEmpty and Seq.nonEmpty instead. ## How was this patch tested? existing tests Author: wangyang <wangyang@haizhi.com> Closes #13601 from yangw1234/isEmpty.
* Revert [SPARK-14485][CORE] ignore task finished for executor lostKay Ousterhout2016-06-101-13/+1
| | | | | | | | | | | This reverts commit 695dbc816a6d70289abeb145cb62ff4e62b3f49b. This change is being reverted because it hurts performance of some jobs, and only helps in a narrow set of cases. For more discussion, refer to the JIRA. Author: Kay Ousterhout <kayousterhout@gmail.com> Closes #13580 from kayousterhout/revert-SPARK-14485.
* [SPARK-15866] Rename listAccumulator collectionAccumulatorReynold Xin2016-06-103-14/+19
| | | | | | | | | | | | ## What changes were proposed in this pull request? SparkContext.listAccumulator, by Spark's convention, makes it sound like "list" is a verb and the method should return a list of accumulators. This patch renames the method and the class collection accumulator. ## How was this patch tested? Updated test case to reflect the names. Author: Reynold Xin <rxin@databricks.com> Closes #13594 from rxin/SPARK-15866.
* [SPARK-15794] Should truncate toString() of very wide plansEric Liang2016-06-092-0/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? With very wide tables, e.g. thousands of fields, the plan output is unreadable and often causes OOMs due to inefficient string processing. This truncates all struct and operator field lists to a user configurable threshold to limit performance impact. It would also be nice to optimize string generation to avoid these sort of O(n^2) slowdowns entirely (i.e. use StringBuilder everywhere including expressions), but this is probably too large of a change for 2.0 at this point, and truncation has other benefits for usability. ## How was this patch tested? Added a microbenchmark that covers this case particularly well. I also ran the microbenchmark while varying the truncation threshold. ``` numFields = 5 wide shallowly nested struct field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ 2000 wide x 50 rows (write in-mem) 2336 / 2558 0.0 23364.4 0.1X numFields = 25 wide shallowly nested struct field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ 2000 wide x 50 rows (write in-mem) 4237 / 4465 0.0 42367.9 0.1X numFields = 100 wide shallowly nested struct field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ 2000 wide x 50 rows (write in-mem) 10458 / 11223 0.0 104582.0 0.0X numFields = Infinity wide shallowly nested struct field r/w: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ [info] java.lang.OutOfMemoryError: Java heap space ``` Author: Eric Liang <ekl@databricks.com> Author: Eric Liang <ekhliang@gmail.com> Closes #13537 from ericl/truncated-string.
* [SPARK-15735] Allow specifying min time to run in microbenchmarksEric Liang2016-06-081-37/+72
| | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? This makes microbenchmarks run for at least 2 seconds by default, to allow some time for jit compilation to kick in. ## How was this patch tested? Tested manually with existing microbenchmarks. This change is backwards compatible in that existing microbenchmarks which specified numIters per-case will still run exactly that number of iterations. Microbenchmarks which previously overrode defaultNumIters now override minNumIters. cc hvanhovell Author: Eric Liang <ekl@databricks.com> Author: Eric Liang <ekhliang@gmail.com> Closes #13472 from ericl/spark-15735.
* [SPARK-14485][CORE] ignore task finished for executor lost and removed by driverzhonghaihua2016-06-071-1/+13
| | | | | | | | | | | | | Now, when executor is removed by driver with heartbeats timeout, driver will re-queue the task on this executor and send a kill command to cluster to kill this executor. But, in a situation, the running task of this executor is finished and return result to driver before this executor killed by kill command sent by driver. At this situation, driver will accept the task finished event and ignore speculative task and re-queued task. But, as we know, this executor has removed by driver, the result of this finished task can not save in driver because the BlockManagerId has also removed from BlockManagerMaster by driver. So, the result data of this stage is not complete, and then, it will cause fetch failure. For more details, [link to jira issues SPARK-14485](https://issues.apache.org/jira/browse/SPARK-14485) This PR introduce a mechanism to ignore this kind of task finished. N/A Author: zhonghaihua <793507405@qq.com> Closes #12258 from zhonghaihua/ignoreTaskFinishForExecutorLostAndRemovedByDriver.
* [SPARK-15783][CORE] still some flakiness in these blacklist tests so ignore ↵Imran Rashid2016-06-062-3/+8
| | | | | | | | | | | | | | | | for now ## What changes were proposed in this pull request? There is still some flakiness in BlacklistIntegrationSuite, so turning it off for the moment to avoid breaking more builds -- will turn it back with more fixes. ## How was this patch tested? jenkins. Author: Imran Rashid <irashid@cloudera.com> Closes #13528 from squito/ignore_blacklist.
* [SPARK-14279][BUILD] Pick the spark version from pomDhruve Ashar2016-06-062-3/+59
| | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? Change the way spark picks up version information. Also embed the build information to better identify the spark version running. More context can be found here : https://github.com/apache/spark/pull/12152 ## How was this patch tested? Ran the mvn and sbt builds to verify the version information was being displayed correctly on executing <code>spark-submit --version </code> ![image](https://cloud.githubusercontent.com/assets/7732317/15197251/f7c673a2-1795-11e6-8b2f-88f2a70cf1c1.png) Author: Dhruve Ashar <dhruveashar@gmail.com> Closes #13061 from dhruve/impr/SPARK-14279.
* [MINOR] Fix Typos 'an -> a'Zheng RuiFeng2016-06-0611-16/+16
| | | | | | | | | | | | | | | ## What changes were proposed in this pull request? `an -> a` Use cmds like `find . -name '*.R' | xargs -i sh -c "grep -in ' an [^aeiou]' {} && echo {}"` to generate candidates, and review them one by one. ## How was this patch tested? manual tests Author: Zheng RuiFeng <ruifengz@foxmail.com> Closes #13515 from zhengruifeng/an_a.
* [SPARK-15723] Fixed local-timezone-brittle test where short-timezone form ↵Brett Randall2016-06-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | "EST" is … ## What changes were proposed in this pull request? Stop using the abbreviated and ambiguous timezone "EST" in a test, since it is machine-local default timezone dependent, and fails in different timezones. Fixed [SPARK-15723](https://issues.apache.org/jira/browse/SPARK-15723). ## How was this patch tested? Note that to reproduce this problem in any locale/timezone, you can modify the scalatest-maven-plugin argLine to add a timezone: <argLine>-ea -Xmx3g -XX:MaxPermSize=${MaxPermGen} -XX:ReservedCodeCacheSize=${CodeCacheSize} -Duser.timezone="Australia/Sydney"</argLine> and run $ mvn test -DwildcardSuites=org.apache.spark.status.api.v1.SimpleDateParamSuite -Dtest=none. Equally this will fix it in an effected timezone: <argLine>-ea -Xmx3g -XX:MaxPermSize=${MaxPermGen} -XX:ReservedCodeCacheSize=${CodeCacheSize} -Duser.timezone="America/New_York"</argLine> To test the fix, apply the above change to `pom.xml` to set test TZ to `Australia/Sydney`, and confirm the test now passes. Author: Brett Randall <javabrett@gmail.com> Closes #13462 from javabrett/SPARK-15723-SimpleDateParamSuite.
* [SPARK-15391] [SQL] manage the temporary memory of timsortDavies Liu2016-06-039-54/+100
| | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? Currently, the memory for temporary buffer used by TimSort is always allocated as on-heap without bookkeeping, it could cause OOM both in on-heap and off-heap mode. This PR will try to manage that by preallocate it together with the pointer array, same with RadixSort. It both works for on-heap and off-heap mode. This PR also change the loadFactor of BytesToBytesMap to 0.5 (it was 0.70), it enables use to radix sort also makes sure that we have enough memory for timsort. ## How was this patch tested? Existing tests. Author: Davies Liu <davies@databricks.com> Closes #13318 from davies/fix_timsort.
* [SPARK-15681][CORE] allow lowercase or mixed case log level string when ↵Xin Wu2016-06-032-7/+24
| | | | | | | | | | | | | | | | calling sc.setLogLevel ## What changes were proposed in this pull request? Currently `SparkContext API setLogLevel(level: String) `can not handle lower case or mixed case input string. But `org.apache.log4j.Level.toLevel` can take lowercase or mixed case. This PR is to allow case-insensitive user input for the log level. ## How was this patch tested? A unit testcase is added. Author: Xin Wu <xinwu@us.ibm.com> Closes #13422 from xwu0226/reset_loglevel.
* [SPARK-15737][CORE] fix jetty warningbomeng2016-06-032-0/+2
| | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? After upgrading Jetty to 9.2, we always see "WARN org.eclipse.jetty.server.handler.AbstractHandler: No Server set for org.eclipse.jetty.server.handler.ErrorHandler" while running any test cases. This PR will fix it. ## How was this patch tested? The existing test cases will cover it. Author: bomeng <bmeng@us.ibm.com> Closes #13475 from bomeng/SPARK-15737.
* [SPARK-15714][CORE] Fix flaky o.a.s.scheduler.BlacklistIntegrationSuiteImran Rashid2016-06-033-25/+54
| | | | | | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? BlacklistIntegrationSuite (introduced by SPARK-10372) is a bit flaky because of some race conditions: 1. Failed jobs might have non-empty results, because the resultHandler will be invoked for successful tasks (if there are task successes before failures) 2. taskScheduler.taskIdToTaskSetManager must be protected by a lock on taskScheduler (1) has failed a handful of jenkins builds recently. I don't think I've seen (2) in jenkins, but I've run into with some uncommitted tests I'm working on where there are lots more tasks. While I was in there, I also made an unrelated fix to `runningTasks`in the test framework -- there was a pointless `O(n)` operation to remove completed tasks, could be `O(1)`. ## How was this patch tested? I modified the o.a.s.scheduler.BlacklistIntegrationSuite to have it run the tests 1k times on my laptop. It failed 11 times before this change, and none with it. (Pretty sure all the failures were problem (1), though I didn't check all of them). Also the full suite of tests via jenkins. Author: Imran Rashid <irashid@cloudera.com> Closes #13454 from squito/SPARK-15714.
* [SPARK-15736][CORE] Gracefully handle loss of DiskStore filesJosh Rosen2016-06-023-6/+66
| | | | | | | | | | | | If an RDD partition is cached on disk and the DiskStore file is lost, then reads of that cached partition will fail and the missing partition is supposed to be recomputed by a new task attempt. In the current BlockManager implementation, however, the missing file does not trigger any metadata updates / does not invalidate the cache, so subsequent task attempts will be scheduled on the same executor and the doomed read will be repeatedly retried, leading to repeated task failures and eventually a total job failure. In order to fix this problem, the executor with the missing file needs to properly mark the corresponding block as missing so that it stops advertising itself as a cache location for that block. This patch fixes this bug and adds an end-to-end regression test (in `FailureSuite`) and a set of unit tests (`in BlockManagerSuite`). Author: Josh Rosen <joshrosen@databricks.com> Closes #13473 from JoshRosen/handle-missing-cache-files.
* [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocksPete Robbins2016-06-022-1/+9
| | | | | | | | | | | | | ## What changes were proposed in this pull request? Set minimum number of dispatcher threads to 3 to avoid deadlocks on machines with only 2 cores ## How was this patch tested? Spark test builds Author: Pete Robbins <robbinspg@gmail.com> Closes #13355 from robbinspg/SPARK-13906.
* [SPARK-15322][SQL][FOLLOWUP] Use the new long accumulator for old int ↵hyukjinkwon2016-06-021-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | accumulators. ## What changes were proposed in this pull request? This PR corrects the remaining cases for using old accumulators. This does not change some old accumulator usages below: - `ImplicitSuite.scala` - Tests dedicated to old accumulator, for implicits with `AccumulatorParam` - `AccumulatorSuite.scala` - Tests dedicated to old accumulator - `JavaSparkContext.scala` - For supporting old accumulators for Java API. - `debug.package.scala` - Usage with `HashSet[String]`. Currently, it seems no implementation for this. I might be able to write an anonymous class for this but I didn't because I think it is not worth writing a lot of codes only for this. - `SQLMetricsSuite.scala` - This uses the old accumulator for checking type boxing. It seems new accumulator does not require type boxing for this case whereas the old one requires (due to the use of generic). ## How was this patch tested? Existing tests cover this. Author: hyukjinkwon <gurwls223@gmail.com> Closes #13434 from HyukjinKwon/accum.
* [SPARK-15671] performance regression CoalesceRDD.pickBin with large #…Thomas Graves2016-06-011-3/+7
| | | | | | | | | | | | | | | | | I was running a 15TB join job with 202000 partitions. It looks like the changes I made to CoalesceRDD in pickBin() are really slow with that large of partitions. The array filter with that many elements just takes to long. It took about an hour for it to pickBins for all the partitions. original change: https://github.com/apache/spark/commit/83ee92f60345f016a390d61a82f1d924f64ddf90 Just reverting the pickBin code back to get currpreflocs fixes the issue After reverting the pickBin code the coalesce takes about 10 seconds so for now it makes sense to revert those changes and we can look at further optimizations later. Tested this via RDDSuite unit test and manually testing the very large job. Author: Thomas Graves <tgraves@prevailsail.corp.gq1.yahoo.com> Closes #13443 from tgravescs/SPARK-15671.
* [SPARK-15495][SQL] Improve the explain output for Aggregation operatorSean Zhong2016-06-011-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? This PR improves the explain output of Aggregator operator. SQL: ``` Seq((1,2,3)).toDF("a", "b", "c").createTempView("df1") spark.sql("cache table df1") spark.sql("select count(a), count(c), b from df1 group by b").explain() ``` **Before change:** ``` *TungstenAggregate(key=[b#8], functions=[count(1),count(1)], output=[count(a)#79L,count(c)#80L,b#8]) +- Exchange hashpartitioning(b#8, 200), None +- *TungstenAggregate(key=[b#8], functions=[partial_count(1),partial_count(1)], output=[b#8,count#98L,count#99L]) +- InMemoryTableScan [b#8], InMemoryRelation [a#7,b#8,c#9], true, 10000, StorageLevel(disk=true, memory=true, offheap=false, deserialized=true, replication=1), LocalTableScan [a#7,b#8,c#9], [[1,2,3]], Some(df1) `````` **After change:** ``` *Aggregate(key=[b#8], functions=[count(1),count(1)], output=[count(a)#79L,count(c)#80L,b#8]) +- Exchange hashpartitioning(b#8, 200), None +- *Aggregate(key=[b#8], functions=[partial_count(1),partial_count(1)], output=[b#8,count#98L,count#99L]) +- InMemoryTableScan [b#8], InMemoryRelation [a#7,b#8,c#9], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas), LocalTableScan [a#7,b#8,c#9], [[1,2,3]], Some(df1) ``` ## How was this patch tested? Manual test and existing UT. Author: Sean Zhong <seanzhong@databricks.com> Closes #13363 from clockfly/verbose3.
* [SPARK-15601][CORE] CircularBuffer's toString() to print only the contents ↵Tejas Patil2016-05-312-25/+43
| | | | | | | | | | | | | | | | | | | | written if buffer isn't full ## What changes were proposed in this pull request? 1. The class allocated 4x space than needed as it was using `Int` to store the `Byte` values 2. If CircularBuffer isn't full, currently toString() will print some garbage chars along with the content written as is tries to print the entire array allocated for the buffer. The fix is to keep track of buffer getting full and don't print the tail of the buffer if it isn't full (suggestion by sameeragarwal over https://github.com/apache/spark/pull/12194#discussion_r64495331) 3. Simplified `toString()` ## How was this patch tested? Added new test case Author: Tejas Patil <tejasp@fb.com> Closes #13351 from tejasapatil/circular_buffer.
* [SPARK-15670][JAVA API][SPARK CORE] ↵WeichenXu2016-05-311-0/+4
| | | | | | | | | | | | | | | | label_accumulator_deprecate_in_java_spark_context ## What changes were proposed in this pull request? Add deprecate annotation for acumulator V1 interface in JavaSparkContext class ## How was this patch tested? N/A Author: WeichenXu <WeichenXu123@outlook.com> Closes #13412 from WeichenXu123/label_accumulator_deprecate_in_java_spark_context.
* [CORE][DOC][MINOR] typos + linksJacek Laskowski2016-05-311-2/+2
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? A very tiny change to javadoc (which I don't mind if gets merged with a bigger change). I've just found it annoying and couldn't resist proposing a pull request. Sorry srowen and rxin. ## How was this patch tested? Manual build Author: Jacek Laskowski <jacek@japila.pl> Closes #13383 from jaceklaskowski/memory-consumer.
* [SPARK-15662][SQL] Add since annotation for classes in sql.catalogReynold Xin2016-05-312-2/+2
| | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? This patch does a few things: 1. Adds since version annotation to methods and classes in sql.catalog. 2. Fixed a typo in FilterFunction and a whitespace issue in spark/api/java/function/package.scala 3. Added "database" field to Function class. ## How was this patch tested? Updated unit test case for "database" field in Function class. Author: Reynold Xin <rxin@databricks.com> Closes #13406 from rxin/SPARK-15662.
* [CORE][MINOR][DOC] Removing incorrect scaladocJacek Laskowski2016-05-311-3/+0
| | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? I don't think the method will ever throw an exception so removing a false comment. Sorry srowen and rxin again -- I simply couldn't resist. I wholeheartedly support merging the change with a bigger one (and trashing this PR). ## How was this patch tested? Manual build Author: Jacek Laskowski <jacek@japila.pl> Closes #13384 from jaceklaskowski/blockinfomanager.
* [SPARK-15641] HistoryServer to not show invalid date for incomplete applicationcatapan2016-05-311-1/+2
| | | | | | | | | | | | | ## What changes were proposed in this pull request? For incomplete applications in HistoryServer, the complete column will show "-" instead of incorrect date. ## How was this patch tested? manually tested. Author: catapan <cedarpan86@gmail.com> Author: Ziying Pan <cedarpan@Ziyings-MacBook.local> Closes #13396 from catapan/SPARK-15641_fix_completed_column.
* [SPARK-15638][SQL] Audit Dataset, SparkSession, and SQLContextReynold Xin2016-05-306-10/+11
| | | | | | | | | | | | ## What changes were proposed in this pull request? This patch contains a list of changes as a result of my auditing Dataset, SparkSession, and SQLContext. The patch audits the categorization of experimental APIs, function groups, and deprecations. For the detailed list of changes, please see the diff. ## How was this patch tested? N/A Author: Reynold Xin <rxin@databricks.com> Closes #13370 from rxin/SPARK-15638.
* [SPARK-10530][CORE] Kill other task attempts when one taskattempt belonging ↵Devaraj K2016-05-3017-22/+117
| | | | | | | | | | | | | | | | | | | | | | | | the same task is succeeded in speculation ## What changes were proposed in this pull request? With this patch, TaskSetManager kills other running attempts when any one of the attempt succeeds for the same task. Also killed tasks will not be considered as failed tasks and they get listed separately in the UI and also shows the task state as KILLED instead of FAILED. ## How was this patch tested? core\src\test\scala\org\apache\spark\ui\jobs\JobProgressListenerSuite.scala core\src\test\scala\org\apache\spark\util\JsonProtocolSuite.scala I have verified this patch manually by enabling spark.speculation as true, when any attempt gets succeeded then other running attempts are getting killed for the same task and other pending tasks are getting assigned in those. And also when any attempt gets killed then they are considered as KILLED tasks and not considered as FAILED tasks. Please find the attached screen shots for the reference. ![stage-tasks-table](https://cloud.githubusercontent.com/assets/3174804/14075132/394c6a12-f4f4-11e5-8638-20ff7b8cc9bc.png) ![stages-table](https://cloud.githubusercontent.com/assets/3174804/14075134/3b60f412-f4f4-11e5-9ea6-dd0dcc86eb03.png) Ref : https://github.com/apache/spark/pull/11916 Author: Devaraj K <devaraj@apache.org> Closes #11996 from devaraj-kavali/SPARK-10530.
* [SPARK-15645][STREAMING] Fix some typos of Streaming moduleXin Ren2016-05-301-2/+2
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? No code change, just some typo fixing. ## How was this patch tested? Manually run project build with testing, and build is successful. Author: Xin Ren <iamshrek@126.com> Closes #13385 from keypointt/codeWalkThroughStreaming.
* [SPARK-15633][MINOR] Make package name for Java tests consistentReynold Xin2016-05-271-1/+1
| | | | | | | | | | | | ## What changes were proposed in this pull request? This is a simple patch that makes package names for Java 8 test suites consistent. I moved everything to test.org.apache.spark to we can test package private APIs properly. Also added "java8" as the package name so we can easily run all the tests related to Java 8. ## How was this patch tested? This is a test only change. Author: Reynold Xin <rxin@databricks.com> Closes #13364 from rxin/SPARK-15633.
* [SPARK-15562][ML] Delete temp directory after program exit in DataFrameExampledding32016-05-271-2/+2
| | | | | | | | | | | | | ## What changes were proposed in this pull request? Temp directory used to save records is not deleted after program exit in DataFrameExample. Although it called deleteOnExit, it doesn't work as the directory is not empty. Similar things happend in ContextCleanerSuite. Update the code to make sure temp directory is deleted after program exit. ## How was this patch tested? unit tests and local build. Author: dding3 <ding.ding@intel.com> Closes #13328 from dding3/master.
* [SPARK-15569] Reduce frequency of updateBytesWritten function in Disk…Sital Kedia2016-05-272-8/+7
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? Profiling a Spark job spilling large amount of intermediate data we found that significant portion of time is being spent in DiskObjectWriter.updateBytesWritten function. Looking at the code, we see that the function is being called too frequently to update the number of bytes written to disk. We should reduce the frequency to avoid this. ## How was this patch tested? Tested by running the job on cluster and saw 20% CPU gain by this change. Author: Sital Kedia <skedia@fb.com> Closes #13332 from sitalkedia/DiskObjectWriter.
* [MINOR] Fix Typos 'a -> an'Zheng RuiFeng2016-05-267-8/+8
| | | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? `a` -> `an` I use regex to generate potential error lines: `grep -in ' a [aeiou]' mllib/src/main/scala/org/apache/spark/ml/*/*scala` and review them line by line. ## How was this patch tested? local build `lint-java` checking Author: Zheng RuiFeng <ruifengz@foxmail.com> Closes #13317 from zhengruifeng/a_an.
* [MINOR][CORE] Fixed doc for Accumulator2.addJoseph K. Bradley2016-05-261-4/+4
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? Scala doc used outdated ```+=```. Replaced with ```add```. ## How was this patch tested? N/A Author: Joseph K. Bradley <joseph@databricks.com> Closes #13346 from jkbradley/accum-doc.
* [SPARK-8428][SPARK-13850] Fix integer overflows in TimSortSameer Agarwal2016-05-263-6/+30
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? This patch fixes a few integer overflows in `UnsafeSortDataFormat.copyRange()` and `ShuffleSortDataFormat copyRange()` that seems to be the most likely cause behind a number of `TimSort` contract violation errors seen in Spark 2.0 and Spark 1.6 while sorting large datasets. ## How was this patch tested? Added a test in `ExternalSorterSuite` that instantiates a large array of the form of [150000000, 150000001, 150000002, ...., 300000000, 0, 1, 2, ..., 149999999] that triggers a `copyRange` in `TimSort.mergeLo` or `TimSort.mergeHi`. Note that the input dataset should contain at least 268.43 million rows with a certain data distribution for an overflow to occur. Author: Sameer Agarwal <sameer@databricks.com> Closes #13336 from sameeragarwal/timsort-bug.
* [SPARK-13148][YARN] document zero-keytab Oozie application launch; add ↵Steve Loughran2016-05-261-2/+49
| | | | | | | | | | | diagnostics This patch provides detail on what to do for keytabless Oozie launches of spark apps, and adds some debug-level diagnostics of what credentials have been submitted Author: Steve Loughran <stevel@hortonworks.com> Author: Steve Loughran <stevel@apache.org> Closes #11033 from steveloughran/stevel/feature/SPARK-13148-oozie.
* [SPARK-10372] [CORE] basic test framework for entire spark schedulerImran Rashid2016-05-267-13/+728
| | | | | | | | This is a basic framework for testing the entire scheduler. The tests this adds aren't very interesting -- the point of this PR is just to setup the framework, to keep the initial change small, but it can be built upon to test more features (eg., speculation, killing tasks, blacklisting, etc.). Author: Imran Rashid <irashid@cloudera.com> Closes #8559 from squito/SPARK-10372-scheduler-integs.
* [SPARK-14269][SCHEDULER] Eliminate unnecessary submitStage() call.Takuya UESHIN2016-05-252-25/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? Currently a method `submitStage()` for waiting stages is called on every iteration of the event loop in `DAGScheduler` to submit all waiting stages, but most of them are not necessary because they are not related to Stage status. The case we should try to submit waiting stages is only when their parent stages are successfully completed. This elimination can improve `DAGScheduler` performance. ## How was this patch tested? Added some checks and other existing tests, and our projects. We have a project bottle-necked by `DAGScheduler`, having about 2000 stages. Before this patch the almost all execution time in `Driver` process was spent to process `submitStage()` of `dag-scheduler-event-loop` thread but after this patch the performance was improved as follows: | | total execution time | `dag-scheduler-event-loop` thread time | `submitStage()` | |--------|---------------------:|---------------------------------------:|----------------:| | Before | 760 sec | 710 sec | 667 sec | | After | 440 sec | 14 sec | 10 sec | Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #12060 from ueshin/issues/SPARK-14269.
* [MINOR][CORE] Fix a HadoopRDD log message and remove unused imports in rdd ↵Dongjoon Hyun2016-05-255-7/+4
| | | | | | | | | | | | | | | | | | | | | | | files. ## What changes were proposed in this pull request? This PR fixes the following typos in log message and comments of `HadoopRDD.scala`. Also, this removes unused imports. ```scala - logWarning("Caching NewHadoopRDDs as deserialized objects usually leads to undesired" + + logWarning("Caching HadoopRDDs as deserialized objects usually leads to undesired" + ... - // since its not removed yet + // since it's not removed yet ``` ## How was this patch tested? Manual. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #13294 from dongjoon-hyun/minor_rdd_fix_log_message.
* [SPARK-15345][SQL][PYSPARK] SparkSession's conf doesn't take effect when ↵Jeff Zhang2016-05-251-0/+3
| | | | | | | | | | | | | | | | | | this already an existing SparkContext ## What changes were proposed in this pull request? Override the existing SparkContext is the provided SparkConf is different. PySpark part hasn't been fixed yet, will do that after the first round of review to ensure this is the correct approach. ## How was this patch tested? Manually verify it in spark-shell. rxin Please help review it, I think this is a very critical issue for spark 2.0 Author: Jeff Zhang <zjffdu@apache.org> Closes #13160 from zjffdu/SPARK-15345.
* [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change.Lukasz2016-05-253-2/+19
| | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? 1. Making 'name' field of RDDInfo mutable. 2. In StorageListener: catching the fact that RDD's name was changed and updating it in RDDInfo. ## How was this patch tested? 1. Manual verification - the 'Storage' tab now behaves as expected. 2. The commit also contains a new unit test which verifies this. Author: Lukasz <lgieron@gmail.com> Closes #13264 from lgieron/SPARK-9044.
* [SPARK-15518] Rename various scheduler backend for consistencyReynold Xin2016-05-2414-107/+123
| | | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? This patch renames various scheduler backends to make them consistent: - LocalScheduler -> LocalSchedulerBackend - AppClient -> StandaloneAppClient - AppClientListener -> StandaloneAppClientListener - SparkDeploySchedulerBackend -> StandaloneSchedulerBackend - CoarseMesosSchedulerBackend -> MesosCoarseGrainedSchedulerBackend - MesosSchedulerBackend -> MesosFineGrainedSchedulerBackend ## How was this patch tested? Updated test cases to reflect the name change. Author: Reynold Xin <rxin@databricks.com> Closes #13288 from rxin/SPARK-15518.
* [SPARK-15512][CORE] repartition(0) should raise IllegalArgumentExceptionDongjoon Hyun2016-05-242-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? Previously, SPARK-8893 added the constraints on positive number of partitions for repartition/coalesce operations in general. This PR adds one missing part for that and adds explicit two testcases. **Before** ```scala scala> sc.parallelize(1 to 5).coalesce(0) java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive. ... scala> sc.parallelize(1 to 5).repartition(0).collect() res1: Array[Int] = Array() // empty scala> spark.sql("select 1").coalesce(0) res2: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [1: int] scala> spark.sql("select 1").coalesce(0).collect() java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive. scala> spark.sql("select 1").repartition(0) res3: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [1: int] scala> spark.sql("select 1").repartition(0).collect() res4: Array[org.apache.spark.sql.Row] = Array() // empty ``` **After** ```scala scala> sc.parallelize(1 to 5).coalesce(0) java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive. ... scala> sc.parallelize(1 to 5).repartition(0) java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive. ... scala> spark.sql("select 1").coalesce(0) java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive. ... scala> spark.sql("select 1").repartition(0) java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive. ... ``` ## How was this patch tested? Pass the Jenkins tests with new testcases. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #13282 from dongjoon-hyun/SPARK-15512.
* [MINOR][CORE][TEST] Update obsolete `takeSample` test case.Dongjoon Hyun2016-05-241-8/+7
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? This PR fixes some obsolete comments and assertion in `takeSample` testcase of `RDDSuite.scala`. ## How was this patch tested? This fixes the testcase only. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #13260 from dongjoon-hyun/SPARK-15481.
* [SPARK-15433] [PYSPARK] PySpark core test should not use SerDe from ↵Liang-Chi Hsieh2016-05-241-1/+1
| | | | | | | | | | | | | | | PythonMLLibAPI ## What changes were proposed in this pull request? Currently PySpark core test uses the `SerDe` from `PythonMLLibAPI` which includes many MLlib things. It should use `SerDeUtil` instead. ## How was this patch tested? Existing tests. Author: Liang-Chi Hsieh <simonh@tw.ibm.com> Closes #13214 from viirya/pycore-use-serdeutil.