aboutsummaryrefslogtreecommitdiff
path: root/external
Commit message (Collapse)AuthorAgeFilesLines
* SPARK-2034. KafkaInputDStream doesn't close resources and may prevent JVM ↵Sean Owen2014-06-221-22/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shutdown Tobias noted today on the mailing list: ======== I am trying to use Spark Streaming with Kafka, which works like a charm – except for shutdown. When I run my program with "sbt run-main", sbt will never exit, because there are two non-daemon threads left that don't die. I created a minimal example at <https://gist.github.com/tgpfeiffer/b1e765064e983449c6b6#file-kafkadoesntshutdown-scala>. It starts a StreamingContext and does nothing more than connecting to a Kafka server and printing what it receives. Using the `future Unknown macro: { ... } ` construct, I shut down the StreamingContext after some seconds and then print the difference between the threads at start time and at end time. The output can be found at <https://gist.github.com/tgpfeiffer/b1e765064e983449c6b6#file-output1>. There are a number of threads remaining that will prevent sbt from exiting. When I replace `KafkaUtils.createStream(...)` with a call that does exactly the same, except that it calls `consumerConnector.shutdown()` in `KafkaReceiver.onStop()` (which it should, IMO), the output is as shown at <https://gist.github.com/tgpfeiffer/b1e765064e983449c6b6#file-output2>. Does anyone have any idea what is going on here and why the program doesn't shut down properly? The behavior is the same with both kafka 0.8.0 and 0.8.1.1, by the way. ======== Something similar was noted last year: http://mail-archives.apache.org/mod_mbox/spark-dev/201309.mbox/%3C1380220041.2428.YahooMailNeo@web160804.mail.bf1.yahoo.com%3E KafkaInputDStream doesn't close `ConsumerConnector` in `onStop()`, and does not close the `Executor` it creates. The latter leaves non-daemon threads and can prevent the JVM from shutting down even if streaming is closed properly. Author: Sean Owen <sowen@cloudera.com> Closes #980 from srowen/SPARK-2034 and squashes the following commits: 9f31a8d [Sean Owen] Restore ClassTag to private class because MIMA flags it; is the shadowing intended? 2d579a8 [Sean Owen] Close ConsumerConnector in onStop; shutdown() the local Executor that is created so that its threads stop when done; close the Zookeeper client even on exception; fix a few typos; log exceptions that otherwise vanish (cherry picked from commit 476581e8c8ca03a5940c404fee8a06361ff94cb5) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
* [SPARK-1998] SparkFlumeEvent with body bigger than 1020 bytes are not re...joyyoj2014-06-101-2/+2
| | | | | | | | | | | | | flume event sent to Spark will fail if the body is too large and numHeaders is greater than zero Author: joyyoj <sunshch@gmail.com> Closes #951 from joyyoj/master and squashes the following commits: f4660c5 [joyyoj] [SPARK-1998] SparkFlumeEvent with body bigger than 1020 bytes are not read properly (cherry picked from commit 29660443077619ee854025b8d0d3d64181724054) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
* Spark 1916David Lemieux2014-05-281-1/+1
| | | | | | | | | | | | The changes could be ported back to 0.9 as well. Changing in.read to in.readFully to read the whole input stream rather than the first 1020 bytes. This should ok considering that Flume caps the body size to 32K by default. Author: David Lemieux <david.lemieux@radialpoint.com> Closes #865 from lemieud/SPARK-1916 and squashes the following commits: a265673 [David Lemieux] Updated SparkFlumeEvent to read the whole stream rather than the first X bytes.
* [maven-release-plugin] prepare for next development iterationTathagata Das2014-05-265-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc11v1.0.0Tathagata Das2014-05-265-5/+5
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc11"Tathagata Das2014-05-265-5/+5
| | | | This reverts commit 2f1dc868e5714882cf40d2633fb66772baf34789.
* Revert "[maven-release-plugin] prepare for next development iteration"Tathagata Das2014-05-265-5/+5
| | | | This reverts commit 832dc594e7666f1d402334f8015ce29917d9c888.
* [maven-release-plugin] prepare for next development iterationTathagata Das2014-05-255-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc11Tathagata Das2014-05-255-5/+5
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc10"Tathagata Das2014-05-255-5/+5
| | | | This reverts commit d807023479ce10aec28ef3c1ab646ddefc2e663c.
* Revert "[maven-release-plugin] prepare for next development iteration"Tathagata Das2014-05-255-5/+5
| | | | This reverts commit 67dd53d2556f03ce292e6889128cf441f1aa48f8.
* [maven-release-plugin] prepare for next development iterationTathagata Das2014-05-205-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc10Tathagata Das2014-05-205-5/+5
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc9"Tathagata Das2014-05-195-5/+5
| | | | This reverts commit 920f947eb5a22a679c0c3186cf69ee75f6041c75.
* Revert "[maven-release-plugin] prepare for next development iteration"Tathagata Das2014-05-195-5/+5
| | | | This reverts commit f8e611955096c5c1c7db5764b9d2851b1d295f0d.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-175-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc9Patrick Wendell2014-05-175-5/+5
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc8"Patrick Wendell2014-05-165-5/+5
| | | | This reverts commit 80eea0f111c06260ffaa780d2f3f7facd09c17bc.
* Revert "[maven-release-plugin] prepare for next development iteration"Patrick Wendell2014-05-165-5/+5
| | | | This reverts commit e5436b8c1a79ce108f3af402455ac5f6dc5d1eb3.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-165-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc8Patrick Wendell2014-05-165-5/+5
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc7"Patrick Wendell2014-05-165-5/+5
| | | | This reverts commit 9212b3e5bb5545ccfce242da8d89108e6fb1c464.
* Revert "[maven-release-plugin] prepare for next development iteration"Patrick Wendell2014-05-165-5/+5
| | | | This reverts commit c4746aa6fe4aaf383e69e34353114d36d1eb9ba6.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-155-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc7Patrick Wendell2014-05-155-5/+5
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc6"Patrick Wendell2014-05-145-5/+5
| | | | This reverts commit 54133abdce0246f6643a1112a5204afb2c4caa82.
* Revert "[maven-release-plugin] prepare for next development iteration"Patrick Wendell2014-05-145-5/+5
| | | | This reverts commit e480bcfbd269ae1d7a6a92cfb50466cf192fe1fb.
* Package docsPrashant Sharma2014-05-1410-0/+220
| | | | | | | | | | | | | | | | | This is a few changes based on the original patch by @scrapcodes. Author: Prashant Sharma <prashant.s@imaginea.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #785 from pwendell/package-docs and squashes the following commits: c32b731 [Patrick Wendell] Changes based on Prashant's patch c0463d3 [Prashant Sharma] added eof new line ce8bf73 [Prashant Sharma] Added eof new line to all files. 4c35f2e [Prashant Sharma] SPARK-1563 Add package-info.java and package.scala files for all packages that appear in docs (cherry picked from commit 46324279dae2fa803267d788f7c56b0ed643b4c8) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-145-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc6Patrick Wendell2014-05-145-5/+5
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc5"Patrick Wendell2014-05-145-5/+5
| | | | This reverts commit 18f062303303824139998e8fc8f4158217b0dbc3.
* Revert "[maven-release-plugin] prepare for next development iteration"Patrick Wendell2014-05-145-5/+5
| | | | This reverts commit d08e9604fc9958b7c768e91715c8152db2ed6fd0.
* Fixed streaming examples docs to use run-example instead of spark-submitTathagata Das2014-05-141-23/+35
| | | | | | | | | | | | | Pretty self-explanatory Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #722 from tdas/example-fix and squashes the following commits: 7839979 [Tathagata Das] Minor changes. 0673441 [Tathagata Das] Fixed java docs of java streaming example e687123 [Tathagata Das] Fixed scala style errors. 9b8d112 [Tathagata Das] Fixed streaming examples docs to use run-example instead of spark-submit.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-135-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc5Patrick Wendell2014-05-135-5/+5
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc4"Patrick Wendell2014-05-125-5/+5
| | | | This reverts commit 3d0a44833ab50360bf9feccc861cb5e8c44a4866.
* Revert "[maven-release-plugin] prepare for next development iteration"Patrick Wendell2014-05-125-5/+5
| | | | This reverts commit 9772d85c6f3893d42044f4bab0e16f8b6287613a.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-135-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc4Patrick Wendell2014-05-135-5/+5
|
* Rollback versions for 1.0.0-rc4Patrick Wendell2014-05-125-5/+5
|
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-125-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc4Patrick Wendell2014-05-125-5/+5
|
* SPARK-1798. Tests should clean up temp filesSean Owen2014-05-125-5/+5
| | | | | | | | | | | | | | | | | | | | | | | Three issues related to temp files that tests generate – these should be touched up for hygiene but are not urgent. Modules have a log4j.properties which directs the unit-test.log output file to a directory like `[module]/target/unit-test.log`. But this ends up creating `[module]/[module]/target/unit-test.log` instead of former. The `work/` directory is not deleted by "mvn clean", in the parent and in modules. Neither is the `checkpoint/` directory created under the various external modules. Many tests create a temp directory, which is not usually deleted. This can be largely resolved by calling `deleteOnExit()` at creation and trying to call `Utils.deleteRecursively` consistently to clean up, sometimes in an `@After` method. _If anyone seconds the motion, I can create a more significant change that introduces a new test trait along the lines of `LocalSparkContext`, which provides management of temp directories for subclasses to take advantage of._ Author: Sean Owen <sowen@cloudera.com> Closes #732 from srowen/SPARK-1798 and squashes the following commits: 5af578e [Sean Owen] Try to consistently delete test temp dirs and files, and set deleteOnExit() for each b21b356 [Sean Owen] Remove work/ and checkpoint/ dirs with mvn clean bdd0f41 [Sean Owen] Remove duplicate module dir in log4j.properties output path for tests (cherry picked from commit 7120a2979d0a9f0f54a88b2416be7ca10e74f409) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
* SPARK-1789. Multiple versions of Netty dependencies cause FlumeStreamSuite ↵Sean Owen2014-05-104-19/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | failure TL;DR is there is a bit of JAR hell trouble with Netty, that can be mostly resolved and will resolve a test failure. I hit the error described at http://apache-spark-user-list.1001560.n3.nabble.com/SparkContext-startup-time-out-td1753.html while running FlumeStreamingSuite, and have for a short while (is it just me?) velvia notes: "I have found a workaround. If you add akka 2.2.4 to your dependencies, then everything works, probably because akka 2.2.4 brings in newer version of Jetty." There are at least 3 versions of Netty in play in the build: - the new Flume 1.4.0 dependency brings in io.netty:netty:3.4.0.Final, and that is the immediate problem - the custom version of akka 2.2.3 depends on io.netty:netty:3.6.6. - but, Spark Core directly uses io.netty:netty-all:4.0.17.Final The POMs try to exclude other versions of netty, but are excluding org.jboss.netty:netty, when in fact older versions of io.netty:netty (not netty-all) are also an issue. The org.jboss.netty:netty excludes are largely unnecessary. I replaced many of them with io.netty:netty exclusions until everything agreed on io.netty:netty-all:4.0.17.Final. But this didn't work, since Akka 2.2.3 doesn't work with Netty 4.x. Down-grading to 3.6.6.Final across the board made some Spark code not compile. If the build *keeps* io.netty:netty:3.6.6.Final as well, everything seems to work. Part of the reason seems to be that Netty 3.x used the old `org.jboss.netty` packages. This is less than ideal, but is no worse than the current situation. So this PR resolves the issue and improves the JAR hell, even if it leaves the existing theoretical Netty 3-vs-4 conflict: - Remove org.jboss.netty excludes where possible, for clarity; they're not needed except with Hadoop artifacts - Add io.netty:netty excludes where needed -- except, let akka keep its io.netty:netty - Change a bit of test code that actually depended on Netty 3.x, to use 4.x equivalent - Update SBT build accordingly A better change would be to update Akka far enough such that it agrees on Netty 4.x, but I don't know if that's feasible. Author: Sean Owen <sowen@cloudera.com> Closes #723 from srowen/SPARK-1789 and squashes the following commits: 43661b7 [Sean Owen] Update and add Netty excludes to prevent some JAR conflicts that cause test issues (cherry picked from commit 2b7bd29eb6ee5baf739eec143044ecfc296b9b1f) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-04-295-5/+5
|
* [maven-release-plugin] prepare release v1.0.0-rc3Patrick Wendell2014-04-295-5/+5
|
* Manual revert of rc2 version changes.Patrick Wendell2014-04-285-5/+5
|
* Improved build configurationwitgo2014-04-285-70/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1, Fix SPARK-1441: compile spark core error with hadoop 0.23.x 2, Fix SPARK-1491: maven hadoop-provided profile fails to build 3, Fix org.scala-lang: * ,org.apache.avro:* inconsistent versions dependency 4, A modified on the sql/catalyst/pom.xml,sql/hive/pom.xml,sql/core/pom.xml (Four spaces formatted into two spaces) Author: witgo <witgo@qq.com> Closes #480 from witgo/format_pom and squashes the following commits: 03f652f [witgo] review commit b452680 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom bee920d [witgo] revert fix SPARK-1629: Spark Core missing commons-lang dependence 7382a07 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom 6902c91 [witgo] fix SPARK-1629: Spark Core missing commons-lang dependence 0da4bc3 [witgo] merge master d1718ed [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom e345919 [witgo] add avro dependency to yarn-alpha 77fad08 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom 62d0862 [witgo] Fix org.scala-lang: * inconsistent versions dependency 1a162d7 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom 934f24d [witgo] review commit cf46edc [witgo] exclude jruby 06e7328 [witgo] Merge branch 'SparkBuild' into format_pom 99464d2 [witgo] fix maven hadoop-provided profile fails to build 0c6c1fc [witgo] Fix compile spark core error with hadoop 0.23.x 6851bec [witgo] Maintain consistent SparkBuild.scala, pom.xml (cherry picked from commit 030f2c2126d5075576cd6d83a1ee7462c48b953b) Conflicts: sql/catalyst/pom.xml sql/core/pom.xml sql/hive/pom.xml
* SPARK-1584: Upgrade Flume dependency to 1.4.0tmalaska2014-04-241-1/+5
| | | | | | | | | | | | | | | Updated the Flume dependency in the maven pom file and the scala build file. Author: tmalaska <ted.malaska@cloudera.com> Closes #507 from tmalaska/master and squashes the following commits: 79492c8 [tmalaska] excluded all thrift 159c3f1 [tmalaska] fixed the flume pom file issues 5bf56a7 [tmalaska] Upgrade flume version (cherry picked from commit d5c6ae6cc3305b9aa3185486b5b6ba0a6e5aca90) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
* SPARK-1586 Windows build fixesMridul Muralidharan2014-04-242-2/+2
| | | | | | | | | | | | | | | | | | | | | | Unfortunately, this is not exhaustive - particularly hive tests still fail due to path issues. Author: Mridul Muralidharan <mridulm80@apache.org> This patch had conflicts when merged, resolved by Committer: Matei Zaharia <matei@databricks.com> Closes #505 from mridulm/windows_fixes and squashes the following commits: ef12283 [Mridul Muralidharan] Move to org.apache.commons.lang3 for StringEscapeUtils. Earlier version was buggy appparently cdae406 [Mridul Muralidharan] Remove leaked changes from > 2G fix branch 3267f4b [Mridul Muralidharan] Fix build failures 35b277a [Mridul Muralidharan] Fix Scalastyle failures bc69d14 [Mridul Muralidharan] Change from hardcoded path separator 10c4d78 [Mridul Muralidharan] Use explicit encoding while using getBytes 1337abd [Mridul Muralidharan] fix classpath while running in windows (cherry picked from commit 968c0187a12f5ae4a696c02c1ff088e998ed7edd) Signed-off-by: Matei Zaharia <matei@databricks.com>