| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I got "java.util.NoSuchElementException: key not found: 1401756085000 ms" exception when using kafka stream and 1 sec batchPeriod.
Investigation showed that the reason is that ReceiverLauncher.startReceivers is asynchronous (started in a thread).
https://github.com/vchekan/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala#L206
In case of slow starting receiver, such as Kafka, it easily takes more than 2sec to start. In result, no single "compute" will be called on ReceiverInputDStream before first batch job is executed and receivedBlockInfo remains empty (obviously). Batch job will cause ReceiverInputDStream.getReceivedBlockInfo call and "key not found" exception.
The patch makes getReceivedBlockInfo more robust by tolerating missing values.
Author: Vadim Chekan <kot.begemot@gmail.com>
Closes #961 from vchekan/branch-1.0 and squashes the following commits:
e86f82b [Vadim Chekan] Fixed indentation
4609563 [Vadim Chekan] Key not found exception: if receiver is slow to start, it is possible that getReceivedBlockInfo will be called before compute has been called
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
#511 and #863 got left out of branch-1.0 since we were really close to the release. Now that they have been tested a little I see no reason to leave them out.
Author: Michael Armbrust <michael@databricks.com>
Author: witgo <witgo@qq.com>
Closes #1078 from marmbrus/branch-1.0 and squashes the following commits:
22be674 [witgo] [SPARK-1841]: update scalatest to version 2.1.5
fc8fc79 [Michael Armbrust] Include #1071 as well.
c5d0adf [Michael Armbrust] Update SparkSQL in branch-1.0 to match master.
|
|
|
|
|
|
|
|
|
|
|
| |
Author: Lars Albertsson <lalle@spotify.com>
Closes #1001 from lallea/contextwaiter_stopped and squashes the following commits:
93cd314 [Lars Albertsson] Mend StreamingContext stop() followed by awaitTermination().
(cherry picked from commit 4d5c12aa1c54c49377a4bafe3bcc4993d5e1a552)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
| |
|
| |
|
|
|
|
| |
This reverts commit 2f1dc868e5714882cf40d2633fb66772baf34789.
|
|
|
|
| |
This reverts commit 832dc594e7666f1d402334f8015ce29917d9c888.
|
| |
|
| |
|
|
|
|
| |
This reverts commit d807023479ce10aec28ef3c1ab646ddefc2e663c.
|
|
|
|
| |
This reverts commit 67dd53d2556f03ce292e6889128cf441f1aa48f8.
|
| |
|
| |
|
|
|
|
| |
This reverts commit 920f947eb5a22a679c0c3186cf69ee75f6041c75.
|
|
|
|
| |
This reverts commit f8e611955096c5c1c7db5764b9d2851b1d295f0d.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
JIRA: https://issues.apache.org/jira/browse/SPARK-1878
Author: zsxwing <zsxwing@gmail.com>
Closes #822 from zsxwing/SPARK-1878 and squashes the following commits:
4a47e27 [zsxwing] SPARK-1878: Fix the incorrect initialization order
(cherry picked from commit 1811ba8ccb580979aa2e12019e6a82805f09ab53)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
|
| |
|
| |
|
|
|
|
| |
This reverts commit 80eea0f111c06260ffaa780d2f3f7facd09c17bc.
|
|
|
|
| |
This reverts commit e5436b8c1a79ce108f3af402455ac5f6dc5d1eb3.
|
| |
|
| |
|
|
|
|
| |
This reverts commit 9212b3e5bb5545ccfce242da8d89108e6fb1c464.
|
|
|
|
| |
This reverts commit c4746aa6fe4aaf383e69e34353114d36d1eb9ba6.
|
| |
|
| |
|
|
|
|
| |
This reverts commit 54133abdce0246f6643a1112a5204afb2c4caa82.
|
|
|
|
| |
This reverts commit e480bcfbd269ae1d7a6a92cfb50466cf192fe1fb.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a few changes based on the original patch by @scrapcodes.
Author: Prashant Sharma <prashant.s@imaginea.com>
Author: Patrick Wendell <pwendell@gmail.com>
Closes #785 from pwendell/package-docs and squashes the following commits:
c32b731 [Patrick Wendell] Changes based on Prashant's patch
c0463d3 [Prashant Sharma] added eof new line
ce8bf73 [Prashant Sharma] Added eof new line to all files.
4c35f2e [Prashant Sharma] SPARK-1563 Add package-info.java and package.scala files for all packages that appear in docs
(cherry picked from commit 46324279dae2fa803267d788f7c56b0ed643b4c8)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
| |
|
| |
|
|
|
|
| |
This reverts commit 18f062303303824139998e8fc8f4158217b0dbc3.
|
|
|
|
| |
This reverts commit d08e9604fc9958b7c768e91715c8152db2ed6fd0.
|
| |
|
| |
|
|
|
|
| |
This reverts commit 3d0a44833ab50360bf9feccc861cb5e8c44a4866.
|
|
|
|
| |
This reverts commit 9772d85c6f3893d42044f4bab0e16f8b6287613a.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Three issues related to temp files that tests generate – these should be touched up for hygiene but are not urgent.
Modules have a log4j.properties which directs the unit-test.log output file to a directory like `[module]/target/unit-test.log`. But this ends up creating `[module]/[module]/target/unit-test.log` instead of former.
The `work/` directory is not deleted by "mvn clean", in the parent and in modules. Neither is the `checkpoint/` directory created under the various external modules.
Many tests create a temp directory, which is not usually deleted. This can be largely resolved by calling `deleteOnExit()` at creation and trying to call `Utils.deleteRecursively` consistently to clean up, sometimes in an `@After` method.
_If anyone seconds the motion, I can create a more significant change that introduces a new test trait along the lines of `LocalSparkContext`, which provides management of temp directories for subclasses to take advantage of._
Author: Sean Owen <sowen@cloudera.com>
Closes #732 from srowen/SPARK-1798 and squashes the following commits:
5af578e [Sean Owen] Try to consistently delete test temp dirs and files, and set deleteOnExit() for each
b21b356 [Sean Owen] Remove work/ and checkpoint/ dirs with mvn clean
bdd0f41 [Sean Owen] Remove duplicate module dir in log4j.properties output path for tests
(cherry picked from commit 7120a2979d0a9f0f54a88b2416be7ca10e74f409)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- [x] Move all of them into subpackages of org.apache.spark.examples (right now some are in org.apache.spark.streaming.examples, for instance, and others are in org.apache.spark.examples.mllib)
- [x] Move Python examples into examples/src/main/python
- [x] Update docs to reflect these changes
Author: Sandeep <sandeep@techaddict.me>
This patch had conflicts when merged, resolved by
Committer: Matei Zaharia <matei@databricks.com>
Closes #571 from techaddict/SPARK-1637 and squashes the following commits:
47ef86c [Sandeep] Changes based on Discussions on PR, removing use of RawTextHelper from examples
8ed2d3f [Sandeep] Docs Updated for changes, Change for java examples
5f96121 [Sandeep] Move Python examples into examples/src/main/python
0a8dd77 [Sandeep] Move all Scala Examples to org.apache.spark.examples (some are in org.apache.spark.streaming.examples, for instance, and others are in org.apache.spark.examples.mllib)
(cherry picked from commit a000b5c3b0438c17e9973df4832c320210c29c27)
Signed-off-by: Matei Zaharia <matei@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- SPARK-1558: Updated custom receiver guide to match it with the new API
- SPARK-1504: Added deployment and monitoring subsection to streaming
- SPARK-1505: Added migration guide for migrating from 0.9.x and below to Spark 1.0
- Updated various Java streaming examples to use JavaReceiverInputDStream to highlight the API change.
- Removed the requirement for cleaner ttl from streaming guide
Author: Tathagata Das <tathagata.das1565@gmail.com>
Closes #652 from tdas/doc-fix and squashes the following commits:
cb4f4b7 [Tathagata Das] Possible fix for flaky graceful shutdown test.
ab71f7f [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into doc-fix
8d6ff9b [Tathagata Das] Addded migration guide to Spark Streaming.
7d171df [Tathagata Das] Added reference to JavaReceiverInputStream in examples and streaming guide.
49edd7c [Tathagata Das] Change java doc links to use Java docs.
11528d7 [Tathagata Das] Updated links on index page.
ff80970 [Tathagata Das] More updates to streaming guide.
4dc42e9 [Tathagata Das] Added monitoring and other documentation in the streaming guide.
14c6564 [Tathagata Das] Updated custom receiver guide.
(cherry picked from commit a975a19f21e71f448b3fdb2ed4461e28ef439900)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
JavaPairRDDStream.reduceByKeyAndWindow()
It appears that one of these methods doesn't use `org.apache.spark.api.java.function.Function2` like all the others, but uses Scala's `Function2`.
Author: Sean Owen <sowen@cloudera.com>
Closes #633 from srowen/SPARK-1663.2 and squashes the following commits:
1e0232d [Sean Owen] Fix signature of one version of reduceByKeyAndWindow to use Java API Function2, as apparently intended
(cherry picked from commit 0088cede592540f35f9aec0f24dc1d9bd690d878)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1, Fix SPARK-1441: compile spark core error with hadoop 0.23.x
2, Fix SPARK-1491: maven hadoop-provided profile fails to build
3, Fix org.scala-lang: * ,org.apache.avro:* inconsistent versions dependency
4, A modified on the sql/catalyst/pom.xml,sql/hive/pom.xml,sql/core/pom.xml (Four spaces formatted into two spaces)
Author: witgo <witgo@qq.com>
Closes #480 from witgo/format_pom and squashes the following commits:
03f652f [witgo] review commit
b452680 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
bee920d [witgo] revert fix SPARK-1629: Spark Core missing commons-lang dependence
7382a07 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
6902c91 [witgo] fix SPARK-1629: Spark Core missing commons-lang dependence
0da4bc3 [witgo] merge master
d1718ed [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
e345919 [witgo] add avro dependency to yarn-alpha
77fad08 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
62d0862 [witgo] Fix org.scala-lang: * inconsistent versions dependency
1a162d7 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
934f24d [witgo] review commit
cf46edc [witgo] exclude jruby
06e7328 [witgo] Merge branch 'SparkBuild' into format_pom
99464d2 [witgo] fix maven hadoop-provided profile fails to build
0c6c1fc [witgo] Fix compile spark core error with hadoop 0.23.x
6851bec [witgo] Maintain consistent SparkBuild.scala, pom.xml
(cherry picked from commit 030f2c2126d5075576cd6d83a1ee7462c48b953b)
Conflicts:
sql/catalyst/pom.xml
sql/core/pom.xml
sql/hive/pom.xml
|