Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Merge remote-tracking branch 'apache/master' into conf2 | Matei Zaharia | 2014-01-01 | 19 | -37/+52 |
|\ | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala | ||||
| * | Merge pull request #312 from pwendell/log4j-fix-2 | Patrick Wendell | 2014-01-01 | 19 | -37/+52 |
| |\ | | | | | | | | | | | | | | | | | | | | | | SPARK-1008: Logging improvments 1. Adds a default log4j file that gets loaded if users haven't specified a log4j file. 2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings after building with SBT (and I've seen similar warnings on the mailing list). | ||||
| | * | Merge remote-tracking branch 'apache-github/master' into log4j-fix-2 | Patrick Wendell | 2014-01-01 | 38 | -465/+803 |
| | |\ | | |/ | |/| | | | | | | | Conflicts: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala | ||||
| | * | Adding outer checkout when initializing logging | Patrick Wendell | 2013-12-31 | 1 | -3/+5 |
| | | | |||||
| | * | Tiny typo fix | Patrick Wendell | 2013-12-31 | 1 | -2/+2 |
| | | | |||||
| | * | Removing use in test | Patrick Wendell | 2013-12-31 | 1 | -2/+0 |
| | | | |||||
| | * | Minor fixes | Patrick Wendell | 2013-12-30 | 2 | -3/+3 |
| | | | |||||
| | * | Removing initLogging entirely | Patrick Wendell | 2013-12-30 | 17 | -32/+21 |
| | | | |||||
| | * | Response to Shivaram's review | Patrick Wendell | 2013-12-30 | 2 | -15/+18 |
| | | | |||||
| | * | SPARK-1008: Logging improvments | Patrick Wendell | 2013-12-29 | 4 | -13/+37 |
| | | | | | | | | | | | | | | | | | | 1. Adds a default log4j file that gets loaded if users haven't specified a log4j file. 2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings after building with SBT (and I've seen similar warnings on the mailing list). | ||||
* | | | Merge remote-tracking branch 'apache/master' into conf2 | Matei Zaharia | 2014-01-01 | 12 | -211/+457 |
|\| | | | | | | | | | | | | | | Conflicts: project/SparkBuild.scala | ||||
| * | | Merge pull request #314 from witgo/master | Reynold Xin | 2013-12-31 | 2 | -1356/+240 |
| |\ \ | | | | | | | | | | | | | restore core/pom.xml file modification | ||||
| | * | | restore core/pom.xml file modification | liguoqiang | 2014-01-01 | 2 | -1356/+240 |
| |/ / | |||||
| * | | Merge pull request #73 from falaki/ApproximateDistinctCount | Reynold Xin | 2013-12-31 | 12 | -233/+1595 |
| |\ \ | | | | | | | | | | | | | | | | | | | | | Approximate distinct count Added countApproxDistinct() to RDD and countApproxDistinctByKey() to PairRDDFunctions to approximately count distinct number of elements and distinct number of values per key, respectively. Both functions use HyperLogLog from stream-lib for counting. Both functions take a parameter that controls the trade-off between accuracy and memory consumption. Also added Scala docs and test suites for both methods. | ||||
| | * | | Made the code more compact and readable | Hossein Falaki | 2013-12-31 | 3 | -23/+8 |
| | | | | |||||
| | * | | minor improvements | Hossein Falaki | 2013-12-31 | 2 | -4/+5 |
| | | | | |||||
| | * | | Added Java unit tests for countApproxDistinct and countApproxDistinctByKey | Hossein Falaki | 2013-12-30 | 1 | -0/+32 |
| | | | | |||||
| | * | | Added Java API for countApproxDistinct | Hossein Falaki | 2013-12-30 | 1 | -0/+11 |
| | | | | |||||
| | * | | Added Java API for countApproxDistinctByKey | Hossein Falaki | 2013-12-30 | 1 | -0/+36 |
| | | | | |||||
| | * | | Added stream 2.5.1 jar depenency | Hossein Falaki | 2013-12-30 | 1 | -1/+2 |
| | | | | |||||
| | * | | Renamed countDistinct and countDistinctByKey methods to include Approx | Hossein Falaki | 2013-12-30 | 5 | -15/+15 |
| | | | | |||||
| | * | | Using origin version | Hossein Falaki | 2013-12-30 | 374 | -8424/+19051 |
| | |\ \ | |||||
| | * | | | Removed superfluous abs call from test cases. | Hossein Falaki | 2013-12-10 | 1 | -2/+2 |
| | | | | | |||||
| | * | | | Made SerializableHyperLogLog Externalizable and added Kryo tests | Hossein Falaki | 2013-10-18 | 2 | -5/+10 |
| | | | | | |||||
| | * | | | Added stream-lib dependency to Maven build | Hossein Falaki | 2013-10-18 | 2 | -0/+9 |
| | | | | | |||||
| | * | | | Improved code style. | Hossein Falaki | 2013-10-17 | 4 | -15/+19 |
| | | | | | |||||
| | * | | | Fixed document typo | Hossein Falaki | 2013-10-17 | 2 | -4/+4 |
| | | | | | |||||
| | * | | | Added dependency on stream-lib version 2.4.0 for approximate distinct count ↵ | Hossein Falaki | 2013-10-17 | 1 | -1/+2 |
| | | | | | | | | | | | | | | | | | | | | support. | ||||
| | * | | | Added countDistinctByKey to PairRDDFunctions that counts the approximate ↵ | Hossein Falaki | 2013-10-17 | 2 | -0/+81 |
| | | | | | | | | | | | | | | | | | | | | number of unique values for each key in the RDD. | ||||
| | * | | | Added a countDistinct method to RDD that takes takes an accuracy parameter ↵ | Hossein Falaki | 2013-10-17 | 2 | -1/+38 |
| | | | | | | | | | | | | | | | | | | | | and returns the (approximate) number of distinct elements in the RDD. | ||||
| | * | | | Added a serializable wrapper for HyperLogLog | Hossein Falaki | 2013-10-17 | 1 | -0/+44 |
| | | | | | |||||
* | | | | | Fix two compile errors introduced in merge | Matei Zaharia | 2013-12-31 | 2 | -2/+2 |
| | | | | | |||||
* | | | | | Merge remote-tracking branch 'apache/master' into conf2 | Matei Zaharia | 2013-12-31 | 30 | -259/+347 |
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala | ||||
| * | | | | Merge pull request #238 from ngbinh/upgradeNetty | Patrick Wendell | 2013-12-31 | 8 | -44/+60 |
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final the changes are listed at https://github.com/netty/netty/wiki/New-and-noteworthy | ||||
| | * | | | | Fix failed unit tests | Binh Nguyen | 2013-12-27 | 3 | -13/+24 |
| | | | | | | | | | | | | | | | | | | | | | | | | Also clean up a bit. | ||||
| | * | | | | Fix imports order | Binh Nguyen | 2013-12-24 | 3 | -5/+2 |
| | | | | | | |||||
| | * | | | | Remove import * and fix some formatting | Binh Nguyen | 2013-12-24 | 2 | -7/+4 |
| | | | | | | |||||
| | * | | | | upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final | Binh Nguyen | 2013-12-24 | 7 | -31/+42 |
| | | | | | | |||||
| * | | | | | Merge pull request #289 from tdas/filestream-fix | Patrick Wendell | 2013-12-31 | 14 | -196/+269 |
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bug fixes for file input stream and checkpointing - Fixed bugs in the file input stream that led the stream to fail due to transient HDFS errors (listing files when a background thread it deleting fails caused errors, etc.) - Updated Spark's CheckpointRDD and Streaming's CheckpointWriter to use SparkContext.hadoopConfiguration, to allow checkpoints to be written to any HDFS compatible store requiring special configuration. - Changed the API of SparkContext.setCheckpointDir() - eliminated the unnecessary 'useExisting' parameter. Now SparkContext will always create a unique subdirectory within the user specified checkpoint directory. This is to ensure that previous checkpoint files are not accidentally overwritten. - Fixed bug where setting checkpoint directory as a relative local path caused the checkpointing to fail. | ||||
| | * | | | | | Fixed comments and long lines based on comments on PR 289. | Tathagata Das | 2013-12-31 | 4 | -10/+19 |
| | | | | | | | |||||
| | * | | | | | Minor changes in comments and strings to address comments in PR 289. | Tathagata Das | 2013-12-27 | 1 | -8/+6 |
| | | | | | | | |||||
| | * | | | | | Added warning if filestream adds files with no data in them (file RDDs have ↵ | Tathagata Das | 2013-12-26 | 1 | -0/+7 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | 0 partitions). | ||||
| | * | | | | | Changed file stream to not catch any exceptions related to finding new files ↵ | Tathagata Das | 2013-12-26 | 1 | -19/+11 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | (FileNotFound exception is still caught and ignored). | ||||
| | * | | | | | Removed slack time in file stream and added better handling of exceptions ↵ | Tathagata Das | 2013-12-26 | 3 | -50/+21 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | due to failures due FileNotFound exceptions. | ||||
| | * | | | | | Fixed Python API for sc.setCheckpointDir. Also other fixes based on ↵ | Tathagata Das | 2013-12-24 | 7 | -22/+16 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reynold's comments on PR 289. | ||||
| | * | | | | | Merge branch 'apache-master' into filestream-fix | Tathagata Das | 2013-12-24 | 37 | -123/+465 |
| | |\ \ \ \ \ | | | | |_|/ / | | | |/| | | | |||||
| | * | | | | | Minor formatting fixes. | Tathagata Das | 2013-12-23 | 3 | -9/+13 |
| | | | | | | | |||||
| | * | | | | | Updated testsuites to work with the slack time of file stream. | Tathagata Das | 2013-12-23 | 3 | -2/+22 |
| | | | | | | | |||||
| | * | | | | | Merge branch 'scheduler-update' into filestream-fix | Tathagata Das | 2013-12-23 | 3 | -4/+26 |
| | |\ \ \ \ \ | |||||
| | * | | | | | | Fixed bug in file stream that prevented some files from being read | Tathagata Das | 2013-12-23 | 1 | -9/+12 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | correctly. |