aboutsummaryrefslogtreecommitdiff
path: root/mllib/src
diff options
context:
space:
mode:
authorSean Owen <sowen@cloudera.com>2014-10-09 18:21:59 -0700
committerMichael Armbrust <michael@databricks.com>2014-10-09 18:21:59 -0700
commit363baacaded56047bcc63276d729ab911e0336cf (patch)
treec6114d94f6b6f04c386df0414f17d3a0e6c87a33 /mllib/src
parent2837bf8548db7e9d43f6eefedf5a73feb22daedb (diff)
downloadspark-363baacaded56047bcc63276d729ab911e0336cf.tar.gz
spark-363baacaded56047bcc63276d729ab911e0336cf.tar.bz2
spark-363baacaded56047bcc63276d729ab911e0336cf.zip
SPARK-3811 [CORE] More robust / standard Utils.deleteRecursively, Utils.createTempDir
I noticed a few issues with how temp directories are created and deleted: *Minor* * Guava's `Files.createTempDir()` plus `File.deleteOnExit()` is used in many tests to make a temp dir, but `Utils.createTempDir()` seems to be the standard Spark mechanism * Call to `File.deleteOnExit()` could be pushed into `Utils.createTempDir()` as well, along with this replacement * _I messed up the message in an exception in `Utils` in SPARK-3794; fixed here_ *Bit Less Minor* * `Utils.deleteRecursively()` fails immediately if any `IOException` occurs, instead of trying to delete any remaining files and subdirectories. I've observed this leave temp dirs around. I suggest changing it to continue in the face of an exception and throw one of the possibly several exceptions that occur at the end. * `Utils.createTempDir()` will add a JVM shutdown hook every time the method is called. Even if the subdir is the parent of another parent dir, since this check is inside the hook. However `Utils` manages a set of all dirs to delete on shutdown already, called `shutdownDeletePaths`. A single hook can be registered to delete all of these on exit. This is how Tachyon temp paths are cleaned up in `TachyonBlockManager`. I noticed a few other things that might be changed but wanted to ask first: * Shouldn't the set of dirs to delete be `File`, not just `String` paths? * `Utils` manages the set of `TachyonFile` that have been registered for deletion, but the shutdown hook is managed in `TachyonBlockManager`. Should this logic not live together, and not in `Utils`? it's more specific to Tachyon, and looks a slight bit odd to import in such a generic place. Author: Sean Owen <sowen@cloudera.com> Closes #2670 from srowen/SPARK-3811 and squashes the following commits: 071ae60 [Sean Owen] Update per @vanzin's review da0146d [Sean Owen] Make Utils.deleteRecursively try to delete all paths even when an exception occurs; use one shutdown hook instead of one per method call to delete temp dirs 3a0faa4 [Sean Owen] Standardize on Utils.createTempDir instead of Files.createTempDir
Diffstat (limited to 'mllib/src')
-rw-r--r--mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala9
1 files changed, 4 insertions, 5 deletions
diff --git a/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala b/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala
index 8ef2bb1bf6..0dbe766b4d 100644
--- a/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala
@@ -67,8 +67,7 @@ class MLUtilsSuite extends FunSuite with LocalSparkContext {
|0
|0 2:4.0 4:5.0 6:6.0
""".stripMargin
- val tempDir = Files.createTempDir()
- tempDir.deleteOnExit()
+ val tempDir = Utils.createTempDir()
val file = new File(tempDir.getPath, "part-00000")
Files.write(lines, file, Charsets.US_ASCII)
val path = tempDir.toURI.toString
@@ -100,7 +99,7 @@ class MLUtilsSuite extends FunSuite with LocalSparkContext {
LabeledPoint(1.1, Vectors.sparse(3, Seq((0, 1.23), (2, 4.56)))),
LabeledPoint(0.0, Vectors.dense(1.01, 2.02, 3.03))
), 2)
- val tempDir = Files.createTempDir()
+ val tempDir = Utils.createTempDir()
val outputDir = new File(tempDir, "output")
MLUtils.saveAsLibSVMFile(examples, outputDir.toURI.toString)
val lines = outputDir.listFiles()
@@ -166,7 +165,7 @@ class MLUtilsSuite extends FunSuite with LocalSparkContext {
Vectors.sparse(2, Array(1), Array(-1.0)),
Vectors.dense(0.0, 1.0)
), 2)
- val tempDir = Files.createTempDir()
+ val tempDir = Utils.createTempDir()
val outputDir = new File(tempDir, "vectors")
val path = outputDir.toURI.toString
vectors.saveAsTextFile(path)
@@ -181,7 +180,7 @@ class MLUtilsSuite extends FunSuite with LocalSparkContext {
LabeledPoint(0.0, Vectors.sparse(2, Array(1), Array(-1.0))),
LabeledPoint(1.0, Vectors.dense(0.0, 1.0))
), 2)
- val tempDir = Files.createTempDir()
+ val tempDir = Utils.createTempDir()
val outputDir = new File(tempDir, "points")
val path = outputDir.toURI.toString
points.saveAsTextFile(path)