[SPARK-19748][SQL] refresh function has a wrong order to do cache invalidate and regenerate the inmemory var for InMemoryFileIndex with FileStatusCache - spark

diff options

author	windpiger <songjun@outlook.com>	2017-02-28 00:16:49 -0800
committer	Wenchen Fan <wenchen@databricks.com>	2017-02-28 00:16:49 -0800
commit	a350bc16d36c58b48ac01f0258678ffcdb77e793 (patch)
tree	5d294d922463cc1e37ef7db78d103913fda25a19 /sql/hive/src/test/resources/golden/groupby6_noskew-2-83c59d378571a6e487aa20217bd87817
parent	73530383538ad72fdc3dd4c670485192f12ebc4e (diff)
download	spark-a350bc16d36c58b48ac01f0258678ffcdb77e793.tar.gz spark-a350bc16d36c58b48ac01f0258678ffcdb77e793.tar.bz2 spark-a350bc16d36c58b48ac01f0258678ffcdb77e793.zip

[SPARK-19748][SQL] refresh function has a wrong order to do cache invalidate and regenerate the inmemory var for InMemoryFileIndex with FileStatusCache

## What changes were proposed in this pull request? If we refresh a InMemoryFileIndex with a FileStatusCache, it will first use the FileStatusCache to re-generate the cachedLeafFiles etc, then call FileStatusCache.invalidateAll. While the order to do these two actions is wrong, this lead to the refresh action does not take effect. ``` override def refresh(): Unit = { refresh0() fileStatusCache.invalidateAll() } private def refresh0(): Unit = { val files = listLeafFiles(rootPaths) cachedLeafFiles = new mutable.LinkedHashMap[Path, FileStatus]() ++= files.map(f => f.getPath -> f) cachedLeafDirToChildrenFiles = files.toArray.groupBy(_.getPath.getParent) cachedPartitionSpec = null } ``` ## How was this patch tested? unit test added Author: windpiger <songjun@outlook.com> Closes #17079 from windpiger/fixInMemoryFileIndexRefresh.

Diffstat (limited to 'sql/hive/src/test/resources/golden/groupby6_noskew-2-83c59d378571a6e487aa20217bd87817')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: