aboutsummaryrefslogtreecommitdiff
path: root/sql/hive
diff options
context:
space:
mode:
authorTathagata Das <tathagata.das1565@gmail.com>2016-10-24 17:21:16 -0700
committerShixiong Zhu <shixiong@databricks.com>2016-10-24 17:21:16 -0700
commit407c3cedf29a4413339dcde758295dc3225a0054 (patch)
tree84b0724da050ed0b2f981b8dc6d29c70b114c2ee /sql/hive
parent81d6933e75579343b1dd14792c18149e97e92cdd (diff)
downloadspark-407c3cedf29a4413339dcde758295dc3225a0054.tar.gz
spark-407c3cedf29a4413339dcde758295dc3225a0054.tar.bz2
spark-407c3cedf29a4413339dcde758295dc3225a0054.zip
[SPARK-17624][SQL][STREAMING][TEST] Fixed flaky StateStoreSuite.maintenance
## What changes were proposed in this pull request? The reason for the flakiness was follows. The test starts the maintenance background thread, and then writes 20 versions of the state store. The maintenance thread is expected to create snapshots in the middle, and clean up old files that are not needed any more. The earliest delta file (1.delta) is expected to be deleted as snapshots will ensure that the earliest delta would not be needed. However, the default configuration for the maintenance thread is to retain files such that last 2 versions can be recovered, and delete the rest. Now while generating the versions, the maintenance thread can kick in and create snapshots anywhere between version 10 and 20 (at least 10 deltas needed for snapshot). Then later it will choose to retain only version 20 and 19 (last 2). There are two cases. - Common case: One of the version between 10 and 19 gets snapshotted. Then recovering versions 19 and 20 just needs 19.snapshot and 20.delta, so 1.delta gets deleted. - Uncommon case (reason for flakiness): Only version 20 gets snapshotted. Then recovering versoin 20 requires 20.snapshot, and recovering version 19 all the previous 19...1.delta. So 1.delta does not get deleted. This PR rearranges the checks such that it create 20 versions, and then waits that there is at least one snapshot, then creates another 20. This will ensure that the latest 2 versions cannot require anything older than the first snapshot generated, and therefore will 1.delta will be deleted. In addition, I have added more logs, and comments that I felt would help future debugging and understanding what is going on. ## How was this patch tested? Ran the StateStoreSuite > 6K times in a heavily loaded machine (10 instances of tests running in parallel). No failures. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #15592 from tdas/SPARK-17624.
Diffstat (limited to 'sql/hive')
0 files changed, 0 insertions, 0 deletions