[SPARK-3954][Streaming] Optimization to FileInputDStream - spark

diff options

author	surq <surq@asiainfo.com>	2014-11-10 17:37:16 -0800
committer	Tathagata Das <tathagata.das1565@gmail.com>	2014-11-10 17:37:16 -0800
commit	ce6ed2abd14de26b9ceaa415e9a42fbb1338f5fa (patch)
tree	ac53a05bc85b0df250ac3887627b36dcc3fca856 /bin
parent	a1fc059b69c9ed150bf8a284404cc149ddaa27d6 (diff)
download	spark-ce6ed2abd14de26b9ceaa415e9a42fbb1338f5fa.tar.gz spark-ce6ed2abd14de26b9ceaa415e9a42fbb1338f5fa.tar.bz2 spark-ce6ed2abd14de26b9ceaa415e9a42fbb1338f5fa.zip

[SPARK-3954][Streaming] Optimization to FileInputDStream

about convert files to RDDS there are 3 loops with files sequence in spark source. loops files sequence: 1.files.map(...) 2.files.zip(fileRDDs) 3.files-size.foreach It's will very time consuming when lots of files.So I do the following correction: 3 loops with files sequence => only one loop Author: surq <surq@asiainfo.com> Closes #2811 from surq/SPARK-3954 and squashes the following commits: 321bbe8 [surq] updated the code style.The style from [for...yield]to [files.map(file=>{})] 88a2c20 [surq] Merge branch 'master' of https://github.com/apache/spark into SPARK-3954 178066f [surq] modify code's style. [Exceeds 100 columns] 626ef97 [surq] remove redundant import(ArrayBuffer) 739341f [surq] promote the speed of convert files to RDDS

Diffstat (limited to 'bin')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: