aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/ml/classification.py
diff options
context:
space:
mode:
authorAndrew Or <andrew@databricks.com>2015-12-03 11:09:29 -0800
committerAndrew Or <andrew@databricks.com>2015-12-03 11:09:29 -0800
commit688e521c2833a00069272a6749153d721a0996f6 (patch)
tree8dca718b9f02b07ad18297cb4b9570579f939857 /python/pyspark/ml/classification.py
parent649be4fa4532dcd3001df8345f9f7e970a3fbc65 (diff)
downloadspark-688e521c2833a00069272a6749153d721a0996f6.tar.gz
spark-688e521c2833a00069272a6749153d721a0996f6.tar.bz2
spark-688e521c2833a00069272a6749153d721a0996f6.zip
[SPARK-12108] Make event logs smaller
**Problem.** Event logs in 1.6 were much bigger than 1.5. I ran page rank and the event log size in 1.6 was almost 5x that in 1.5. I did a bisect to find that the RDD callsite added in #9398 is largely responsible for this. **Solution.** This patch removes the long form of the callsite (which is not used!) from the event log. This reduces the size of the event log significantly. *Note on compatibility*: if this patch is to be merged into 1.6.0, then it won't break any compatibility. Otherwise, if it is merged into 1.6.1, then we might need to add more backward compatibility handling logic (currently does not exist yet). Author: Andrew Or <andrew@databricks.com> Closes #10115 from andrewor14/smaller-event-logs.
Diffstat (limited to 'python/pyspark/ml/classification.py')
0 files changed, 0 insertions, 0 deletions