summaryrefslogtreecommitdiff
path: root/faq.md
diff options
context:
space:
mode:
authorMatei Zaharia <matei@databricks.com>2016-08-03 19:12:55 -0700
committerMatei Zaharia <matei@databricks.com>2016-08-03 19:12:55 -0700
commit9700f2f4afe566412bdb73b443b3aad99b375af1 (patch)
tree34e05bf37c3951c1308db06673e982e0cae20509 /faq.md
parent612383440bfa727a9980cf83460b8f32df8d666c (diff)
downloadspark-website-9700f2f4afe566412bdb73b443b3aad99b375af1.tar.gz
spark-website-9700f2f4afe566412bdb73b443b3aad99b375af1.tar.bz2
spark-website-9700f2f4afe566412bdb73b443b3aad99b375af1.zip
Trademarks page and some FAQ cleanup
Diffstat (limited to 'faq.md')
-rw-r--r--faq.md25
1 files changed, 15 insertions, 10 deletions
diff --git a/faq.md b/faq.md
index d8de57546..f5c8565ba 100644
--- a/faq.md
+++ b/faq.md
@@ -33,21 +33,14 @@ Spark is a fast and general processing engine compatible with Hadoop data. It ca
<p class="question">Do I need Hadoop to run Spark?</p>
<p class="answer">No, but if you run on a cluster, you will need some form of shared file system (for example, NFS mounted at the same path on each node). If you have this type of filesystem, you can just deploy Spark in standalone mode.</p>
-<p class="question">How can I access data in S3?</p>
-<p class="answer">Use the <code>s3n://</code> URI scheme (<code>s3n://bucket/path</code>). You will also need to set your Amazon security credentials, either by setting the environment variables <code>AWS_ACCESS_KEY_ID</code> and <code>AWS_SECRET_ACCESS_KEY</code> before your program runs, or by setting <code>fs.s3.awsAccessKeyId</code> and <code>fs.s3.awsSecretAccessKey</code> in <code>SparkContext.hadoopConfiguration</code>.</p>
-
<p class="question">Does Spark require modified versions of Scala or Python?</p>
<p class="answer">No. Spark requires no changes to Scala or compiler plugins. The Python API uses the standard CPython implementation, and can call into existing C libraries for Python such as NumPy.</p>
-<p class="question">What are good resources for learning Scala?</p>
-<p class="answer">Check out <a href="http://www.artima.com/scalazine/articles/steps.html">First Steps to Scala</a> for a quick introduction, the <a href="http://www.scala-lang.org/docu/files/ScalaTutorial.pdf">Scala tutorial for Java programmers</a>, or the free online book <a href="http://www.artima.com/pins1ed/">Programming in Scala</a>. Scala is easy to transition to if you have Java experience or experience in a similarly high-level language (e.g. Ruby).</p>
-
-
-<p>In addition, Spark also has <a href="{{site.url}}docs/latest/java-programming-guide.html">Java</a> and <a href="{{site.url}}docs/latest/python-programming-guide.html">Python</a> APIs.</p>
-
<p class="question">I understand Spark Streaming uses micro-batching. Does this increase latency?</p>
-While Spark does use a micro-batch execution model, this does not have much impact on applications, because the batches can be as short as 0.5 seconds. In most applications of streaming big data, the analytics is done over a larger window (say 10 minutes), or the latency to get data in is higher (e.g. sensors collect readings every 10 seconds). The benefit of Spark's micro-batch model is that it enables <a href="http://people.csail.mit.edu/matei/papers/2013/sosp_spark_streaming.pdf">exactly-once semantics</a>, meaning the system can recover all intermediate state and results on failure.
+<p class="answer">
+While Spark does use a micro-batch execution model, this does not have much impact on applications, because the batches can be as short as 0.5 seconds. In most applications of streaming big data, the analytics is done over a larger window (say 10 minutes), or the latency to get data in is higher (e.g. sensors collect readings every 10 seconds). Spark's model enables <a href="http://people.csail.mit.edu/matei/papers/2013/sosp_spark_streaming.pdf">exactly-once semantics and consistency</a>, meaning the system gives correct results despite slow nodes or failures.
+</p>
<p class="question">Where can I find high-resolution versions of the Spark logo?</p>
@@ -60,6 +53,18 @@ While Spark does use a micro-batch execution model, this does not have much impa
in all uses of these logos.
</p>
+<p class="question">Can I provide commercial software or services based on Spark?</p>
+
+<p class="answer">
+Yes, as long as you respect the Apache Software Foundation's
+<a href="https://www.apache.org/licenses/">software license</a>
+and <a href="https://www.apache.org/foundation/marks/">trademark policy</a>.
+In particular, note that there are strong restrictions about how third-party products
+use the "Spark" name (names based on Spark are generally not allowed).
+Please also refer to our
+<a href="{{site.url}}trademarks.html">trademark policy summary</a>.
+</p>
+
<p class="question">How can I contribute to Spark?</p>
<p class="answer">See the <a href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark">Contributing to Spark wiki</a> for more information.</p>