<?xml version="1.0" encoding="ascii"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>pyspark.mllib.classification.NaiveBayes</title>
<link rel="stylesheet" href="epydoc.css" type="text/css" />
<script type="text/javascript" src="epydoc.js"></script>
</head>
<body bgcolor="white" text="black" link="blue" vlink="#204080"
alink="#204080">
<!-- ==================== NAVIGATION BAR ==================== -->
<table class="navbar" border="0" width="100%" cellpadding="0"
bgcolor="#a0c0ff" cellspacing="0">
<tr valign="middle">
<!-- Home link -->
<th> <a
href="pyspark-module.html">Home</a> </th>
<!-- Tree link -->
<th> <a
href="module-tree.html">Trees</a> </th>
<!-- Index link -->
<th> <a
href="identifier-index.html">Indices</a> </th>
<!-- Help link -->
<th> <a
href="help.html">Help</a> </th>
<!-- Project homepage -->
<th class="navbar" align="right" width="100%">
<table border="0" cellpadding="0" cellspacing="0">
<tr><th class="navbar" align="center"
><a class="navbar" target="_top" href="http://spark.apache.org">Spark 1.0.0 Python API Docs</a></th>
</tr></table></th>
</tr>
</table>
<table width="100%" cellpadding="0" cellspacing="0">
<tr valign="top">
<td width="100%">
<span class="breadcrumbs">
<a href="pyspark-module.html">Package pyspark</a> ::
<a href="pyspark.mllib-module.html">Package mllib</a> ::
<a href="pyspark.mllib.classification-module.html">Module classification</a> ::
Class NaiveBayes
</span>
</td>
<td>
<table cellpadding="0" cellspacing="0">
<!-- hide/show private -->
<tr><td align="right"><span class="options"
>[<a href="frames.html" target="_top">frames</a
>] | <a href="pyspark.mllib.classification.NaiveBayes-class.html"
target="_top">no frames</a>]</span></td></tr>
</table>
</td>
</tr>
</table>
<!-- ==================== CLASS DESCRIPTION ==================== -->
<h1 class="epydoc">Class NaiveBayes</h1><p class="nomargin-top"><span class="codelink"><a href="pyspark.mllib.classification-pysrc.html#NaiveBayes">source code</a></span></p>
<pre class="base-tree">
object --+
|
<strong class="uidshort">NaiveBayes</strong>
</pre>
<hr />
<!-- ==================== INSTANCE METHODS ==================== -->
<a name="section-InstanceMethods"></a>
<table class="summary" border="1" cellpadding="3"
cellspacing="0" width="100%" bgcolor="white">
<tr bgcolor="#70b0f0" class="table-header">
<td align="left" colspan="2" class="table-header">
<span class="table-header">Instance Methods</span></td>
</tr>
<tr>
<td colspan="2" class="summary">
<p class="indent-wrapped-lines"><b>Inherited from <code>object</code></b>:
<code>__delattr__</code>,
<code>__format__</code>,
<code>__getattribute__</code>,
<code>__hash__</code>,
<code>__init__</code>,
<code>__new__</code>,
<code>__reduce__</code>,
<code>__reduce_ex__</code>,
<code>__repr__</code>,
<code>__setattr__</code>,
<code>__sizeof__</code>,
<code>__str__</code>,
<code>__subclasshook__</code>
</p>
</td>
</tr>
</table>
<!-- ==================== CLASS METHODS ==================== -->
<a name="section-ClassMethods"></a>
<table class="summary" border="1" cellpadding="3"
cellspacing="0" width="100%" bgcolor="white">
<tr bgcolor="#70b0f0" class="table-header">
<td align="left" colspan="2" class="table-header">
<span class="table-header">Class Methods</span></td>
</tr>
<tr>
<td width="15%" align="right" valign="top" class="summary">
<span class="summary-type"> </span>
</td><td class="summary">
<table width="100%" cellpadding="0" cellspacing="0" border="0">
<tr>
<td><span class="summary-sig"><a href="pyspark.mllib.classification.NaiveBayes-class.html#train" class="summary-sig-name">train</a>(<span class="summary-sig-arg">cls</span>,
<span class="summary-sig-arg">data</span>,
<span class="summary-sig-arg">lambda_</span>=<span class="summary-sig-default">1.0</span>)</span><br />
Train a Naive Bayes model given an RDD of (label, features) vectors.</td>
<td align="right" valign="top">
<span class="codelink"><a href="pyspark.mllib.classification-pysrc.html#NaiveBayes.train">source code</a></span>
</td>
</tr>
</table>
</td>
</tr>
</table>
<!-- ==================== PROPERTIES ==================== -->
<a name="section-Properties"></a>
<table class="summary" border="1" cellpadding="3"
cellspacing="0" width="100%" bgcolor="white">
<tr bgcolor="#70b0f0" class="table-header">
<td align="left" colspan="2" class="table-header">
<span class="table-header">Properties</span></td>
</tr>
<tr>
<td colspan="2" class="summary">
<p class="indent-wrapped-lines"><b>Inherited from <code>object</code></b>:
<code>__class__</code>
</p>
</td>
</tr>
</table>
<!-- ==================== METHOD DETAILS ==================== -->
<a name="section-MethodDetails"></a>
<table class="details" border="1" cellpadding="3"
cellspacing="0" width="100%" bgcolor="white">
<tr bgcolor="#70b0f0" class="table-header">
<td align="left" colspan="2" class="table-header">
<span class="table-header">Method Details</span></td>
</tr>
</table>
<a name="train"></a>
<div>
<table class="details" border="1" cellpadding="3"
cellspacing="0" width="100%" bgcolor="white">
<tr><td>
<table width="100%" cellpadding="0" cellspacing="0" border="0">
<tr valign="top"><td>
<h3 class="epydoc"><span class="sig"><span class="sig-name">train</span>(<span class="sig-arg">cls</span>,
<span class="sig-arg">data</span>,
<span class="sig-arg">lambda_</span>=<span class="sig-default">1.0</span>)</span>
<br /><em class="fname">Class Method</em>
</h3>
</td><td align="right" valign="top"
><span class="codelink"><a href="pyspark.mllib.classification-pysrc.html#NaiveBayes.train">source code</a></span>
</td>
</tr></table>
<p>Train a Naive Bayes model given an RDD of (label, features)
vectors.</p>
<p>This is the Multinomial NB (<a href="http://tinyurl.com/lsdw6p"
target="_top">http://tinyurl.com/lsdw6p</a>) which can handle all kinds
of discrete data. For example, by converting documents into TF-IDF
vectors, it can be used for document classification. By making every
vector a 0-1 vector, it can also be used as Bernoulli NB (<a
href="http://tinyurl.com/p7c96j6"
target="_top">http://tinyurl.com/p7c96j6</a>).</p>
<dl class="fields">
<dt>Parameters:</dt>
<dd><ul class="nomargin-top">
<li><strong class="pname"><code>data</code></strong> - RDD of NumPy vectors, one per element, where the first coordinate
is the label and the rest is the feature vector (e.g. a count
vector).</li>
<li><strong class="pname"><code>lambda_</code></strong> - The smoothing parameter</li>
</ul></dd>
</dl>
</td></tr></table>
</div>
<br />
<!-- ==================== NAVIGATION BAR ==================== -->
<table class="navbar" border="0" width="100%" cellpadding="0"
bgcolor="#a0c0ff" cellspacing="0">
<tr valign="middle">
<!-- Home link -->
<th> <a
href="pyspark-module.html">Home</a> </th>
<!-- Tree link -->
<th> <a
href="module-tree.html">Trees</a> </th>
<!-- Index link -->
<th> <a
href="identifier-index.html">Indices</a> </th>
<!-- Help link -->
<th> <a
href="help.html">Help</a> </th>
<!-- Project homepage -->
<th class="navbar" align="right" width="100%">
<table border="0" cellpadding="0" cellspacing="0">
<tr><th class="navbar" align="center"
><a class="navbar" target="_top" href="http://spark.apache.org">Spark 1.0.0 Python API Docs</a></th>
</tr></table></th>
</tr>
</table>
<table border="0" cellpadding="0" cellspacing="0" width="100%%">
<tr>
<td align="left" class="footer">
Generated by Epydoc 3.0.1 on Fri May 30 01:48:46 2014
</td>
<td align="right" class="footer">
<a target="mainFrame" href="http://epydoc.sourceforge.net"
>http://epydoc.sourceforge.net</a>
</td>
</tr>
</table>
<script type="text/javascript">
<!--
// Private objects are initially displayed (because if
// javascript is turned off then we want them to be
// visible); but by default, we want to hide them. So hide
// them unless we have a cookie that says to show them.
checkCookie();
// -->
</script>
</body>
</html>