aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorDhruve Ashar <dhruveashar@gmail.com>2016-05-04 08:45:43 -0500
committerTom Graves <tgraves@yahoo-inc.com>2016-05-04 08:45:43 -0500
commita45647746d1efb90cb8bc142c2ef110a0db9bc9f (patch)
tree1c6cdb00bce295b2d16a98860848a19c72c4aa30 /docs
parentabecbcd5e9598471b705a2f701731af1adc9d48b (diff)
downloadspark-a45647746d1efb90cb8bc142c2ef110a0db9bc9f.tar.gz
spark-a45647746d1efb90cb8bc142c2ef110a0db9bc9f.tar.bz2
spark-a45647746d1efb90cb8bc142c2ef110a0db9bc9f.zip
[SPARK-4224][CORE][YARN] Support group acls
## What changes were proposed in this pull request? Currently only a list of users can be specified for view and modify acls. This change enables a group of admins/devs/users to be provisioned for viewing and modifying Spark jobs. **Changes Proposed in the fix** Three new corresponding config entries have been added where the user can specify the groups to be given access. ``` spark.admin.acls.groups spark.modify.acls.groups spark.ui.view.acls.groups ``` New config entries were added because specifying the users and groups explicitly is a better and cleaner way compared to specifying them in the existing config entry using a delimiter. A generic trait has been introduced to provide the user to group mapping which makes it pluggable to support a variety of mapping protocols - similar to the one used in hadoop. A default unix shell based implementation has been provided. Custom user to group mapping protocol can be specified and configured by the entry ```spark.user.groups.mapping``` **How the patch was Tested** We ran different spark jobs setting the config entries in combinations of admin, modify and ui acls. For modify acls we tried killing the job stages from the ui and using yarn commands. For view acls we tried accessing the UI tabs and the logs. Headless accounts were used to launch these jobs and different users tried to modify and view the jobs to ensure that the groups mapping applied correctly. Additional Unit tests have been added without modifying the existing ones. These test for different ways of setting the acls through configuration and/or API and validate the expected behavior. Author: Dhruve Ashar <dhruveashar@gmail.com> Closes #12760 from dhruve/impr/SPARK-4224.
Diffstat (limited to 'docs')
-rw-r--r--docs/configuration.md55
-rw-r--r--docs/monitoring.md4
-rw-r--r--docs/security.md6
3 files changed, 57 insertions, 8 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index 6512e16faf..9191570d07 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1231,7 +1231,7 @@ Apart from these, the following properties are also available, and may be useful
<td><code>spark.acls.enable</code></td>
<td>false</td>
<td>
- Whether Spark acls should are enabled. If enabled, this checks to see if the user has
+ Whether Spark acls should be enabled. If enabled, this checks to see if the user has
access permissions to view or modify the job. Note this requires the user to be known,
so if the user comes across as null no checks are done. Filters can be used with the UI
to authenticate and set the user.
@@ -1243,8 +1243,33 @@ Apart from these, the following properties are also available, and may be useful
<td>
Comma separated list of users/administrators that have view and modify access to all Spark jobs.
This can be used if you run on a shared cluster and have a set of administrators or devs who
- help debug when things work. Putting a "*" in the list means any user can have the privilege
- of admin.
+ help debug when things do not work. Putting a "*" in the list means any user can have the
+ privilege of admin.
+ </td>
+</tr>
+<tr>
+ <td><code>spark.admin.acls.groups</code></td>
+ <td>Empty</td>
+ <td>
+ Comma separated list of groups that have view and modify access to all Spark jobs.
+ This can be used if you have a set of administrators or developers who help maintain and debug
+ the underlying infrastructure. Putting a "*" in the list means any user in any group can have
+ the privilege of admin. The user groups are obtained from the instance of the groups mapping
+ provider specified by <code>spark.user.groups.mapping</code>. Check the entry
+ <code>spark.user.groups.mapping</code> for more details.
+ </td>
+</tr>
+<tr>
+ <td><code>spark.user.groups.mapping</code></td>
+ <td><code>org.apache.spark.security.ShellBasedGroupsMappingProvider</code></td>
+ <td>
+ The list of groups for a user are determined by a group mapping service defined by the trait
+ org.apache.spark.security.GroupMappingServiceProvider which can configured by this property.
+ A default unix shell based implementation is provided <code>org.apache.spark.security.ShellBasedGroupsMappingProvider</code>
+ which can be specified to resolve a list of groups for a user.
+ <em>Note:</em> This implementation supports only a Unix/Linux based environment. Windows environment is
+ currently <b>not</b> supported. However, a new platform/protocol can be supported by implementing
+ the trait <code>org.apache.spark.security.GroupMappingServiceProvider</code>.
</td>
</tr>
<tr>
@@ -1306,6 +1331,18 @@ Apart from these, the following properties are also available, and may be useful
</td>
</tr>
<tr>
+ <td><code>spark.modify.acls.groups</code></td>
+ <td>Empty</td>
+ <td>
+ Comma separated list of groups that have modify access to the Spark job. This can be used if you
+ have a set of administrators or developers from the same team to have access to control the job.
+ Putting a "*" in the list means any user in any group has the access to modify the Spark job.
+ The user groups are obtained from the instance of the groups mapping provider specified by
+ <code>spark.user.groups.mapping</code>. Check the entry <code>spark.user.groups.mapping</code>
+ for more details.
+ </td>
+</tr>
+<tr>
<td><code>spark.ui.filters</code></td>
<td>None</td>
<td>
@@ -1328,6 +1365,18 @@ Apart from these, the following properties are also available, and may be useful
have view access to this Spark job.
</td>
</tr>
+<tr>
+ <td><code>spark.ui.view.acls.groups</code></td>
+ <td>Empty</td>
+ <td>
+ Comma separated list of groups that have view access to the Spark web ui to view the Spark Job
+ details. This can be used if you have a set of administrators or developers or users who can
+ monitor the Spark job submitted. Putting a "*" in the list means any user in any group can view
+ the Spark job details on the Spark web ui. The user groups are obtained from the instance of the
+ groups mapping provider specified by <code>spark.user.groups.mapping</code>. Check the entry
+ <code>spark.user.groups.mapping</code> for more details.
+ </td>
+</tr>
</table>
#### Encryption
diff --git a/docs/monitoring.md b/docs/monitoring.md
index 88002ebdc3..697962ae3a 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -162,8 +162,8 @@ The history server can be configured as follows:
If enabled, access control checks are made regardless of what the individual application had
set for <code>spark.ui.acls.enable</code> when the application was run. The application owner
will always have authorization to view their own application and any users specified via
- <code>spark.ui.view.acls</code> when the application was run will also have authorization
- to view that application.
+ <code>spark.ui.view.acls</code> and groups specified via <code>spark.ui.view.acls.groups<code>
+ when the application was run will also have authorization to view that application.
If disabled, no access control checks are made.
</td>
</tr>
diff --git a/docs/security.md b/docs/security.md
index 32c33d2857..d2708a8070 100644
--- a/docs/security.md
+++ b/docs/security.md
@@ -16,10 +16,10 @@ and by using [https/SSL](http://en.wikipedia.org/wiki/HTTPS) via the `spark.ui.h
### Authentication
-A user may want to secure the UI if it has data that other users should not be allowed to see. The javax servlet filter specified by the user can authenticate the user and then once the user is logged in, Spark can compare that user versus the view ACLs to make sure they are authorized to view the UI. The configs `spark.acls.enable` and `spark.ui.view.acls` control the behavior of the ACLs. Note that the user who started the application always has view access to the UI. On YARN, the Spark UI uses the standard YARN web application proxy mechanism and will authenticate via any installed Hadoop filters.
+A user may want to secure the UI if it has data that other users should not be allowed to see. The javax servlet filter specified by the user can authenticate the user and then once the user is logged in, Spark can compare that user versus the view ACLs to make sure they are authorized to view the UI. The configs `spark.acls.enable`, `spark.ui.view.acls` and `spark.ui.view.acls.groups` control the behavior of the ACLs. Note that the user who started the application always has view access to the UI. On YARN, the Spark UI uses the standard YARN web application proxy mechanism and will authenticate via any installed Hadoop filters.
-Spark also supports modify ACLs to control who has access to modify a running Spark application. This includes things like killing the application or a task. This is controlled by the configs `spark.acls.enable` and `spark.modify.acls`. Note that if you are authenticating the web UI, in order to use the kill button on the web UI it might be necessary to add the users in the modify acls to the view acls also. On YARN, the modify acls are passed in and control who has modify access via YARN interfaces.
-Spark allows for a set of administrators to be specified in the acls who always have view and modify permissions to all the applications. is controlled by the config `spark.admin.acls`. This is useful on a shared cluster where you might have administrators or support staff who help users debug applications.
+Spark also supports modify ACLs to control who has access to modify a running Spark application. This includes things like killing the application or a task. This is controlled by the configs `spark.acls.enable`, `spark.modify.acls` and `spark.modify.acls.groups`. Note that if you are authenticating the web UI, in order to use the kill button on the web UI it might be necessary to add the users in the modify acls to the view acls also. On YARN, the modify acls are passed in and control who has modify access via YARN interfaces.
+Spark allows for a set of administrators to be specified in the acls who always have view and modify permissions to all the applications. is controlled by the configs `spark.admin.acls` and `spark.admin.acls.groups`. This is useful on a shared cluster where you might have administrators or support staff who help users debug applications.
## Event Logging