aboutsummaryrefslogtreecommitdiff
path: root/dev/create-release/generate-contributors.py
Commit message (Collapse)AuthorAgeFilesLines
* [RELEASE] Add more contributors & only show names in release notes.Reynold Xin2015-09-081-8/+12
| | | | | | Author: Reynold Xin <rxin@databricks.com> Closes #8660 from rxin/contrib.
* [Release] Update contributors list format and sort itAndrew Or2014-12-161-4/+4
| | | | | Additionally, we now warn the user when a duplicate author name arises, in which case he/she needs to resolve it manually.
* [Release] Cache known author translations locallyAndrew Or2014-12-161-9/+9
| | | | | | This bypasses unnecessary calls to the Github and JIRA API. Additionally, having a local cache allows us to remember names that we had to manually discover ourselves.
* [Release] Major improvements to generate contributors scriptAndrew Or2014-12-161-64/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit introduces several major improvements to the script that generates the contributors list for release notes, notably: (1) Use release tags instead of a range of commits. Across branches, commits are not actually strictly two-dimensional, and so it is not sufficient to specify a start hash and an end hash. Otherwise, we end up counting commits that were already merged in an older branch. (2) Match PR numbers in addition to commit hashes. This is related to the first point in that if a PR is already merged in an older minor release tag, it should be filtered out here. This requires us to do some intelligent regex parsing on the commit description in addition to just relying on the GitHub API. (3) Relax author validity check. The old code fails on a name that has many middle names, for instance. The test was just too strict. (4) Use GitHub authentication. This allows us to make far more requests through the GitHub API than before (5000 as opposed to 60 per hour). (5) Translate from Github username, not commit author name. This is important because the commit author name is not always configured correctly by the user. For instance, the username "falaki" used to resolve to just "Hossein", which was treated as a github username and translated to something else that is completely arbitrary. (6) Add an option to use the untranslated name. If there is not a satisfactory candidate to replace the untranslated name with, at least allow the user to not translate it.
* [Release] Correctly translate contributors name in release notesAndrew Or2014-12-031-21/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit involves three main changes: (1) It separates the translation of contributor names from the generation of the contributors list. This is largely motivated by the Github API limit; even if we exceed this limit, we should at least be able to proceed manually as before. This is why the translation logic is abstracted into its own script translate-contributors.py. (2) When we look for candidate replacements for invalid author names, we should look for the assignees of the associated JIRAs too. As a result, the intermediate file must keep track of these. (3) This provides an interactive mode with which the user can sit at the terminal and manually pick the candidate replacement that he/she thinks makes the most sense. As before, there is a non-interactive mode that picks the first candidate that the script considers "valid." TODO: We should have a known_contributors file that stores known mappings so we don't have to go through all of this translation every time. This is also valuable because some contributors simply cannot be automatically translated.
* [Release] Translate unknown author names automaticallyAndrew Or2014-12-021-18/+18
|
* [Release] Automate generation of contributors listAndrew Or2014-11-261-0/+206
This commit provides a script that computes the contributors list by linking the github commits with JIRA issues. Automatically translating github usernames remains a TODO at this point.