diff options
author | Brennon York <brennon.york@capitalone.com> | 2015-06-17 12:00:34 -0700 |
---|---|---|
committer | Josh Rosen <joshrosen@databricks.com> | 2015-06-17 12:00:34 -0700 |
commit | 50a0496a43f09d70593419efc38587c8441843bf (patch) | |
tree | 5aadfbd344545107bba76f7b5cc33ce64819bec7 /dev/run-tests | |
parent | 6765ef98dff070768bbcd585d341ee7664fbe76c (diff) | |
download | spark-50a0496a43f09d70593419efc38587c8441843bf.tar.gz spark-50a0496a43f09d70593419efc38587c8441843bf.tar.bz2 spark-50a0496a43f09d70593419efc38587c8441843bf.zip |
[SPARK-7017] [BUILD] [PROJECT INFRA] Refactor dev/run-tests into Python
All, this is a first attempt at refactoring `dev/run-tests` into Python. Initially I merely converted all Bash calls over to Python, then moved to a much more modular approach (more functions, moved the calls around, etc.). What is here is the initial culmination and should provide a great base to various downstream issues (e.g. SPARK-7016, modularize / parallelize testing, etc.). Would love comments / suggestions for this initial first step!
/cc srowen pwendell nchammas
Author: Brennon York <brennon.york@capitalone.com>
Closes #5694 from brennonyork/SPARK-7017 and squashes the following commits:
154ed73 [Brennon York] updated finding java binary if JAVA_HOME not set
3922a85 [Brennon York] removed necessary passed in variable
f9fbe54 [Brennon York] reverted doc test change
8135518 [Brennon York] removed the test check for documentation changes until jenkins can get updated
05d435b [Brennon York] added check for jekyll install
22edb78 [Brennon York] add check if jekyll isn't installed on the path
2dff136 [Brennon York] fixed pep8 whitespace errors
767a668 [Brennon York] fixed path joining issues, ensured docs actually build on doc changes
c42cf9a [Brennon York] unpack set operations with splat (*)
fb85a41 [Brennon York] fixed minor set bug
0379833 [Brennon York] minor doc addition to print the changed modules
aa03d9e [Brennon York] added documentation builds as a top level test component, altered high level project changes to properly execute core tests only when necessary, changed variable names for simplicity
ec1ae78 [Brennon York] minor name changes, bug fixes
b7c72b9 [Brennon York] reverting streaming context
03fdd7b [Brennon York] fixed the tuple () wraps around example lambda
705d12e [Brennon York] changed example to comply with pep3113 supporting python3
60b3d51 [Brennon York] prepend rather than append onto PATH
7d2f5e2 [Brennon York] updated python tests to remove unused variable
2898717 [Brennon York] added a change to streaming test to check if it only runs streaming tests
eb684b6 [Brennon York] fixed sbt_test_goals reference error
db7ae6f [Brennon York] reverted SPARK_HOME from start of command
1ecca26 [Brennon York] fixed merge conflicts
2fcdfc0 [Brennon York] testing targte branch dump on jenkins
1f607b1 [Brennon York] finalizing revisions to modular tests
8afbe93 [Brennon York] made error codes a global
0629de8 [Brennon York] updated to refactor and remove various small bugs, removed pep8 complaints
d90ab2d [Brennon York] fixed merge conflicts, ensured that for regular builds both core and sql tests always run
b1248dc [Brennon York] exec python rather than running python and exiting with return code
f9deba1 [Brennon York] python to python2 and removed newline
6d0a052 [Brennon York] incorporated merge conflicts with SPARK-7249
f950010 [Brennon York] removed building hive-0.12.0 per SPARK-6908
703f095 [Brennon York] fixed merge conflicts
b1ca593 [Brennon York] reverted the sparkR test
afeb093 [Brennon York] updated to make sparkR test fail
1dada6b [Brennon York] reverted pyspark test failure
9a592ec [Brennon York] reverted mima exclude issue, added pyspark test failure
d825aa4 [Brennon York] revert build break, add mima break
f041d8a [Brennon York] added space from commented import to now test build breaking
983f2a2 [Brennon York] comment out import to fail build test
2386785 [Brennon York] Merge remote-tracking branch 'upstream/master' into SPARK-7017
76335fb [Brennon York] reverted rat license issue for sparkconf
e4a96cc [Brennon York] removed the import error and added license error, fixed the way run-tests and run-tests.py report their error codes
56d3cb9 [Brennon York] changed test back and commented out import to break compile
b37328c [Brennon York] fixed typo and added default return is no error block was found in the environment
7613558 [Brennon York] updated to return the proper env variable for return codes
a5bd445 [Brennon York] reverted license, changed test in shuffle to fail
803143a [Brennon York] removed license file for SparkContext
b0b2604 [Brennon York] comment out import to see if build fails and returns properly
83e80ef [Brennon York] attempt at better python output when called from bash
c095fa6 [Brennon York] removed another wait() call
26e18e8 [Brennon York] removed unnecessary wait()
07210a9 [Brennon York] minor doc string change for java version with namedtuple update
ec03bf3 [Brennon York] added namedtuple for java version to add readability
2cb413b [Brennon York] upcased global variables, changes various calling methods from check_output to check_call
639f1e9 [Brennon York] updated with pep8 rules, fixed minor bugs, added run-tests file in bash to call the run-tests.py script
3c53a1a [Brennon York] uncomment the scala tests :)
6126c4f [Brennon York] refactored run-tests into python
Diffstat (limited to 'dev/run-tests')
-rwxr-xr-x | dev/run-tests | 219 |
1 files changed, 1 insertions, 218 deletions
diff --git a/dev/run-tests b/dev/run-tests index d178e2a460..a00d9f0c27 100755 --- a/dev/run-tests +++ b/dev/run-tests @@ -17,224 +17,7 @@ # limitations under the License. # -# Go to the Spark project root directory FWDIR="$(cd "`dirname $0`"/..; pwd)" cd "$FWDIR" -# Clean up work directory and caches -rm -rf ./work -rm -rf ~/.ivy2/local/org.apache.spark -rm -rf ~/.ivy2/cache/org.apache.spark - -source "$FWDIR/dev/run-tests-codes.sh" - -CURRENT_BLOCK=$BLOCK_GENERAL - -function handle_error () { - echo "[error] Got a return code of $? on line $1 of the run-tests script." - exit $CURRENT_BLOCK -} - - -# Build against the right version of Hadoop. -{ - if [ -n "$AMPLAB_JENKINS_BUILD_PROFILE" ]; then - if [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop1.0" ]; then - export SBT_MAVEN_PROFILES_ARGS="-Phadoop-1 -Dhadoop.version=1.2.1" - elif [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop2.0" ]; then - export SBT_MAVEN_PROFILES_ARGS="-Phadoop-1 -Dhadoop.version=2.0.0-mr1-cdh4.1.1" - elif [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop2.2" ]; then - export SBT_MAVEN_PROFILES_ARGS="-Pyarn -Phadoop-2.2" - elif [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop2.3" ]; then - export SBT_MAVEN_PROFILES_ARGS="-Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0" - fi - fi - - if [ -z "$SBT_MAVEN_PROFILES_ARGS" ]; then - export SBT_MAVEN_PROFILES_ARGS="-Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0" - fi -} - -export SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS -Pkinesis-asl" - -# Determine Java path and version. -{ - if test -x "$JAVA_HOME/bin/java"; then - declare java_cmd="$JAVA_HOME/bin/java" - else - declare java_cmd=java - fi - - # We can't use sed -r -e due to OS X / BSD compatibility; hence, all the parentheses. - JAVA_VERSION=$( - $java_cmd -version 2>&1 \ - | grep -e "^java version" --max-count=1 \ - | sed "s/java version \"\(.*\)\.\(.*\)\.\(.*\)\"/\1\2/" - ) - - if [ "$JAVA_VERSION" -lt 18 ]; then - echo "[warn] Java 8 tests will not run because JDK version is < 1.8." - fi -} - -# Only run Hive tests if there are SQL changes. -# Partial solution for SPARK-1455. -if [ -n "$AMPLAB_JENKINS" ]; then - target_branch="$ghprbTargetBranch" - git fetch origin "$target_branch":"$target_branch" - - # AMP_JENKINS_PRB indicates if the current build is a pull request build. - if [ -n "$AMP_JENKINS_PRB" ]; then - # It is a pull request build. - sql_diffs=$( - git diff --name-only "$target_branch" \ - | grep -e "^sql/" -e "^bin/spark-sql" -e "^sbin/start-thriftserver.sh" - ) - - non_sql_diffs=$( - git diff --name-only "$target_branch" \ - | grep -v -e "^sql/" -e "^bin/spark-sql" -e "^sbin/start-thriftserver.sh" - ) - - if [ -n "$sql_diffs" ]; then - echo "[info] Detected changes in SQL. Will run Hive test suite." - _RUN_SQL_TESTS=true - - if [ -z "$non_sql_diffs" ]; then - echo "[info] Detected no changes except in SQL. Will only run SQL tests." - _SQL_TESTS_ONLY=true - fi - fi - else - # It is a regular build. We should run SQL tests. - _RUN_SQL_TESTS=true - fi -fi - -set -o pipefail -trap 'handle_error $LINENO' ERR - -echo "" -echo "=========================================================================" -echo "Running Apache RAT checks" -echo "=========================================================================" - -CURRENT_BLOCK=$BLOCK_RAT - -./dev/check-license - -echo "" -echo "=========================================================================" -echo "Running Scala style checks" -echo "=========================================================================" - -CURRENT_BLOCK=$BLOCK_SCALA_STYLE - -./dev/lint-scala - -echo "" -echo "=========================================================================" -echo "Running Python style checks" -echo "=========================================================================" - -CURRENT_BLOCK=$BLOCK_PYTHON_STYLE - -./dev/lint-python - -echo "" -echo "=========================================================================" -echo "Building Spark" -echo "=========================================================================" - -CURRENT_BLOCK=$BLOCK_BUILD - -{ - HIVE_BUILD_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-thriftserver" - echo "[info] Compile with Hive 0.13.1" - [ -d "lib_managed" ] && rm -rf lib_managed - echo "[info] Building Spark with these arguments: $HIVE_BUILD_ARGS" - - if [ "${AMPLAB_JENKINS_BUILD_TOOL}" == "maven" ]; then - build/mvn $HIVE_BUILD_ARGS clean package -DskipTests - else - echo -e "q\n" \ - | build/sbt $HIVE_BUILD_ARGS package assembly/assembly streaming-kafka-assembly/assembly \ - | grep -v -e "info.*Resolving" -e "warn.*Merging" -e "info.*Including" - fi -} - -echo "" -echo "=========================================================================" -echo "Detecting binary incompatibilities with MiMa" -echo "=========================================================================" - -CURRENT_BLOCK=$BLOCK_MIMA - -./dev/mima - -echo "" -echo "=========================================================================" -echo "Running Spark unit tests" -echo "=========================================================================" - -CURRENT_BLOCK=$BLOCK_SPARK_UNIT_TESTS - -{ - # If the Spark SQL tests are enabled, run the tests with the Hive profiles enabled. - # This must be a single argument, as it is. - if [ -n "$_RUN_SQL_TESTS" ]; then - SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-thriftserver" - fi - - if [ -n "$_SQL_TESTS_ONLY" ]; then - # This must be an array of individual arguments. Otherwise, having one long string - # will be interpreted as a single test, which doesn't work. - SBT_MAVEN_TEST_ARGS=("catalyst/test" "sql/test" "hive/test" "hive-thriftserver/test" "mllib/test") - else - SBT_MAVEN_TEST_ARGS=("test") - fi - - echo "[info] Running Spark tests with these arguments: $SBT_MAVEN_PROFILES_ARGS ${SBT_MAVEN_TEST_ARGS[@]}" - - if [ "${AMPLAB_JENKINS_BUILD_TOOL}" == "maven" ]; then - build/mvn test $SBT_MAVEN_PROFILES_ARGS --fail-at-end - else - # NOTE: echo "q" is needed because sbt on encountering a build file with failure - # (either resolution or compilation) prompts the user for input either q, r, etc - # to quit or retry. This echo is there to make it not block. - # NOTE: Do not quote $SBT_MAVEN_PROFILES_ARGS or else it will be interpreted as a - # single argument! - # "${SBT_MAVEN_TEST_ARGS[@]}" is cool because it's an array. - # QUESTION: Why doesn't 'yes "q"' work? - # QUESTION: Why doesn't 'grep -v -e "^\[info\] Resolving"' work? - echo -e "q\n" \ - | build/sbt $SBT_MAVEN_PROFILES_ARGS "${SBT_MAVEN_TEST_ARGS[@]}" \ - | grep -v -e "info.*Resolving" -e "warn.*Merging" -e "info.*Including" - fi -} - -echo "" -echo "=========================================================================" -echo "Running PySpark tests" -echo "=========================================================================" - -CURRENT_BLOCK=$BLOCK_PYSPARK_UNIT_TESTS - -# add path for python 3 in jenkins -export PATH="${PATH}:/home/anaconda/envs/py3k/bin" -./python/run-tests - -echo "" -echo "=========================================================================" -echo "Running SparkR tests" -echo "=========================================================================" - -CURRENT_BLOCK=$BLOCK_SPARKR_UNIT_TESTS - -if [ $(command -v R) ]; then - ./R/install-dev.sh - ./R/run-tests.sh -else - echo "Ignoring SparkR tests as R was not found in PATH" -fi - +exec python -u ./dev/run-tests.py |