Add hive test files to repository. Remove download script.

This PR removes our test dependence on files hosted at Berkeley by checking the test queries and answers into the repository. This should also fix the maven Jenkins build. I realize this is a *giant* commit. But size wise its actually pretty small. We are only looking at ~1.2Mb compressed (~30Mb uncompressed). Given that we already have a ~80Mb file permanently added to the spark code lineage, I do not think that this will change the developer experience significantly. Furthermore, I think it is good engineering practice to consider such test support files as "code", since changes to them would indicate a change in functionality. These files were only excluded from the initial PR as I wanted the diff to be readable. Author: Michael Armbrust <michael@databricks.com> Closes #199 from marmbrus/hiveTestFiles and squashes the following commits: b9b9b17 [Michael Armbrust] Add hive test files to repository. Remove download script.
author: Michael Armbrust <michael@databricks.com> 2014-03-21 15:05:45 -0700
committer: Patrick Wendell <pwendell@gmail.com> 2014-03-21 15:05:45 -0700
commit: 7e17fe69f9c3dc4cac024ea483f5d5f34ee06203 (patch)
tree: bf6235fda03105bb981d64a25819ddb5a49bc19c /sql/hive/src/test/resources/golden/columnstats_partlvl-8-dc5682403f4154cef30860f2b4e37bce
parent: e09139d9ca529a8f983a8b3e2a8158c3f3caa523 (diff)
download: spark-7e17fe69f9c3dc4cac024ea483f5d5f34ee06203.tar.gz
spark-7e17fe69f9c3dc4cac024ea483f5d5f34ee06203.tar.bz2
spark-7e17fe69f9c3dc4cac024ea483f5d5f34ee06203.zip
1 files changed, 129 insertions, 0 deletions
diff --git a/sql/hive/src/test/resources/golden/columnstats_partlvl-8-dc5682403f4154cef30860f2b4e37bce b/sql/hive/src/test/resources/golden/columnstats_partlvl-8-dc5682403f4154cef30860f2b4e37bce
new file mode 100644
index 0000000000..cd72c7efbf
--- /dev/null
+++ b/sql/hive/src/test/resources/golden/columnstats_partlvl-8-dc5682403f4154cef30860f2b4e37bce
@@ -0,0 +1,129 @@
+ABSTRACT SYNTAX TREE:
+  (TOK_ANALYZE (TOK_TAB (TOK_TABNAME Employee_Part) (TOK_PARTSPEC (TOK_PARTVAL employeeSalary 4000.0))) (TOK_TABCOLNAME employeeID))
+
+STAGE DEPENDENCIES:
+  Stage-0 is a root stage
+  Stage-1 is a root stage
+
+STAGE PLANS:
+  Stage: Stage-0
+    Map Reduce
+      Alias -> Map Operator Tree:
+        employee_part 
+          TableScan
+            alias: employee_part
+            GatherStats: false
+            Select Operator
+              expressions:
+                    expr: employeeid
+                    type: int
+              outputColumnNames: employeeid
+              Group By Operator
+                aggregations:
+                      expr: compute_stats(employeeid, 16)
+                bucketGroup: false
+                mode: hash
+                outputColumnNames: _col0
+                Reduce Output Operator
+                  sort order: 
+                  tag: -1
+                  value expressions:
+                        expr: _col0
+                        type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:string,numbitvectors:int>
+      Path -> Alias:
+        file:/private/var/folders/36/cjkbrr953xg2p_krwrmn8h_r0000gn/T/sharkWarehouse7107609744565894054/employee_part/employeesalary=4000.0 [employee_part]
+      Path -> Partition:
+        file:/private/var/folders/36/cjkbrr953xg2p_krwrmn8h_r0000gn/T/sharkWarehouse7107609744565894054/employee_part/employeesalary=4000.0 
+          Partition
+            base file name: employeesalary=4000.0
+            input format: org.apache.hadoop.mapred.TextInputFormat
+            output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+            partition values:
+              employeesalary 4000.0
+            properties:
+              bucket_count -1
+              columns employeeid,employeename
+              columns.types int:string
+              field.delim |
+              file.inputformat org.apache.hadoop.mapred.TextInputFormat
+              file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+              location file:/private/var/folders/36/cjkbrr953xg2p_krwrmn8h_r0000gn/T/sharkWarehouse7107609744565894054/employee_part/employeesalary=4000.0
+              name default.employee_part
+              numFiles 1
+              numRows 0
+              partition_columns employeesalary
+              rawDataSize 0
+              serialization.ddl struct employee_part { i32 employeeid, string employeename}
+              serialization.format |
+              serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+              totalSize 105
+              transient_lastDdlTime 1389728706
+            serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+          
+              input format: org.apache.hadoop.mapred.TextInputFormat
+              output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+              properties:
+                bucket_count -1
+                columns employeeid,employeename
+                columns.types int:string
+                field.delim |
+                file.inputformat org.apache.hadoop.mapred.TextInputFormat
+                file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                location file:/private/var/folders/36/cjkbrr953xg2p_krwrmn8h_r0000gn/T/sharkWarehouse7107609744565894054/employee_part
+                name default.employee_part
+                numFiles 2
+                numPartitions 2
+                numRows 0
+                partition_columns employeesalary
+                rawDataSize 0
+                serialization.ddl struct employee_part { i32 employeeid, string employeename}
+                serialization.format |
+                serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+                totalSize 210
+                transient_lastDdlTime 1389728706
+              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+              name: default.employee_part
+            name: default.employee_part
+      Truncated Path -> Alias:
+        /employee_part/employeesalary=4000.0 [employee_part]
+      Needs Tagging: false
+      Reduce Operator Tree:
+        Group By Operator
+          aggregations:
+                expr: compute_stats(VALUE._col0)
+          bucketGroup: false
+          mode: mergepartial
+          outputColumnNames: _col0
+          Select Operator
+            expressions:
+                  expr: _col0
+                  type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint>
+            outputColumnNames: _col0
+            File Output Operator
+              compressed: false
+              GlobalTableId: 0
+              directory: file:/var/folders/36/cjkbrr953xg2p_krwrmn8h_r0000gn/T/marmbrus/hive_2014-01-14_11-45-24_849_6968895828655634809-1/-ext-10001
+              NumFilesPerFileSink: 1
+              Stats Publishing Key Prefix: file:/var/folders/36/cjkbrr953xg2p_krwrmn8h_r0000gn/T/marmbrus/hive_2014-01-14_11-45-24_849_6968895828655634809-1/-ext-10001/
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  properties:
+                    columns _col0
+                    columns.types struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint>
+                    escape.delim \
+                    hive.serialization.extend.nesting.levels true
+                    serialization.format 1
+              TotalFiles: 1
+              GatherStats: false
+              MultiFileSpray: false
+
+  Stage: Stage-1
+    Column Stats Work
+      Column Stats Desc:
+          Columns: employeeID
+          Column Types: int
+          Partition: employeesalary=4000.0
+          Table: Employee_Part
+          Is Table Level Stats: false
+
author	Michael Armbrust <michael@databricks.com>	2014-03-21 15:05:45 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-03-21 15:05:45 -0700
commit	7e17fe69f9c3dc4cac024ea483f5d5f34ee06203 (patch)
tree	bf6235fda03105bb981d64a25819ddb5a49bc19c /sql/hive/src/test/resources/golden/columnstats_partlvl-8-dc5682403f4154cef30860f2b4e37bce
parent	e09139d9ca529a8f983a8b3e2a8158c3f3caa523 (diff)
download	spark-7e17fe69f9c3dc4cac024ea483f5d5f34ee06203.tar.gz spark-7e17fe69f9c3dc4cac024ea483f5d5f34ee06203.tar.bz2 spark-7e17fe69f9c3dc4cac024ea483f5d5f34ee06203.zip