diff options
author | Kan Zhang <kzhang@apache.org> | 2014-08-14 19:03:51 -0700 |
---|---|---|
committer | Matei Zaharia <matei@databricks.com> | 2014-08-14 19:03:51 -0700 |
commit | 9422a9b084e3fd5b2b9be2752013588adfb430d0 (patch) | |
tree | 72d21725ac720cb2b796a42e4803d547a6a4514b /examples/src/main/resources | |
parent | 3a8b68b7353fea50245686903b308fa9eb52cb51 (diff) | |
download | spark-9422a9b084e3fd5b2b9be2752013588adfb430d0.tar.gz spark-9422a9b084e3fd5b2b9be2752013588adfb430d0.tar.bz2 spark-9422a9b084e3fd5b2b9be2752013588adfb430d0.zip |
[SPARK-2736] PySpark converter and example script for reading Avro files
JIRA: https://issues.apache.org/jira/browse/SPARK-2736
This patch includes:
1. An Avro converter that converts Avro data types to Python. It handles all 3 Avro data mappings (Generic, Specific and Reflect).
2. An example Python script for reading Avro files using AvroKeyInputFormat and the converter.
3. Fixing a classloading issue.
cc @MLnick @JoshRosen @mateiz
Author: Kan Zhang <kzhang@apache.org>
Closes #1916 from kanzhang/SPARK-2736 and squashes the following commits:
02443f8 [Kan Zhang] [SPARK-2736] Adding .avsc files to .rat-excludes
f74e9a9 [Kan Zhang] [SPARK-2736] nit: clazz -> className
82cc505 [Kan Zhang] [SPARK-2736] Update data sample
0be7761 [Kan Zhang] [SPARK-2736] Example pyspark script and data files
c8e5881 [Kan Zhang] [SPARK-2736] Trying to work with all 3 Avro data models
2271a5b [Kan Zhang] [SPARK-2736] Using the right class loader to find Avro classes
536876b [Kan Zhang] [SPARK-2736] Adding Avro to Java converter
Diffstat (limited to 'examples/src/main/resources')
-rw-r--r-- | examples/src/main/resources/user.avsc | 8 | ||||
-rw-r--r-- | examples/src/main/resources/users.avro | bin | 0 -> 334 bytes |
2 files changed, 8 insertions, 0 deletions
diff --git a/examples/src/main/resources/user.avsc b/examples/src/main/resources/user.avsc new file mode 100644 index 0000000000..4995357ab3 --- /dev/null +++ b/examples/src/main/resources/user.avsc @@ -0,0 +1,8 @@ +{"namespace": "example.avro", + "type": "record", + "name": "User", + "fields": [ + {"name": "name", "type": "string"}, + {"name": "favorite_color", "type": ["string", "null"]} + ] +} diff --git a/examples/src/main/resources/users.avro b/examples/src/main/resources/users.avro Binary files differnew file mode 100644 index 0000000000..27c526ab11 --- /dev/null +++ b/examples/src/main/resources/users.avro |