Subject: [mongodb-user] Re: Issue reading from existing
document with using spark-connector - Type
Conversion problem?



Hi Wan,
Yes it had been a while, but your answer has helped me move further forward with this project.
Greatly appreciated for that :)
regards,Luke
On Monday, 7 August 2017 03:26:30 UTC-4, Wan Bachtiar wrote:

I don’t want to filter out these rows as the rest of the metadata is useful to me, but I also don’t want to have to remap all 900 fields from the schema just to tackle 4 problematic fields - is there an easy way to tell the Type Conversion code to simply make an assumption based on a setting / something I can force?

Hi Luke,

It’s been a while since you posted this question, have you found the answer to your issue ?

I assumed that the schema is inferred by Spark and not explicitly specified i.e. map 900 fields.
Based on the exception message you posted, the value seems to be coming from geo field or sort.
Spark would samples fields to infer the schema, and in this case it is likely that it has sampled all geo fields of NULL values. Which resulting in inferring type for geo field as NullType. When it encountered a document with non-null value of type Document it sees it as conflict.

A work around without defining your own map of 900 fields, is to let it infer the schema and then modify selected types only.
For example:

>>> df.printSchema()
root
 |-- _id: struct (nullable = true)
 |    |-- oid: string (nullable = true)
 |-- a: null (nullable = true)
 |-- b: string (nullable = true)

# Example of changing the NullType to StringType
>>> modified_df = df.withColumn("geo", df["geo"].cast("string")) 
>>> modified_df.printSchema()
root
 |-- _id: struct (nullable = true)
 |    |-- oid: string (nullable = true)
 |-- a: string (nullable = true)
 |-- b: string (nullable = true)

See also pyspark.sql.types

Regards,
Wan.

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/eea51e83-5016-4e5d-90ab-84d8194b75ec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



Programming list archiving by: Enterprise Git Hosting