feat: Implement mapping from GeoArrow extension types to Polars extension types#258
feat: Implement mapping from GeoArrow extension types to Polars extension types#258kylebarron merged 3 commits intomainfrom
Conversation
| name: &str, | ||
| storage: &DataType, | ||
| metadata: Option<&str>, | ||
| ) -> Box<dyn ExtensionTypeImpl> { |
There was a problem hiding this comment.
@orlp I was surprised that this function is infallible. How should implementors behave when the storage type is incompatible with the declared extension type for the given name?
This corresponds to arrow_schema::ExtensionType::try_new, where they allow an error in case
This should return an error if the given data type is not supported by this extension type.
There was a problem hiding this comment.
For the moment there is no way to register a type-check for storage type compatibility. Like any other overridable behavior there were unresolved questions about multiple extensions registering incompatible type-checks. The rest of the Polars code base also was not yet ready for the concept of "give me the data type of this column" to be fallible (which would be the case when reading from Parquet with an extension type with a fallible constructor).
There was a problem hiding this comment.
The behavior should just be to construct some valid representation. Do your type checking for the correct storage type in your functions handling the data, at least for now.
There was a problem hiding this comment.
Ok, perhaps I'll change this to be
struct PointType(GeoArrowResult<geoarrow_schema::PointType>);but I'll leave that for a future PR
|
|
||
| /// Register all GeoArrow extension types into the Polars extension registry. | ||
| pub fn register_all_extensions() -> PolarsResult<()> { | ||
| let factory = Arc::new(GeoArrowExtensionTypeFactory); |
There was a problem hiding this comment.
We could create a separate factory function for each geometry type, but for simplicity I think it's fine to have a single factory for all types
| define_basic_type!( | ||
| /// A GeoPolars GeometryCollection extension type. | ||
| GeometryCollectionType | ||
| ); | ||
| define_basic_type!( | ||
| /// A GeoPolars Geometry extension type. | ||
| GeometryType | ||
| ); |
There was a problem hiding this comment.
These two types are implemented with Arrow unions, so they're currently incompatible with Polars. They're still defined for completeness, but I assume these code paths will never be reached, because Polars will error at some prior point when seeing union types.
| } | ||
|
|
||
| fn dyn_display(&self) -> Cow<'_, str> { | ||
| Cow::Owned(format!("{:?}", self.0)) |
There was a problem hiding this comment.
In the future we can better define a Display that is improved over Debug
Change list
geopolars-extensioncrate to handle definitions of Polars extension types and the act of registering them with Polarsgeoarrow-schemaFuture work: