Skip to content

feat: Implement mapping from GeoArrow extension types to Polars extension types#258

Merged
kylebarron merged 3 commits intomainfrom
kyle/geoarrow-extension-types
Dec 10, 2025
Merged

feat: Implement mapping from GeoArrow extension types to Polars extension types#258
kylebarron merged 3 commits intomainfrom
kyle/geoarrow-extension-types

Conversation

@kylebarron
Copy link
Copy Markdown
Collaborator

@kylebarron kylebarron commented Dec 10, 2025

Change list

  • Create geopolars-extension crate to handle definitions of Polars extension types and the act of registering them with Polars
  • Create Polars extension types as wrappers around upstream types in geoarrow-schema
  • Create factory for dynamically creating extension types.

Future work:

name: &str,
storage: &DataType,
metadata: Option<&str>,
) -> Box<dyn ExtensionTypeImpl> {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orlp I was surprised that this function is infallible. How should implementors behave when the storage type is incompatible with the declared extension type for the given name?

This corresponds to arrow_schema::ExtensionType::try_new, where they allow an error in case

This should return an error if the given data type is not supported by this extension type.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the moment there is no way to register a type-check for storage type compatibility. Like any other overridable behavior there were unresolved questions about multiple extensions registering incompatible type-checks. The rest of the Polars code base also was not yet ready for the concept of "give me the data type of this column" to be fallible (which would be the case when reading from Parquet with an extension type with a fallible constructor).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior should just be to construct some valid representation. Do your type checking for the correct storage type in your functions handling the data, at least for now.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, perhaps I'll change this to be

struct PointType(GeoArrowResult<geoarrow_schema::PointType>);

but I'll leave that for a future PR


/// Register all GeoArrow extension types into the Polars extension registry.
pub fn register_all_extensions() -> PolarsResult<()> {
let factory = Arc::new(GeoArrowExtensionTypeFactory);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could create a separate factory function for each geometry type, but for simplicity I think it's fine to have a single factory for all types

Comment on lines +101 to +108
define_basic_type!(
/// A GeoPolars GeometryCollection extension type.
GeometryCollectionType
);
define_basic_type!(
/// A GeoPolars Geometry extension type.
GeometryType
);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two types are implemented with Arrow unions, so they're currently incompatible with Polars. They're still defined for completeness, but I assume these code paths will never be reached, because Polars will error at some prior point when seeing union types.

}

fn dyn_display(&self) -> Cow<'_, str> {
Cow::Owned(format!("{:?}", self.0))
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future we can better define a Display that is improved over Debug

@kylebarron kylebarron merged commit eebad68 into main Dec 10, 2025
4 checks passed
@kylebarron kylebarron deleted the kyle/geoarrow-extension-types branch December 10, 2025 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants