Skip to content

Engine & Utilities

Engine is the low-level spatial index that backs every SpatialFrame. Most users interact with it only through SpatialFrame.engine for delta buffer operations or batch convex-hull computations. It can also be used directly when you need index access without the DataFrame layer.

wkb_point_distance and wkb_points_to_xy are standalone utility functions for working with WKB-encoded geometry columns.

pycanopy.Engine

Geospatial query engine with automatic index selection.

Parameters:

Name Type Description Default
geometries

Any of: GeoArrow PyArrow array, geopandas GeoSeries, numpy (N x 2) array, list of shapely Points, or list of (x, y) tuples.

None

delta_len property

Report how many points are currently in the delta buffer.

Returns:

Type Description
int

The number of points in the delta buffer.

extent property

Report the bounding extent of the dataset.

Returns:

Type Description
tuple[float, float, float, float] | None

The extent as (min_x, min_y, max_x, max_y), or None if empty.

index_bytes property

Report the heap bytes of all built indexes, excluding the xs/ys arrays.

Returns:

Type Description
int

Heap bytes of built indexes, zero before any build.

n property

Report the number of geometries in the dataset.

Returns:

Type Description
int

The number of geometries in the dataset.

append_delta(xs, ys)

Append new points to the delta buffer (point datasets only).

Parameters:

Name Type Description Default
xs

x coordinates as a float64 array-like.

required
ys

y coordinates as a float64 array-like.

required

build_index()

Build the spatial index ahead of any query (idempotent).

contains(x, y)

Return indices of polygons that contain the point (x, y).

Parameters:

Name Type Description Default
x float

X coordinate of the query point.

required
y float

Y coordinate of the query point.

required

Returns:

Type Description
list[int]

Indices of matching polygons in no guaranteed order.

convex_hull_area(xs, ys) staticmethod

Return the area of the convex hull of a standalone point set.

Parameters:

Name Type Description Default
xs

x coordinates as a float64 array-like.

required
ys

y coordinates as a float64 array-like.

required

Returns:

Type Description
float

Area of the convex hull. Zero for fewer than three points.

flush()

Force the delta buffer to be merged into the main index immediately.

from_coords(xs, ys) classmethod

Construct directly from x and y coordinate sequences.

Parameters:

Name Type Description Default
xs Sequence[float]

Sequence of x coordinates.

required
ys Sequence[float]

Sequence of y coordinates.

required

Returns:

Type Description
Engine

Engine object over coord data.

from_polygons(geometries) classmethod

Construct from a collection of polygon geometries. Interior holes are supported.

Parameters:

Name Type Description Default
geometries

A geopandas GeoSeries or list of shapely Polygon / MultiPolygon objects. Polygons with holes are accepted, and a MultiPolygon is treated as one logical polygon spanning all of its parts.

required

Returns:

Type Description
Engine

Engine object over polygon data.

from_wkb_points(points) classmethod

Construct from a column of WKB point geometries.

Decoded with a vectorised buffer read for standard 2D LE WKB, falling back to shapely otherwise.

Parameters:

Name Type Description Default
points

A polars Binary Series, a pyarrow Binary/LargeBinary array, or a numpy object array of WKB byte strings.

required

Returns:

Type Description
Engine

Engine object over coord data.

from_wkb_polygons(column) classmethod

Construct from a WKB Polygon/MultiPolygon column, decoding the bytes in Rust.

Parameters:

Name Type Description Default
column

A polars Binary Series or pyarrow Binary/LargeBinary array of WKB Polygon / MultiPolygon geometries.

required

Returns:

Type Description
Engine

Engine object over polygon data.

group_convex_hull_areas(xs_series, ys_series) staticmethod

Compute the convex hull area for each group in a pair of Polars list Series.

Parameters:

Name Type Description Default
xs_series

A Polars List(Float64) Series of x coordinates, one list per group.

required
ys_series

A Polars List(Float64) Series of y coordinates, one list per group.

required

Returns:

Type Description
ndarray

Float64 numpy array of convex hull areas, one per group.

knn(x, y, k)

Return indices of the k nearest points to (x, y).

Parameters:

Name Type Description Default
x float

X coordinate of the query point.

required
y float

Y coordinate of the query point.

required
k int

Number of neighbours to return.

required

Returns:

Type Description
list[int]

Indices into the original dataset, sorted nearest-first.

knn_from_candidates(x, y, k, survivor_indices)

Return the k nearest indices from a candidate subset of the dataset.

Squared distances to each survivor then a partial sort for the k nearest. Use when M survivors are already known (e.g. after scalar pre-filtering).

Parameters:

Name Type Description Default
x float

X coordinate of the query point.

required
y float

Y coordinate of the query point.

required
k int

Number of neighbours to return.

required
survivor_indices ndarray

Contiguous uint32 array of M row positions in the full dataset.

required

Returns:

Type Description
list[int]

Up to k indices into the original dataset, sorted nearest-first.

points_within_distance_of_polygon(polygon, distance)

Return indices of engine points within distance of a query polygon.

Parameters:

Name Type Description Default
polygon

A shapely Polygon or MultiPolygon (interior holes supported). A point matches when within distance of any part.

required
distance float

Maximum Euclidean point-to-polygon distance for a match.

required

Returns:

Type Description
ndarray

uint64 array of matching point indices.

polygon_areas()

Return the unsigned area of every polygon in dataset order (polygon datasets only).

Returns:

Type Description
ndarray

float64 array of unsigned polygon areas in dataset order.

polygon_intersects_self_join()

Return all intersecting polygon pairs (i, j) with i < j. Polygon datasets only.

Returns:

Type Description
ndarray

uint64 array of shape (M * 2,) interleaved [i0, j0, i1, j1, ...].

range_query(min_x, min_y, max_x, max_y)

Return indices of all points inside the bounding box.

Parameters:

Name Type Description Default
min_x float

Minimum x coordinate of the bounding box.

required
min_y float

Minimum y coordinate of the bounding box.

required
max_x float

Maximum x coordinate of the bounding box.

required
max_y float

Maximum y coordinate of the bounding box.

required

Returns:

Type Description
list[int]

Indices of matching geometries in no guaranteed order.

set_index_mode(mode)

Set the index build policy, returning the previous mode.

Modes: "eager" (build when a kind is selected), "none" (always brute-force scan), "auto" (build only when the cost model beats a scan) and retains built indexes.

Parameters:

Name Type Description Default
mode str

"auto", "eager", or "none".

required

Returns:

Type Description
str

The previous index mode.

stats()

Return a human-readable summary of dataset statistics.

Returns:

Type Description
str

A human-readable summary of dataset statistics.

pycanopy.wkb_point_distance(series_a, series_b)

Compute the Euclidean distance between two WKB point columns in one parallel pass.

Parameters:

Name Type Description Default
series_a

A column of WKB point geometries (first point set).

required
series_b

A column of WKB point geometries (second point set).

required

Returns:

Type Description
ndarray

Float64 numpy array of per-row distances.

pycanopy.wkb_points_to_xy(points)

Decode a column of WKB point geometries to contiguous float64 x and y arrays.

Standard 2D little-endian points use a vectorised buffer read. Other variants (big-endian, Z/M, nulls) fall back to shapely.

Parameters:

Name Type Description Default
points

A column of WKB point geometries in one of the accepted forms.

required

Returns:

Type Description
tuple[ndarray, ndarray]

Pair (xs, ys) of contiguous float64 numpy arrays.