Aggregations
Aggregation specs used with .group_by(...).agg(...). Each spec reduces over a streamed spatial join without materialising the full pair frame.
import pycanopy as pc
result = (
zones.lazy()
.within_join(trips, x_col="lon", y_col="lat")
.group_by(["zone_id"])
.agg(
n=pc.agg.count(),
total_fare=pc.agg.sum("fare"),
avg_fare=pc.agg.mean("fare"),
min_fare=pc.agg.min("fare"),
max_fare=pc.agg.max("fare"),
)
)
pycanopy.agg
Aggregation specs for the fused aggregate-join (SpatialGroupBy.agg). Specs are associative so partials fold over the streamed join without materialising the full pair frame.
AggSpec
dataclass
One associative aggregation: a kind and the column it reads (None for count).
inputs
property
Source columns this spec reads, for the join keep-set.
Returns:
| Type | Description |
|---|---|
set[str]
|
The set of source column names, empty for count. |
combine(name)
Build the cross-morsel exprs that re-aggregate this spec's partials.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Output column name this aggregation produces. |
required |
Returns:
| Type | Description |
|---|---|
list[Expr]
|
Exprs re-aggregating this spec's prefixed intermediate columns. |
finalize(name)
Build the expr producing the named output from the combined partials.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Output column name this aggregation produces. |
required |
Returns:
| Type | Description |
|---|---|
Expr
|
Expr yielding the named output column. |
partial(name)
Build the per-morsel aggregation exprs for this spec.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Output column name this aggregation produces. |
required |
Returns:
| Type | Description |
|---|---|
list[Expr]
|
Exprs producing this spec's prefixed intermediate columns. |
count()
Count rows (pairs) per group, like Polars pl.len().
Returns:
| Type | Description |
|---|---|
AggSpec
|
An AggSpec for the count aggregation. |
max(column)
Maximum of a column per group.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column
|
str
|
Name of the column to reduce. |
required |
Returns:
| Type | Description |
|---|---|
AggSpec
|
An AggSpec for the max aggregation. |
mean(column)
Mean of a column per group, ignoring nulls.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column
|
str
|
Name of the column to average. |
required |
Returns:
| Type | Description |
|---|---|
AggSpec
|
An AggSpec for the mean aggregation. |
min(column)
Minimum of a column per group.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column
|
str
|
Name of the column to reduce. |
required |
Returns:
| Type | Description |
|---|---|
AggSpec
|
An AggSpec for the min aggregation. |
sum(column)
Sum a column per group.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column
|
str
|
Name of the column to sum. |
required |
Returns:
| Type | Description |
|---|---|
AggSpec
|
An AggSpec for the sum aggregation. |