column_skewnessReviewedSample skewness of a numeric column
Curator doesn’t write every analysis from scratch. It builds on a shared library of vetted building blocks — and tells you exactly how much each one has been verified. 7 and growing.
column_skewnessReviewedSample skewness of a numeric column
correlation_pairReviewedPearson correlation between two numeric columns, with the number of complete (non-null) pairs used.
Drops rows missing either column, then computes the Pearson correlation coefficient on the remaining complete pairs.
filtered_countReviewedCount rows where a column satisfies a comparison (==, !=, >, >=, <, <=), with the matched percentage.
Builds a boolean mask comparing col against value with the chosen operator (coercing value to float for numeric columns) and counts the True rows.
group_aggregateReviewedAggregate a numeric column by a grouping column (mean/sum/min/max/median), sorted by the result. With no value column it counts rows per group.
Groups rows by group_col (NaNs kept as their own group) and applies the requested aggregation to value_col, then sorts descending by the aggregate.
grouped_linear_regressionReviewedComputes linear regression coefficients (slope and intercept) for each group in a grouped DataFrame.
For each group sorted by the grouping column, drops rows with missing x or y values, performs linear regression using scipy.stats.linregress, and collects the slope and intercept.
top_n_byReviewedReturn the top N rows ranked by a numeric column (descending by default).
Sorts the full frame by sort_col and returns the first N rows.
value_distributionReviewedFrequency distribution of a column: each distinct value with its count and percentage of rows.
Counts occurrences of each distinct value (NaNs included), keeps the top values, and adds each value's share of total rows.
Every primitive runs in your browser, and its exact code is one click away on any answer that uses it. Try it on your data — or suggest a primitive.