FivetranAccessor.process_ad_reporting#

FivetranAccessor.process_ad_reporting(value_columns='impressions', *, date_col='date_day', platform_col='platform', agg='sum', fill_value=0.0, include_missing_dates=False, freq='D', rename_date_to='date')[source]#

Process Fivetran Ad Reporting tables into wide, model-ready features.

Compatible with Fivetran’s Ad Reporting schema tables:

ad_reporting__account_report: daily metrics by account
ad_reporting__campaign_report: daily metrics by campaign and account
ad_reporting__ad_group_report: daily metrics by ad group, campaign and account
ad_reporting__ad_report: daily metrics by ad, ad group, campaign and account

The input data must include a date column, a platform column (e.g., vendor name), and one or more metric columns such as spend or impressions. The output is a wide dataframe with one row per date and columns named {platform}_{metric}.

This function supports multiple DataFrame backends including pandas, polars, and PySpark. The output type will match the input type (type preservation).

Parameters:

dfDataFrame-like: Input dataframe in long format with at least the date, platform, and metric columns. Accepts pandas.DataFrame, polars.DataFrame, polars.LazyFrame, or pyspark.DataFrame.
value_columnsstr or Sequence[str], default “impressions”: Column name(s) to aggregate and pivot. Example: “spend” or [“spend”, “impressions”].
date_colstr, default “date_day”: Name of the date column.
platform_colstr, default “platform”: Name of the platform (vendor) column.
aggstr, default “sum”: Aggregation method applied during groupby. Supported: ‘sum’, ‘mean’, ‘min’, ‘max’, ‘count’.
fill_valuefloat or None, default 0.0: Value used to fill missing values in the wide output. If None, missing values are left as NaN.
include_missing_datesbool, default False: If True, include a continuous date range and fill missing dates using fill_value.
freqstr, default “D”: Frequency used when include_missing_dates is True.
rename_date_tostr or None, default “date”: If provided, rename the date column in the result to this value. If None, keep date_col.

Returns:

DataFrame-like: A wide-format dataframe with one row per date and columns for each {platform}_{metric} combination. The return type matches the input type.

Notes

Backend-specific implementation notes:

PySpark: Uses native PySpark pivot operations for distributed computing. This is a temporary workaround until narwhals adds LazyFrame.pivot support. See: narwhals-dev/narwhals#1901
pandas/polars: Uses narwhals pivot operations (works on eager DataFrames).
polars.LazyFrame: Input LazyFrames are automatically collected to eager DataFrames for the pivot operation, then results are returned as eager DataFrames.