pyspark.sql.functions.sum#
- pyspark.sql.functions.sum(col)[source]#
Aggregate function: returns the sum of all values in the expression.
New in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- col
Column
or str target column to compute on.
- col
- Returns
Column
the column for computed results.
Examples
Example 1: Calculating the sum of values in a column
>>> from pyspark.sql import functions as sf >>> df = spark.range(10) >>> df.select(sf.sum(df["id"])).show() +-------+ |sum(id)| +-------+ | 45| +-------+
Example 2: Using a plus expression together to calculate the sum
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(1, 2), (3, 4)], ["A", "B"]) >>> df.select(sf.sum(sf.col("A") + sf.col("B"))).show() +------------+ |sum((A + B))| +------------+ | 10| +------------+
Example 3: Calculating the summation of ages with None
>>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([(1982, None), (1990, 2), (2000, 4)], ["birth", "age"]) >>> df.select(sf.sum("age")).show() +--------+ |sum(age)| +--------+ | 6| +--------+