pyspark.sql.functions.sum#

pyspark.sql.functions.sum(col)[source]#

Aggregate function: returns the sum of all values in the expression.

New in version 1.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
colColumn or str

target column to compute on.

Returns
Column

the column for computed results.

Examples

Example 1: Calculating the sum of values in a column

>>> from pyspark.sql import functions as sf
>>> df = spark.range(10)
>>> df.select(sf.sum(df["id"])).show()
+-------+
|sum(id)|
+-------+
|     45|
+-------+

Example 2: Using a plus expression together to calculate the sum

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([(1, 2), (3, 4)], ["A", "B"])
>>> df.select(sf.sum(sf.col("A") + sf.col("B"))).show()
+------------+
|sum((A + B))|
+------------+
|          10|
+------------+

Example 3: Calculating the summation of ages with None

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([(1982, None), (1990, 2), (2000, 4)], ["birth", "age"])
>>> df.select(sf.sum("age")).show()
+--------+
|sum(age)|
+--------+
|       6|
+--------+