pyspark.sql.functions.array_size#
- pyspark.sql.functions.array_size(col)[source]#
Array function: returns the total number of elements in the array. The function returns null for null input.
New in version 3.5.0.
- Parameters
- col
Column
or str The name of the column or an expression that represents the array.
- col
- Returns
Column
A new column that contains the size of each array.
Examples
Example 1: Basic usage with integer array
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([2, 1, 3],), (None,)], ['data']) >>> df.select(sf.array_size(df.data)).show() +----------------+ |array_size(data)| +----------------+ | 3| | NULL| +----------------+
Example 2: Usage with string array
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(['apple', 'banana', 'cherry'],)], ['data']) >>> df.select(sf.array_size(df.data)).show() +----------------+ |array_size(data)| +----------------+ | 3| +----------------+
Example 3: Usage with mixed type array
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(['apple', 1, 'cherry'],)], ['data']) >>> df.select(sf.array_size(df.data)).show() +----------------+ |array_size(data)| +----------------+ | 3| +----------------+
Example 4: Usage with array of arrays
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([[2, 1], [3, 4]],)], ['data']) >>> df.select(sf.array_size(df.data)).show() +----------------+ |array_size(data)| +----------------+ | 2| +----------------+
Example 5: Usage with empty array
>>> from pyspark.sql import functions as sf >>> from pyspark.sql.types import ArrayType, IntegerType, StructType, StructField >>> schema = StructType([ ... StructField("data", ArrayType(IntegerType()), True) ... ]) >>> df = spark.createDataFrame([([],)], schema=schema) >>> df.select(sf.array_size(df.data)).show() +----------------+ |array_size(data)| +----------------+ | 0| +----------------+