pyspark.sql.functions.map_from_arrays#

pyspark.sql.functions.map_from_arrays(col1, col2)[source]#

Map function: Creates a new map from two arrays. This function takes two arrays of keys and values respectively, and returns a new map column. .. versionadded:: 2.4.0

Changed in version 3.4.0: Supports Spark Connect.

Parameters
col1Column or str

Name of column containing a set of keys. All elements should not be null.

col2Column or str

Name of column containing a set of values.

Returns
Column

A column of map type.

Notes

The input arrays for keys and values must have the same length and all elements in keys should not be null. If these conditions are not met, an exception will be thrown.

Examples

Example 1: Basic usage of map_from_arrays

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([([2, 5], ['a', 'b'])], ['k', 'v'])
>>> df.select(sf.map_from_arrays(df.k, df.v)).show()
+---------------------+
|map_from_arrays(k, v)|
+---------------------+
|     {2 -> a, 5 -> b}|
+---------------------+

Example 2: map_from_arrays with null values

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([([1, 2], ['a', None])], ['k', 'v'])
>>> df.select(sf.map_from_arrays(df.k, df.v)).show()
+---------------------+
|map_from_arrays(k, v)|
+---------------------+
|  {1 -> a, 2 -> NULL}|
+---------------------+

Example 3: map_from_arrays with empty arrays

>>> from pyspark.sql import functions as sf
>>> from pyspark.sql.types import ArrayType, StringType, IntegerType, StructType, StructField
>>> schema = StructType([
...   StructField('k', ArrayType(IntegerType())),
...   StructField('v', ArrayType(StringType()))
... ])
>>> df = spark.createDataFrame([([], [])], schema=schema)
>>> df.select(sf.map_from_arrays(df.k, df.v)).show()
+---------------------+
|map_from_arrays(k, v)|
+---------------------+
|                   {}|
+---------------------+