pyspark.sql.functions.map_from_arrays#
- pyspark.sql.functions.map_from_arrays(col1, col2)[source]#
Map function: Creates a new map from two arrays. This function takes two arrays of keys and values respectively, and returns a new map column. .. versionadded:: 2.4.0
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- Returns
Column
A column of map type.
Notes
The input arrays for keys and values must have the same length and all elements in keys should not be null. If these conditions are not met, an exception will be thrown.
Examples
Example 1: Basic usage of map_from_arrays
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([2, 5], ['a', 'b'])], ['k', 'v']) >>> df.select(sf.map_from_arrays(df.k, df.v)).show() +---------------------+ |map_from_arrays(k, v)| +---------------------+ | {2 -> a, 5 -> b}| +---------------------+
Example 2: map_from_arrays with null values
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, 2], ['a', None])], ['k', 'v']) >>> df.select(sf.map_from_arrays(df.k, df.v)).show() +---------------------+ |map_from_arrays(k, v)| +---------------------+ | {1 -> a, 2 -> NULL}| +---------------------+
Example 3: map_from_arrays with empty arrays
>>> from pyspark.sql import functions as sf >>> from pyspark.sql.types import ArrayType, StringType, IntegerType, StructType, StructField >>> schema = StructType([ ... StructField('k', ArrayType(IntegerType())), ... StructField('v', ArrayType(StringType())) ... ]) >>> df = spark.createDataFrame([([], [])], schema=schema) >>> df.select(sf.map_from_arrays(df.k, df.v)).show() +---------------------+ |map_from_arrays(k, v)| +---------------------+ | {}| +---------------------+