pyspark.pandas.DataFrame.insert#
- DataFrame.insert(loc, column, value, allow_duplicates=False)[source]#
Insert column into DataFrame at specified location.
Raises a ValueError if column is already contained in the DataFrame, unless allow_duplicates is set to True.
- Parameters
- locint
Insertion index. Must verify 0 <= loc <= len(columns).
- columnstr, number, or hashable object
Label of the inserted column.
- valueint, Series, or array-like
- allow_duplicatesbool, optional
Examples
>>> psdf = ps.DataFrame([1, 2, 3]) >>> psdf.sort_index() 0 0 1 1 2 2 3 >>> psdf.insert(0, 'x', 4) >>> psdf.sort_index() x 0 0 4 1 1 4 2 2 4 3
>>> from pyspark.pandas.config import set_option, reset_option >>> set_option("compute.ops_on_diff_frames", True)
>>> psdf.insert(1, 'y', [5, 6, 7]) >>> psdf.sort_index() x y 0 0 4 5 1 1 4 6 2 2 4 7 3
>>> psdf.insert(2, 'z', ps.Series([8, 9, 10])) >>> psdf.sort_index() x y z 0 0 4 5 8 1 1 4 6 9 2 2 4 7 10 3
>>> reset_option("compute.ops_on_diff_frames")