pyspark.pandas.DataFrame.drop¶
-
DataFrame.
drop
(labels: Union[Any, Tuple[Any, …], List[Union[Any, Tuple[Any, …]]], None] = None, axis: Union[int, str] = 1, columns: Union[Any, Tuple[Any, …], List[Union[Any, Tuple[Any, …]]]] = None) → pyspark.pandas.frame.DataFrame[source]¶ Drop specified labels from columns.
Remove columns by specifying label names and axis=1 or columns. When specifying both labels and columns, only labels will be dropped. Removing rows is yet to be implemented.
- Parameters
- labelssingle label or list-like
Column labels to drop.
- axis{1 or ‘columns’}, default 1
- columnssingle label or list-like
Alternative to specifying axis (
labels, axis=1
is equivalent tocolumns=labels
).
- Returns
- droppedDataFrame
See also
Notes
Currently only axis = 1 is supported in this function, axis = 0 is yet to be implemented.
Examples
>>> df = ps.DataFrame({'x': [1, 2], 'y': [3, 4], 'z': [5, 6], 'w': [7, 8]}, ... columns=['x', 'y', 'z', 'w']) >>> df x y z w 0 1 3 5 7 1 2 4 6 8
>>> df.drop('x', axis=1) y z w 0 3 5 7 1 4 6 8
>>> df.drop(['y', 'z'], axis=1) x w 0 1 7 1 2 8
>>> df.drop(columns=['y', 'z']) x w 0 1 7 1 2 8
Also support for MultiIndex
>>> df = ps.DataFrame({'x': [1, 2], 'y': [3, 4], 'z': [5, 6], 'w': [7, 8]}, ... columns=['x', 'y', 'z', 'w']) >>> columns = [('a', 'x'), ('a', 'y'), ('b', 'z'), ('b', 'w')] >>> df.columns = pd.MultiIndex.from_tuples(columns) >>> df a b x y z w 0 1 3 5 7 1 2 4 6 8 >>> df.drop('a') b z w 0 5 7 1 6 8