pyspark.sql.DataFrame.select#
- DataFrame.select(*cols)[source]#
Projects a set of expressions and returns a new
DataFrame
.New in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- Returns
DataFrame
A DataFrame with subset (or all) of columns.
Examples
>>> df = spark.createDataFrame([ ... (2, "Alice"), (5, "Bob")], schema=["age", "name"])
Select all columns in the DataFrame.
>>> df.select('*').show() +---+-----+ |age| name| +---+-----+ | 2|Alice| | 5| Bob| +---+-----+
Select a column with other expressions in the DataFrame.
>>> df.select(df.name, (df.age + 10).alias('age')).show() +-----+---+ | name|age| +-----+---+ |Alice| 12| | Bob| 15| +-----+---+