Packages

c

org.apache.spark.sql

ColumnName

class ColumnName extends Column

A convenient class used for constructing schema.

Annotations
@Stable()
Source
Column.scala
Since

1.3.0

Linear Supertypes
Column, Logging, AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. ColumnName
  2. Column
  3. Logging
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ColumnName(name: String)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. def %(other: Any): Column

    Modulo (a.k.a.

    Modulo (a.k.a. remainder) expression.

    Definition Classes
    Column
    Since

    1.3.0

  4. def &&(other: Any): Column

    Boolean AND.

    Boolean AND.

    // Scala: The following selects people that are in school and employed at the same time.
    people.select( people("inSchool") && people("isEmployed") )
    
    // Java:
    people.select( people.col("inSchool").and(people.col("isEmployed")) );
    Definition Classes
    Column
    Since

    1.3.0

  5. def *(other: Any): Column

    Multiplication of this expression and another expression.

    Multiplication of this expression and another expression.

    // Scala: The following multiplies a person's height by their weight.
    people.select( people("height") * people("weight") )
    
    // Java:
    people.select( people.col("height").multiply(people.col("weight")) );
    Definition Classes
    Column
    Since

    1.3.0

  6. def +(other: Any): Column

    Sum of this expression and another expression.

    Sum of this expression and another expression.

    // Scala: The following selects the sum of a person's height and weight.
    people.select( people("height") + people("weight") )
    
    // Java:
    people.select( people.col("height").plus(people.col("weight")) );
    Definition Classes
    Column
    Since

    1.3.0

  7. def -(other: Any): Column

    Subtraction.

    Subtraction. Subtract the other expression from this expression.

    // Scala: The following selects the difference between people's height and their weight.
    people.select( people("height") - people("weight") )
    
    // Java:
    people.select( people.col("height").minus(people.col("weight")) );
    Definition Classes
    Column
    Since

    1.3.0

  8. def /(other: Any): Column

    Division this expression by another expression.

    Division this expression by another expression.

    // Scala: The following divides a person's height by their weight.
    people.select( people("height") / people("weight") )
    
    // Java:
    people.select( people.col("height").divide(people.col("weight")) );
    Definition Classes
    Column
    Since

    1.3.0

  9. def <(other: Any): Column

    Less than.

    Less than.

    // Scala: The following selects people younger than 21.
    people.select( people("age") < 21 )
    
    // Java:
    people.select( people.col("age").lt(21) );
    Definition Classes
    Column
    Since

    1.3.0

  10. def <=(other: Any): Column

    Less than or equal to.

    Less than or equal to.

    // Scala: The following selects people age 21 or younger than 21.
    people.select( people("age") <= 21 )
    
    // Java:
    people.select( people.col("age").leq(21) );
    Definition Classes
    Column
    Since

    1.3.0

  11. def <=>(other: Any): Column

    Equality test that is safe for null values.

    Equality test that is safe for null values.

    Definition Classes
    Column
    Since

    1.3.0

  12. def =!=(other: Any): Column

    Inequality test.

    Inequality test.

    // Scala:
    df.select( df("colA") =!= df("colB") )
    df.select( !(df("colA") === df("colB")) )
    
    // Java:
    import static org.apache.spark.sql.functions.*;
    df.filter( col("colA").notEqual(col("colB")) );
    Definition Classes
    Column
    Since

    2.0.0

  13. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  14. def ===(other: Any): Column

    Equality test.

    Equality test.

    // Scala:
    df.filter( df("colA") === df("colB") )
    
    // Java
    import static org.apache.spark.sql.functions.*;
    df.filter( col("colA").equalTo(col("colB")) );
    Definition Classes
    Column
    Since

    1.3.0

  15. def >(other: Any): Column

    Greater than.

    Greater than.

    // Scala: The following selects people older than 21.
    people.select( people("age") > 21 )
    
    // Java:
    import static org.apache.spark.sql.functions.*;
    people.select( people.col("age").gt(21) );
    Definition Classes
    Column
    Since

    1.3.0

  16. def >=(other: Any): Column

    Greater than or equal to an expression.

    Greater than or equal to an expression.

    // Scala: The following selects people age 21 or older than 21.
    people.select( people("age") >= 21 )
    
    // Java:
    people.select( people.col("age").geq(21) )
    Definition Classes
    Column
    Since

    1.3.0

  17. def alias(alias: String): Column

    Gives the column an alias.

    Gives the column an alias. Same as as.

    // Renames colA to colB in select output.
    df.select($"colA".alias("colB"))
    Definition Classes
    Column
    Since

    1.4.0

  18. def and(other: Column): Column

    Boolean AND.

    Boolean AND.

    // Scala: The following selects people that are in school and employed at the same time.
    people.select( people("inSchool") && people("isEmployed") )
    
    // Java:
    people.select( people.col("inSchool").and(people.col("isEmployed")) );
    Definition Classes
    Column
    Since

    1.3.0

  19. def apply(extraction: Any): Column

    Extracts a value or values from a complex type.

    Extracts a value or values from a complex type. The following types of extraction are supported:

    • Given an Array, an integer ordinal can be used to retrieve a single value.
    • Given a Map, a key of the correct type can be used to retrieve an individual value.
    • Given a Struct, a string fieldName can be used to extract that field.
    • Given an Array of Structs, a string fieldName can be used to extract filed of every struct in that array, and return an Array of fields.
    Definition Classes
    Column
    Since

    1.4.0

  20. def array(dataType: DataType): StructField

    Creates a new StructField of type array.

    Creates a new StructField of type array.

    Since

    1.3.0

  21. def as(alias: String, metadata: Metadata): Column

    Gives the column an alias with metadata.

    Gives the column an alias with metadata.

    val metadata: Metadata = ...
    df.select($"colA".as("colB", metadata))
    Definition Classes
    Column
    Since

    1.3.0

  22. def as(alias: Symbol): Column

    Gives the column an alias.

    Gives the column an alias.

    // Renames colA to colB in select output.
    df.select($"colA".as("colB"))

    If the current column has metadata associated with it, this metadata will be propagated to the new column. If this not desired, use the API as(alias: String, metadata: Metadata) with explicit metadata.

    Definition Classes
    Column
    Since

    1.3.0

  23. def as(aliases: Array[String]): Column

    Assigns the given aliases to the results of a table generating function.

    Assigns the given aliases to the results of a table generating function.

    // Renames colA to colB in select output.
    df.select(explode($"myMap").as("key" :: "value" :: Nil))
    Definition Classes
    Column
    Since

    1.4.0

  24. def as(aliases: Seq[String]): Column

    (Scala-specific) Assigns the given aliases to the results of a table generating function.

    (Scala-specific) Assigns the given aliases to the results of a table generating function.

    // Renames colA to colB in select output.
    df.select(explode($"myMap").as("key" :: "value" :: Nil))
    Definition Classes
    Column
    Since

    1.4.0

  25. def as(alias: String): Column

    Gives the column an alias.

    Gives the column an alias.

    // Renames colA to colB in select output.
    df.select($"colA".as("colB"))

    If the current column has metadata associated with it, this metadata will be propagated to the new column. If this not desired, use the API as(alias: String, metadata: Metadata) with explicit metadata.

    Definition Classes
    Column
    Since

    1.3.0

  26. def as[U](implicit arg0: Encoder[U]): TypedColumn[Any, U]

    Provides a type hint about the expected return value of this column.

    Provides a type hint about the expected return value of this column. This information can be used by operations such as select on a Dataset to automatically convert the results into the correct JVM types.

    Definition Classes
    Column
    Since

    1.6.0

  27. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  28. def asc: Column

    Returns a sort expression based on ascending order of the column.

    Returns a sort expression based on ascending order of the column.

    // Scala: sort a DataFrame by age column in ascending order.
    df.sort(df("age").asc)
    
    // Java
    df.sort(df.col("age").asc());
    Definition Classes
    Column
    Since

    1.3.0

  29. def asc_nulls_first: Column

    Returns a sort expression based on ascending order of the column, and null values return before non-null values.

    Returns a sort expression based on ascending order of the column, and null values return before non-null values.

    // Scala: sort a DataFrame by age column in ascending order and null values appearing first.
    df.sort(df("age").asc_nulls_first)
    
    // Java
    df.sort(df.col("age").asc_nulls_first());
    Definition Classes
    Column
    Since

    2.1.0

  30. def asc_nulls_last: Column

    Returns a sort expression based on ascending order of the column, and null values appear after non-null values.

    Returns a sort expression based on ascending order of the column, and null values appear after non-null values.

    // Scala: sort a DataFrame by age column in ascending order and null values appearing last.
    df.sort(df("age").asc_nulls_last)
    
    // Java
    df.sort(df.col("age").asc_nulls_last());
    Definition Classes
    Column
    Since

    2.1.0

  31. def between(lowerBound: Any, upperBound: Any): Column

    True if the current column is between the lower bound and upper bound, inclusive.

    True if the current column is between the lower bound and upper bound, inclusive.

    Definition Classes
    Column
    Since

    1.4.0

  32. def binary: StructField

    Creates a new StructField of type binary.

    Creates a new StructField of type binary.

    Since

    1.3.0

  33. def bitwiseAND(other: Any): Column

    Compute bitwise AND of this expression with another expression.

    Compute bitwise AND of this expression with another expression.

    df.select($"colA".bitwiseAND($"colB"))
    Definition Classes
    Column
    Since

    1.4.0

  34. def bitwiseOR(other: Any): Column

    Compute bitwise OR of this expression with another expression.

    Compute bitwise OR of this expression with another expression.

    df.select($"colA".bitwiseOR($"colB"))
    Definition Classes
    Column
    Since

    1.4.0

  35. def bitwiseXOR(other: Any): Column

    Compute bitwise XOR of this expression with another expression.

    Compute bitwise XOR of this expression with another expression.

    df.select($"colA".bitwiseXOR($"colB"))
    Definition Classes
    Column
    Since

    1.4.0

  36. def boolean: StructField

    Creates a new StructField of type boolean.

    Creates a new StructField of type boolean.

    Since

    1.3.0

  37. def byte: StructField

    Creates a new StructField of type byte.

    Creates a new StructField of type byte.

    Since

    1.3.0

  38. def cast(to: String): Column

    Casts the column to a different data type, using the canonical string representation of the type.

    Casts the column to a different data type, using the canonical string representation of the type. The supported types are: string, boolean, byte, short, int, long, float, double, decimal, date, timestamp.

    // Casts colA to integer.
    df.select(df("colA").cast("int"))
    Definition Classes
    Column
    Since

    1.3.0

  39. def cast(to: DataType): Column

    Casts the column to a different data type.

    Casts the column to a different data type.

    // Casts colA to IntegerType.
    import org.apache.spark.sql.types.IntegerType
    df.select(df("colA").cast(IntegerType))
    
    // equivalent to
    df.select(df("colA").cast("int"))
    Definition Classes
    Column
    Since

    1.3.0

  40. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  41. def contains(other: Any): Column

    Contains the other element.

    Contains the other element. Returns a boolean column based on a string match.

    Definition Classes
    Column
    Since

    1.3.0

  42. def date: StructField

    Creates a new StructField of type date.

    Creates a new StructField of type date.

    Since

    1.3.0

  43. def decimal(precision: Int, scale: Int): StructField

    Creates a new StructField of type decimal.

    Creates a new StructField of type decimal.

    Since

    1.3.0

  44. def decimal: StructField

    Creates a new StructField of type decimal.

    Creates a new StructField of type decimal.

    Since

    1.3.0

  45. def desc: Column

    Returns a sort expression based on the descending order of the column.

    Returns a sort expression based on the descending order of the column.

    // Scala
    df.sort(df("age").desc)
    
    // Java
    df.sort(df.col("age").desc());
    Definition Classes
    Column
    Since

    1.3.0

  46. def desc_nulls_first: Column

    Returns a sort expression based on the descending order of the column, and null values appear before non-null values.

    Returns a sort expression based on the descending order of the column, and null values appear before non-null values.

    // Scala: sort a DataFrame by age column in descending order and null values appearing first.
    df.sort(df("age").desc_nulls_first)
    
    // Java
    df.sort(df.col("age").desc_nulls_first());
    Definition Classes
    Column
    Since

    2.1.0

  47. def desc_nulls_last: Column

    Returns a sort expression based on the descending order of the column, and null values appear after non-null values.

    Returns a sort expression based on the descending order of the column, and null values appear after non-null values.

    // Scala: sort a DataFrame by age column in descending order and null values appearing last.
    df.sort(df("age").desc_nulls_last)
    
    // Java
    df.sort(df.col("age").desc_nulls_last());
    Definition Classes
    Column
    Since

    2.1.0

  48. def divide(other: Any): Column

    Division this expression by another expression.

    Division this expression by another expression.

    // Scala: The following divides a person's height by their weight.
    people.select( people("height") / people("weight") )
    
    // Java:
    people.select( people.col("height").divide(people.col("weight")) );
    Definition Classes
    Column
    Since

    1.3.0

  49. def double: StructField

    Creates a new StructField of type double.

    Creates a new StructField of type double.

    Since

    1.3.0

  50. def dropFields(fieldNames: String*): Column

    An expression that drops fields in StructType by name.

    An expression that drops fields in StructType by name. This is a no-op if schema doesn't contain field name(s).

    val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col")
    df.select($"struct_col".dropFields("b"))
    // result: {"a":1}
    
    val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col")
    df.select($"struct_col".dropFields("c"))
    // result: {"a":1,"b":2}
    
    val df = sql("SELECT named_struct('a', 1, 'b', 2, 'c', 3) struct_col")
    df.select($"struct_col".dropFields("b", "c"))
    // result: {"a":1}
    
    val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col")
    df.select($"struct_col".dropFields("a", "b"))
    // result: org.apache.spark.sql.AnalysisException: cannot resolve 'update_fields(update_fields(`struct_col`))' due to data type mismatch: cannot drop all fields in struct
    
    val df = sql("SELECT CAST(NULL AS struct<a:int,b:int>) struct_col")
    df.select($"struct_col".dropFields("b"))
    // result: null of type struct<a:int>
    
    val df = sql("SELECT named_struct('a', 1, 'b', 2, 'b', 3) struct_col")
    df.select($"struct_col".dropFields("b"))
    // result: {"a":1}
    
    val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col")
    df.select($"struct_col".dropFields("a.b"))
    // result: {"a":{"a":1}}
    
    val df = sql("SELECT named_struct('a', named_struct('b', 1), 'a', named_struct('c', 2)) struct_col")
    df.select($"struct_col".dropFields("a.c"))
    // result: org.apache.spark.sql.AnalysisException: Ambiguous reference to fields

    This method supports dropping multiple nested fields directly e.g.

    val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col")
    df.select($"struct_col".dropFields("a.b", "a.c"))
    // result: {"a":{"a":1}}

    However, if you are going to drop multiple nested fields, it is more optimal to extract out the nested struct before dropping multiple fields from it e.g.

    val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col")
    df.select($"struct_col".withField("a", $"struct_col.a".dropFields("b", "c")))
    // result: {"a":{"a":1}}
    Definition Classes
    Column
    Since

    3.1.0

  51. def endsWith(literal: String): Column

    String ends with another string literal.

    String ends with another string literal. Returns a boolean column based on a string match.

    Definition Classes
    Column
    Since

    1.3.0

  52. def endsWith(other: Column): Column

    String ends with.

    String ends with. Returns a boolean column based on a string match.

    Definition Classes
    Column
    Since

    1.3.0

  53. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  54. def eqNullSafe(other: Any): Column

    Equality test that is safe for null values.

    Equality test that is safe for null values.

    Definition Classes
    Column
    Since

    1.3.0

  55. def equalTo(other: Any): Column

    Equality test.

    Equality test.

    // Scala:
    df.filter( df("colA") === df("colB") )
    
    // Java
    import static org.apache.spark.sql.functions.*;
    df.filter( col("colA").equalTo(col("colB")) );
    Definition Classes
    Column
    Since

    1.3.0

  56. def equals(that: Any): Boolean
    Definition Classes
    Column → AnyRef → Any
  57. def explain(extended: Boolean): Unit

    Prints the expression to the console for debugging purposes.

    Prints the expression to the console for debugging purposes.

    Definition Classes
    Column
    Since

    1.3.0

  58. val expr: Expression
    Definition Classes
    Column
  59. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  60. def float: StructField

    Creates a new StructField of type float.

    Creates a new StructField of type float.

    Since

    1.3.0

  61. def geq(other: Any): Column

    Greater than or equal to an expression.

    Greater than or equal to an expression.

    // Scala: The following selects people age 21 or older than 21.
    people.select( people("age") >= 21 )
    
    // Java:
    people.select( people.col("age").geq(21) )
    Definition Classes
    Column
    Since

    1.3.0

  62. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  63. def getField(fieldName: String): Column

    An expression that gets a field by name in a StructType.

    An expression that gets a field by name in a StructType.

    Definition Classes
    Column
    Since

    1.3.0

  64. def getItem(key: Any): Column

    An expression that gets an item at position ordinal out of an array, or gets a value by key key in a MapType.

    An expression that gets an item at position ordinal out of an array, or gets a value by key key in a MapType.

    Definition Classes
    Column
    Since

    1.3.0

  65. def gt(other: Any): Column

    Greater than.

    Greater than.

    // Scala: The following selects people older than 21.
    people.select( people("age") > lit(21) )
    
    // Java:
    import static org.apache.spark.sql.functions.*;
    people.select( people.col("age").gt(21) );
    Definition Classes
    Column
    Since

    1.3.0

  66. def hashCode(): Int
    Definition Classes
    Column → AnyRef → Any
  67. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  68. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  69. def int: StructField

    Creates a new StructField of type int.

    Creates a new StructField of type int.

    Since

    1.3.0

  70. def isInCollection(values: Iterable[_]): Column

    A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.

    A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.

    Note: Since the type of the elements in the collection are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double"

    Definition Classes
    Column
    Since

    2.4.0

  71. def isInCollection(values: Iterable[_]): Column

    A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.

    A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.

    Note: Since the type of the elements in the collection are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double"

    Definition Classes
    Column
    Since

    2.4.0

  72. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  73. def isNaN: Column

    True if the current expression is NaN.

    True if the current expression is NaN.

    Definition Classes
    Column
    Since

    1.5.0

  74. def isNotNull: Column

    True if the current expression is NOT null.

    True if the current expression is NOT null.

    Definition Classes
    Column
    Since

    1.3.0

  75. def isNull: Column

    True if the current expression is null.

    True if the current expression is null.

    Definition Classes
    Column
    Since

    1.3.0

  76. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  77. def isin(list: Any*): Column

    A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.

    A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.

    Note: Since the type of the elements in the list are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double"

    Definition Classes
    Column
    Annotations
    @varargs()
    Since

    1.5.0

  78. def leq(other: Any): Column

    Less than or equal to.

    Less than or equal to.

    // Scala: The following selects people age 21 or younger than 21.
    people.select( people("age") <= 21 )
    
    // Java:
    people.select( people.col("age").leq(21) );
    Definition Classes
    Column
    Since

    1.3.0

  79. def like(literal: String): Column

    SQL like expression.

    SQL like expression. Returns a boolean column based on a SQL LIKE match.

    Definition Classes
    Column
    Since

    1.3.0

  80. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  81. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  82. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  83. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  84. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  85. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  86. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  87. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  88. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  89. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  90. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  91. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  92. def long: StructField

    Creates a new StructField of type long.

    Creates a new StructField of type long.

    Since

    1.3.0

  93. def lt(other: Any): Column

    Less than.

    Less than.

    // Scala: The following selects people younger than 21.
    people.select( people("age") < 21 )
    
    // Java:
    people.select( people.col("age").lt(21) );
    Definition Classes
    Column
    Since

    1.3.0

  94. def map(mapType: MapType): StructField
  95. def map(keyType: DataType, valueType: DataType): StructField

    Creates a new StructField of type map.

    Creates a new StructField of type map.

    Since

    1.3.0

  96. def minus(other: Any): Column

    Subtraction.

    Subtraction. Subtract the other expression from this expression.

    // Scala: The following selects the difference between people's height and their weight.
    people.select( people("height") - people("weight") )
    
    // Java:
    people.select( people.col("height").minus(people.col("weight")) );
    Definition Classes
    Column
    Since

    1.3.0

  97. def mod(other: Any): Column

    Modulo (a.k.a.

    Modulo (a.k.a. remainder) expression.

    Definition Classes
    Column
    Since

    1.3.0

  98. def multiply(other: Any): Column

    Multiplication of this expression and another expression.

    Multiplication of this expression and another expression.

    // Scala: The following multiplies a person's height by their weight.
    people.select( people("height") * people("weight") )
    
    // Java:
    people.select( people.col("height").multiply(people.col("weight")) );
    Definition Classes
    Column
    Since

    1.3.0

  99. def name(alias: String): Column

    Gives the column a name (alias).

    Gives the column a name (alias).

    // Renames colA to colB in select output.
    df.select($"colA".name("colB"))

    If the current column has metadata associated with it, this metadata will be propagated to the new column. If this not desired, use the API as(alias: String, metadata: Metadata) with explicit metadata.

    Definition Classes
    Column
    Since

    2.0.0

  100. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  101. def notEqual(other: Any): Column

    Inequality test.

    Inequality test.

    // Scala:
    df.select( df("colA") !== df("colB") )
    df.select( !(df("colA") === df("colB")) )
    
    // Java:
    import static org.apache.spark.sql.functions.*;
    df.filter( col("colA").notEqual(col("colB")) );
    Definition Classes
    Column
    Since

    1.3.0

  102. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  103. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  104. def or(other: Column): Column

    Boolean OR.

    Boolean OR.

    // Scala: The following selects people that are in school or employed.
    people.filter( people("inSchool") || people("isEmployed") )
    
    // Java:
    people.filter( people.col("inSchool").or(people.col("isEmployed")) );
    Definition Classes
    Column
    Since

    1.3.0

  105. def otherwise(value: Any): Column

    Evaluates a list of conditions and returns one of multiple possible result expressions.

    Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.

    // Example: encoding gender string column into integer.
    
    // Scala:
    people.select(when(people("gender") === "male", 0)
      .when(people("gender") === "female", 1)
      .otherwise(2))
    
    // Java:
    people.select(when(col("gender").equalTo("male"), 0)
      .when(col("gender").equalTo("female"), 1)
      .otherwise(2))
    Definition Classes
    Column
    Since

    1.4.0

  106. def over(): Column

    Defines an empty analytic clause.

    Defines an empty analytic clause. In this case the analytic function is applied and presented for all rows in the result set.

    df.select(
      sum("price").over(),
      avg("price").over()
    )
    Definition Classes
    Column
    Since

    2.0.0

  107. def over(window: WindowSpec): Column

    Defines a windowing column.

    Defines a windowing column.

    val w = Window.partitionBy("name").orderBy("id")
    df.select(
      sum("price").over(w.rangeBetween(Window.unboundedPreceding, 2)),
      avg("price").over(w.rowsBetween(Window.currentRow, 4))
    )
    Definition Classes
    Column
    Since

    1.4.0

  108. def plus(other: Any): Column

    Sum of this expression and another expression.

    Sum of this expression and another expression.

    // Scala: The following selects the sum of a person's height and weight.
    people.select( people("height") + people("weight") )
    
    // Java:
    people.select( people.col("height").plus(people.col("weight")) );
    Definition Classes
    Column
    Since

    1.3.0

  109. def rlike(literal: String): Column

    SQL RLIKE expression (LIKE with Regex).

    SQL RLIKE expression (LIKE with Regex). Returns a boolean column based on a regex match.

    Definition Classes
    Column
    Since

    1.3.0

  110. def short: StructField

    Creates a new StructField of type short.

    Creates a new StructField of type short.

    Since

    1.3.0

  111. def startsWith(literal: String): Column

    String starts with another string literal.

    String starts with another string literal. Returns a boolean column based on a string match.

    Definition Classes
    Column
    Since

    1.3.0

  112. def startsWith(other: Column): Column

    String starts with.

    String starts with. Returns a boolean column based on a string match.

    Definition Classes
    Column
    Since

    1.3.0

  113. def string: StructField

    Creates a new StructField of type string.

    Creates a new StructField of type string.

    Since

    1.3.0

  114. def struct(structType: StructType): StructField

    Creates a new StructField of type struct.

    Creates a new StructField of type struct.

    Since

    1.3.0

  115. def struct(fields: StructField*): StructField

    Creates a new StructField of type struct.

    Creates a new StructField of type struct.

    Since

    1.3.0

  116. def substr(startPos: Int, len: Int): Column

    An expression that returns a substring.

    An expression that returns a substring.

    startPos

    starting position.

    len

    length of the substring.

    Definition Classes
    Column
    Since

    1.3.0

  117. def substr(startPos: Column, len: Column): Column

    An expression that returns a substring.

    An expression that returns a substring.

    startPos

    expression for the starting position.

    len

    expression for the length of the substring.

    Definition Classes
    Column
    Since

    1.3.0

  118. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  119. def timestamp: StructField

    Creates a new StructField of type timestamp.

    Creates a new StructField of type timestamp.

    Since

    1.3.0

  120. def toString(): String
    Definition Classes
    Column → AnyRef → Any
  121. def unary_!: Column

    Inversion of boolean expression, i.e.

    Inversion of boolean expression, i.e. NOT.

    // Scala: select rows that are not active (isActive === false)
    df.filter( !df("isActive") )
    
    // Java:
    import static org.apache.spark.sql.functions.*;
    df.filter( not(df.col("isActive")) );
    Definition Classes
    Column
    Since

    1.3.0

  122. def unary_-: Column

    Unary minus, i.e.

    Unary minus, i.e. negate the expression.

    // Scala: select the amount column and negates all values.
    df.select( -df("amount") )
    
    // Java:
    import static org.apache.spark.sql.functions.*;
    df.select( negate(col("amount") );
    Definition Classes
    Column
    Since

    1.3.0

  123. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  124. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  125. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  126. def when(condition: Column, value: Any): Column

    Evaluates a list of conditions and returns one of multiple possible result expressions.

    Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.

    // Example: encoding gender string column into integer.
    
    // Scala:
    people.select(when(people("gender") === "male", 0)
      .when(people("gender") === "female", 1)
      .otherwise(2))
    
    // Java:
    people.select(when(col("gender").equalTo("male"), 0)
      .when(col("gender").equalTo("female"), 1)
      .otherwise(2))
    Definition Classes
    Column
    Since

    1.4.0

  127. def withField(fieldName: String, col: Column): Column

    An expression that adds/replaces field in StructType by name.

    An expression that adds/replaces field in StructType by name.

    val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col")
    df.select($"struct_col".withField("c", lit(3)))
    // result: {"a":1,"b":2,"c":3}
    
    val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col")
    df.select($"struct_col".withField("b", lit(3)))
    // result: {"a":1,"b":3}
    
    val df = sql("SELECT CAST(NULL AS struct<a:int,b:int>) struct_col")
    df.select($"struct_col".withField("c", lit(3)))
    // result: null of type struct<a:int,b:int,c:int>
    
    val df = sql("SELECT named_struct('a', 1, 'b', 2, 'b', 3) struct_col")
    df.select($"struct_col".withField("b", lit(100)))
    // result: {"a":1,"b":100,"b":100}
    
    val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col")
    df.select($"struct_col".withField("a.c", lit(3)))
    // result: {"a":{"a":1,"b":2,"c":3}}
    
    val df = sql("SELECT named_struct('a', named_struct('b', 1), 'a', named_struct('c', 2)) struct_col")
    df.select($"struct_col".withField("a.c", lit(3)))
    // result: org.apache.spark.sql.AnalysisException: Ambiguous reference to fields

    This method supports adding/replacing nested fields directly e.g.

    val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col")
    df.select($"struct_col".withField("a.c", lit(3)).withField("a.d", lit(4)))
    // result: {"a":{"a":1,"b":2,"c":3,"d":4}}

    However, if you are going to add/replace multiple nested fields, it is more optimal to extract out the nested struct before adding/replacing multiple fields e.g.

    val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col")
    df.select($"struct_col".withField("a", $"struct_col.a".withField("c", lit(3)).withField("d", lit(4))))
    // result: {"a":{"a":1,"b":2,"c":3,"d":4}}
    Definition Classes
    Column
    Since

    3.1.0

  128. def ||(other: Any): Column

    Boolean OR.

    Boolean OR.

    // Scala: The following selects people that are in school or employed.
    people.filter( people("inSchool") || people("isEmployed") )
    
    // Java:
    people.filter( people.col("inSchool").or(people.col("isEmployed")) );
    Definition Classes
    Column
    Since

    1.3.0

Deprecated Value Members

  1. def !==(other: Any): Column

    Inequality test.

    Inequality test.

    // Scala:
    df.select( df("colA") !== df("colB") )
    df.select( !(df("colA") === df("colB")) )
    
    // Java:
    import static org.apache.spark.sql.functions.*;
    df.filter( col("colA").notEqual(col("colB")) );
    Definition Classes
    Column
    Annotations
    @deprecated
    Deprecated

    (Since version 2.0.0) !== does not have the same precedence as ===, use =!= instead

    Since

    1.3.0

Inherited from Column

Inherited from Logging

Inherited from AnyRef

Inherited from Any

DataFrame functions

Expression operators

Java-specific expression operators

Support functions for DataFrames