public abstract class PredictionModel<FeaturesType,M extends PredictionModel<FeaturesType,M>> extends Model<M>
Constructor and Description |
---|
PredictionModel() |
Modifier and Type | Method and Description |
---|---|
protected DataType |
featuresDataType()
Returns the SQL DataType corresponding to the FeaturesType type parameter.
|
protected abstract double |
predict(FeaturesType features)
Predict label for the given features.
|
M |
setFeaturesCol(java.lang.String value) |
M |
setPredictionCol(java.lang.String value) |
DataFrame |
transform(DataFrame dataset)
Transforms dataset by reading from
featuresCol , calling predict() , and storing
the predictions as a new column predictionCol . |
protected DataFrame |
transformImpl(DataFrame dataset) |
StructType |
transformSchema(StructType schema)
:: DeveloperApi ::
|
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType)
Validates and transforms the input schema with the provided param map.
|
transform, transform, transform
transformSchema
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
clear, copy, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParams
toString, uid
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
public M setFeaturesCol(java.lang.String value)
public M setPredictionCol(java.lang.String value)
protected DataType featuresDataType()
This is used by validateAndTransformSchema()
.
This workaround is needed since SQL has different APIs for Scala and Java.
The default value is VectorUDT, but it may be overridden if FeaturesType is not Vector.
public StructType transformSchema(StructType schema)
PipelineStage
Derives the output schema from the input schema.
transformSchema
in class PipelineStage
schema
- (undocumented)public DataFrame transform(DataFrame dataset)
featuresCol
, calling predict()
, and storing
the predictions as a new column predictionCol
.
transform
in class Transformer
dataset
- input datasetpredictionCol
of type Double
protected abstract double predict(FeaturesType features)
transform()
and output predictionCol
.features
- (undocumented)public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
schema
- input schemafitting
- whether this is in fittingfeaturesDataType
- SQL DataType for FeaturesType.
E.g., VectorUDT
for vector features.