public class PowerIterationClustering extends Object implements PowerIterationClusteringParams, DefaultParamsWritable
This class is not yet an Estimator/Transformer, use assignClusters
method to run the
PowerIterationClustering algorithm.
Constructor and Description |
---|
PowerIterationClustering() |
Modifier and Type | Method and Description |
---|---|
Dataset<Row> |
assignClusters(Dataset<?> dataset)
Run the PIC algorithm and returns a cluster assignment for each input vertex.
|
PowerIterationClustering |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
Param<String> |
dstCol()
Name of the input column for destination vertex IDs.
|
Param<String> |
initMode()
Param for the initialization algorithm.
|
IntParam |
k()
The number of clusters to create (k).
|
static PowerIterationClustering |
load(String path) |
IntParam |
maxIter()
Param for maximum number of iterations (>= 0).
|
Param<?>[] |
params()
Returns all params sorted by their names.
|
static MLReader<T> |
read() |
PowerIterationClustering |
setDstCol(String value) |
PowerIterationClustering |
setInitMode(String value) |
PowerIterationClustering |
setK(int value) |
PowerIterationClustering |
setMaxIter(int value) |
PowerIterationClustering |
setSrcCol(String value) |
PowerIterationClustering |
setWeightCol(String value) |
Param<String> |
srcCol()
Param for the name of the input column for source vertex IDs.
|
String |
uid()
An immutable unique ID for the object and its derivatives.
|
Param<String> |
weightCol()
Param for weight column name.
|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getDstCol, getInitMode, getK, getSrcCol
getMaxIter
getWeightCol
clear, copyValues, defaultCopy, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, set, set, set, setDefault, setDefault, shouldOwn
toString
write
save
public static PowerIterationClustering load(String path)
public static MLReader<T> read()
public final IntParam k()
PowerIterationClusteringParams
k
in interface PowerIterationClusteringParams
public final Param<String> initMode()
PowerIterationClusteringParams
initMode
in interface PowerIterationClusteringParams
public Param<String> srcCol()
PowerIterationClusteringParams
srcCol
in interface PowerIterationClusteringParams
public Param<String> dstCol()
PowerIterationClusteringParams
dstCol
in interface PowerIterationClusteringParams
public final Param<String> weightCol()
HasWeightCol
weightCol
in interface HasWeightCol
public final IntParam maxIter()
HasMaxIter
maxIter
in interface HasMaxIter
public Param<?>[] params()
Params
Param
.
public String uid()
Identifiable
uid
in interface Identifiable
public PowerIterationClustering setK(int value)
public PowerIterationClustering setInitMode(String value)
public PowerIterationClustering setMaxIter(int value)
public PowerIterationClustering setSrcCol(String value)
public PowerIterationClustering setDstCol(String value)
public PowerIterationClustering setWeightCol(String value)
public Dataset<Row> assignClusters(Dataset<?> dataset)
dataset
- A dataset with columns src, dst, weight representing the affinity matrix,
which is the matrix A in the PIC paper. Suppose the src column value is i,
the dst column value is j, the weight column value is similarity s,,ij,,
which must be nonnegative. This is a symmetric matrix and hence
s,,ij,, = s,,ji,,. For any (i, j) with nonzero similarity, there should be
either (i, j, s,,ij,,) or (j, i, s,,ji,,) in the input. Rows with i = j are
ignored, because we assume s,,ij,, = 0.0.
public PowerIterationClustering copy(ParamMap extra)
Params
defaultCopy()
.