public class RandomRDDs
extends Object
i.i.d.
samples from some distribution.Constructor and Description |
---|
RandomRDDs() |
Modifier and Type | Method and Description |
---|---|
static JavaDoubleRDD |
exponentialJavaRDD(JavaSparkContext jsc,
double mean,
long size)
exponentialJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default number of partitions and the default seed. |
static JavaDoubleRDD |
exponentialJavaRDD(JavaSparkContext jsc,
double mean,
long size,
int numPartitions)
exponentialJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default seed. |
static JavaDoubleRDD |
exponentialJavaRDD(JavaSparkContext jsc,
double mean,
long size,
int numPartitions,
long seed)
Java-friendly version of
exponentialRDD(org.apache.spark.SparkContext, double, long, int, long) . |
static JavaRDD<Vector> |
exponentialJavaVectorRDD(JavaSparkContext jsc,
double mean,
long numRows,
int numCols)
exponentialJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long) with the default number of partitions
and the default seed. |
static JavaRDD<Vector> |
exponentialJavaVectorRDD(JavaSparkContext jsc,
double mean,
long numRows,
int numCols,
int numPartitions)
|
static JavaRDD<Vector> |
exponentialJavaVectorRDD(JavaSparkContext jsc,
double mean,
long numRows,
int numCols,
int numPartitions,
long seed)
Java-friendly version of
exponentialVectorRDD(org.apache.spark.SparkContext, double, long, int, int, long) . |
static RDD<Object> |
exponentialRDD(SparkContext sc,
double mean,
long size,
int numPartitions,
long seed)
Generates an RDD comprised of
i.i.d. samples from the exponential distribution with
the input mean. |
static RDD<Vector> |
exponentialVectorRDD(SparkContext sc,
double mean,
long numRows,
int numCols,
int numPartitions,
long seed)
Generates an RDD[Vector] with vectors containing
i.i.d. samples drawn from the
exponential distribution with the input mean. |
static JavaDoubleRDD |
gammaJavaRDD(JavaSparkContext jsc,
double shape,
double scale,
long size)
gammaJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long) with the default number of partitions and the default seed. |
static JavaDoubleRDD |
gammaJavaRDD(JavaSparkContext jsc,
double shape,
double scale,
long size,
int numPartitions)
gammaJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long) with the default seed. |
static JavaDoubleRDD |
gammaJavaRDD(JavaSparkContext jsc,
double shape,
double scale,
long size,
int numPartitions,
long seed)
Java-friendly version of
gammaRDD(org.apache.spark.SparkContext, double, double, long, int, long) . |
static JavaRDD<Vector> |
gammaJavaVectorRDD(JavaSparkContext jsc,
double shape,
double scale,
long numRows,
int numCols)
gammaJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, int, long) with the default number of partitions and the default seed. |
static JavaRDD<Vector> |
gammaJavaVectorRDD(JavaSparkContext jsc,
double shape,
double scale,
long numRows,
int numCols,
int numPartitions)
|
static JavaRDD<Vector> |
gammaJavaVectorRDD(JavaSparkContext jsc,
double shape,
double scale,
long numRows,
int numCols,
int numPartitions,
long seed)
Java-friendly version of
gammaVectorRDD(org.apache.spark.SparkContext, double, double, long, int, int, long) . |
static RDD<Object> |
gammaRDD(SparkContext sc,
double shape,
double scale,
long size,
int numPartitions,
long seed)
Generates an RDD comprised of
i.i.d. samples from the gamma distribution with the input
shape and scale. |
static RDD<Vector> |
gammaVectorRDD(SparkContext sc,
double shape,
double scale,
long numRows,
int numCols,
int numPartitions,
long seed)
Generates an RDD[Vector] with vectors containing
i.i.d. samples drawn from the
gamma distribution with the input shape and scale. |
static JavaDoubleRDD |
logNormalJavaRDD(JavaSparkContext jsc,
double mean,
double std,
long size)
logNormalJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long) with the default number of partitions and the default seed. |
static JavaDoubleRDD |
logNormalJavaRDD(JavaSparkContext jsc,
double mean,
double std,
long size,
int numPartitions)
|
static JavaDoubleRDD |
logNormalJavaRDD(JavaSparkContext jsc,
double mean,
double std,
long size,
int numPartitions,
long seed)
Java-friendly version of
logNormalRDD(org.apache.spark.SparkContext, double, double, long, int, long) . |
static JavaRDD<Vector> |
logNormalJavaVectorRDD(JavaSparkContext jsc,
double mean,
double std,
long numRows,
int numCols)
logNormalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, int, long) with the default number of partitions and
the default seed. |
static JavaRDD<Vector> |
logNormalJavaVectorRDD(JavaSparkContext jsc,
double mean,
double std,
long numRows,
int numCols,
int numPartitions)
|
static JavaRDD<Vector> |
logNormalJavaVectorRDD(JavaSparkContext jsc,
double mean,
double std,
long numRows,
int numCols,
int numPartitions,
long seed)
Java-friendly version of
logNormalVectorRDD(org.apache.spark.SparkContext, double, double, long, int, int, long) . |
static RDD<Object> |
logNormalRDD(SparkContext sc,
double mean,
double std,
long size,
int numPartitions,
long seed)
Generates an RDD comprised of
i.i.d. samples from the log normal distribution with the input
mean and standard deviation |
static RDD<Vector> |
logNormalVectorRDD(SparkContext sc,
double mean,
double std,
long numRows,
int numCols,
int numPartitions,
long seed)
Generates an RDD[Vector] with vectors containing
i.i.d. samples drawn from a
log normal distribution. |
static JavaDoubleRDD |
normalJavaRDD(JavaSparkContext jsc,
long size)
normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed. |
static JavaDoubleRDD |
normalJavaRDD(JavaSparkContext jsc,
long size,
int numPartitions)
normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default seed. |
static JavaDoubleRDD |
normalJavaRDD(JavaSparkContext jsc,
long size,
int numPartitions,
long seed)
Java-friendly version of
normalRDD(org.apache.spark.SparkContext, long, int, long) . |
static JavaRDD<Vector> |
normalJavaVectorRDD(JavaSparkContext jsc,
long numRows,
int numCols)
normalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default number of partitions and the default seed. |
static JavaRDD<Vector> |
normalJavaVectorRDD(JavaSparkContext jsc,
long numRows,
int numCols,
int numPartitions)
normalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default seed. |
static JavaRDD<Vector> |
normalJavaVectorRDD(JavaSparkContext jsc,
long numRows,
int numCols,
int numPartitions,
long seed)
Java-friendly version of
normalVectorRDD(org.apache.spark.SparkContext, long, int, int, long) . |
static RDD<Object> |
normalRDD(SparkContext sc,
long size,
int numPartitions,
long seed)
Generates an RDD comprised of
i.i.d. samples from the standard normal distribution. |
static RDD<Vector> |
normalVectorRDD(SparkContext sc,
long numRows,
int numCols,
int numPartitions,
long seed)
Generates an RDD[Vector] with vectors containing
i.i.d. samples drawn from the
standard normal distribution. |
static JavaDoubleRDD |
poissonJavaRDD(JavaSparkContext jsc,
double mean,
long size)
poissonJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default number of partitions and the default seed. |
static JavaDoubleRDD |
poissonJavaRDD(JavaSparkContext jsc,
double mean,
long size,
int numPartitions)
poissonJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default seed. |
static JavaDoubleRDD |
poissonJavaRDD(JavaSparkContext jsc,
double mean,
long size,
int numPartitions,
long seed)
Java-friendly version of
poissonRDD(org.apache.spark.SparkContext, double, long, int, long) . |
static JavaRDD<Vector> |
poissonJavaVectorRDD(JavaSparkContext jsc,
double mean,
long numRows,
int numCols)
poissonJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long) with the default number of partitions and the default seed. |
static JavaRDD<Vector> |
poissonJavaVectorRDD(JavaSparkContext jsc,
double mean,
long numRows,
int numCols,
int numPartitions)
|
static JavaRDD<Vector> |
poissonJavaVectorRDD(JavaSparkContext jsc,
double mean,
long numRows,
int numCols,
int numPartitions,
long seed)
Java-friendly version of
poissonVectorRDD(org.apache.spark.SparkContext, double, long, int, int, long) . |
static RDD<Object> |
poissonRDD(SparkContext sc,
double mean,
long size,
int numPartitions,
long seed)
Generates an RDD comprised of
i.i.d. samples from the Poisson distribution with the input
mean. |
static RDD<Vector> |
poissonVectorRDD(SparkContext sc,
double mean,
long numRows,
int numCols,
int numPartitions,
long seed)
Generates an RDD[Vector] with vectors containing
i.i.d. samples drawn from the
Poisson distribution with the input mean. |
static <T> JavaRDD<T> |
randomJavaRDD(JavaSparkContext jsc,
RandomDataGenerator<T> generator,
long size)
:: DeveloperApi ::
randomJavaRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<T>, long, int, long) with the default seed & numPartitions |
static <T> JavaRDD<T> |
randomJavaRDD(JavaSparkContext jsc,
RandomDataGenerator<T> generator,
long size,
int numPartitions)
:: DeveloperApi ::
randomJavaRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<T>, long, int, long) with the default seed. |
static <T> JavaRDD<T> |
randomJavaRDD(JavaSparkContext jsc,
RandomDataGenerator<T> generator,
long size,
int numPartitions,
long seed)
:: DeveloperApi ::
Generates an RDD comprised of
i.i.d. samples produced by the input RandomDataGenerator. |
static JavaRDD<Vector> |
randomJavaVectorRDD(JavaSparkContext jsc,
RandomDataGenerator<Object> generator,
long numRows,
int numCols)
:: DeveloperApi ::
randomJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<java.lang.Object>, long, int, int, long) with the default number of partitions and the default seed. |
static JavaRDD<Vector> |
randomJavaVectorRDD(JavaSparkContext jsc,
RandomDataGenerator<Object> generator,
long numRows,
int numCols,
int numPartitions)
:: DeveloperApi ::
randomJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<java.lang.Object>, long, int, int, long) with the default seed. |
static JavaRDD<Vector> |
randomJavaVectorRDD(JavaSparkContext jsc,
RandomDataGenerator<Object> generator,
long numRows,
int numCols,
int numPartitions,
long seed)
:: DeveloperApi ::
Java-friendly version of
randomVectorRDD(org.apache.spark.SparkContext, org.apache.spark.mllib.random.RandomDataGenerator<java.lang.Object>, long, int, int, long) . |
static <T> RDD<T> |
randomRDD(SparkContext sc,
RandomDataGenerator<T> generator,
long size,
int numPartitions,
long seed,
scala.reflect.ClassTag<T> evidence$1)
:: DeveloperApi ::
Generates an RDD comprised of
i.i.d. samples produced by the input RandomDataGenerator. |
static RDD<Vector> |
randomVectorRDD(SparkContext sc,
RandomDataGenerator<Object> generator,
long numRows,
int numCols,
int numPartitions,
long seed)
:: DeveloperApi ::
Generates an RDD[Vector] with vectors containing
i.i.d. samples produced by the
input RandomDataGenerator. |
static JavaDoubleRDD |
uniformJavaRDD(JavaSparkContext jsc,
long size)
uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed. |
static JavaDoubleRDD |
uniformJavaRDD(JavaSparkContext jsc,
long size,
int numPartitions)
uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default seed. |
static JavaDoubleRDD |
uniformJavaRDD(JavaSparkContext jsc,
long size,
int numPartitions,
long seed)
Java-friendly version of
uniformRDD(org.apache.spark.SparkContext, long, int, long) . |
static JavaRDD<Vector> |
uniformJavaVectorRDD(JavaSparkContext jsc,
long numRows,
int numCols)
uniformJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default number of partitions and the default seed. |
static JavaRDD<Vector> |
uniformJavaVectorRDD(JavaSparkContext jsc,
long numRows,
int numCols,
int numPartitions)
uniformJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default seed. |
static JavaRDD<Vector> |
uniformJavaVectorRDD(JavaSparkContext jsc,
long numRows,
int numCols,
int numPartitions,
long seed)
Java-friendly version of
uniformVectorRDD(org.apache.spark.SparkContext, long, int, int, long) . |
static RDD<Object> |
uniformRDD(SparkContext sc,
long size,
int numPartitions,
long seed)
Generates an RDD comprised of
i.i.d. samples from the uniform distribution U(0.0, 1.0) . |
static RDD<Vector> |
uniformVectorRDD(SparkContext sc,
long numRows,
int numCols,
int numPartitions,
long seed)
Generates an RDD[Vector] with vectors containing
i.i.d. samples drawn from the
uniform distribution on U(0.0, 1.0) . |
public static RDD<Object> uniformRDD(SparkContext sc, long size, int numPartitions, long seed)
i.i.d.
samples from the uniform distribution U(0.0, 1.0)
.
To transform the distribution in the generated RDD from U(0.0, 1.0)
to U(a, b)
, use
RandomRDDs.uniformRDD(sc, n, p, seed).map(v => a + (b - a) * v)
.
sc
- SparkContext used to create the RDD.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).i.i.d.
samples ~ U(0.0, 1.0)
.public static JavaDoubleRDD uniformJavaRDD(JavaSparkContext jsc, long size, int numPartitions, long seed)
uniformRDD(org.apache.spark.SparkContext, long, int, long)
.jsc
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaDoubleRDD uniformJavaRDD(JavaSparkContext jsc, long size, int numPartitions)
uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long)
with the default seed.jsc
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)public static JavaDoubleRDD uniformJavaRDD(JavaSparkContext jsc, long size)
uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)size
- (undocumented)public static RDD<Object> normalRDD(SparkContext sc, long size, int numPartitions, long seed)
i.i.d.
samples from the standard normal distribution.
To transform the distribution in the generated RDD from standard normal to some other normal
N(mean, sigma^2^)
, use RandomRDDs.normalRDD(sc, n, p, seed).map(v => mean + sigma * v)
.
sc
- SparkContext used to create the RDD.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).i.i.d.
samples ~ N(0.0, 1.0).public static JavaDoubleRDD normalJavaRDD(JavaSparkContext jsc, long size, int numPartitions, long seed)
normalRDD(org.apache.spark.SparkContext, long, int, long)
.jsc
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaDoubleRDD normalJavaRDD(JavaSparkContext jsc, long size, int numPartitions)
normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long)
with the default seed.jsc
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)public static JavaDoubleRDD normalJavaRDD(JavaSparkContext jsc, long size)
normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)size
- (undocumented)public static RDD<Object> poissonRDD(SparkContext sc, double mean, long size, int numPartitions, long seed)
i.i.d.
samples from the Poisson distribution with the input
mean.
sc
- SparkContext used to create the RDD.mean
- Mean, or lambda, for the Poisson distribution.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).i.i.d.
samples ~ Pois(mean).public static JavaDoubleRDD poissonJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions, long seed)
poissonRDD(org.apache.spark.SparkContext, double, long, int, long)
.jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaDoubleRDD poissonJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions)
poissonJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long)
with the default seed.jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)public static JavaDoubleRDD poissonJavaRDD(JavaSparkContext jsc, double mean, long size)
poissonJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)public static RDD<Object> exponentialRDD(SparkContext sc, double mean, long size, int numPartitions, long seed)
i.i.d.
samples from the exponential distribution with
the input mean.
sc
- SparkContext used to create the RDD.mean
- Mean, or 1 / lambda, for the exponential distribution.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).i.i.d.
samples ~ Pois(mean).public static JavaDoubleRDD exponentialJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions, long seed)
exponentialRDD(org.apache.spark.SparkContext, double, long, int, long)
.jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaDoubleRDD exponentialJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions)
exponentialJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long)
with the default seed.jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)public static JavaDoubleRDD exponentialJavaRDD(JavaSparkContext jsc, double mean, long size)
exponentialJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)public static RDD<Object> gammaRDD(SparkContext sc, double shape, double scale, long size, int numPartitions, long seed)
i.i.d.
samples from the gamma distribution with the input
shape and scale.
sc
- SparkContext used to create the RDD.shape
- shape parameter (> 0) for the gamma distributionscale
- scale parameter (> 0) for the gamma distributionsize
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).i.i.d.
samples ~ Pois(mean).public static JavaDoubleRDD gammaJavaRDD(JavaSparkContext jsc, double shape, double scale, long size, int numPartitions, long seed)
gammaRDD(org.apache.spark.SparkContext, double, double, long, int, long)
.jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaDoubleRDD gammaJavaRDD(JavaSparkContext jsc, double shape, double scale, long size, int numPartitions)
gammaJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long)
with the default seed.jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)public static JavaDoubleRDD gammaJavaRDD(JavaSparkContext jsc, double shape, double scale, long size)
gammaJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)size
- (undocumented)public static RDD<Object> logNormalRDD(SparkContext sc, double mean, double std, long size, int numPartitions, long seed)
i.i.d.
samples from the log normal distribution with the input
mean and standard deviation
sc
- SparkContext used to create the RDD.mean
- mean for the log normal distributionstd
- standard deviation for the log normal distributionsize
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).i.i.d.
samples ~ Pois(mean).public static JavaDoubleRDD logNormalJavaRDD(JavaSparkContext jsc, double mean, double std, long size, int numPartitions, long seed)
logNormalRDD(org.apache.spark.SparkContext, double, double, long, int, long)
.jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaDoubleRDD logNormalJavaRDD(JavaSparkContext jsc, double mean, double std, long size, int numPartitions)
logNormalJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long)
with the default seed.jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)public static JavaDoubleRDD logNormalJavaRDD(JavaSparkContext jsc, double mean, double std, long size)
logNormalJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)size
- (undocumented)public static <T> RDD<T> randomRDD(SparkContext sc, RandomDataGenerator<T> generator, long size, int numPartitions, long seed, scala.reflect.ClassTag<T> evidence$1)
i.i.d.
samples produced by the input RandomDataGenerator.
sc
- SparkContext used to create the RDD.generator
- RandomDataGenerator used to populate the RDD.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).evidence$1
- (undocumented)i.i.d.
samples produced by generator.public static <T> JavaRDD<T> randomJavaRDD(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size, int numPartitions, long seed)
i.i.d.
samples produced by the input RandomDataGenerator.
jsc
- JavaSparkContext used to create the RDD.generator
- RandomDataGenerator used to populate the RDD.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).i.i.d.
samples produced by generator.public static <T> JavaRDD<T> randomJavaRDD(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size, int numPartitions)
randomJavaRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<T>, long, int, long)
with the default seed.jsc
- (undocumented)generator
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)public static <T> JavaRDD<T> randomJavaRDD(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size)
randomJavaRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<T>, long, int, long)
with the default seed & numPartitionsjsc
- (undocumented)generator
- (undocumented)size
- (undocumented)public static RDD<Vector> uniformVectorRDD(SparkContext sc, long numRows, int numCols, int numPartitions, long seed)
i.i.d.
samples drawn from the
uniform distribution on U(0.0, 1.0)
.
sc
- SparkContext used to create the RDD.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD.seed
- Seed for the RNG that generates the seed for the generator in each partition.U(0.0, 1.0)
.public static JavaRDD<Vector> uniformJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions, long seed)
uniformVectorRDD(org.apache.spark.SparkContext, long, int, int, long)
.jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaRDD<Vector> uniformJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions)
uniformJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long)
with the default seed.jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)public static JavaRDD<Vector> uniformJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols)
uniformJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)public static RDD<Vector> normalVectorRDD(SparkContext sc, long numRows, int numCols, int numPartitions, long seed)
i.i.d.
samples drawn from the
standard normal distribution.
sc
- SparkContext used to create the RDD.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).i.i.d.
samples ~ N(0.0, 1.0)
.public static JavaRDD<Vector> normalJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions, long seed)
normalVectorRDD(org.apache.spark.SparkContext, long, int, int, long)
.jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaRDD<Vector> normalJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions)
normalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long)
with the default seed.jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)public static JavaRDD<Vector> normalJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols)
normalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)public static RDD<Vector> logNormalVectorRDD(SparkContext sc, double mean, double std, long numRows, int numCols, int numPartitions, long seed)
i.i.d.
samples drawn from a
log normal distribution.
sc
- SparkContext used to create the RDD.mean
- Mean of the log normal distribution.std
- Standard deviation of the log normal distribution.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).i.i.d.
samples.public static JavaRDD<Vector> logNormalJavaVectorRDD(JavaSparkContext jsc, double mean, double std, long numRows, int numCols, int numPartitions, long seed)
logNormalVectorRDD(org.apache.spark.SparkContext, double, double, long, int, int, long)
.jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaRDD<Vector> logNormalJavaVectorRDD(JavaSparkContext jsc, double mean, double std, long numRows, int numCols, int numPartitions)
logNormalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, int, long)
with the default seed.jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)public static JavaRDD<Vector> logNormalJavaVectorRDD(JavaSparkContext jsc, double mean, double std, long numRows, int numCols)
logNormalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, int, long)
with the default number of partitions and
the default seed.jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)public static RDD<Vector> poissonVectorRDD(SparkContext sc, double mean, long numRows, int numCols, int numPartitions, long seed)
i.i.d.
samples drawn from the
Poisson distribution with the input mean.
sc
- SparkContext used to create the RDD.mean
- Mean, or lambda, for the Poisson distribution.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
)seed
- Random seed (default: a random long integer).i.i.d.
samples ~ Pois(mean).public static JavaRDD<Vector> poissonJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions, long seed)
poissonVectorRDD(org.apache.spark.SparkContext, double, long, int, int, long)
.jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaRDD<Vector> poissonJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions)
poissonJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long)
with the default seed.jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)public static JavaRDD<Vector> poissonJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols)
poissonJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)public static RDD<Vector> exponentialVectorRDD(SparkContext sc, double mean, long numRows, int numCols, int numPartitions, long seed)
i.i.d.
samples drawn from the
exponential distribution with the input mean.
sc
- SparkContext used to create the RDD.mean
- Mean, or 1 / lambda, for the Exponential distribution.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
)seed
- Random seed (default: a random long integer).i.i.d.
samples ~ Exp(mean).public static JavaRDD<Vector> exponentialJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions, long seed)
exponentialVectorRDD(org.apache.spark.SparkContext, double, long, int, int, long)
.jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaRDD<Vector> exponentialJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions)
exponentialJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long)
with the default seed.jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)public static JavaRDD<Vector> exponentialJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols)
exponentialJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long)
with the default number of partitions
and the default seed.jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)public static RDD<Vector> gammaVectorRDD(SparkContext sc, double shape, double scale, long numRows, int numCols, int numPartitions, long seed)
i.i.d.
samples drawn from the
gamma distribution with the input shape and scale.
sc
- SparkContext used to create the RDD.shape
- shape parameter (> 0) for the gamma distribution.scale
- scale parameter (> 0) for the gamma distribution.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
)seed
- Random seed (default: a random long integer).i.i.d.
samples ~ Exp(mean).public static JavaRDD<Vector> gammaJavaVectorRDD(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols, int numPartitions, long seed)
gammaVectorRDD(org.apache.spark.SparkContext, double, double, long, int, int, long)
.jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaRDD<Vector> gammaJavaVectorRDD(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols, int numPartitions)
gammaJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, int, long)
with the default seed.jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)public static JavaRDD<Vector> gammaJavaVectorRDD(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols)
gammaJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)public static RDD<Vector> randomVectorRDD(SparkContext sc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions, long seed)
i.i.d.
samples produced by the
input RandomDataGenerator.
sc
- SparkContext used to create the RDD.generator
- RandomDataGenerator used to populate the RDD.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default: sc.defaultParallelism
).seed
- Random seed (default: a random long integer).i.i.d.
samples produced by generator.public static JavaRDD<Vector> randomJavaVectorRDD(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions, long seed)
randomVectorRDD(org.apache.spark.SparkContext, org.apache.spark.mllib.random.RandomDataGenerator<java.lang.Object>, long, int, int, long)
.jsc
- (undocumented)generator
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)public static JavaRDD<Vector> randomJavaVectorRDD(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions)
randomJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<java.lang.Object>, long, int, int, long)
with the default seed.jsc
- (undocumented)generator
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)public static JavaRDD<Vector> randomJavaVectorRDD(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols)
randomJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<java.lang.Object>, long, int, int, long)
with the default number of partitions and the default seed.jsc
- (undocumented)generator
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)