KafkaUtils

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def createDirectStream[K, V, KD <: Decoder[K], VD <: Decoder[V]](jssc: JavaStreamingContext, keyClass: Class[K], valueClass: Class[V], keyDecoderClass: Class[KD], valueDecoderClass: Class[VD], kafkaParams: Map[String, String], topics: Set[String]): JavaPairInputDStream[K, V]

Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).
Points to note:
- No receivers: This stream does not use any receiver. It directly queries Kafka
- Offsets: This does not use Zookeeper to store offsets. The consumed offsets are tracked by the stream itself. For interoperability with Kafka monitoring tools that depend on Zookeeper, you have to update Kafka/Zookeeper yourself from the streaming application. You can access the offsets used in each batch from the generated RDDs (see org.apache.spark.streaming.kafka.HasOffsetRanges).
- Failure Recovery: To recover from driver failures, you have to enable checkpointing in the StreamingContext. The information on consumed offset can be recovered from the checkpoint. See the programming guide for details (constraints, etc.).
- End-to-end semantics: This stream ensures that every records is effectively received and transformed exactly once, but gives no guarantees on whether the transformed data are outputted exactly once. For end-to-end exactly-once semantics, you have to either ensure that the output operation is idempotent, or use transactions to output records atomically. See the programming guide for more details.
K
type of Kafka message key
V
type of Kafka message value
KD
type of Kafka message key decoder
VD
type of Kafka message value decoder
jssc
JavaStreamingContext object
keyClass
Class of the keys in the Kafka records
valueClass
Class of the values in the Kafka records
keyDecoderClass
Class of the key decoder
valueDecoderClass
Class type of the value decoder
kafkaParams
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers), specified in host1:port1,host2:port2 form. If not starting from a checkpoint, "auto.offset.reset" may be set to "largest" or "smallest" to determine where the stream starts (defaults to "largest")
topics
Names of the topics to consume
returns
DStream of (Kafka message key, Kafka message value)
def createDirectStream[K, V, KD <: Decoder[K], VD <: Decoder[V], R](jssc: JavaStreamingContext, keyClass: Class[K], valueClass: Class[V], keyDecoderClass: Class[KD], valueDecoderClass: Class[VD], recordClass: Class[R], kafkaParams: Map[String, String], fromOffsets: Map[TopicAndPartition, Long], messageHandler: Function[MessageAndMetadata[K, V], R]): JavaInputDStream[R]

Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).
Points to note:
- No receivers: This stream does not use any receiver. It directly queries Kafka
- Offsets: This does not use Zookeeper to store offsets. The consumed offsets are tracked by the stream itself. For interoperability with Kafka monitoring tools that depend on Zookeeper, you have to update Kafka/Zookeeper yourself from the streaming application. You can access the offsets used in each batch from the generated RDDs (see org.apache.spark.streaming.kafka.HasOffsetRanges).
- Failure Recovery: To recover from driver failures, you have to enable checkpointing in the StreamingContext. The information on consumed offset can be recovered from the checkpoint. See the programming guide for details (constraints, etc.).
- End-to-end semantics: This stream ensures that every records is effectively received and transformed exactly once, but gives no guarantees on whether the transformed data are outputted exactly once. For end-to-end exactly-once semantics, you have to either ensure that the output operation is idempotent, or use transactions to output records atomically. See the programming guide for more details.
K
type of Kafka message key
V
type of Kafka message value
KD
type of Kafka message key decoder
VD
type of Kafka message value decoder
R
type returned by messageHandler
jssc
JavaStreamingContext object
keyClass
Class of the keys in the Kafka records
valueClass
Class of the values in the Kafka records
keyDecoderClass
Class of the key decoder
valueDecoderClass
Class of the value decoder
recordClass
Class of the records in DStream
kafkaParams
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers), specified in host1:port1,host2:port2 form.
fromOffsets
Per-topic/partition Kafka offsets defining the (inclusive) starting point of the stream
messageHandler
Function for translating each message and metadata into the desired type
returns
DStream of R
def createDirectStream[K, V, KD <: Decoder[K], VD <: Decoder[V]](ssc: StreamingContext, kafkaParams: Map[String, String], topics: Set[String])(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[KD], arg3: ClassTag[VD]): InputDStream[(K, V)]

Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).
Points to note:
- No receivers: This stream does not use any receiver. It directly queries Kafka
- Offsets: This does not use Zookeeper to store offsets. The consumed offsets are tracked by the stream itself. For interoperability with Kafka monitoring tools that depend on Zookeeper, you have to update Kafka/Zookeeper yourself from the streaming application. You can access the offsets used in each batch from the generated RDDs (see org.apache.spark.streaming.kafka.HasOffsetRanges).
- Failure Recovery: To recover from driver failures, you have to enable checkpointing in the StreamingContext. The information on consumed offset can be recovered from the checkpoint. See the programming guide for details (constraints, etc.).
- End-to-end semantics: This stream ensures that every records is effectively received and transformed exactly once, but gives no guarantees on whether the transformed data are outputted exactly once. For end-to-end exactly-once semantics, you have to either ensure that the output operation is idempotent, or use transactions to output records atomically. See the programming guide for more details.
K
type of Kafka message key
V
type of Kafka message value
KD
type of Kafka message key decoder
VD
type of Kafka message value decoder
ssc
StreamingContext object
kafkaParams
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers), specified in host1:port1,host2:port2 form. If not starting from a checkpoint, "auto.offset.reset" may be set to "largest" or "smallest" to determine where the stream starts (defaults to "largest")
topics
Names of the topics to consume
returns
DStream of (Kafka message key, Kafka message value)
def createDirectStream[K, V, KD <: Decoder[K], VD <: Decoder[V], R](ssc: StreamingContext, kafkaParams: Map[String, String], fromOffsets: Map[TopicAndPartition, Long], messageHandler: (MessageAndMetadata[K, V]) ⇒ R)(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[KD], arg3: ClassTag[VD], arg4: ClassTag[R]): InputDStream[R]

Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).
Points to note:
- No receivers: This stream does not use any receiver. It directly queries Kafka
- Offsets: This does not use Zookeeper to store offsets. The consumed offsets are tracked by the stream itself. For interoperability with Kafka monitoring tools that depend on Zookeeper, you have to update Kafka/Zookeeper yourself from the streaming application. You can access the offsets used in each batch from the generated RDDs (see org.apache.spark.streaming.kafka.HasOffsetRanges).
- Failure Recovery: To recover from driver failures, you have to enable checkpointing in the StreamingContext. The information on consumed offset can be recovered from the checkpoint. See the programming guide for details (constraints, etc.).
- End-to-end semantics: This stream ensures that every records is effectively received and transformed exactly once, but gives no guarantees on whether the transformed data are outputted exactly once. For end-to-end exactly-once semantics, you have to either ensure that the output operation is idempotent, or use transactions to output records atomically. See the programming guide for more details.
K
type of Kafka message key
V
type of Kafka message value
KD
type of Kafka message key decoder
VD
type of Kafka message value decoder
R
type returned by messageHandler
ssc
StreamingContext object
kafkaParams
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.
fromOffsets
Per-topic/partition Kafka offsets defining the (inclusive) starting point of the stream
messageHandler
Function for translating each message and metadata into the desired type
returns
DStream of R
def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V], R](jsc: JavaSparkContext, keyClass: Class[K], valueClass: Class[V], keyDecoderClass: Class[KD], valueDecoderClass: Class[VD], recordClass: Class[R], kafkaParams: Map[String, String], offsetRanges: Array[OffsetRange], leaders: Map[TopicAndPartition, Broker], messageHandler: Function[MessageAndMetadata[K, V], R]): JavaRDD[R]

Create a RDD from Kafka using offset ranges for each topic and partition.
Create a RDD from Kafka using offset ranges for each topic and partition. This allows you specify the Kafka leader to connect to (to optimize fetching) and access the message as well as the metadata.
K
type of Kafka message key
V
type of Kafka message value
KD
type of Kafka message key decoder
VD
type of Kafka message value decoder
R
type returned by messageHandler
jsc
JavaSparkContext object
kafkaParams
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.
offsetRanges
Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition
leaders
Kafka brokers for each TopicAndPartition in offsetRanges. May be an empty map, in which case leaders will be looked up on the driver.
messageHandler
Function for translating each message and metadata into the desired type
returns
RDD of R
def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V]](jsc: JavaSparkContext, keyClass: Class[K], valueClass: Class[V], keyDecoderClass: Class[KD], valueDecoderClass: Class[VD], kafkaParams: Map[String, String], offsetRanges: Array[OffsetRange]): JavaPairRDD[K, V]

Create a RDD from Kafka using offset ranges for each topic and partition.
Create a RDD from Kafka using offset ranges for each topic and partition.
K
type of Kafka message key
V
type of Kafka message value
KD
type of Kafka message key decoder
VD
type of Kafka message value decoder
jsc
JavaSparkContext object
keyClass
type of Kafka message key
valueClass
type of Kafka message value
keyDecoderClass
type of Kafka message key decoder
valueDecoderClass
type of Kafka message value decoder
kafkaParams
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.
offsetRanges
Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition
returns
RDD of (Kafka message key, Kafka message value)
def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V], R](sc: SparkContext, kafkaParams: Map[String, String], offsetRanges: Array[OffsetRange], leaders: Map[TopicAndPartition, Broker], messageHandler: (MessageAndMetadata[K, V]) ⇒ R)(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[KD], arg3: ClassTag[VD], arg4: ClassTag[R]): RDD[R]

Create a RDD from Kafka using offset ranges for each topic and partition.
Create a RDD from Kafka using offset ranges for each topic and partition. This allows you specify the Kafka leader to connect to (to optimize fetching) and access the message as well as the metadata.
K
type of Kafka message key
V
type of Kafka message value
KD
type of Kafka message key decoder
VD
type of Kafka message value decoder
R
type returned by messageHandler
sc
SparkContext object
kafkaParams
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.
offsetRanges
Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition
leaders
Kafka brokers for each TopicAndPartition in offsetRanges. May be an empty map, in which case leaders will be looked up on the driver.
messageHandler
Function for translating each message and metadata into the desired type
returns
RDD of R
def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V]](sc: SparkContext, kafkaParams: Map[String, String], offsetRanges: Array[OffsetRange])(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[KD], arg3: ClassTag[VD]): RDD[(K, V)]

Create a RDD from Kafka using offset ranges for each topic and partition.
Create a RDD from Kafka using offset ranges for each topic and partition.
K
type of Kafka message key
V
type of Kafka message value
KD
type of Kafka message key decoder
VD
type of Kafka message value decoder
sc
SparkContext object
kafkaParams
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.
offsetRanges
Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition
returns
RDD of (Kafka message key, Kafka message value)
def createStream[K, V, U <: Decoder[_], T <: Decoder[_]](jssc: JavaStreamingContext, keyTypeClass: Class[K], valueTypeClass: Class[V], keyDecoderClass: Class[U], valueDecoderClass: Class[T], kafkaParams: Map[String, String], topics: Map[String, Integer], storageLevel: StorageLevel): JavaPairReceiverInputDStream[K, V]

Create an input stream that pulls messages from Kafka Brokers.
Create an input stream that pulls messages from Kafka Brokers.
K
type of Kafka message key
V
type of Kafka message value
U
type of Kafka message key decoder
T
type of Kafka message value decoder
jssc
JavaStreamingContext object
keyTypeClass
Key type of DStream
valueTypeClass
value type of Dstream
keyDecoderClass
Type of kafka key decoder
valueDecoderClass
Type of kafka value decoder
kafkaParams
Map of kafka configuration parameters, see http://kafka.apache.org/08/configuration.html
topics
Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread
storageLevel
RDD storage level.
returns
DStream of (Kafka message key, Kafka message value)
def createStream(jssc: JavaStreamingContext, zkQuorum: String, groupId: String, topics: Map[String, Integer], storageLevel: StorageLevel): JavaPairReceiverInputDStream[String, String]

Create an input stream that pulls messages from Kafka Brokers.
Create an input stream that pulls messages from Kafka Brokers.
jssc
JavaStreamingContext object
zkQuorum
Zookeeper quorum (hostname:port,hostname:port,..).
groupId
The group id for this consumer.
topics
Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread.
storageLevel
RDD storage level.
returns
DStream of (Kafka message key, Kafka message value)
def createStream(jssc: JavaStreamingContext, zkQuorum: String, groupId: String, topics: Map[String, Integer]): JavaPairReceiverInputDStream[String, String]

Create an input stream that pulls messages from Kafka Brokers.
Create an input stream that pulls messages from Kafka Brokers. Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.
jssc
JavaStreamingContext object
zkQuorum
Zookeeper quorum (hostname:port,hostname:port,..)
groupId
The group id for this consumer
topics
Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread
returns
DStream of (Kafka message key, Kafka message value)
def createStream[K, V, U <: Decoder[_], T <: Decoder[_]](ssc: StreamingContext, kafkaParams: Map[String, String], topics: Map[String, Int], storageLevel: StorageLevel)(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[U], arg3: ClassTag[T]): ReceiverInputDStream[(K, V)]

Create an input stream that pulls messages from Kafka Brokers.
Create an input stream that pulls messages from Kafka Brokers.
K
type of Kafka message key
V
type of Kafka message value
U
type of Kafka message key decoder
T
type of Kafka message value decoder
ssc
StreamingContext object
kafkaParams
Map of kafka configuration parameters, see http://kafka.apache.org/08/configuration.html
topics
Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread.
storageLevel
Storage level to use for storing the received objects
returns
DStream of (Kafka message key, Kafka message value)
def createStream(ssc: StreamingContext, zkQuorum: String, groupId: String, topics: Map[String, Int], storageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK_SER_2): ReceiverInputDStream[(String, String)]

Create an input stream that pulls messages from Kafka Brokers.
Create an input stream that pulls messages from Kafka Brokers.
ssc
StreamingContext object
zkQuorum
Zookeeper quorum (hostname:port,hostname:port,..)
groupId
The group id for this consumer
topics
Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread
storageLevel
Storage level to use for storing the received objects (default: StorageLevel.MEMORY_AND_DISK_SER_2)
returns
DStream of (Kafka message key, Kafka message value)
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Doc: package kafka

object KafkaUtils

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

def createDirectStream[K, V, KD <: Decoder[K], VD <: Decoder[V]](jssc: JavaStreamingContext, keyClass: Class[K], valueClass: Class[V], keyDecoderClass: Class[KD], valueDecoderClass: Class[VD], kafkaParams: Map[String, String], topics: Set[String]): JavaPairInputDStream[K, V]

def createDirectStream[K, V, KD <: Decoder[K], VD <: Decoder[V]](ssc: StreamingContext, kafkaParams: Map[String, String], topics: Set[String])(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[KD], arg3: ClassTag[VD]): InputDStream[(K, V)]

def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V]](jsc: JavaSparkContext, keyClass: Class[K], valueClass: Class[V], keyDecoderClass: Class[KD], valueDecoderClass: Class[VD], kafkaParams: Map[String, String], offsetRanges: Array[OffsetRange]): JavaPairRDD[K, V]

def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V]](sc: SparkContext, kafkaParams: Map[String, String], offsetRanges: Array[OffsetRange])(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[KD], arg3: ClassTag[VD]): RDD[(K, V)]

def createStream(jssc: JavaStreamingContext, zkQuorum: String, groupId: String, topics: Map[String, Integer], storageLevel: StorageLevel): JavaPairReceiverInputDStream[String, String]

def createStream(jssc: JavaStreamingContext, zkQuorum: String, groupId: String, topics: Map[String, Integer]): JavaPairReceiverInputDStream[String, String]

def createStream[K, V, U <: Decoder[_], T <: Decoder[_]](ssc: StreamingContext, kafkaParams: Map[String, String], topics: Map[String, Int], storageLevel: StorageLevel)(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[U], arg3: ClassTag[T]): ReceiverInputDStream[(K, V)]

def createStream(ssc: StreamingContext, zkQuorum: String, groupId: String, topics: Map[String, Int], storageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK_SER_2): ReceiverInputDStream[(String, String)]

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from AnyRef

Inherited from Any

Ungrouped