Object

org.apache.spark.streaming.kafka

KafkaUtils

Related Doc: package kafka

Permalink

object KafkaUtils

Source
KafkaUtils.scala
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. KafkaUtils
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. def createDirectStream[K, V, KD <: Decoder[K], VD <: Decoder[V]](jssc: JavaStreamingContext, keyClass: Class[K], valueClass: Class[V], keyDecoderClass: Class[KD], valueDecoderClass: Class[VD], kafkaParams: Map[String, String], topics: Set[String]): JavaPairInputDStream[K, V]

    Permalink

    Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.

    Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).

    Points to note:

    • No receivers: This stream does not use any receiver. It directly queries Kafka
    • Offsets: This does not use Zookeeper to store offsets. The consumed offsets are tracked by the stream itself. For interoperability with Kafka monitoring tools that depend on Zookeeper, you have to update Kafka/Zookeeper yourself from the streaming application. You can access the offsets used in each batch from the generated RDDs (see org.apache.spark.streaming.kafka.HasOffsetRanges).
    • Failure Recovery: To recover from driver failures, you have to enable checkpointing in the StreamingContext. The information on consumed offset can be recovered from the checkpoint. See the programming guide for details (constraints, etc.).
    • End-to-end semantics: This stream ensures that every records is effectively received and transformed exactly once, but gives no guarantees on whether the transformed data are outputted exactly once. For end-to-end exactly-once semantics, you have to either ensure that the output operation is idempotent, or use transactions to output records atomically. See the programming guide for more details.
    K

    type of Kafka message key

    V

    type of Kafka message value

    KD

    type of Kafka message key decoder

    VD

    type of Kafka message value decoder

    jssc

    JavaStreamingContext object

    keyClass

    Class of the keys in the Kafka records

    valueClass

    Class of the values in the Kafka records

    keyDecoderClass

    Class of the key decoder

    valueDecoderClass

    Class type of the value decoder

    kafkaParams

    Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers), specified in host1:port1,host2:port2 form. If not starting from a checkpoint, "auto.offset.reset" may be set to "largest" or "smallest" to determine where the stream starts (defaults to "largest")

    topics

    Names of the topics to consume

    returns

    DStream of (Kafka message key, Kafka message value)

  7. def createDirectStream[K, V, KD <: Decoder[K], VD <: Decoder[V], R](jssc: JavaStreamingContext, keyClass: Class[K], valueClass: Class[V], keyDecoderClass: Class[KD], valueDecoderClass: Class[VD], recordClass: Class[R], kafkaParams: Map[String, String], fromOffsets: Map[TopicAndPartition, Long], messageHandler: Function[MessageAndMetadata[K, V], R]): JavaInputDStream[R]

    Permalink

    Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.

    Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).

    Points to note:

    • No receivers: This stream does not use any receiver. It directly queries Kafka
    • Offsets: This does not use Zookeeper to store offsets. The consumed offsets are tracked by the stream itself. For interoperability with Kafka monitoring tools that depend on Zookeeper, you have to update Kafka/Zookeeper yourself from the streaming application. You can access the offsets used in each batch from the generated RDDs (see org.apache.spark.streaming.kafka.HasOffsetRanges).
    • Failure Recovery: To recover from driver failures, you have to enable checkpointing in the StreamingContext. The information on consumed offset can be recovered from the checkpoint. See the programming guide for details (constraints, etc.).
    • End-to-end semantics: This stream ensures that every records is effectively received and transformed exactly once, but gives no guarantees on whether the transformed data are outputted exactly once. For end-to-end exactly-once semantics, you have to either ensure that the output operation is idempotent, or use transactions to output records atomically. See the programming guide for more details.
    K

    type of Kafka message key

    V

    type of Kafka message value

    KD

    type of Kafka message key decoder

    VD

    type of Kafka message value decoder

    R

    type returned by messageHandler

    jssc

    JavaStreamingContext object

    keyClass

    Class of the keys in the Kafka records

    valueClass

    Class of the values in the Kafka records

    keyDecoderClass

    Class of the key decoder

    valueDecoderClass

    Class of the value decoder

    recordClass

    Class of the records in DStream

    kafkaParams

    Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers), specified in host1:port1,host2:port2 form.

    fromOffsets

    Per-topic/partition Kafka offsets defining the (inclusive) starting point of the stream

    messageHandler

    Function for translating each message and metadata into the desired type

    returns

    DStream of R

  8. def createDirectStream[K, V, KD <: Decoder[K], VD <: Decoder[V]](ssc: StreamingContext, kafkaParams: Map[String, String], topics: Set[String])(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[KD], arg3: ClassTag[VD]): InputDStream[(K, V)]

    Permalink

    Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.

    Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).

    Points to note:

    • No receivers: This stream does not use any receiver. It directly queries Kafka
    • Offsets: This does not use Zookeeper to store offsets. The consumed offsets are tracked by the stream itself. For interoperability with Kafka monitoring tools that depend on Zookeeper, you have to update Kafka/Zookeeper yourself from the streaming application. You can access the offsets used in each batch from the generated RDDs (see org.apache.spark.streaming.kafka.HasOffsetRanges).
    • Failure Recovery: To recover from driver failures, you have to enable checkpointing in the StreamingContext. The information on consumed offset can be recovered from the checkpoint. See the programming guide for details (constraints, etc.).
    • End-to-end semantics: This stream ensures that every records is effectively received and transformed exactly once, but gives no guarantees on whether the transformed data are outputted exactly once. For end-to-end exactly-once semantics, you have to either ensure that the output operation is idempotent, or use transactions to output records atomically. See the programming guide for more details.
    K

    type of Kafka message key

    V

    type of Kafka message value

    KD

    type of Kafka message key decoder

    VD

    type of Kafka message value decoder

    ssc

    StreamingContext object

    kafkaParams

    Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers), specified in host1:port1,host2:port2 form. If not starting from a checkpoint, "auto.offset.reset" may be set to "largest" or "smallest" to determine where the stream starts (defaults to "largest")

    topics

    Names of the topics to consume

    returns

    DStream of (Kafka message key, Kafka message value)

  9. def createDirectStream[K, V, KD <: Decoder[K], VD <: Decoder[V], R](ssc: StreamingContext, kafkaParams: Map[String, String], fromOffsets: Map[TopicAndPartition, Long], messageHandler: (MessageAndMetadata[K, V]) ⇒ R)(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[KD], arg3: ClassTag[VD], arg4: ClassTag[R]): InputDStream[R]

    Permalink

    Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.

    Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).

    Points to note:

    • No receivers: This stream does not use any receiver. It directly queries Kafka
    • Offsets: This does not use Zookeeper to store offsets. The consumed offsets are tracked by the stream itself. For interoperability with Kafka monitoring tools that depend on Zookeeper, you have to update Kafka/Zookeeper yourself from the streaming application. You can access the offsets used in each batch from the generated RDDs (see org.apache.spark.streaming.kafka.HasOffsetRanges).
    • Failure Recovery: To recover from driver failures, you have to enable checkpointing in the StreamingContext. The information on consumed offset can be recovered from the checkpoint. See the programming guide for details (constraints, etc.).
    • End-to-end semantics: This stream ensures that every records is effectively received and transformed exactly once, but gives no guarantees on whether the transformed data are outputted exactly once. For end-to-end exactly-once semantics, you have to either ensure that the output operation is idempotent, or use transactions to output records atomically. See the programming guide for more details.
    K

    type of Kafka message key

    V

    type of Kafka message value

    KD

    type of Kafka message key decoder

    VD

    type of Kafka message value decoder

    R

    type returned by messageHandler

    ssc

    StreamingContext object

    kafkaParams

    Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.

    fromOffsets

    Per-topic/partition Kafka offsets defining the (inclusive) starting point of the stream

    messageHandler

    Function for translating each message and metadata into the desired type

    returns

    DStream of R

  10. def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V], R](jsc: JavaSparkContext, keyClass: Class[K], valueClass: Class[V], keyDecoderClass: Class[KD], valueDecoderClass: Class[VD], recordClass: Class[R], kafkaParams: Map[String, String], offsetRanges: Array[OffsetRange], leaders: Map[TopicAndPartition, Broker], messageHandler: Function[MessageAndMetadata[K, V], R]): JavaRDD[R]

    Permalink

    Create a RDD from Kafka using offset ranges for each topic and partition.

    Create a RDD from Kafka using offset ranges for each topic and partition. This allows you specify the Kafka leader to connect to (to optimize fetching) and access the message as well as the metadata.

    K

    type of Kafka message key

    V

    type of Kafka message value

    KD

    type of Kafka message key decoder

    VD

    type of Kafka message value decoder

    R

    type returned by messageHandler

    jsc

    JavaSparkContext object

    kafkaParams

    Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.

    offsetRanges

    Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition

    leaders

    Kafka brokers for each TopicAndPartition in offsetRanges. May be an empty map, in which case leaders will be looked up on the driver.

    messageHandler

    Function for translating each message and metadata into the desired type

    returns

    RDD of R

  11. def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V]](jsc: JavaSparkContext, keyClass: Class[K], valueClass: Class[V], keyDecoderClass: Class[KD], valueDecoderClass: Class[VD], kafkaParams: Map[String, String], offsetRanges: Array[OffsetRange]): JavaPairRDD[K, V]

    Permalink

    Create a RDD from Kafka using offset ranges for each topic and partition.

    Create a RDD from Kafka using offset ranges for each topic and partition.

    K

    type of Kafka message key

    V

    type of Kafka message value

    KD

    type of Kafka message key decoder

    VD

    type of Kafka message value decoder

    jsc

    JavaSparkContext object

    keyClass

    type of Kafka message key

    valueClass

    type of Kafka message value

    keyDecoderClass

    type of Kafka message key decoder

    valueDecoderClass

    type of Kafka message value decoder

    kafkaParams

    Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.

    offsetRanges

    Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition

    returns

    RDD of (Kafka message key, Kafka message value)

  12. def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V], R](sc: SparkContext, kafkaParams: Map[String, String], offsetRanges: Array[OffsetRange], leaders: Map[TopicAndPartition, Broker], messageHandler: (MessageAndMetadata[K, V]) ⇒ R)(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[KD], arg3: ClassTag[VD], arg4: ClassTag[R]): RDD[R]

    Permalink

    Create a RDD from Kafka using offset ranges for each topic and partition.

    Create a RDD from Kafka using offset ranges for each topic and partition. This allows you specify the Kafka leader to connect to (to optimize fetching) and access the message as well as the metadata.

    K

    type of Kafka message key

    V

    type of Kafka message value

    KD

    type of Kafka message key decoder

    VD

    type of Kafka message value decoder

    R

    type returned by messageHandler

    sc

    SparkContext object

    kafkaParams

    Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.

    offsetRanges

    Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition

    leaders

    Kafka brokers for each TopicAndPartition in offsetRanges. May be an empty map, in which case leaders will be looked up on the driver.

    messageHandler

    Function for translating each message and metadata into the desired type

    returns

    RDD of R

  13. def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V]](sc: SparkContext, kafkaParams: Map[String, String], offsetRanges: Array[OffsetRange])(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[KD], arg3: ClassTag[VD]): RDD[(K, V)]

    Permalink

    Create a RDD from Kafka using offset ranges for each topic and partition.

    Create a RDD from Kafka using offset ranges for each topic and partition.

    K

    type of Kafka message key

    V

    type of Kafka message value

    KD

    type of Kafka message key decoder

    VD

    type of Kafka message value decoder

    sc

    SparkContext object

    kafkaParams

    Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.

    offsetRanges

    Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition

    returns

    RDD of (Kafka message key, Kafka message value)

  14. def createStream[K, V, U <: Decoder[_], T <: Decoder[_]](jssc: JavaStreamingContext, keyTypeClass: Class[K], valueTypeClass: Class[V], keyDecoderClass: Class[U], valueDecoderClass: Class[T], kafkaParams: Map[String, String], topics: Map[String, Integer], storageLevel: StorageLevel): JavaPairReceiverInputDStream[K, V]

    Permalink

    Create an input stream that pulls messages from Kafka Brokers.

    Create an input stream that pulls messages from Kafka Brokers.

    K

    type of Kafka message key

    V

    type of Kafka message value

    U

    type of Kafka message key decoder

    T

    type of Kafka message value decoder

    jssc

    JavaStreamingContext object

    keyTypeClass

    Key type of DStream

    valueTypeClass

    value type of Dstream

    keyDecoderClass

    Type of kafka key decoder

    valueDecoderClass

    Type of kafka value decoder

    kafkaParams

    Map of kafka configuration parameters, see http://kafka.apache.org/08/configuration.html

    topics

    Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread

    storageLevel

    RDD storage level.

    returns

    DStream of (Kafka message key, Kafka message value)

  15. def createStream(jssc: JavaStreamingContext, zkQuorum: String, groupId: String, topics: Map[String, Integer], storageLevel: StorageLevel): JavaPairReceiverInputDStream[String, String]

    Permalink

    Create an input stream that pulls messages from Kafka Brokers.

    Create an input stream that pulls messages from Kafka Brokers.

    jssc

    JavaStreamingContext object

    zkQuorum

    Zookeeper quorum (hostname:port,hostname:port,..).

    groupId

    The group id for this consumer.

    topics

    Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread.

    storageLevel

    RDD storage level.

    returns

    DStream of (Kafka message key, Kafka message value)

  16. def createStream(jssc: JavaStreamingContext, zkQuorum: String, groupId: String, topics: Map[String, Integer]): JavaPairReceiverInputDStream[String, String]

    Permalink

    Create an input stream that pulls messages from Kafka Brokers.

    Create an input stream that pulls messages from Kafka Brokers. Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.

    jssc

    JavaStreamingContext object

    zkQuorum

    Zookeeper quorum (hostname:port,hostname:port,..)

    groupId

    The group id for this consumer

    topics

    Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread

    returns

    DStream of (Kafka message key, Kafka message value)

  17. def createStream[K, V, U <: Decoder[_], T <: Decoder[_]](ssc: StreamingContext, kafkaParams: Map[String, String], topics: Map[String, Int], storageLevel: StorageLevel)(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[U], arg3: ClassTag[T]): ReceiverInputDStream[(K, V)]

    Permalink

    Create an input stream that pulls messages from Kafka Brokers.

    Create an input stream that pulls messages from Kafka Brokers.

    K

    type of Kafka message key

    V

    type of Kafka message value

    U

    type of Kafka message key decoder

    T

    type of Kafka message value decoder

    ssc

    StreamingContext object

    kafkaParams

    Map of kafka configuration parameters, see http://kafka.apache.org/08/configuration.html

    topics

    Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread.

    storageLevel

    Storage level to use for storing the received objects

    returns

    DStream of (Kafka message key, Kafka message value)

  18. def createStream(ssc: StreamingContext, zkQuorum: String, groupId: String, topics: Map[String, Int], storageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK_SER_2): ReceiverInputDStream[(String, String)]

    Permalink

    Create an input stream that pulls messages from Kafka Brokers.

    Create an input stream that pulls messages from Kafka Brokers.

    ssc

    StreamingContext object

    zkQuorum

    Zookeeper quorum (hostname:port,hostname:port,..)

    groupId

    The group id for this consumer

    topics

    Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread

    storageLevel

    Storage level to use for storing the received objects (default: StorageLevel.MEMORY_AND_DISK_SER_2)

    returns

    DStream of (Kafka message key, Kafka message value)

  19. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  20. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  21. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  22. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  23. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  24. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  25. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  26. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  27. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  28. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  29. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  30. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped