pyspark.RDD.countByValue# RDD.countByValue()[source]# Return the count of each unique value in this RDD as a dictionary of (value, count) pairs. New in version 0.7.0. Returns dicta dictionary of (value, count) pairs See also RDD.collectAsMap() RDD.countByKey() Examples >>> sorted(sc.parallelize([1, 2, 1, 2, 2], 2).countByValue().items()) [(1, 2), (2, 3)]