Kafka-Spark Batch Streaming: WARN clients.NetworkClient: Bootstrap broker disconnected

Kafka-Spark Batch Streaming: WARN clients.NetworkClient: Bootstrap broker disconnected



I am trying to write the rows of a Dataframe into a Kafka topic. The kafka cluster is Kerberized and I am providing the jaas.conf in the --conf arguments to be able to authenticate and connect to the cluster. Below is my code:


Dataframe


Kafka topic


object app
val conf = new SparkConf().setAppName("Kerberos kafka ")
val spark = SparkSession.builder().config(conf).enableHiveSupport().getOrCreate()
System.setProperty("java.security.auth.login.config", "path to jaas.conf")
spark.sparkContext.setLogLevel("ERROR")
def main(args: Array[String]): Unit =
val test= spark.sql("select * from testing.test")
test.show()
println("publishing to kafka...")
val test_final = test.selectExpr("cast(to_json(struct(*)) as string) AS value")
test_final .show()
test_final.write.format("kafka")
.option("kafka.bootstrap.servers","XXXXXXXXX:9093")
.option("topic", "test")
.option("security.protocol", "SASL_SSL")
.option("sasl.kerberos.service.name","kafka")
.save()





When I run the above code, it fails with this error:
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.


org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.



When I looked into the error logs on executors, I see this:


18/08/20 22:06:05 INFO producer.ProducerConfig: ProducerConfig values:
compression.type = none
metric.reporters =
metadata.max.age.ms = 300000
metadata.fetch.timeout.ms = 60000
reconnect.backoff.ms = 50
sasl.kerberos.ticket.renew.window.factor = 0.8
bootstrap.servers = [xxxxxxxxx:9093]
retry.backoff.ms = 100
sasl.kerberos.kinit.cmd = /usr/bin/kinit
buffer.memory = 33554432
timeout.ms = 30000
key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
ssl.keystore.type = JKS
ssl.trustmanager.algorithm = PKIX
block.on.buffer.full = false
ssl.key.password = null
max.block.ms = 60000
sasl.kerberos.min.time.before.relogin = 60000
connections.max.idle.ms = 540000
ssl.truststore.password = null
max.in.flight.requests.per.connection = 5
metrics.num.samples = 2
client.id =
ssl.endpoint.identification.algorithm = null
ssl.protocol = TLS
request.timeout.ms = 30000
ssl.provider = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
acks = 1
batch.size = 16384
ssl.keystore.location = null
receive.buffer.bytes = 32768
ssl.cipher.suites = null
ssl.truststore.type = JKS
**security.protocol = PLAINTEXT**
retries = 0
max.request.size = 1048576
value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
ssl.truststore.location = null
ssl.keystore.password = null
ssl.keymanager.algorithm = SunX509
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
send.buffer.bytes = 131072
linger.ms = 0

18/08/20 22:06:05 **INFO utils.AppInfoParser: Kafka version : 0.9.0-kafka-2.0.2**
18/08/20 22:06:05 INFO utils.AppInfoParser: Kafka commitId : unknown
18/08/20 22:06:05 INFO datasources.FileScanRDD: Reading File path: hdfs://nameservice1/user/test5/dt=2017-08-04/5a8bb121-3cab-4bed-a32b-9d0fae4a4e8b.parquet, range: 0-142192, partition values: [2017-08-04]
18/08/20 22:06:05 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 4
18/08/20 22:06:05 INFO memory.MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 33.9 KB, free 5.2 GB)
18/08/20 22:06:05 INFO broadcast.TorrentBroadcast: Reading broadcast variable 4 took 224 ms
18/08/20 22:06:05 INFO memory.MemoryStore: Block broadcast_4 stored as values in memory (estimated size 472.2 KB, free 5.2 GB)
18/08/20 22:06:06 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected
18/08/20 22:06:06 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected
18/08/20 22:06:07 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected
18/08/20 22:06:07 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected
18/08/20 22:06:07 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected
18/08/20 22:06:08 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4
18/08/20 22:06:08 INFO executor.Executor: Running task 1.0 in stage 2.0 (TID 4)
18/08/20 22:06:08 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected
18/08/20 22:06:08 INFO datasources.FileScanRDD: Reading File path: hdfs://nameservice1/user/test5/dt=2017-08-10/2175e5d9-e969-41e9-8aa2-f329b5df06bf.parquet, range: 0-77484, partition values: [2017-08-10]
18/08/20 22:06:08 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected
18/08/20 22:06:09 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected
18/08/20 22:06:09 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected
18/08/20 22:06:10 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected
18/08/20 22:06:10 WARN clients.NetworkClient: Bootstrap broker xxxxxxxxx:9093:9093 disconnected



In this above log, I see three entries that are conflicting:



security.protocol = PLAINTEXT



sasl.kerberos.service.name = null



INFO utils.AppInfoParser: Kafka version : 0.9.0-kafka-2.0.2



I am setting the security.protocol and sasl.kerberos.service.name values in my test_final.write..... Does that mean the configs are not being passed? The Kafka dependency I am using in my jar is:


security.protocol


sasl.kerberos.service.name


test_final.write....


<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.11</artifactId>
<version>0.10.2.1</version>
</dependency>



Does the 0.10.2.1 version conflict with the 0.9.0-kafka-2.0.2 ? and could this be causing the issue?


0.10.2.1


0.9.0-kafka-2.0.2



Here is my jaas.conf:


/* $Id$ */

kinit
com.sun.security.auth.module.Krb5LoginModule required;
;

KafkaClient
com.sun.security.auth.module.Krb5LoginModule required
doNotPrompt=true
useTicketCache=true
useKeyTab=true
principal="user@CORP.COM"
serviceName="kafka"
keyTab="/data/home/keytabs/user.keytab"
client=true;
;



Here is my spark-submit command:


spark2-submit --master yarn --class app --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=path to jaas.conf" --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=path to jaas.conf" --files path to jaas.conf --conf "spark.driver.extraClassPath=path to spark-sql-kafka-0-10_2.11-2.2.0.jar" --conf "spark.executor.extraClassPath=path to spark-sql-kafka-0-10_2.11-2.2.0.jar" --num-executors 2 --executor-cores 4 --executor-memory 10g --driver-memory 5g ./KerberizedKafkaConnect-1.0-SNAPSHOT-shaded.jar



Any help would be appreciated. Thank you!









By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

ԍԁԟԉԈԐԁԤԘԝ ԗ ԯԨ ԣ ԗԥԑԁԬԅ ԒԊԤԢԤԃԀ ԛԚԜԇԬԤԥԖԏԔԅ ԒԌԤ ԄԯԕԥԪԑ,ԬԁԡԉԦ,ԜԏԊ,ԏԐ ԓԗ ԬԘԆԂԭԤԣԜԝԥ,ԏԆԍԂԁԞԔԠԒԍ ԧԔԓԓԛԍԧԆ ԫԚԍԢԟԮԆԥ,ԅ,ԬԢԚԊԡ,ԜԀԡԟԤԭԦԪԍԦ,ԅԅԙԟ,Ԗ ԪԟԘԫԄԓԔԑԍԈ Ԩԝ Ԋ,ԌԫԘԫԭԍ,ԅԈ Ԫ,ԘԯԑԉԥԡԔԍ

How to change the default border color of fbox? [duplicate]

Henj