• Apache Kafka
  • Dev Rel
Get Your Apache Camel™ Kafka Connectors in a Row

A long camel train on the beach in Broome, Western Australia

(Source: Adobe Stock by scottimage)

Recently I wrote a blog called “Running the Apache Camel HTTP Kafka Source Connector on Instaclustr Managed Apache Kafka®” (check it out here). There are a lot of Apache Camel Kafka Connectors available, and I wondered how to “get your Camels in a row”? That is, understand what permutations of Apache Camel Kafka connectors are possible and form a useful “camel train! 

Which technologies are supported by Apache Camel Kafka source connectors only, sink connectors only, or ideally, both source and sink connectors?  

Technologies with both source and sink connectors make it easy to get data into and out of Kafka again, but in practice, the advantage is that it’s easy to use Kafka to write data to the sink/target technology and read it back into Kafka with a source connector.  

Technologies with both types of connectors are more likely to work correctly for a data pipeline, as they take into account the target technology’s native data formats, types, and APIs etc.

Here’s a reminder of how Kafka Connect works with heterogeneous source and sink technologies: 

Here are my initial results (all errors are mine!)—a table showing Kafka source connectors in the left-hand column and sink connectors in the right-hand column. Technologies (53) with both source and sink connectors are highlighted in green:

Source Sink
aws-cloudtrail-source
aws-cloudwatch-sink
aws-ddb-experimental-sink
aws-ddb-sink
aws-ddb-streams-source
aws-ec2-sink
aws-eventbridge-sink
aws-kinesis-firehose-sink
aws-kinesis-source aws-kinesis-sink
aws-lambda-sink
aws-redshift-source aws-redshift-sink
aws-s3-cdc-source
aws-s3-experimental-source
aws-s3-source aws-s3-sink
aws-s3-streaming-upload-sink
aws-secrets-manager-sink
aws-ses-sink
Thank aws-sns-fifo-sink
aws-sns-sink
aws-sqs-batch-sink
aws-sqs-fifo-sink
aws-sqs-source aws-sqs-sink
aws2-iam
aws2-kms
azure-cosmosdb-source
azure-eventhubs-source azure-eventhubs-sink
azure-functions-sink
azure-servicebus-source azure-servicebus-sink
azure-storage-blob-cdc-source
azure-storage-blob-changefeed-source
azure-storage-blob-source azure-storage-blob-sink
azure-storage-queue-source azure-storage-queue-sink
beer-source
bitcoin-source
cassandra-source cassandra-sink
ceph-source ceph-sink
chuck-norris-source
couchbase-sink
cron-source
cxf cxf
cxfrs cxfrs
dropbox-source dropbox-sink
earthquake-source
elasticsearch-search-source elasticsearch-index-sink
exec-sink
fhir-source fhir-sink
file file
file-watch-source
ftp-source ftp-sink
ftps-source ftps-sink
github-commit-source
github-event-source
github-pullrequest-comment-source
github-pullrequest-source
github-tag-source
google-bigquery-sink
google-calendar-source
google-functions-sink
google-mail-source
google-pubsub-source google-pubsub-sink
google-sheets-source
google-storage-cdc-source
google-storage-source google-storage-sink
hdfs hdfs
http-secured-source http-secured-sink
http-source http-sink
https
infinispan-source infinispan-sink
jdbc
jira-add-comment-sink
jira-add-issue-sink
jira-oauth-source
jira-source
jira-transition-issue-sink
jira-update-issue-sink
jms-amqp-10-source jms-amqp-10-sink
jms-apache-activemq-source jms-apache-activemq-sink
jms-apache-artemis-source jms-apache-artemis-sink
jms-ibm-mq-source jms-ibm-mq-sink
kafka-not-secured-source kafka-not-secured-sink
kafka-source kafka-sink
kafka-ssl-source kafka-ssl-sink
kubernetes-namespaces-source
kubernetes-nodes-source
kubernetes-pods-source
log-sink
mail-imap-source
mail-sink
mariadb-source mariadb-sink
minio-source minio-sink
mongodb-changes-stream-source
mongodb-source mongodb-sink
mqtt-source mqtt-sink
mqtt5-source mqtt5-sink
mysql-source mysql-sink
nats-source nats-sink
netty-http netty-http
netty netty
oracle-database-source oracle-database-sink
postgresql-source postgresql-sink
pulsar-source pulsar-sink
rabbitmq-source
redis-source redis-sink
rest-openapi-sink
salesforce-create-sink
salesforce-delete-sink
salesforce-source salesforce-update-sink
scp-sink
sftp-source sftp-sink
sjms2 sjms2
slack-source slack-sink
solr-source solr-sink
splunk-hec-sink
splunk-source splunk-sink
sqlserver-source sqlserver-sink
ssh-source ssh-sink
syslog syslog
telegram-source telegram-sink
timer-source
twitter-directmessage-source
twitter-search-source
twitter-timeline-source
webhook-source
websocket-source
wttrin-source

I’ve left the -source and -sink postfixes in the names to help with understanding the connector roles (but note that some connectors don’t have source/sink in the names—so rather than using a different connector for the source or sink, you configure the connector as either a source or sink connector—officially there are 172 Camel Connectors, but 179 if you count these types as 2 sorts).

The above table is the traditional Kafka-centric view of the source and sink connectors (writing to Kafka from a source technology, reading from Kafka to a sink technology). But as noted above, it’s also useful to put the non-Kafka technologies in the center with Kafka writing to them (Kafka sinks) and reading from them (Kafka sources)—in theory, and in practice, the Kafka sink and source systems can be different (there are many use cases for multiple Kafka clusters):

So, in this table we’ve reversed the columns with the Sinks on the left:

Write to this technology from Kafka (Sink) Read from this technology to Kafka (Source)
aws-cloudtrail-source
aws-cloudwatch-sink
aws-ddb-experimental-sink
aws-ddb-sink
aws-ddb-streams-source
aws-ec2-sink
aws-eventbridge-sink
aws-kinesis-firehose-sink
aws-kinesis-sink aws-kinesis-source
aws-lambda-sink
aws-redshift-sink aws-redshift-source
aws-s3-cdc-source
aws-s3-experimental-source
aws-s3-sink aws-s3-source
aws-s3-streaming-upload-sink
aws-secrets-manager-sink
aws-ses-sink
aws-sns-fifo-sink
aws-sns-sink
aws-sqs-batch-sink
aws-sqs-fifo-sink
aws-sqs-sink aws-sqs-source
aws2-iam
aws2-kms
azure-cosmosdb-source
azure-eventhubs-sink azure-eventhubs-source
azure-functions-sink
azure-servicebus-sink azure-servicebus-source
azure-storage-blob-cdc-source
azure-storage-blob-changefeed-source
azure-storage-blob-sink azure-storage-blob-source
azure-storage-queue-sink azure-storage-queue-source
beer-source
bitcoin-source
cassandra-sink cassandra-source
ceph-sink ceph-source
chuck-norris-source
couchbase-sink
cron-source
cxf cxf
cxfrs cxfrs
dropbox-sink dropbox-source
earthquake-source
elasticsearch-index-sink elasticsearch-search-source
exec-sink
fhir-sink fhir-source
file file
file-watch-source
ftp-sink ftp-source
ftps-sink ftps-source
github-commit-source
github-event-source
github-pullrequest-comment-source
github-pullrequest-source
github-tag-source
google-bigquery-sink
google-calendar-source
google-functions-sink
google-mail-source
google-pubsub-sink google-pubsub-source
google-sheets-source
google-storage-cdc-source
google-storage-sink google-storage-source
hdfs hdfs
http-secured-sink http-secured-source
http-sink http-source
https
infinispan-sink infinispan-source
jdbc
jira-add-comment-sink
jira-add-issue-sink
jira-oauth-source
jira-source
jira-transition-issue-sink
jira-update-issue-sink
jms-amqp-10-sink jms-amqp-10-source
jms-apache-activemq-sink jms-apache-activemq-source
jms-apache-artemis-sink jms-apache-artemis-source
jms-ibm-mq-sink jms-ibm-mq-source
kafka-not-secured-sink kafka-not-secured-source
kafka-sink kafka-source
kafka-ssl-sink kafka-ssl-source
kubernetes-namespaces-source
kubernetes-nodes-source
kubernetes-pods-source
log-sink
mail-imap-source
mail-sink
mariadb-sink mariadb-source
minio-sink minio-source
mongodb-changes-stream-source
mongodb-sink mongodb-source
mqtt-sink mqtt-source
mqtt5-sink mqtt5-source
mysql-sink mysql-source
nats-sink nats-source
netty-http netty-http
netty netty
oracle-database-sink oracle-database-source
postgresql-sink postgresql-source
pulsar-sink pulsar-source
rabbitmq-source
redis-sink redis-source
rest-openapi-sink
salesforce-create-sink
salesforce-delete-sink
salesforce-update-sink salesforce-source
scp-sink
sftp-sink sftp-source
sjms2 sjms2
slack-sink slack-source
solr-sink solr-source
splunk-hec-sink
splunk-sink splunk-source
sqlserver-sink sqlserver-source
ssh-sink ssh-source
syslog syslog
telegram-sink telegram-source
timer-source
twitter-directmessage-source
twitter-search-source
twitter-timeline-source
webhook-source
websocket-source
wttrin-source

I was also curious to see which of Instaclustr’s Managed Services have matching Apache Camel Kafka connectors (noting that all of them work with Kafka® Connect), and here’s a blue highlighted table with the answers:

Source Sink
aws-cloudtrail-source
aws-cloudwatch-sink
aws-ddb-experimental-sink
aws-ddb-sink
aws-ddb-streams-source
aws-ec2-sink
aws-eventbridge-sink
aws-kinesis-firehose-sink
aws-kinesis-source aws-kinesis-sink
aws-lambda-sink
aws-redshift-source aws-redshift-sink
aws-s3-cdc-source
aws-s3-experimental-source
aws-s3-source aws-s3-sink
aws-s3-streaming-upload-sink
aws-secrets-manager-sink
aws-ses-sink
aws-sns-fifo-sink
aws-sns-sink
aws-sqs-batch-sink
aws-sqs-fifo-sink
aws-sqs-source aws-sqs-sink
aws2-iam
aws2-kms
azure-cosmosdb-source
azure-eventhubs-source azure-eventhubs-sink
azure-functions-sink
azure-servicebus-source azure-servicebus-sink
azure-storage-blob-cdc-source
azure-storage-blob-changefeed-source
azure-storage-blob-source azure-storage-blob-sink
azure-storage-queue-source azure-storage-queue-sink
beer-source
bitcoin-source
cassandra-source cassandra-sink
ceph-source ceph-sink
chuck-norris-source
couchbase-sink
cron-source
cxf cxf
cxfrs cxfrs
dropbox-source dropbox-sink
earthquake-source
elasticsearch-search-source elasticsearch-index-sink
exec-sink
fhir-source fhir-sink
file file
file-watch-source
ftp-source ftp-sink
ftps-source ftps-sink
github-commit-source
github-event-source
github-pullrequest-comment-source
github-pullrequest-source
github-tag-source
google-bigquery-sink
google-calendar-source
google-functions-sink
google-mail-source
google-pubsub-source google-pubsub-sink
google-sheets-source
google-storage-cdc-source
google-storage-source google-storage-sink
hdfs hdfs
http-secured-source http-secured-sink
http-source http-sink
https
infinispan-source infinispan-sink
jdbc
jira-add-comment-sink
jira-add-issue-sink
jira-oauth-source
jira-source
jira-transition-issue-sink
jira-update-issue-sink
jms-amqp-10-source jms-amqp-10-sink
jms-apache-activemq-source jms-apache-activemq-sink
jms-apache-artemis-source jms-apache-artemis-sink
jms-ibm-mq-source jms-ibm-mq-sink
kafka-not-secured-source kafka-not-secured-sink
kafka-source kafka-sink
kafka-ssl-source kafka-ssl-sink
kubernetes-namespaces-source
kubernetes-nodes-source
kubernetes-pods-source
log-sink
mail-imap-source
mail-sink
mariadb-source mariadb-sink
minio-source minio-sink
mongodb-changes-stream-source
mongodb-source mongodb-sink
mqtt-source mqtt-sink
mqtt5-source mqtt5-sink
mysql-source mysql-sink
nats-source nats-sink
netty-http netty-http
netty netty
oracle-database-source oracle-database-sink
postgresql-source postgresql-sink
pulsar-source pulsar-sink
rabbitmq-source
redis-source redis-sink
rest-openapi-sink
salesforce-create-sink
salesforce-delete-sink
salesforce-source salesforce-update-sink
scp-sink
sftp-source sftp-sink
sjms2 sjms2
slack-source slack-sink
solr-source solr-sink
splunk-hec-sink
splunk-source splunk-sink
sqlserver-source sqlserver-sink
ssh-source ssh-sink
syslog syslog
telegram-source telegram-sink
timer-source
twitter-directmessage-source
twitter-search-source
twitter-timeline-source
webhook-source
websocket-source
wttrin-source

This is pretty good coverage (except for OpenSearch®). Interestingly, there are also a couple of Kafka source and sink connectors—this is a bit “odd”, as Kafka connectors are designed to write to, or read from, Kafka and some other technology!

Kafka connector, Kafka source, and sink connectors will presumably work identically and read data from Kafka and write data back to Kafka—actually this could be useful and is what Kafka stream processors do, too.

There could be some use cases for this (e.g., moving data between topics, applying single message transforms to data, etc.). But watch out for infinite event loops (which is why MM2 is normally used for Kafka→Kafka mirroring as it can be configured to prevent them):

(Source: Adobe Stock)

I was trying to think of a nice visualization for this but still haven’t found it—any suggestions are welcome. Also, note that I haven’t tried most of these connectors out yet, so you will need to do your own testing and evaluation.

And just a reminder, when configuring Apache Camel Kafka Connectors, you need to look in 3 places:

  1. Camel component documentation
  2. Camel Kafka Connect basic configuration documentation, and
  3. Camel Kafka Connect specific documentation!

Ready to try Instaclustr Managed Apache Kafka with Camel connectors yourself? Click here for a free 30-day trial!