A long camel train on the beach in Broome, Western Australia
(Source: Adobe Stock by scottimage)
Recently I wrote a blog called “Running the Apache Camel™ HTTP Kafka Source Connector on Instaclustr Managed Apache Kafka®” (check it out here). There are a lot of Apache Camel Kafka Connectors available, and I wondered how to “get your Camels in a row”? That is, understand what permutations of Apache Camel Kafka connectors are possible and form a useful “camel train”!
Which technologies are supported by Apache Camel Kafka source connectors only, sink connectors only, or ideally, both source and sink connectors?
Technologies with both source and sink connectors make it easy to get data into and out of Kafka again, but in practice, the advantage is that it’s easy to use Kafka to write data to the sink/target technology and read it back into Kafka with a source connector.
Technologies with both types of connectors are more likely to work correctly for a data pipeline, as they take into account the target technology’s native data formats, types, and APIs etc.
Here’s a reminder of how Kafka Connect works with heterogeneous source and sink technologies:
Here are my initial results (all errors are mine!)—a table showing Kafka source connectors in the left-hand column and sink connectors in the right-hand column. Technologies (53) with both source and sink connectors are highlighted in green:
Source | Sink |
aws-cloudtrail-source | |
aws-cloudwatch-sink | |
aws-ddb-experimental-sink | |
aws-ddb-sink | |
aws-ddb-streams-source | |
aws-ec2-sink | |
aws-eventbridge-sink | |
aws-kinesis-firehose-sink | |
aws-kinesis-source | aws-kinesis-sink |
aws-lambda-sink | |
aws-redshift-source | aws-redshift-sink |
aws-s3-cdc-source | |
aws-s3-experimental-source | |
aws-s3-source | aws-s3-sink |
aws-s3-streaming-upload-sink | |
aws-secrets-manager-sink | |
aws-ses-sink | |
Thank | aws-sns-fifo-sink |
aws-sns-sink | |
aws-sqs-batch-sink | |
aws-sqs-fifo-sink | |
aws-sqs-source | aws-sqs-sink |
aws2-iam | |
aws2-kms | |
azure-cosmosdb-source | |
azure-eventhubs-source | azure-eventhubs-sink |
azure-functions-sink | |
azure-servicebus-source | azure-servicebus-sink |
azure-storage-blob-cdc-source | |
azure-storage-blob-changefeed-source | |
azure-storage-blob-source | azure-storage-blob-sink |
azure-storage-queue-source | azure-storage-queue-sink |
beer-source | |
bitcoin-source | |
cassandra-source | cassandra-sink |
ceph-source | ceph-sink |
chuck-norris-source | |
couchbase-sink | |
cron-source | |
cxf | cxf |
cxfrs | cxfrs |
dropbox-source | dropbox-sink |
earthquake-source | |
elasticsearch-search-source | elasticsearch-index-sink |
exec-sink | |
fhir-source | fhir-sink |
file | file |
file-watch-source | |
ftp-source | ftp-sink |
ftps-source | ftps-sink |
github-commit-source | |
github-event-source | |
github-pullrequest-comment-source | |
github-pullrequest-source | |
github-tag-source | |
google-bigquery-sink | |
google-calendar-source | |
google-functions-sink | |
google-mail-source | |
google-pubsub-source | google-pubsub-sink |
google-sheets-source | |
google-storage-cdc-source | |
google-storage-source | google-storage-sink |
hdfs | hdfs |
http-secured-source | http-secured-sink |
http-source | http-sink |
https | |
infinispan-source | infinispan-sink |
jdbc | |
jira-add-comment-sink | |
jira-add-issue-sink | |
jira-oauth-source | |
jira-source | |
jira-transition-issue-sink | |
jira-update-issue-sink | |
jms-amqp-10-source | jms-amqp-10-sink |
jms-apache-activemq-source | jms-apache-activemq-sink |
jms-apache-artemis-source | jms-apache-artemis-sink |
jms-ibm-mq-source | jms-ibm-mq-sink |
kafka-not-secured-source | kafka-not-secured-sink |
kafka-source | kafka-sink |
kafka-ssl-source | kafka-ssl-sink |
kubernetes-namespaces-source | |
kubernetes-nodes-source | |
kubernetes-pods-source | |
log-sink | |
mail-imap-source | |
mail-sink | |
mariadb-source | mariadb-sink |
minio-source | minio-sink |
mongodb-changes-stream-source | |
mongodb-source | mongodb-sink |
mqtt-source | mqtt-sink |
mqtt5-source | mqtt5-sink |
mysql-source | mysql-sink |
nats-source | nats-sink |
netty-http | netty-http |
netty | netty |
oracle-database-source | oracle-database-sink |
postgresql-source | postgresql-sink |
pulsar-source | pulsar-sink |
rabbitmq-source | |
redis-source | redis-sink |
rest-openapi-sink | |
salesforce-create-sink | |
salesforce-delete-sink | |
salesforce-source | salesforce-update-sink |
scp-sink | |
sftp-source | sftp-sink |
sjms2 | sjms2 |
slack-source | slack-sink |
solr-source | solr-sink |
splunk-hec-sink | |
splunk-source | splunk-sink |
sqlserver-source | sqlserver-sink |
ssh-source | ssh-sink |
syslog | syslog |
telegram-source | telegram-sink |
timer-source | |
twitter-directmessage-source | |
twitter-search-source | |
twitter-timeline-source | |
webhook-source | |
websocket-source | |
wttrin-source |
I’ve left the -source and -sink postfixes in the names to help with understanding the connector roles (but note that some connectors don’t have source/sink in the names—so rather than using a different connector for the source or sink, you configure the connector as either a source or sink connector—officially there are 172 Camel Connectors, but 179 if you count these types as 2 sorts).
The above table is the traditional Kafka-centric view of the source and sink connectors (writing to Kafka from a source technology, reading from Kafka to a sink technology). But as noted above, it’s also useful to put the non-Kafka technologies in the center with Kafka writing to them (Kafka sinks) and reading from them (Kafka sources)—in theory, and in practice, the Kafka sink and source systems can be different (there are many use cases for multiple Kafka clusters):
So, in this table we’ve reversed the columns with the Sinks on the left:
Write to this technology from Kafka (Sink) | Read from this technology to Kafka (Source) |
aws-cloudtrail-source | |
aws-cloudwatch-sink | |
aws-ddb-experimental-sink | |
aws-ddb-sink | |
aws-ddb-streams-source | |
aws-ec2-sink | |
aws-eventbridge-sink | |
aws-kinesis-firehose-sink | |
aws-kinesis-sink | aws-kinesis-source |
aws-lambda-sink | |
aws-redshift-sink | aws-redshift-source |
aws-s3-cdc-source | |
aws-s3-experimental-source | |
aws-s3-sink | aws-s3-source |
aws-s3-streaming-upload-sink | |
aws-secrets-manager-sink | |
aws-ses-sink | |
aws-sns-fifo-sink | |
aws-sns-sink | |
aws-sqs-batch-sink | |
aws-sqs-fifo-sink | |
aws-sqs-sink | aws-sqs-source |
aws2-iam | |
aws2-kms | |
azure-cosmosdb-source | |
azure-eventhubs-sink | azure-eventhubs-source |
azure-functions-sink | |
azure-servicebus-sink | azure-servicebus-source |
azure-storage-blob-cdc-source | |
azure-storage-blob-changefeed-source | |
azure-storage-blob-sink | azure-storage-blob-source |
azure-storage-queue-sink | azure-storage-queue-source |
beer-source | |
bitcoin-source | |
cassandra-sink | cassandra-source |
ceph-sink | ceph-source |
chuck-norris-source | |
couchbase-sink | |
cron-source | |
cxf | cxf |
cxfrs | cxfrs |
dropbox-sink | dropbox-source |
earthquake-source | |
elasticsearch-index-sink | elasticsearch-search-source |
exec-sink | |
fhir-sink | fhir-source |
file | file |
file-watch-source | |
ftp-sink | ftp-source |
ftps-sink | ftps-source |
github-commit-source | |
github-event-source | |
github-pullrequest-comment-source | |
github-pullrequest-source | |
github-tag-source | |
google-bigquery-sink | |
google-calendar-source | |
google-functions-sink | |
google-mail-source | |
google-pubsub-sink | google-pubsub-source |
google-sheets-source | |
google-storage-cdc-source | |
google-storage-sink | google-storage-source |
hdfs | hdfs |
http-secured-sink | http-secured-source |
http-sink | http-source |
https | |
infinispan-sink | infinispan-source |
jdbc | |
jira-add-comment-sink | |
jira-add-issue-sink | |
jira-oauth-source | |
jira-source | |
jira-transition-issue-sink | |
jira-update-issue-sink | |
jms-amqp-10-sink | jms-amqp-10-source |
jms-apache-activemq-sink | jms-apache-activemq-source |
jms-apache-artemis-sink | jms-apache-artemis-source |
jms-ibm-mq-sink | jms-ibm-mq-source |
kafka-not-secured-sink | kafka-not-secured-source |
kafka-sink | kafka-source |
kafka-ssl-sink | kafka-ssl-source |
kubernetes-namespaces-source | |
kubernetes-nodes-source | |
kubernetes-pods-source | |
log-sink | |
mail-imap-source | |
mail-sink | |
mariadb-sink | mariadb-source |
minio-sink | minio-source |
mongodb-changes-stream-source | |
mongodb-sink | mongodb-source |
mqtt-sink | mqtt-source |
mqtt5-sink | mqtt5-source |
mysql-sink | mysql-source |
nats-sink | nats-source |
netty-http | netty-http |
netty | netty |
oracle-database-sink | oracle-database-source |
postgresql-sink | postgresql-source |
pulsar-sink | pulsar-source |
rabbitmq-source | |
redis-sink | redis-source |
rest-openapi-sink | |
salesforce-create-sink | |
salesforce-delete-sink | |
salesforce-update-sink | salesforce-source |
scp-sink | |
sftp-sink | sftp-source |
sjms2 | sjms2 |
slack-sink | slack-source |
solr-sink | solr-source |
splunk-hec-sink | |
splunk-sink | splunk-source |
sqlserver-sink | sqlserver-source |
ssh-sink | ssh-source |
syslog | syslog |
telegram-sink | telegram-source |
timer-source | |
twitter-directmessage-source | |
twitter-search-source | |
twitter-timeline-source | |
webhook-source | |
websocket-source | |
wttrin-source |
I was also curious to see which of Instaclustr’s Managed Services have matching Apache Camel Kafka connectors (noting that all of them work with Kafka® Connect), and here’s a blue highlighted table with the answers:
Source | Sink |
aws-cloudtrail-source | |
aws-cloudwatch-sink | |
aws-ddb-experimental-sink | |
aws-ddb-sink | |
aws-ddb-streams-source | |
aws-ec2-sink | |
aws-eventbridge-sink | |
aws-kinesis-firehose-sink | |
aws-kinesis-source | aws-kinesis-sink |
aws-lambda-sink | |
aws-redshift-source | aws-redshift-sink |
aws-s3-cdc-source | |
aws-s3-experimental-source | |
aws-s3-source | aws-s3-sink |
aws-s3-streaming-upload-sink | |
aws-secrets-manager-sink | |
aws-ses-sink | |
aws-sns-fifo-sink | |
aws-sns-sink | |
aws-sqs-batch-sink | |
aws-sqs-fifo-sink | |
aws-sqs-source | aws-sqs-sink |
aws2-iam | |
aws2-kms | |
azure-cosmosdb-source | |
azure-eventhubs-source | azure-eventhubs-sink |
azure-functions-sink | |
azure-servicebus-source | azure-servicebus-sink |
azure-storage-blob-cdc-source | |
azure-storage-blob-changefeed-source | |
azure-storage-blob-source | azure-storage-blob-sink |
azure-storage-queue-source | azure-storage-queue-sink |
beer-source | |
bitcoin-source | |
cassandra-source | cassandra-sink |
ceph-source | ceph-sink |
chuck-norris-source | |
couchbase-sink | |
cron-source | |
cxf | cxf |
cxfrs | cxfrs |
dropbox-source | dropbox-sink |
earthquake-source | |
elasticsearch-search-source | elasticsearch-index-sink |
exec-sink | |
fhir-source | fhir-sink |
file | file |
file-watch-source | |
ftp-source | ftp-sink |
ftps-source | ftps-sink |
github-commit-source | |
github-event-source | |
github-pullrequest-comment-source | |
github-pullrequest-source | |
github-tag-source | |
google-bigquery-sink | |
google-calendar-source | |
google-functions-sink | |
google-mail-source | |
google-pubsub-source | google-pubsub-sink |
google-sheets-source | |
google-storage-cdc-source | |
google-storage-source | google-storage-sink |
hdfs | hdfs |
http-secured-source | http-secured-sink |
http-source | http-sink |
https | |
infinispan-source | infinispan-sink |
jdbc | |
jira-add-comment-sink | |
jira-add-issue-sink | |
jira-oauth-source | |
jira-source | |
jira-transition-issue-sink | |
jira-update-issue-sink | |
jms-amqp-10-source | jms-amqp-10-sink |
jms-apache-activemq-source | jms-apache-activemq-sink |
jms-apache-artemis-source | jms-apache-artemis-sink |
jms-ibm-mq-source | jms-ibm-mq-sink |
kafka-not-secured-source | kafka-not-secured-sink |
kafka-source | kafka-sink |
kafka-ssl-source | kafka-ssl-sink |
kubernetes-namespaces-source | |
kubernetes-nodes-source | |
kubernetes-pods-source | |
log-sink | |
mail-imap-source | |
mail-sink | |
mariadb-source | mariadb-sink |
minio-source | minio-sink |
mongodb-changes-stream-source | |
mongodb-source | mongodb-sink |
mqtt-source | mqtt-sink |
mqtt5-source | mqtt5-sink |
mysql-source | mysql-sink |
nats-source | nats-sink |
netty-http | netty-http |
netty | netty |
oracle-database-source | oracle-database-sink |
postgresql-source | postgresql-sink |
pulsar-source | pulsar-sink |
rabbitmq-source | |
redis-source | redis-sink |
rest-openapi-sink | |
salesforce-create-sink | |
salesforce-delete-sink | |
salesforce-source | salesforce-update-sink |
scp-sink | |
sftp-source | sftp-sink |
sjms2 | sjms2 |
slack-source | slack-sink |
solr-source | solr-sink |
splunk-hec-sink | |
splunk-source | splunk-sink |
sqlserver-source | sqlserver-sink |
ssh-source | ssh-sink |
syslog | syslog |
telegram-source | telegram-sink |
timer-source | |
twitter-directmessage-source | |
twitter-search-source | |
twitter-timeline-source | |
webhook-source | |
websocket-source | |
wttrin-source |
This is pretty good coverage (except for OpenSearch®). Interestingly, there are also a couple of Kafka source and sink connectors—this is a bit “odd”, as Kafka connectors are designed to write to, or read from, Kafka and some other technology!
Kafka connector, Kafka source, and sink connectors will presumably work identically and read data from Kafka and write data back to Kafka—actually this could be useful and is what Kafka stream processors do, too.
There could be some use cases for this (e.g., moving data between topics, applying single message transforms to data, etc.). But watch out for infinite event loops (which is why MM2 is normally used for Kafka→Kafka mirroring as it can be configured to prevent them):
(Source: Adobe Stock)
I was trying to think of a nice visualization for this but still haven’t found it—any suggestions are welcome. Also, note that I haven’t tried most of these connectors out yet, so you will need to do your own testing and evaluation.
And just a reminder, when configuring Apache Camel Kafka Connectors, you need to look in 3 places:
- Camel component documentation
- Camel Kafka Connect basic configuration documentation, and
- Camel Kafka Connect specific documentation!
Ready to try Instaclustr Managed Apache Kafka with Camel connectors yourself? Click here for a free 30-day trial!