1

I have to load csv data from object store(which can be accessed only using url) into kafka topic. how can i load this csv data to topic in kafka ?please explain the steps. and also is there any time interval to load?.I tried spooldir connector via rest api but could not figure out how to provide url..

spooldir config:

{"connector.class":"com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector",
"topic":"spooldir-testing-topic",
"input.path":"",//here how to provide url instead of path?
"finished.path": "<finished path>",
"error.path": "<>error path",
"input.file.pattern":".*\\.csv",
"schema.generation.enabled":"true",
"csv.first.row.as.header":"true",
"tasks.max":"1",
"halt.on.error":"false"}

1)how to provide url instead of path? 2)how to provide time interval between each row of csv? ex:n rows in n seonds?

  • file://192.168.10.20/f$/MyDir/SubDir/text.doc Where f:/ is the driver, just throwing this out , don't know if it works, can you update please – Ran Lupovich Jun 09 '21 at 05:57
  • @RanLupovich csv file is actually located as an object inside a bucket in cloud service – Aishwarya B Jun 09 '21 at 09:16
  • 1
    An S3 bucket? Confluent has an S3 source connector – OneCricketeer Jun 09 '21 at 11:57
  • @OneCricketeer it is oracle cloud object storage..Is there any connector available for oracle cloud object storage.? – Aishwarya B Jun 09 '21 at 22:42
  • I have no experience with Oracle products, so none that I am aware of – OneCricketeer Jun 09 '21 at 22:42
  • @AishwaryaB, you can have a look to Kafka Connect FilePulse : https://github.com/streamthoughts/kafka-connect-file-pulse . Maybe you can extend it to support Oracle Cloud Object. The connector already support AWS S3, Azure Blob Storage and GCP Cloud Storage. – fhussonnois Jun 15 '21 at 09:18
  • I don't know if this helps in your case. Oracle Object Storage has a fully compatible S3 endpoint. So you can utilize S3 compatible connector to easily put your data from OCI bucket to Streaming. I am sorry, I don't have an example but Here an example Publishing To Object Storage From Oracle Streaming Service that utilize the Kafka Connect S3 Sink Connector : https://blogs.oracle.com/developers/publishing-to-object-storage-from-oracle-streaming-service. At least the concept is here - use compatibility with S3. – Dario Jun 22 '21 at 06:09

1 Answers1

0

Spooldir connector assumes local filesystem access, only.

You're going to need a different connector/solution, including possibly writing your own Connector class

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245