Package com.azure.messaging.eventhubs


package com.azure.messaging.eventhubs

Azure Event Hubs is a highly scalable publish-subscribe service that can ingest millions of events per second and stream them to multiple consumers. This lets you process and analyze the massive amounts of data produced by your connected devices and applications. Once Event Hubs has collected the data, you can retrieve, transform, and store it by using any real-time analytics provider or with batching/storage adapters.

The Azure Event Hubs client library allows Java developers to interact with Azure Event Hubs. It provides a set of clients that enable Java developers to publish events to and consume events from an Event Hub.

Key Concepts

  • Event Hub producer: A source of telemetry data, diagnostics information, usage logs, or other data, as part of an embedded device solution, a mobile device application, a game title running on a console or other device, some client or server based business solution, or a website
  • Event Hub consumer: Fetches events published to an Event Hub and processes it. Processing may involve aggregation, complex computation, and filtering. Processing may also involve distribution or storage of the information in a raw or transformed fashion. Event Hub consumers are often robust and high-scale platform infrastructure parts with built-in analytics capabilities, like Azure Stream Analytics, Apache Spark, or Apache Storm.
  • Partition: An ordered sequence of events that is held in an Event Hub. Azure Event Hubs provides message streaming through a partitioned consumer pattern in which each consumer only reads a specific subset, or partition, of the message stream. As newer events arrive, they are added to the end of this sequence. The number of partitions is specified at the time an Event Hub is created and cannot be changed.
  • Consumer group: A view of an entire Event Hub. Consumer groups enable multiple consuming applications to each have a separate view of the event stream, and to read the stream independently at their own pace and from their own position. There can be at most 5 concurrent readers on a partition per consumer group; however it is recommended that there is only one active consumer for a given partition and consumer group pairing. Each active reader receives the events from its partition; if there are multiple readers on the same partition, then they will receive duplicate events.
  • Stream offset: The position of an event within an Event Hub partition. It is a client-side cursor that specifies the point in the stream where the event is located. The offset of an event can change as events expire from the stream.
  • Stream sequence number: A number assigned to the event when it was enqueued in the associated Event Hub partition. This is unique for every message received in the Event Hub partition.
  • Checkpointing: A process by which readers mark or commit their position within a partition event sequence. Checkpointing is the responsibility of the consumer and occurs on a per-partition basis within a consumer group. This responsibility means that for each consumer group, each partition reader must keep track of its current position in the event stream, and can inform the service when it considers the data stream complete.

Getting Started

Service clients are the point of interaction for developers to use Azure Event Hubs. EventHubProducerClient and EventHubProducerAsyncClient are the sync and async clients for publishing events to an Event Hub. Similarly, EventHubConsumerClient and EventHubConsumerAsyncClient are the sync and async clients for consuming events from an Event Hub. In production scenarios, we recommend users leverage EventProcessorClient because consumes events from all Event Hub partition, load balances work between multiple instances of EventProcessorClient and can perform checkpointing.

The examples shown in this document use a credential object named DefaultAzureCredential for authentication, which is appropriate for most scenarios, including local development and production environments. Additionally, we recommend using managed identity for authentication in production environments. You can find more information on different ways of authenticating and their corresponding credential types in the Azure Identity documentation".

Publishing events

This library provides several ways to publish events to Azure Event Hubs. There is a producer client, that sends events immediately to Azure Event Hubs and a buffered producer, that batches events together in the background and publishes them later. These two clients have synchronous and asynchronous versions. The samples below demonstrate simple scenarios, more snippets can be found in the class documentation for EventHubProducerClient, EventHubProducerAsyncClient, EventHubBufferedProducerClient, and EventHubBufferedProducerAsyncClient.

In the following snippets, fullyQualifiedNamespace is the Event Hubs Namespace's host name. It is listed under the "Essentials" panel after navigating to the Event Hubs Namespace via Azure Portal. The credential used is DefaultAzureCredential because it combines commonly used credentials in deployment and development and chooses the credential to used based on its running environment.

Sample: Construct a synchronous producer and publish events

The following code sample demonstrates the creation of the synchronous client EventHubProducerClient.

 TokenCredential credential = new DefaultAzureCredentialBuilder().build();

 EventHubProducerClient producer = new EventHubClientBuilder()
     .credential("<<fully-qualified-namespace>>", "<<event-hub-name>>",
         credential)
     .buildProducerClient();

 List<EventData> allEvents = Arrays.asList(new EventData("Foo"), new EventData("Bar"));
 EventDataBatch eventDataBatch = producer.createBatch();

 for (EventData eventData : allEvents) {
     if (!eventDataBatch.tryAdd(eventData)) {
         producer.send(eventDataBatch);
         eventDataBatch = producer.createBatch();

         // Try to add that event that couldn't fit before.
         if (!eventDataBatch.tryAdd(eventData)) {
             throw new IllegalArgumentException("Event is too large for an empty batch. Max size: "
                 + eventDataBatch.getMaxSizeInBytes());
         }
     }
 }

 // send the last batch of remaining events
 if (eventDataBatch.getCount() > 0) {
     producer.send(eventDataBatch);
 }

 // Clients are expected to be long-lived objects.
 // Dispose of the producer to close any underlying resources when we are finished with it.
 producer.close();
 

Sample: Creating an EventHubBufferedProducerClient and enqueuing events

The following code sample demonstrates the creation of the synchronous client EventHubBufferedProducerClient as well as enqueueing events. The producer is set to publish events every 60 seconds with a buffer size of 1500 events for each partition.

 TokenCredential credential = new DefaultAzureCredentialBuilder().build();

 // "<<fully-qualified-namespace>>" will look similar to "{your-namespace}.servicebus.windows.net"
 // "<<event-hub-name>>" will be the name of the Event Hub instance you created inside the Event Hubs namespace.
 EventHubBufferedProducerClient client = new EventHubBufferedProducerClientBuilder()
     .credential("fully-qualified-namespace", "event-hub-name", credential)
     .onSendBatchSucceeded(succeededContext -> {
         System.out.println("Successfully published events to: " + succeededContext.getPartitionId());
     })
     .onSendBatchFailed(failedContext -> {
         System.out.printf("Failed to published events to %s. Error: %s%n",
             failedContext.getPartitionId(), failedContext.getThrowable());
     })
     .buildClient();

 List<EventData> events = Arrays.asList(new EventData("maple"), new EventData("aspen"),
     new EventData("oak"));

 // Enqueues the events to be published.
 client.enqueueEvents(events);

 // Seconds later, enqueue another event.
 client.enqueueEvent(new EventData("bonsai"));

 // Causes any buffered events to be flushed before closing underlying connection.
 client.close();
 

Consuming events

This library provides several ways to consume events from Azure Event Hubs. There are consumer clients, EventHubConsumerClient and EventHubConsumerAsyncClient, which fetches events from either a single partition or all partitions in an Event Hub. For production, we recommend EventProcessorClient whose checkpoints are backed by a durable storage such as Azure Blob Storage. The samples below demonstrate simple scenarios, more snippets can be found in the class documentation for EventHubConsumerClient, EventHubConsumerAsyncClient, and EventProcessorClient.

In the following snippets, fullyQualifiedNamespace is the Event Hubs Namespace's host name. It is listed under the "Essentials" panel after navigating to the Event Hubs Namespace via Azure Portal. The credential used is DefaultAzureCredential because it combines commonly used credentials in deployment and development and chooses the credential to used based on its running environment. The consumerGroup is found by navigating to the Event Hub instance, and selecting "Consumer groups" under the "Entities" panel. The consumerGroup is required for creating consumer clients.

The credential used is DefaultAzureCredential because it combines commonly used credentials in deployment and development and chooses the credential to used based on its running environment.

Sample: Construct a synchronous consumer and receive events

The following code sample demonstrates the creation of the synchronous client EventHubConsumerClient. In addition, it receives the first 100 events that were enqueued 12 hours ago. If there are less than 100 events, the ones fetched within maxWaitTime of 30 seconds are returned.

 TokenCredential credential = new DefaultAzureCredentialBuilder().build();

 // "<<fully-qualified-namespace>>" will look similar to "{your-namespace}.servicebus.windows.net"
 // "<<event-hub-name>>" will be the name of the Event Hub instance you created inside the Event Hubs namespace.
 EventHubConsumerClient consumer = new EventHubClientBuilder()
     .credential("<<fully-qualified-namespace>>", "<<event-hub-name>>",
         credential)
     .consumerGroup(EventHubClientBuilder.DEFAULT_CONSUMER_GROUP_NAME)
     .buildConsumerClient();

 Instant twelveHoursAgo = Instant.now().minus(Duration.ofHours(12));
 EventPosition startingPosition = EventPosition.fromEnqueuedTime(twelveHoursAgo);
 String partitionId = "0";

 // Reads events from partition '0' and returns the first 100 received or until the 30 seconds has elapsed.
 IterableStream<PartitionEvent> events = consumer.receiveFromPartition(partitionId, 100,
     startingPosition, Duration.ofSeconds(30));

 Long lastSequenceNumber = -1L;
 for (PartitionEvent partitionEvent : events) {
     // For each event, perform some sort of processing.
     System.out.print("Event received: " + partitionEvent.getData().getSequenceNumber());
     lastSequenceNumber = partitionEvent.getData().getSequenceNumber();
 }

 // Figure out what the next EventPosition to receive from is based on last event we processed in the stream.
 // If lastSequenceNumber is -1L, then we didn't see any events the first time we fetched events from the
 // partition.
 if (lastSequenceNumber != -1L) {
     EventPosition nextPosition = EventPosition.fromSequenceNumber(lastSequenceNumber, false);

     // Gets the next set of events from partition '0' to consume and process.
     IterableStream<PartitionEvent> nextEvents = consumer.receiveFromPartition(partitionId, 100,
         nextPosition, Duration.ofSeconds(30));
 }
 

Sample: Construct an EventProcessorClient

The following code sample demonstrates the creation of the processor client. The processor client is recommended for production scenarios because it can load balance between multiple running instances, can perform checkpointing, and reconnects on transient failures such as network outages. The sample below uses an in-memory CheckpointStore but azure-messaging-eventhubs-checkpointstore-blob provides a checkpoint store backed by Azure Blob Storage.

 TokenCredential credential = new DefaultAzureCredentialBuilder().build();

 // "<<fully-qualified-namespace>>" will look similar to "{your-namespace}.servicebus.windows.net"
 // "<<event-hub-name>>" will be the name of the Event Hub instance you created inside the Event Hubs namespace.
 EventProcessorClient eventProcessorClient = new EventProcessorClientBuilder()
     .consumerGroup(EventHubClientBuilder.DEFAULT_CONSUMER_GROUP_NAME)
     .credential("<<fully-qualified-namespace>>", "<<event-hub-name>>",
         credential)
     .processEvent(eventContext -> {
         System.out.printf("Partition id = %s and sequence number of event = %s%n",
             eventContext.getPartitionContext().getPartitionId(),
             eventContext.getEventData().getSequenceNumber());
     })
     .processError(errorContext -> {
         System.out.printf("Error occurred in partition processor for partition %s, %s%n",
             errorContext.getPartitionContext().getPartitionId(),
             errorContext.getThrowable());
     })
     .checkpointStore(new SampleCheckpointStore())
     .buildEventProcessorClient();

 eventProcessorClient.start();

 // Continue to perform other tasks while the processor is running in the background.
 //
 // Finally, stop the processor client when application is finished.
 eventProcessorClient.stop();
 
See Also: