In Exercise 3.16 you provisioned a Basic tier Event Hubs namespace, and in Exercise 3.17 you configured it as an Azure Stream Analytics input. An important aspect of Azure Event Hubs has to do with partitioning (refer to Figure 3.77) and the partition count chosen when provisioning the event hub. The number of partitions that are allocated to an event hub is what dictates the amount of data it can handle. How partitioning works, the limitations, and many other features are based on the Events Hub namespace tier that was chosen during provisioning. In most cases, the features like the number of partitions cannot be changed once provisioned, so it is important to know the constraints and options before you provision for your production event hub. Table 7.5 compares the most important features so that you can choose the one that fits your current and future stream ingestion requirements.
TABLE 7.5 Azure Event Hubs tiers
Feature | Basic | Standard | Premium | Dedicated |
Dynamic partition scale out | No | No | Yes | Yes |
Maximum number of partitions | 32 | 32 | 100 | 1,024 |
Multitenant | Yes | Yes | Isolated | No |
Private link | No | Yes | Yes | Yes |
Maximum message size | 256 KB | 1 MB | 1 MB | 1 MB |
It is important to mention that once you create an event hub, you cannot change the partition count in the Basic or Standard tier. Both Premium and Dedicated tiers can be configured to scale based on load, which nullifies the situation of being bound to a fixed number of partitions. This flexibility comes at a higher price, as the higher tiers offer more features and throughput capacity. Every tier has a maximum number of partitions. In Exercise 3.16 you left the default of two partitions for the event hub, which cannot be changed. If you needed to change the number of partitions, you would need to provision another event hub and repoint your data producers and consumers to it. Multitenancy is a common cloud model. It does not mean that you share the same VM with other customers; it means that the grouping of the unit of compute resources is shared by multiple customers. For example, there is no network hardware separating the customers doing their work in a given tenant. If you have a requirement to isolate your workloads from other customers, you can provision a Premium tier Event Hubs namespace, which would separate your partitions from other customers using networking features like NSGs and firewalls. Or you can provision an entire tenant that is completely independent from all other cloud customers, but this is costly. If you do not want the Event Hubs namespace to exist in a public DNS, then you would want a tier that supports a private link. Otherwise, your Event Hubs endpoint will be globally discoverable. Clients must be authorized to access the endpoint; it is not usable anonymously. Lastly, notice the maximum allowed message sizes. The supported message sizes are large enough, when you consider the massive scale in which the Event Hubs endpoint is capable of managing.
Azure Stream Analytics
The work you have performed thus far in this book has been building up to the next few exercises. You have ingested brain wave data, transformed it into many formats, used many Azure data analytics products and features, and performed some very sophisticated exploratory data analysis. Now it is time to use the insights in a real‐world application. Complete Exercise 7.3, where you will analyze brain waves readings in real time using Azure Stream Analytics.