Azure Data Centers

Azure provides more than 100 redundant & secure facilities worldwide linked with a network.
- Allows you to
  - gain global reach with local presence
  - keep your data secure and compliant with local laws
You can pick the region and sometimes availability zone you want resources deployed into.
- ❗You can't select a specific datacenter or location within a datacenter.

Regions

Regions = Contains at least one, but often multiple datacenters that are nearby and networked together with a low-latency network.
- Azure assigns and controls the resources within each region to ensure workloads are appropriately balanced.
- E.g. West US, Canada Central, West Europe, Australia East, and Japan West.
❗Some services or virtual machine features are only available in certain regions, such as specific virtual machine sizes or storage types.
Azure regions as of February 2020:
💡Regions provide better scalability, redundancy, and preserves data residency for your services.
Read more: Azure regions

Special regions

For compliance or legal purposes.
Azure Government
- US DoD Central, US Gov Virginia, US Gov Iowa and more
- 📝 Physical and logical network-isolated instances of Azure for US government agencies and partners.
China East, China North and more
- Unique partnership between Microsoft and 21Vianet
- Microsoft does not directly maintain the datacenters.

Geographies

Each region belongs to a single geography
Defined by geopolitical boundaries or country borders.
Has specific service availability, compliance, and data residency/sovereignty rules applied to it
Fault-tolerant to withstand complete region failure through their connection to dedicated networking infrastructure
- 📝 Fault-tolerance: App ability to self-detect and correct all types of problems in its environment
Data residency
- Defines the legal or regulatory requirements imposed on data
- Based on the country or region in which it resides
- 💡 An important consideration when planning out your application data storage.
Geographies are broken up into the following areas
- Americas
- Europe
- Asia Pacific
- Middle East and Africa
Read more: Azure geographies

Availability Zones

📝 Physically separate datacenters within an Azure region.
💡 Allows you to make applications highly available through redundancy.
- Replicate your compute, storage, networking, and data resources in other zones.
- Costs more
- Primarily for VMs, managed disks, load balancers, and SQL databases
- Zonal services: Pin resource to a specific zone.
- Zone-redundant services: Replicates automatically across zones.
Have independent power, cooling, and networking
Set up to be an isolation boundary
- If one zone goes down, the other continues working
Identified as 1-2-3
- Logically mapped to the actual physical zones for each subscription independently.
- Availability Zone 1 in a given subscription might refer to a different physical zone than Availability Zone 1 in a different subscription.
Connected through high-speed, private fiber-optic networks.
❗There are regions that do not support (multiple) availability zones

Region Pairs

Each Azure region is always paired with another region within the same geography
- E.g. West US paired with East US, and South East Asia paired with East Asia
Pairs are at least 300 (≈ 500 km) miles away.
Allows for the replication of resources, e.g. virtual machine storage
- Some services offer automatic geo-redundant storage using region pairs.
Reduce the likelihood of interruptions to both regions
- E.g. natural disasters, civil unrest, power outages, or physical network outages
If one region fails, services automatically fail over to the other region in its region pair.
Data continues to reside within the same geography as its pair (except for Brazil South) for tax and law enforcement jurisdiction purposes.
If there's an extensive Azure outage =>
- One region out of every pair is prioritized to make sure at least one is restored as quick as possible,
Planned Azure updates are rolled out to paired regions one region at a time to minimize downtime and risk of application outage.

Service-level Agreements (SLA)

Formal documents to define the performance standards that apply to Azure.
Specify also what happens if a service or product fails to perform to a governing SLAs specification.
There are SLAs for individual Azure products and services.
❗ Azure does not provide SLAs for most services under the Free or Shared tiers
- e.g. Azure Advisor
Three key characteristics of SLAs for Azure products and services:
1. Performance Targets
  - Specific to each Azure product and service.
  - E.g. uptime guarantees or connectivity rates
2. Uptime and Connectivity Guarantees
  - 📝 Monthly Uptime % = (Maximum Available Minutes-Downtime) / Maximum Available Minutes X 100
  - 📝 Range from 99.9% ("three nines") to 99.999% ("five nines") for any paid tier service.
    - In other words minimum SLA for all non-free Azure services are 99.9%
  - E.g. Azure Cosmos DB (Database) service SLA offers 99.999 percent uptime
    - meaning it allows for about 5 minutes of total downtime per year.
    - also includes low-latency commitments of less than 10 milliseconds on DB read + write operations.
3. 📝 Service credits
  - Given to paying Azure customers if uptime percentage is lower than given in SLA.
  - Describe how Microsoft will respond if an Azure product or service fails to perform to its governing SLAs specification.
  - E.g. customers may have a discount applied to their Azure bill, as compensation for an under-performing Azure product or service.
Read more: SLA Summary for Azure Services

Composite SLA

Result of combining SLAs across different service offerings.
📝 Calculating downtime
- E.g. web app (99.95% SLA from Azure) writes to SQL database (99.99% SLA from Azure)
  - Composite SLA = 99.95 percent × 99.99 percent = 99.94 percent
    - \= 0.9995 * 0.9999 = 0.9994
  - Means combined probability of failure is higher than the individual SLA values
You can improve the composite SLA by creating independent fallback paths.
- E.g. if the SQL Database is unavailable, you can put transactions into a queue for processing at a later time.
  - Web app (99.95%) writes to either SQL Database (99.99%) or queue (99.9%)
  - Application is still available even if it can't connect to the database.
    - ❗But it fails if both the database and the queue fail simultaneously.
  - If the expected percentage of time for a simultaneous failure is 0.0001 × 0.001
    - the composite SLA for this combined path of a database or queue would be:
      - 1.0 − (0.0001 × 0.001) = 99.99999 percent
  - If we add the queue to our web app, the total composite SLA is:
  - 99.95 percent × 99.99999 percent = ~99.95 percent
  - Improves SLA but application logic gets more complicated
    - You are paying more to add the queue support and there may be data-consistency issues you'll have to deal with due to retry behavior.

Application SLA

By creating your own SLAs, you can set performance targets to suit your specific Azure application.
💡 >= four 9's (99.99%) SLA performance targets =>
- manual intervention from failures may not be enough (difficult to be quick enough)
- should have self-diagnosing & self-healing solutions.

Resiliency

Resiliency is the ability of a system to recover from failures and continue to function.
High availability and disaster recovery are two crucial components of resiliency
- 📝 Disaster recovery: When Godzilla destroys your data center, you do have alternative locations to keep providing your service and protocols/means for the other location to know how to keep delivering the service.
Failure Mode Analysis (FMA)
- Goal:
  - Identify possible points of failure.
  - Define how the application will respond to those failures.
Read more: Designing resilient applications for Azure

High availability

📝 Availability is often given as percentage uptime
Refers to the time that a system is functional and working.
Most providers prefer to maximize the availability of their Azure solutions by minimizing downtime.
- ❗ As you increase availability, you also increase the cost and complexity of your solution.
- As your solution grows in complexity, you will have more services depending on each other.
  - You might overlook possible failure points in your solution if you have any interdependent services.
  - 💡E.g. a workload that requires 99.99 percent uptime shouldn't depend upon a service with a 99.9 percent SLA.
Read more: Availability choices for Azure compute

Day 05: What is azure data centers and Service-level Agreements (SLA)

Azure Data Centers

Regions

Special regions

Geographies

Availability Zones

Region Pairs

Service-level Agreements (SLA)

Composite SLA

Application SLA

Resiliency

High availability

Comments

AZURE

Day 04: Azure Basics, Purchasing & Licensing Options, Account, Subscription, Support and Billing

More from this blog

AWS Day 17: Creating an IAM Group | KodeKloud 100 Days of Cloud

AWS Day 16: Creating an IAM User | KodeKloud 100 Days of Cloud

AWS Day 15: Creating an Amazon EBS Volume Snapshot | KodeKloud 100 Days of Cloud

Kodekloud AWS Task Day 14 : Terminate EC2 Instance

Kodekloud AWS Task Day 13 : Create AMI from EC2 Instance

Command Palette

Azure Data Centers

Regions

Special regions

Geographies

Availability Zones

Region Pairs

Service-level Agreements (SLA)

Composite SLA

Application SLA

Resiliency

High availability

Comments

AZURE

Day 04: Azure Basics, Purchasing & Licensing Options, Account, Subscription, Support and Billing

More from this blog