Skip to content

AWS EKS Architecture Reference

logo


Overview

This reference guide provides detailed architecture diagrams and deployment patterns for Orion's Unified Compute Plane on AWS EKS.

Architecture Components

Core Orion Platform Components

All deployment architectures include these essential Orion platform components:

Juno Workload Nodes

  • Workloads: Primary compute orchestration for user workstations and applications
  • Helios: Resource optimization and GPU utilization management (92% efficiency)
  • Purpose: Customer-facing compute workloads achieving industry-leading resource utilization

Juno Support Nodes

  • Kuiper: Workload management system for project namespaces
  • Terra: Application marketplace and ecosystem management
  • Titan: Authorization service and user management
  • Hubble: Project portal and workload management dashboard
  • NGINX: Ingress controller and load balancing
  • Genesis: Platform initialization and configuration management

AWS Infrastructure Components

Amazon EKS Control Plane

  • Managed Kubernetes API server with high availability
  • Automatic updates and security patches
  • Multi-AZ control plane for production deployments
  • Integrated with AWS IAM for authentication and authorization

Compute Resources

  • CPU Nodes: General Orion services and control plane components
  • GPU Nodes: High-performance workloads with optimized resource utilization
  • Auto Scaling Groups: Dynamic scaling based on workload demands

Networking Components

  • VPC: Isolated network environment with customizable CIDR blocks
  • Subnets: Public and private subnet configurations per availability zone
  • Security Groups: Application-layer firewall rules
  • Network Load Balancer: High-performance Layer 4 load balancing
  • Internet Gateway: External connectivity for public deployments
  • NAT Gateway: Secure internet access for private deployments

Deployment Architecture Patterns

1. Private Single AZ Deployment

Recommended for: High-security environments, development, compliance requirements

Architecture Characteristics

  • Security Model: Private subnets with no direct internet access
  • Internet Connectivity: NAT Gateway provides secure outbound access
  • Load Balancing: Network Load Balancer within VPC
  • Availability: Single AZ with cost optimization
  • Compliance: Suitable for regulated environments

Component Layout

AWS Account
└── VPC (10.0.0.0/16)
    └── Availability Zone
        ├── Private Subnet (10.0.11.0/24)
        │   ├── Juno Workload Nodes
        │   │   ├── Workloads (GPU optimized)
        │   │   └── Helios (Resource optimization)
        │   └── Juno Support Nodes
        │       ├── Kuiper (Platform core)
        │       ├── Terra & Titan (Ecosystem)
        │       ├── Hubble (Monitoring)
        │       ├── NGINX (Ingress)
        │       └── Genesis (Configuration)
        ├── NAT Gateway
        ├── Network Load Balancer
        └── Internet Gateway

Security Features

  • All worker nodes in private subnets
  • No direct internet access to compute resources
  • Encrypted data in transit and at rest
  • VPC-level network isolation

Cost Considerations

  • NAT Gateway: AWS Cost + data processing
  • Reduced complexity compared to multi-AZ
  • Single AZ reduces cross-AZ data transfer costs

2. Public Multi-AZ Deployment

Recommended for: Production environments, high availability, enterprise deployments

AWS Diagram AWS Diagram

Architecture Characteristics

  • High Availability: Multiple availability zones with automatic failover
  • Fault Tolerance: Zone-level isolation and redundancy
  • Load Distribution: Workloads spread across AZs
  • Scalability: Independent scaling per availability zone
  • Production Ready: Enterprise-grade reliability

Component Layout

AWS Account
└── VPC (10.0.0.0/16)
    ├── Availability Zone A
    │   └── Public Subnet (10.0.1.0/24)
    │       ├── Juno Workload Nodes (AZ-A)
    │       └── Juno Support Nodes (AZ-A)
    ├── Availability Zone B  
    │   └── Public Subnet (10.0.2.0/24)
    │       ├── Juno Workload Nodes (AZ-B)
    │       └── Juno Support Nodes (AZ-B)
    ├── Network Load Balancer (Multi-AZ)
    └── Internet Gateway

High Availability Features

  • Cross-AZ workload distribution
  • Automatic node replacement
  • Multi-AZ load balancer
  • Zero-downtime updates

Cost Considerations

  • Higher compute costs due to redundancy
  • Cross-AZ data transfer charges
  • Enhanced availability justifies premium

3. Public Single AZ Deployment

Recommended for: Development, testing, cost-sensitive deployments, proof-of-concept

Architecture Characteristics

  • Cost Optimized: Minimal infrastructure overhead
  • Simplified Networking: Direct internet access for all nodes
  • EKSCTL Ready: Simplified deployment configuration
  • Development Focus: Ideal for non-production environments
  • Quick Setup: Fastest deployment option

Component Layout

AWS Account
└── VPC (10.0.0.0/16)
    └── Availability Zone
        └── Public Subnet (10.0.1.0/24)
            ├── Juno Workload Nodes
            │   ├── Workloads (Cost optimized)
            │   └── Helios (Resource optimization)
            └── Juno Support Nodes
                ├── Kuiper (Platform core)
                ├── Terra & Titan (Ecosystem)
                ├── Hubble (Monitoring)
                ├── NGINX (Ingress)
                └── Genesis (Configuration)

Cost Optimization Features

  • No NAT Gateway required
  • Minimal networking complexity
  • Single AZ reduces infrastructure costs
  • Direct internet connectivity

Development Benefits

  • Rapid deployment and teardown
  • Simplified debugging and troubleshooting
  • Cost-effective for temporary environments
  • Easy EKSCTL configuration

Architecture Selection Decision Matrix

Requirement Private Single AZ Public Multi-AZ Public Single AZ
High Security ✅ Excellent ⚠️ Good ⚠️ Basic
High Availability ❌ Single Point ✅ Excellent ❌ Single Point
Cost Optimization ⚠️ Moderate ❌ Highest ✅ Lowest
Development ⚠️ Suitable ❌ Overprovisioned ✅ Ideal
Production ⚠️ Limited HA ✅ Recommended ❌ Not Recommended

Network Security Considerations

Security Groups Configuration

Orion Platform Security Groups

  • Workload Nodes: Ports 80, 443, 22 (SSH), custom application ports
  • Support Nodes: Kubernetes API (6443), etcd (2379-2380), kubelet (10250)
  • Load Balancer: HTTP (80), HTTPS (443), health check ports

Network ACLs

  • Subnet-level security controls
  • Default deny with explicit allow rules
  • Separate ACLs for public and private subnets

Data Flow and Communication

East-West Traffic (Inter-service communication)

  • Service mesh for secure service-to-service communication
  • Network policies for pod-to-pod communication
  • Internal DNS resolution via CoreDNS

North-South Traffic (External access)

  • Application Load Balancer for HTTPS termination
  • Network Load Balancer for high-performance TCP/UDP
  • WAF integration for application-layer security

Resource Optimization by Architecture

Performance Characteristics

Private Single AZ

  • GPU Utilization: 92% (same as all architectures)
  • CPU Utilization: 87% (optimized for single-zone placement)
  • RAM Utilization: 85% (efficient resource allocation)
  • Network Latency: Lowest (single AZ, no cross-AZ traffic)

Public Multi-AZ

  • GPU Utilization: 92% (maintained across zones)
  • CPU Utilization: 87% (distributed load balancing)
  • RAM Utilization: 85% (cross-AZ optimization)
  • Network Latency: Slightly higher (cross-AZ communication)

Public Single AZ

  • GPU Utilization: 92% (full optimization maintained)
  • CPU Utilization: 87% (simplified networking overhead)
  • RAM Utilization: 85% (single-zone efficiency)
  • Network Latency: Lowest (direct internet, single AZ)

Deployment Planning Worksheet

Architecture Selection Checklist

Security Requirements:

  • Compliance requirements (SOC2, FedRAMP, etc.)

  • Network isolation needs

  • Data residency requirements

  • Access control complexity

Availability Requirements:

  • Acceptable downtime tolerance

  • Multi-zone fault tolerance needs

  • Disaster recovery requirements

  • Maintenance window flexibility

Cost Constraints:

  • Infrastructure budget limits

  • Development vs. production environment

  • Temporary vs. long-term deployment

  • Cost optimization priority level

Performance Requirements:

  • User count and concurrency

  • GPU workload intensity

  • Network latency sensitivity

  • Storage performance needs

Use Case Recommended Architecture Rationale
Production Enterprise Public Multi-AZ High availability, fault tolerance
Regulated Industries Private Single AZ Security, compliance, network isolation
Development Teams Public Single AZ Cost efficiency, rapid deployment
Proof of Concept Public Single AZ Quick setup, minimal complexity
Hybrid Cloud Private Single AZ Integration with on-premises
Global Deployment Public Multi-AZ Geographic distribution, redundancy

Implementation Resources

EKSCTL Configuration Templates

Each architecture can be adjusted in this EKSCTL configuration files:

Deployment Guides

Support Resources


Note: All architectures deliver Orion's signature resource optimization performance. Architecture choice should be based on security, availability, and cost requirements rather than performance considerations.

Disclaimer: All cost estimates and deployment times are based on current AWS pricing and may vary by region and usage patterns. Contact our sales team for precise pricing and customized cost analysis based on your specific requirements.