Autoscaler Overview #

The Stackbooster Autoscaler fundamentally enhances Kubernetes cluster management by ensuring optimal pod deployment and resource utilization. This advanced autoscaling mechanism, designed with a pod-driven approach, redefines Kubernetes scaling into a precision tool. It not only enhances system availability and achieves more with less but also drives innovation without escalating costs. By serving several critical functions, Stackbooster provides businesses a competitive edge with a sophisticated, reliable, and cost-effective solution in managing cloud resources.

Core Functions of Stackbooster Autoscaler #

Efficient Pod Scheduling: Actively schedules pods that were previously unable to run due to a lack of sufficient resources on existing nodes. By continually monitoring the cluster’s state, it dynamically allocates resources to accommodate new or pending pods.
Intelligent Node Selection and Scaling: Calculates the aggregate resource demand and selects the optimal types and sizes of nodes needed, influenced by various factors including the specific resource requirements of the pods and the cost-effectiveness of available spot or preemptible VMs.
Advanced Predictive Capabilities: Analyzes trends and usage patterns within the cluster to predict scenarios where scale-ups will soon be required, adding resources preemptively, reducing wait times for pods requiring immediate resources, enhancing overall system responsiveness and efficiency.
Optimization of Cluster Resources: Focuses on actual pod requirements to prevent scenarios where new nodes are added without hosting any pods or, conversely, where nodes critical to system operations are inadvertently removed.

Cost Efficiency Highlights #

Spot and Preemptible VM Integration: Significantly reduces the cost of running Kubernetes clusters by leveraging lower-cost spot or preemptible VMs, ensuring the most cost-effective resources are utilized without compromising performance.
Dynamic Resource Allocation: Prevents over-provisioning by predicting and scaling exactly when needed using the most appropriate resources, minimizing waste and reducing expenditure.
Optimal Node Utilization: Ensures every node is optimally utilized, avoiding scenarios of underutilization or overcrowding, each leading to increased costs either through wasted resources or degraded performance.

Data Collection and Transmission #

Scaling decisions within Stackbooster are driven by real-time data collected from the Kubernetes cluster. This data is gathered by the Stackbooster Controller, a dedicated component running within the client’s Kubernetes cluster. The controller is responsible for continuously monitoring the cluster’s state and transmitting this data to the Stackbooster server API every 10 seconds. This frequent update cycle ensures that our scaling decisions are based on the most current information, allowing for timely and precise adjustments to the cluster topology.

Decision Making Process #

Upon receiving data, the Stackbooster system analyzes the current cluster topology to determine immediate scaling needs, such as scaling up or down. This analysis includes not only real-time data but also historical metrics which aid in forecasting future capacity requirements. By predicting these needs, Stackbooster can proactively adjust the cluster’s capacity to handle anticipated changes in load, ensuring consistent performance and resource availability.
To optimize the placement of workloads, Stackbooster employs a sophisticated, ML-powered bin-packing algorithm. This algorithm evaluates various node placement options to find the most cost-effective and efficient configuration that guarantees high availability. The algorithm considers multiple factors, including cost, resource needs, and predefined constraints, ensuring that each pod is scheduled on the ideal node within the cluster.

TODO create some image here #

                +--------------------------+
                | Stackbooster Controller  |
                | - Collects cluster state |
                +------------+-------------+
                             |
                             | Data transmission
                             | every 10 seconds
                             v                             
                +--------------------------+
                | Stackbooster Server API  |
                | - Analyzes data          |
                | - Applies ML-powered     |
                |   bin-packing algorithms |
                +------------+-------------+
                             |
                             | Scaling decisions
                             |                             
              +-------------+-------------+--------------+
              |                          |              |
    +---------+---------+     +---------+---------+     +---------+--------+
    | Scale Up          |     | Optimal Node       |     | Scale Down       |
    | - Add nodes based |     | Placement          |     | - Remove nodes   |
    |   on demand       |     | - Ensures cost     |     |   based on       |
    |                   |     |   effectiveness    |     |   underutilization|
    +-------------------+     | - High availability |     +-----------------+
                              +---------------------+

Customization Through Logical Node Templates #

Stackbooster provides customers with the flexibility to define logical node templates, or groups, which dictate custom parameters for nodes and the pods scheduled on them. These templates enable clients to align their technical and business objectives more closely with the infrastructure provisioned by Stackbooster. When these templates are specified, Stackbooster adjusts its node provisioning mechanisms accordingly to meet these custom requirements effectively.

Proactive Spot Resource Management Stackbooster actively monitors the stability of preemptive spot resources on AWS, utilizing advanced AI algorithms to predict possible interruptions. This proactive monitoring allows Stackbooster to preemptively replace machines likely to be interrupted soon, minimizing disruptions and maintaining continuity in resource availability.