June 23, 2019

How reward functions affect our agents' decisions - Pedantic Reinforcement Learning (pt 5)

This post is part of a series. If you haven’t read the introduction yet, you might want to go back and read it so you can understand why we are doing this. So far, we’ve created an environment that simulates a cluster attempting to handle HTTP requests that come in varying volumes throughout the day. Our agent is responsible for adding and removing servers from our cluster to handle the traffic. Read more

June 22, 2019

Feature engineering and creating the environment - Pedantic Reinforcement Learning (pt 4)

This post is part of a series. If you haven’t read the introduction yet, you might want to go back and read it so you can understand why we are doing this. Feature engineering So far, we’ve decided that our agent, at each timestep, can choose from one of three options: Add a new server to our cluster (up to a limit) Remove a server from our cluster (once all the requests that server is handling are done) Do nothing And we’ve built out all the necessary tools for simulating both our stream of requests and our cluster responsible for handling those requests. Read more

June 21, 2019

Simulating a cluster of servers - Pedantic Reinforcement Learning (pt 3)

This post is part of a series. If you haven’t read the introduction yet, you might want to go back and read it so you can understand why we are doing this. Creating the environment - choosing an action space Like all reinforcement learning environments, we need to figure out our observation space, action space, and reward function. This post will only deal with the action space. There are a lot of ways to model the action space, and for this project, I experimented with a few before settling on the one I wanted. Read more

June 20, 2019

Simulating HTTP Traffic - Pedantic Reinforcement Learning (pt 2)

This post is part of a series. If you haven’t read the introduction yet, you might want to go back and read it so you can understand why we are doing this. Modeling random request volumes and durations The first piece of our environment that we will tackle is simulating HTTP traffic. We will assume that the rate of HTTP requests varies throughout the day, but has some peak time periods and some slow time periods. Read more

June 19, 2019

Auto-Scaling a Cluster with RL - Pedantic Reinforcement Learning (pt 1)

When I was a kid, I played basketball at about the level you’d expect a not athletic future engineer to play - poorly and awkwardly. In our first game, my coach asked me to stay under our basket and wait for the ball, which I did. When the other team had the ball, I waited carefully on the other side of the court by our basket, confused why everyone was yelling at me to leave my post. Read more