Working with ECS

My experience working with ECS and some tips.

Lately I've had the chance to really dive feet first into the container world. I did a few small projects before, but due to some automation requirements I need to build and deploy faster. Insert AWS ECS.

What is ECS?

In my words, AWS ECS is a container orchestrator for small projects. If you have ever worked with docker I'd think of this as docker compose.

Non-tech definition: It's an easy and automated way to deploy a small set of code.

ECS controls everything for you. All you have to do is have a url of where your container image is. Also just to note I used ECS with EC2 and an AWS Linux 2 OS. There will be nothing in here about Fargate.

What happened?

What didn't happen in the better question. ECS is easy to use (I think), but some of the issues I faced were dealing with incorrect IAM roles, container logs, agent/VM logs, and coding issues. The IAM roles were the easiest to fix, but as a fairly new container person I was very confused on where to look for everything else. I'll talk about some commands I used to verify what was going on.

curl -v http://<container-ip>:<container-port> : I used this command to make sure my container code was running. A 200 means success and anything else means you should look deeper.

docker logs <container-id> : I used this command to view all the logging coming from my container. Now...in production I do suggest you clean this up for security reasons, but logging "all the things" is very helpful when working on a development environment or staging.

PS: to get the container id run docker ps -a and grab the far left value of the first row (usually).

vi /var/log/ecs/ecs-agent.log : When everything looks fine in the docker logs but your task are still failing out, look here. This is how I found out my container was low on memory. There are also general details about everything the agent does with timestamps so you can see the changes in a timeline.

docker exec -it <container-id> bash : When you need to get into your container and maybe dig deeper into the code, this is how you do it.

docker container prune : This is just a clean up command to delete all those failed containers 😅

What I learned

I learned multiple things here and I just want to highlight a few.

  1. networking: I'll be honest I realized that I didn't know enough about networking while completing this task. I copied the structure of another project just to get setup fast and I somewhat regret it. Please take the time to understand the network configuration first.
  2. aws autocreation: This might just be for me but don't let aws create anything outside of service roles for you. Even then you should go read said service roles and make sure that everything you need is there. This was a 1-2 day hold up for me because it made me also look at networking.
  3. running docker locally isn't the same as running in ECS: With ECS the server acts as your computer. That means the 1001 scripts you created for a smooth manual process will not work. Sit down and map out the key parts and then refactor your Dockerfile and ECS task to work the way you want.
  4. Take more breaks: This has been the hardest concept to wrap my head around since grad school nano engineering classes. Why? I believe containers really do show any gaps you might have in your networking background. It's fine though. Take breaks, talk it out with others, sleep, and everything will be ok. You be surprised how clear your google searches get once you have taken a step back.

Resources

Service Event Error Meanings: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-event-messages.html

Agent Logs: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/logs.html

General Page: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/troubleshooting.html