Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Warning

You have tried to access an archived page. Please go to the new https://root360.atlassian.net/wiki/spaces/KB to find more documents.



HTML Comment
hiddentrue

How to manage my docker-based containers on root360 cloud platform?


HTML Comment
hiddentrue

→ Bild auf englische Seite verlinken, Bild= 30px


Excerpt
hiddentrue

Learn the commands to manage and analyze your containers on root360 cloud platform

...

Table of contents

Table of Contents
excludeTable of contents

Terminology

  • Container: a docker container

  • Container type: a term describing a set of containers that have the same role, e.g. api, core, booking-engine. It is defined by a corresponding Task Definition.

  • ECS Service: manages the containers of a certain container type.

  • Image: a docker image

  • Docker Cluster: A logical entity consisting of a number of docker hosts

  • Task Definition: an extended, ECS-specific docker compose file (JSON)

  • ECS Task: Instance of a Task Definition, usually represents a single container

  • Desired Tasks: A number of containers based on the same image (horizontal scale)

  • Revision: Version of an ECS Task, that has a set of properties, such as an image-URL pointing to an image in a registry

  • Registry: a system for storing and managing images

  • Repository: a logical space seperator within the registry to separate images e.g. by type or purpose

The deployment process

Docker Container Deployment describes a process of updating a fleet of running docker containers that are operated via Amazon AWS EC2 Container Service (ECS).

...

The main difference between replacing and rolling updates is that deployments run with 0 downtime only in the later case. However, additional capacity has to be provided permanently. Thus increasing cost of permanently running docker hosts in the cluster.

Process example

  • Initiation of deployment from the Jump-Server via CLI for a Container Type providing a new image-URL

  • Creation of a new ECS Task Definition with the new image-URL, which has its own version number (revision)

  • Update of the respective ECS Service managing the Containers of the Container Type of consideration

  • Creation of new Containers based on the new image-URL in the Docker Cluster

  • All new Containers are taken into Service from the Loadbalancer

  • Connections to all old containers are drained by the Loadbalancer, meaning established ones are allowed to finish, bu no new ones are established to them

  • Shutdown of all old containers

  • Finish

Safeguards

Several mechanism are in place to recognize success and handle failure and rollback:

  • Container state verification

    • A deployment is successful if all new containers are recognized as running and the Loadbalancer has taken them into service successfully

  • Image-URL verification

    • Checking to given Image-URL to be a valid String

    • If the docker registry images should be pulled from is part of your environment (within the same AWS account) we try to locate the given image

  • Roll-back by Roll-forward in case of Timeout

    • If a deployment can not finish for whatever reason and runs into a timeout and a roll-back is initiated

  • Roll-back by Roll-forward in case of failed deployment

    • If a deployment fails bringing up the new docker containers a roll-back is initiated

  • Circuit Breaker

    • If a deployment fails bringing up the new docker containers AWS ECS changes the interval it tries to start new containers in from asap to every 5 minutes. Ultimatly it stops the deployment process after around 1.5 hours.

The roll-back process

If a deployment process is not successful it must be dealt with. However, AWS ECS does not offer a mechanism to just stop the deployment process (and undo steps taken so far).

To cope with this 3 possibilities exist:

  • If an issue with the docker cluster or the loadbalancer is reported during the deployment process, solve it (fast)

  • If the image to be deployed is broken, initiate a new deployment with another one

  • Initiate a deployment process based on the last known stable state

The 2 later ones are named as roll-forward deployments.

The phrase Roll-back by Roll-forward used in the Safeguards section above thus refers to:

  • recognize the running deployment as failed

  • identify the last known stable state (which is described by the Task Definition, which was associated with the ECS Service before we initiated the deployment)

  • initiate a new deployment referencing to this last known stable state

Native docker commands

The following commands can be called via "sudo" by default, meaning in every environment:

  • docker ps, docker ps -a

  • docker images, docker images -a

  • docker logs [OPTION] CONTAINERID

  • docker exec -it CONTAINERID bash, docker exec -it CONTAINERID sh

  • docker stats, docker stats CONTAINERID

  • docker inspect CONTAINERID

For non-productive environments root360 can also enable access to the following commands e.g. to support debugging and analysis:

  • docker run

  • docker pull

  • docker start

  • docker create

You can find further details about the individual commands in docker cli reference.

...

root360 Cloud Platform Management Suite commands

title
Note

Constraints

All commands operating on docker hosts and containers and the tasks to manipulate them are available in the "r3 container" subcommand.

Aliases are not supported.

...

What are the general command line options?

...

...

r3 container overview
linenumbers
Code Block
true
languagecollapsetexttrue
USER@JUMPSERVER:~$ r3 container -h

# Command Response:
Manage Docker Hosts and Containers on root360 Cloud Platform

optional arguments:
  -h, --help          show this help message and exit

Command Overview:
  {list,deploy,show}
    list              List Container by Hosts
    deploy            Deploy Containers to Hosts
    show              Show detailed Information

On which host which container is running at the moment and which resources are booked?

...

r3 container list
linenumbers
Code Block
true
languagecollapsetruetext
USER@JUMPSERVER:~$ r3 container list 

# Command Response:
  host: 10.12.57.129  | CPU (free/max): 1024/1024  | memory (free/max): 2000/2000
    container name                      	  image                    	  reserved CPU   	  reserved Memory	  state    	  launched         
    -                                   	  -                        	  -              	  -              	  -        	  -
  host: ...

Which container types can be deployed?

...

r3 container list --container-types
linenumbers
Code Block
true
languagecollapsetruetext
USER@JUMPSERVER:~$ r3 container list --container-types

# Command Response:
['container-type-a', 'container-type-b', ...]

How to start a new deployment for container-type x?

...

r3 container deploy --container-type CONTAINER-TYPE-Name --image-url Registry/Repository:Tag
linenumbers
Code Block
true
languagecollapsetexttrue
USER@JUMPSERVER:~$ r3 container deploy --container-type fbill-fbservice-stage-account --image-url amazon/amazon-ecs-sample

# Command Response:
Replacing amazon/amazon-ecs-sample for image in /usr/local/etc/update-ecs-service/fbill-fbservice-stage-account.json.
Updated image with amazon/amazon-ecs-sample.
Registered task definition arn:aws:ecs:eu-central-1:385026310112:task-definition/fbill-fbservice-stage-account:3.
Initiated Service update.
Waiting for completion of service update.
Service update complete.

On which image do the currently running containers of type X base?

...

?

r3 container show --container-type CONTAINER-TYPE-Name --image
linenumbers
Code Block
true
languagecollapsetruetext
USER@JUMPSERVER:~$ r3 container show --container-type CONTAINER-TYPE-Name --image

# Command Response:
[
  "amazon/amazon-ecs-sample"
]

Show the latest events for a given container type

...

...

r3 container show --container-type CONTAINER-TYPE-Name --events
linenumbers
Code Block
true
languagecollapsetruetext
USER@JUMPSERVER:~$ r3 container show --container-type CONTAINER-TYPE-Name --events


# Command Response:
[
  {
    "events": [
      {
        "message": "(service CONTAINER-TYPE-Name) has reached a steady state.",
        "id": "4c23824b-ae16-4c33-aeb1-f4f9fd0b713b",
        "createdAt": 1448643717.709
      },
      {
        "message": "(service CONTAINER-TYPE-Name) stopped 1 running tasks.",
        "id": "55bd871a-ab68-4f1b-87fe-77f8f0135e59",
        "createdAt": 1448643675.905
      },
      {
        "message": "(service CONTAINER-TYPE-Name) has begun draining connections on 1 tasks.",
        "id": "5608da62-6697-44f0-891e-9a5911b36bdc",
        "createdAt": 1448643665.745
      },
      {
        "message": "(service CONTAINER-TYPE-Name) deregistered 1 instances in (elb port-r3WebELB-TOKW867P8Q3N)",
        "id": "ff828587-a1d8-4a64-bed2-b935db28f2ac",
        "createdAt": 1448643665.744
      },
      {
        "message": "(service CONTAINER-TYPE-Name) registered 1 instances in (elb port-r3WebELB-TOKW867P8Q3N)",
        "id": "04226066-b501-42a8-a9cf-118cbc076a62",
        "createdAt": 1448643643.593
      },
      {
        "message": "(service CONTAINER-TYPE-Name) deregistered 1 instances in (elb port-r3WebELB-TOKW867P8Q3N)",
        "id": "341d0d15-1d9e-4a19-bb8f-5be2ddea5ed7",
        "createdAt": 1448643465.881
      },
      {
        "message": "(service CONTAINER-TYPE-Name) has started 1 tasks: (task 5d1c1443-0dff-4378-bce3-ed116db70b96).",
        "id": "6a85e881-2514-43c3-b039-7ba9daa20cab",
        "createdAt": 1448643465.881
      },
...
      {
        "message": "(service CONTAINER-TYPE-Name) has reached a steady state.",
        "id": "428eaa0f-2c0b-484b-883d-782fa5af65a7",
        "createdAt": 1448534247.066
      }
    ]
  }
]

Shows the deployment status for a container-type

...

sudo show-ecs-service-status.py -s CONTAINER-TYPE-Name
linenumbers
Code Block
true
languagecollapsetexttrue
# Not yet supported by r3 cli! Alternative:
USER@JUMPSERVER:~$ r3 container show --container-type CONTAINER-TYPE-Name


# Command Response:
[
  {
    "status": "ACTIVE",
    "desired": 1,
    "running": 1,
    "taskDefinition": "arn:aws:ecs:eu-west-1:123456789:task-definition/CONTAINER-TYPE-Name:30",
    "deploys": {
      "status": "PRIMARY",
      "pendingCount": 0,
      "desiredCount": 1,
      "runningCount": 1,
      "createdAt": 1448643341.737,
      "updatedAt": 1448643341.737,
      "id": "ecs-svc/9223370588211434070",
      "taskDefinition": "arn:aws:ecs:eu-west-1:123456789:task-definition/CONTAINER-TYPE-Name:30"
    },
    "pending": 0
  }
]

The important points here are:

  • The service is ACTIVE, which means that containers are started in the cluster

  • "desired" specifies the number of container instances to run as well 

  • "running" indicates how many container instances actually run

  • "pending" indicates how many container instances are still in the start process

  • "deploys" describes existing deployment processes or the current state

    • status: "PRIMARY" specifies which container revision (" taskDefinition" ) is currently running

  • if there is more than one block under "deploys", a migration process is displayed as part of a ongoing deployment (rolling update)

Show detailed information about all containers of container-type X (see following example for explanation)

titletext
Info
language

container-status Value

The value for the attribute --container-status can be either lookup or a valid ECS TaskID. The ECS TaskID is the last portion of the TaskARN.

You can obtain the value from:

  • an unspecific lookup like shown in the following

  • the event log, e.g. task 5d1c1443-0dff-4378-bce3-ed116db70b96 shown in the example above

Example:

  • TaskARN = "arn:aws:ecs:eu-west-1:XXXX:task/29dce5e3-16d8-45fe-acc2-0da74d93f069"

  • TaskID = 29dce5e3-16d8-45fe-acc2-0da74d93f069

Code Block
title


r3 container show --container-type CONTAINER-TYPE-Name --container-status lookup
linenumbers
Code Block
true
languagecollapsetexttrue
USER@JUMPSERVER:~$ r3 container show --container-type CONTAINER-TYPE-Name --container-status lookup

# Command Response:
[
  [
    {
      "taskArn": "arn:aws:ecs:eu-west-1:XXXX:task/29dce5e3-16d8-45fe-acc2-0da74d93f069", 
      "group": "service:company-project-prod-monitor", 
      "containers": [
        {
          "containerArn": "arn:aws:ecs:eu-west-1:XXXX:container/0844bc9c-ff24-459e-b230-715cc72df95b", 
          "taskArn": "arn:aws:ecs:eu-west-1:XXXX:task/29dce5e3-16d8-45fe-acc2-0da74d93f069", 
          "lastStatus": "RUNNING", 
          "name": "company-project-prod-monitor", 
          "networkBindings": [
            {
              "bindIP": "0.0.0.0", 
              "protocol": "tcp", 
              "containerPort": 9100, 
              "hostPort": 9100
            }, 
            {
              "bindIP": "0.0.0.0", 
              "protocol": "tcp", 
              "containerPort": 9126, 
              "hostPort": 9126
            }, 
            {
              "bindIP": "0.0.0.0", 
              "protocol": "tcp", 
              "containerPort": 8080, 
              "hostPort": 8085
            }
          ]
        }
      ], 
      "overrides": {
        "containerOverrides": [
          {
            "name": "company-project-prod-monitor"
          }
        ]
      }, 
      "lastStatus": "RUNNING", 
      "containerInstanceArn": "arn:aws:ecs:eu-west-1:XXXX:container-instance/f5f5d9ac-ded3-4780-9453-a07ea3108350", 
      "version": 3, 
      "clusterArn": "arn:aws:ecs:eu-west-1:XXXX:cluster/company-project-prod", 
      "desiredStatus": "RUNNING", 
      "startedAt": 1508248278.273, 
      "taskDefinitionArn": "arn:aws:ecs:eu-west-1:XXXX:task-definition/company-project-prod-monitor:24", 
      "startedBy": "ecs-svc/9223370528606525657", 
      "createdAt": 1508248275.161
    }
  ], 
  [
    {
      "taskArn": "arn:aws:ecs:eu-west-1:XXXX:task/2c12b253-3f2f-4277-8b9a-4b8d26a5a75e", 
      "group": "service:company-project-prod-monitor", 
      ...
    }
  ], 
  [
    {
      "taskArn": "arn:aws:ecs:eu-west-1:XXXX:task/83ba896d-582e-487c-bc94-19b3e02b8742", 
      "group": "service:company-project-prod-monitor", 
    ...    
    }
  ]
]

Show detailed information of a dedicated container of container type X

language
Info
title

container-status Value

The value for the attribute --container-status can be either lookup or a valid ECS TaskID. The ECS TaskID is the last portion of the TaskARN.

You can obtain the value from:

  • an unspecific lookup like shown above

  • the event log, e.g. task 5d1c1443-0dff-4378-bce3-ed116db70b96 shown in the example above

Example:

  • TaskARN = "arn:aws:ecs:eu-west-1:XXXX:task/29dce5e3-16d8-45fe-acc2-0da74d93f069"

  • TaskID = 29dce5e3-16d8-45fe-acc2-0da74d93f069

Code Block
texttitle


r3 container show --container-type CONTAINER-TYPE-Name --container-status TaskID
linenumbers
Code Block
true
languagecollapsetruetext
USER@JUMPSERVER:~$ r3 container show --container-type company-project-prod-monitor --container-status 29dce5e3-16d8-45fe-acc2-0da74d93f069

# Command Response:
[
  [
    {
      "taskArn": "arn:aws:ecs:eu-west-1:XXXX:task/29dce5e3-16d8-45fe-acc2-0da74d93f069", 
      "group": "service:company-project-prod-monitor", 
      "containers": [
        {
          "containerArn": "arn:aws:ecs:eu-west-1:XXXX:container/0844bc9c-ff24-459e-b230-715cc72df95b", 
          "taskArn": "arn:aws:ecs:eu-west-1:XXXX:task/29dce5e3-16d8-45fe-acc2-0da74d93f069", 
          "lastStatus": "RUNNING", 
          "name": "company-project-prod-monitor", 
          "networkBindings": [
            {
              "bindIP": "0.0.0.0", 
              "protocol": "tcp", 
              "containerPort": 9100, 
              "hostPort": 9100
            }, 
            {
              "bindIP": "0.0.0.0", 
              "protocol": "tcp", 
              "containerPort": 9126, 
              "hostPort": 9126
            }, 
            {
              "bindIP": "0.0.0.0", 
              "protocol": "tcp", 
              "containerPort": 8080, 
              "hostPort": 8085
            }
          ]
        }
      ], 
      "overrides": {
        "containerOverrides": [
          {
            "name": "company-project-prod-monitor"
          }
        ]
      }, 
      "lastStatus": "RUNNING", 
      "containerInstanceArn": "arn:aws:ecs:eu-west-1:XXXX:container-instance/f5f5d9ac-ded3-4780-9453-a07ea3108350", 
      "version": 3, 
      "clusterArn": "arn:aws:ecs:eu-west-1:XXXX:cluster/company-project-prod", 
      "desiredStatus": "RUNNING", 
      "startedAt": 1508248278.273, 
      "taskDefinitionArn": "arn:aws:ecs:eu-west-1:XXXX:task-definition/company-project-prod-monitor:24", 
      "startedBy": "ecs-svc/9223370528606525657", 
      "createdAt": 1508248275.161
    }
  ]
]

The important points here are:

  • "lastStatus" describes the current status

  • "desiredStatus" describes the target state

  • "stoppedReason": is only to be found with stopped containers and represents in detail the reason for the state.