piątek, 6 grudnia 2019

Quarkus - game changer for Java developers

Recently I came across Quarkus framework. With this framework you can significantly (even in order of magnitude) reduce Java application startup time and RAM memory consumption. This makes it very attractive for running Java applications in containers on Kubernetes or OpenShift. 

In Quarkus Java applications can be executed in OpenJDK JVM. However in order to get highest memory and startup time reduction you should compile your Java application code into native executable using GraalVM. 

Here is example based on OpenJDK s2i image how you can build Quarkus Java application image:

# In this example I'm using OpenJDK 11 image from Red Hat image registry which requires authentication, hence in first step you might need to get your image pull secret. In general you'll need s2i OpenJDK image containing Maven >= 3.5.3 otherwise build might fail.


# get image pull secret from 
# https://access.redhat.com/terms-based-registry/#/accounts 
# and save it to my-pull-secret.yaml file
$ oc create -f my-pull-secret.yaml
$ oc secrets link builder my-pull-secret

$ oc new-app \
registry.redhat.io/openjdk/openjdk-11-rhel8~https://github.com/jstakun/hello-quarkus.git --name=java-quarkus

# or first import OpenJDK image to your cluster
$ oc import-image registry.redhat.io/openjdk/openjdk-11-rhel8 --confirm \
--all=true -n openshift
$ oc new-app \
openshift/openjdk-11-rhel8~https://github.com/jstakun/hello-quarkus.git --name=quarkus-java

Here is single command example for building Native executable for the same Java application:

$ oc new-app \
quay.io/quarkus/ubi-quarkus-native-s2i:19.2.1~https://github.com/jstakun/hello-quarkus.git --name=native-quarkus

This command produces very large image:

$ sudo podman images
REPOSITORY                                                                                         TAG                                                                       IMAGE ID       CREATED          SIZE
image-registry.openshift-image-registry.svc:5000/my-quarkus-app   sha256:4e0feed735fdee1403b8f23a35ad0b9b0ae42baec41e324824f72fd45af2a424   ab444dc9a48f   3 minutes ago    1.38 GB


If you would like to build much smaller image you can build native executable locally and then use s2i binary build with Universal Base Image minimal version:

#Build native executable locally:
$MAVEN_OPTS="-Xmx4G -Xss128M \
-XX:MetaspaceSize=1G -XX:MaxMetaspaceSize=2G  \
-XX:+CMSClassUnloadingEnabled" 
mvn clean package -Pnative -DskipTests 

#create binary s2i build
$ oc new-build --name=hello-quarkus \
--dockerfile=$'FROM registry.access.redhat.com/ubi7/ubi-minimal:latest\nCOPY *-runner /application\nRUN chgrp 0 /application && chmod +x /application\nCMD /application\nEXPOSE 8080'


#--from-file specifies native executable file created above
$ oc start-build hello-quarkus --from-file=/projects/hello-quarkus/target/hello-1.0.0-SNAPSHOT-runner 

#run pod after image build finish
$ oc new-app hello-quarkus

This should produce much smaller image:

$ sudo podman images
REPOSITORY                                                                                         TAG                                                                       IMAGE ID       CREATED          SIZE
image-registry.openshift-image-registry.svc:5000/my-quarkus-app/hello-quarkus-ubi-minimal                  sha256:169396aa1199bcf7d8bfab444357ccffe35a01f366a572e3f1486a0214271c35   bc050c6c6ccd   27 minutes ago   129 MB


You can try to decrease size of your image even further using Father Linux UBI micro image which I built and pushed to my quay registry:

$ oc new-build --name=hello-quarkus --dockerfile=$'FROM quay.io/jstakun/ubi8-micro:0.1\nCOPY *-runner /application\nRUN mkdir /vertx && chgrp -R 0 /vertx && chmod -R g=u /vertx && chgrp 0 /application && chmod +x /application\nCMD /application -Djava.io.tmpdir=/vertx\nEXPOSE 8080'

This should produce even smaller image:

$ sudo podman images
REPOSITORY                                                                                         TAG                                                                       IMAGE ID       CREATED          SIZE
image-registry.openshift-image-registry.svc:5000/my-quarkus-app/hello-quarkus-ubi-micro                      sha256:a2f6e6c81487ccf174f568223e65477c07736370331d77cebd1122c862eddd33   5044755e8456   32 minutes ago   91.2 MB 


If you want to try to build smallest image you can try to do that from the scratch and add only libraries which are referenced by your executable and sh:

#check what libraries are referenced by your native executable:
$ ldd /projects/hello-quarkus/target/hello-1.0.0-SNAPSHOT-runner
        linux-vdso.so.1 (0x00007ffc25b54000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f298e977000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f298e757000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f298e553000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f298e33c000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f298e133000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f298dd70000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f298ecf9000)


#and sh
$ ldd /bin/sh
    linux-vdso.so.1 =>  (0x00007ffc4dfb4000)
    libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00007f0ec5407000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00007f0ec5203000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f0ec4e36000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f0ec5631000)


#build container from the scratch using buildah
container=$(buildah from scratch)
mnt=$(buildah mount $container)
mkdir $mnt/bin
mkdir $mnt/lib64
buildah config --workingdir /bin $container
buildah copy $container /projects/hello-quarkus/target/hello-1.0.0-SNAPSHOT-runner /bin/application
buildah copy $container /bin/sh /bin/sh
buildah copy $container /lib64/libtinfo.so.5 /lib64
buildah copy $container    /lib64/ld-linux-x86-64.so.2 /lib64
buildah copy $container /lib64/libm.so.6 /lib64
buildah copy $container /lib64/libpthread.so.0 /lib64
buildah copy $container /lib64/libdl.so.2 /lib64
buildah copy $container /lib64/libz.so.1 /lib64
buildah copy $container /lib64/librt.so.1 /lib64
buildah copy $container /lib64/libc.so.6 /lib64      
buildah copy $container /lib64/ld-linux-x86-64.so.2 /lib64
buildah config --port 8080 $container
buildah config --entrypoint /bin/application $container
buildah commit --format docker $container hello-quarkus-minimal:latest


$ podman images
REPOSITORY                                    TAG      IMAGE ID       CREATED         SIZE
localhost/
hello-quarkus-minimal                             latest   2dd6b83e8432   8 seconds ago   28 MB
 

#run container
$ podman run localhost/hello-quarkus-minimal:latest -d
2019-12-06 08:48:11,158 INFO  [io.quarkus] (main) hello 1.0.0-SNAPSHOT (running on Quarkus 1.0.0.Final) started in 0.009s. Listening on: http://0.0.0.0:8080
2019-12-06 08:48:11,158 INFO  [io.quarkus] (main) Profile prod activated.
2019-12-06 08:48:11,158 INFO  [io.quarkus] (main) Installed features: [cdi, resteasy]


Now check startup time in the logs and RAM memory consumption for both running pods. You should see big difference. If you compare with simplest Spring Boot example difference should be even bigger in terms of startup time, RAM consumption and image size.

Quarkus is community project but soon will get productized as part of OpenShift Container Platform. Enjoy!

poniedziałek, 4 listopada 2019

Resize storage in your CodeReady Containers virtual machine

CodeReady Containers (CRC) provides a pre-built development environment based on Red Hat Enterprise Linux and OpenShift Container Platform for quick container-based application development which you can run on your workstation. After installation you can easly configure CRC virtual machine using crc config command to adjust CPU and memory settings according to your needs. However if you would like to resize storage in your CRC virtual machine (which is by default only 30 GB) you'll need to use external tools. Here is how I did it on my RHEL workstation with KVM virtualization using gparted live iso:

1. Increase CRC virtual machine disk size on your host.

For example add additonal 10 GB storage to CRC virtual machine:

$ qemu-img resize ~/.crc/machines/crc/crc +10G

Check actual virtual disk size:

$ qemu-img info ~/.crc/machines/crc/crc | grep 'virtual size'

Now your virtual disk size is increased but you need to resize filesytem inside your CRC Virtual machine.

If you are running CRC 1.6+ it is as easy as:

$ crc start

$ ssh -i /home/jstakun/.crc/machines/crc/id_rsa core@192.168.130.11

$ sudo xfs_growfs /sysroot

$ df -h

If you are running CRC 1.5 or earlier here is my procedure:

2. Download gparted live iso from https://gparted.org/download.php

3. Configure gparted live iso as cd-rom device in you CRC virtual machine


4. Set gparted live iso cd-rom device as first on top in Boot Options

 
5. Boot CRC virtual machine


6. Click 4 times enter until you open gparted GUI as below


7. Select /dev/vda3 partion on the list. Click Resize/Move the selected partition button in the Toolbar on top. Resize partition using unallocated disk space as below


8. Shutdown virtual machine

9. Move gparted live iso cd-rom device down in Boot Options


9. Start your CRC virtual machine with crc start command
 
Now you can ssh to your CRC virual machine and check that partiton size has been increased!

$ ssh -i ~/.crc/machines/crc/id_rsa core@192.168.130.11
Red Hat Enterprise Linux CoreOS 42.80.20191010.0
---
[core@crc-847lc-master-0 ~]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs         12G     0   12G   0% /dev
tmpfs            12G   84K   12G   1% /dev/shm
tmpfs            12G   28M   12G   1% /run
tmpfs            12G     0   12G   0% /sys/fs/cgroup
/dev/vda3        40G   27G   14G  66% /sysroot
/dev/vda2       976M   76M  833M   9% /boot



środa, 2 października 2019

Switching context between clusters using oc command line tool

When using Openshift oc command line tool you might want to connect to multiple Openshift clusters over the time. In order to easily switch between different clusters you can use oc config context subcommands. 

Let's create two contexts for different Openshift clusters. First context for AWS cluster:

$ oc login -u admin http://my_aws_cluster

$ oc config rename-context $(oc config current-context) aws

and second one for Openshift Online Pro cluster:

$ oc login -u my_oso_user https://api.pro-us-east-1.openshift.com

$ oc config rename-context $(oc config current-context) oso
 
Now we've got 2 oc contexts aws and oso. Contexts are saved in kubeconfig file (typically ~/.kube/config). You can check what contexts are currently set in your environment using command:

$ oc config get-contexts

You can switch between contexts using oc config use-context command. For example let's switch to oso context:

$ oc config use-context oso

Now we can check what is our current user and project we are in:

$ oc whoami && oc project -q
my_oso_user
my_oso_project

You can find more subcommands to manage contexts using command:

$ oc config --help

Context management commands will work in oc command line tools for both Openshift 3 and Openshift 4 versions.


wtorek, 3 września 2019

Alerting on ElasticSearch data in Openshift

Openshift Container Platform contains built in container log aggregation service based on ElasticSearch Kibana and Fluentd (EFK) stack. However it lacks built in framework for alerting on anomalies, spikes, or other patterns of interest from data in Elasticsearch. In order to deliver this functionality I've created container images based on this Dockerfile and application template to deploy it in just a few steps to Openshift. For detailed instructions on how to deploy this template to Openshift please refer to my git repo

This template has been tested on Openshift 4.1, 4.2 and 4.4

piątek, 16 sierpnia 2019

Etcd cluster database recovery


Etcd is cluster database in Kubernetes and Openshift. This is critical component to keep your cluster up and running. By design it is fault tolerant, but of course some failures might require administator intervention. In Openshift Etcd is running on master nodes collocated with api servers and controller.

In order to keep Etcd fully operational you need to have more than half cluster members running. If less than half Etcd cluster members are running cluster will switch to read only mode and in practice you won't be able to manage your cluster with kubectl/oc or web/admin console. In this situation you must recover cluster nodes or add new cluster members to have more than half cluster members running.

Here are Etcd failure scenarios and how you can recover in Openshift environment:

  • Majority masters are up, minority master are down.
When failed masters comes back they will automatically recover and join the cluster.

  • Network partition.
If there is any side where majority is running this side will remain fully operational. Once the network partition clears, the minority side automatically recognizes the leader from the majority side and recovers its state.

  • Minority masters are up, majority masters are down.
First create a single-node etcd cluster following Restoring etcd quorum for static pods procedure.
Secondly add more cluster members following Adding etcd nodes after restoring procedure. 3 master clusters are recommended.

  • All masters are down.
If all masters are down first you should check if you can get any of then up and running reasonably quickly. If yes you can proceed with scenario 3. From my experience if there is no file system corruption this should work pretty well in most cases. Otherwise you'll need to recover your Etcd cluster from the backup:

First Restore etcd from snapshot which will create single-node etcd cluster.

Secondly add more cluster members following Adding etcd nodes after restoring procedure. 3 master clusters are recommended.

---

Read more about Etcd failures


piątek, 19 lipca 2019

Dealing with RWO storage limitations


Recently I came across the following issue in Openshift: when pod has attached RWO persistent volume and node where this pod is running goes down for whatever reason persistent volume is never detached from the node and pod will never get automatically evicted to other node.

Here is how ithis looks like in CLI:

PV backed pod is running on the node

$ kubectl get pods -o wide

NAME                 READY     STATUS    RESTARTS   AGE       IP            NODE                         
postgresql-1-6ppzs   1/1       Running   1          6d        10.129.2.141   ip-10-0-12-219.ec2.internal


$ kubectl get nodes

NAME                          STATUS     ROLES     AGE       VERSION
ip-10-0-1-12.ec2.internal     Ready      master    280d      v1.11.0+d4cacc0
ip-10-0-12-219.ec2.internal   Ready      compute   246d      v1.11.0+d4cacc0
ip-10-0-8-129.ec2.internal    Ready      compute   280d      v1.11.0+d4cacc0
ip-10-0-9-236.ec2.internal    Ready      infra     280d      v1.11.0+d4cacc0


When node went down, pod status changed to "Unknown" and new replica remains in ContainerCreating status forever

$ kubectl get nodes

NAME                          STATUS     ROLES     AGE       VERSION
ip-10-0-1-12.ec2.internal     Ready      master    280d      v1.11.0+d4cacc0
ip-10-0-12-219.ec2.internal   NotReady   compute   246d      v1.11.0+d4cacc0
ip-10-0-8-129.ec2.internal    Ready      compute   280d      v1.11.0+d4cacc0
ip-10-0-9-236.ec2.internal    Ready      infra     280d      v1.11.0+d4cacc0

$ kubectl get pods -o wide

NAME                 READY     STATUS              RESTARTS   AGE       IP             NODE                         
postgresql-1-4jtfz   0/1       ContainerCreating   0          6m        <none>         ip-10-0-8-129.ec2.internal
postgresql-1-6ppzs   1/1       Unknown             1          7d        10.129.2.141   ip-10-0-12-219.ec2.internal


You can also see the following events in the event log: Multi-Attach error for volume "pvc-53cd2ba8-a496-11e9-b701-0ea4b5a6d9c6" Volume is already used by pod(s) postgresql-1-6ppzs. This is because RWO (ReadWriteOnce) volumes can be mounted as read-write only by a single node at the time. RWO is the most common storage access mode which is provided by many popular storage technologies including AWS EBS or Vmware vmdk disks. You can find detailed list of RWO volumes here. If you are using RWO storage your stateful pod won’t be automatically evicted to other node. This problem is identified and tracked in this Kubernetes issue.

Manual Failover


Fortunately there is quite straight forward manual procedure to failover from this situation. You simply need to force delete pod in "Unknown" status without any grace period with following command:

$ kubectl delete pod postgresql-1-6ppzs --grace-period=0 --force

After this command is executed pod will be immediately deleted and after 6 minutes (this value is hardcoded in Kubernetes) persistent volume will be detached from failed node and will be attached to node where new pod replica has been scheduled.

$ kubectl get pods -o wide

NAME                 READY     STATUS    RESTARTS   AGE       IP             NODE                         
postgresql-1-qd7q2   1/1       Running   0          10m       10.129.2.145   ip-10-0-8-129.ec2.internal 


Automated Failover


Having Kubernetes self healing in mind we would like to automate this procedure so that all pods using RWO persistent volumes will be automatically evicted in case of node failure or maintenance window. Here is the proposed solution:

1. Implement shutdown taint for nodes

2. Write an external controller (could be a cronjob python/ruby script too) which watches node objects with shutdown taint and force deletes pods which are stuck in "Unknown" state from nodes which have shutdown taint.

Any other options?


This issue is specific to RWO storage. Other solution will be to use RWX (ReadWriteMany) volumes where each volume can be mounted to multiple nodes at the same time. You can check again here what storage technologies supports RWX access mode. As you can see there are only a few RWX storage technologies available. From my experience very good choice are Software Defined Storage technologies like CephFS or GlusterFS. On the other hand the easiest option which is NFS doesn’t offer enough quality of service at least in some use cases like database storage or storage for systems with large number of small files read/write operations i.e. Prometheus or Elasticsearch.

With Red Hat Openshift Container Storage you can take a step further and leverage Openshift nodes to run RWX storage cluster based on GlusterFS or CephFS, depending which Openshift version you'll use. You can learn more about Openshift Container Storage here.

niedziela, 16 czerwca 2019

Managing cluster nodes configuration in Openshift v4


Openshift v4 introduces new set of APIs to manage cluster nodes configuration called Machine Config. Machine Config Pools manage a cluster of nodes and their corresponding Machine Configs. Machine Configs contains configuration information for a cluster including nodes configuration files. You can check what Machine Configs and Machine Config Polls exists in your cluster by calling:

$ oc get machineconfigpools
NAME     CONFIG                                                                      UPDATED   UPDATING   DEGRADED
master   rendered-master-e192851e43f1ab347b3a565c9c71d2b8   True      False      False
worker   rendered-worker-5c35596867d37b22ca2daac46351cda5   True      False      False


$ oc get machineconfig
NAME                                                      
00-master                                                  
00-worker                                                  
01-master-container-runtime                                
01-master-kubelet                                          
01-worker-container-runtime                                
01-worker-kubelet                                          
50-worker-container-registries                             
99-master-4d4106e2-8c0f-11e9-a9ed-02607282474a-registries  
99-master-ssh                                              
99-worker-4d668e54-8c0f-11e9-a9ed-02607282474a-registries  
99-worker-ssh                                              
rendered-master-e192851e43f1ab347b3a565c9c71d2b8           
rendered-worker-5c35596867d37b22ca2daac46351cda5           
rendered-worker-da1dff08dd891cb37c0ffc31e2276fe0     


By default you should see two Machine Config Pools for master and worker nodes and a bunch of Machine Configs in each of pools. At this time I encourage you to have a look at each of Machine Config to learn what configuration file they manage. For example:

$ oc describe machineconfig 01-worker-container-runtime

Nodes configuration is managed by Machine Config Operator. One important thing you should know is how Machine Configs are applied by the Operator to the nodes. The Machine Configs are read in order (from 00* to 99*). Labels inside the Machine Configs identify the type of node it belongs to (master or worker). If the same file appears in multiple Machine Config files, the last one wins. So, for example, any file that appears in a 99* file would replace the same file that appeared in a 00* file. The input Machine Config objects are unioned into a "rendered" Machine Config object, which will be used as a target by the operator and is the value you can see in the Machine Config Pool.

To see what files are managed from a Machine Config, look for “Path:” inside a particular Machine Config. For example:

$ oc describe machineconfigs 01-worker-container-runtime | grep Path:
            Path:            /etc/containers/registries.conf
            Path:            /etc/containers/storage.conf
            Path:            /etc/crio/crio.conf

Now lets try a simple example: I'd like to add quay.io image registry to list of search registires on my worker nodes. This configuration is stored in /etc/containers/registries.conf file. As you can see above this file is configured in  01-worker-container-runtime Machine Config object.

First thing you need to do is to create Machine Config object which will contain your augmented version of /etc/containers/registries.conf

cat <<EOF > 50-worker-container-registries.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 50-worker-container-registries
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
 source: data:,%5Bregistries.search%5D%0Aregistries%20%3D%20%5B'registry.access.redhat.com'%2C%20'docker.io'%2C%20'quay.io'%5D%0A%0A%5Bregistries.insecure%5D%0Aregistries%20%3D%20%5B%5D%0A%0A%5Bregistries.block%5D%0Aregistries%20%3D%20%5B%5D%0A
          verification: {}
        filesystem: root
        mode: 420
        path: /etc/containers/registries.conf
EOF

As you can see content of registries.conf file data is url encoded. You can use any url encoding tool or online service to encode/decode your configuration files data. 

There are two important metadata in this object. First one is labels section where you specify to what Machine Config Pool this configuration should be added (master or worker). In our example this is machineconfiguration.openshift.io/role: worker. Second is name: 50-worker-container-registries which should start with higher number than Machine Config you want to override (if you want to override existing configuration file and not create new one).

Now you can create this Machine Config in your cluster:

$ oc create -f 50-worker-container-registries.yaml -n openshift-config

This should trigger automatically rolling upgrade of your worker nodes. You should see worker nodes being restarted one by one. You can also check if your Machine Config Pool is in updating status:

$ ./oc get machineconfigpools
NAME     CONFIG                                                                   UPDATED   UPDATING   DEGRADED
master   rendered-master-e192851e43f1ab347b3a565c9c71d2b8   True      False      False
worker   rendered-worker-5c35596867d37b22ca2daac46351cda5   True      True      False
 

After your nodes are upgraded you can check if new configuration has been applied successfully:

$ oc get nodes

NAME                           STATUS                     ROLES          AGE   VERSION
ip-10-0-135-132.ec2.internal   Ready                      worker         15h   v1.13.4+cb455d664
ip-10-0-139-98.ec2.internal    Ready,SchedulingDisabled   worker         47h   v1.13.4+cb455d664
ip-10-0-140-77.ec2.internal    Ready                      infra,worker   46h   v1.13.4+cb455d664
ip-10-0-143-102.ec2.internal   Ready                      master         47h   v1.13.4+cb455d664
ip-10-0-154-138.ec2.internal   Ready                      worker         47h   v1.13.4+cb455d664
ip-10-0-159-103.ec2.internal   Ready                      master         47h   v1.13.4+cb455d664
ip-10-0-160-135.ec2.internal   Ready                      worker         47h   v1.13.4+cb455d664
ip-10-0-172-230.ec2.internal   Ready                      master         47h   v1.13.4+cb455d664

$ oc debug node/ip-10-0-135-132.ec2.internal
 
Starting pod/ip-10-0-135-132ec2internal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
 
sh-4.2# chroot /host
 
sh-4.4# cat /etc/containers/registries.conf
[registries.search]
registries = ['registry.access.redhat.com', 'docker.io', 'quay.io']

[registries.insecure]
registries = []

[registries.block]
registries = []

That's it! You have learned how to apply custom configuration to your Openshift v4 cluster node.

czwartek, 30 maja 2019

Getting started with Openshift v4 - part 2 - installing on vSphere

After successfull installation of Openshift v4 on AWS I had a opportunity to install Openshift v4 on vSphere. This time installation had to be done following User Provisioned Infrastructure procedure. This means before the cluster could be bootstrapped I had to prepare required infrastucture myself following the documentation

First thing you need to prepare is networking: 2 load balancers for masters and worker nodes and DNS configuration for your cluster.

Secondly you need to download and run openshift-install but this time only to generate your machines ignition config files. Don't forget to create install-config.yaml file first!

Last thing you need to do is to create virtual machines in your vSphere cluster. You'll need to download RHCOS OVA image and create 1 bootstrap, 3 masters and 3 worker machines.

Now you are almost done!

At this point I decided to slighty disobey documentation and boot seperately masters and worker nodes. In order to do that first I have started only bootstrap node and 3 master nodes and had to wait until bootstrap process has finished. This will take a while and you can either ssh to bootstrap machine and follow journal log, or call openshift-install with wait-for bootstrap-complete parameters to monitor bootstrap progress. 

To confirm master bootstrap is done you can call oc get nodes to verify if all 3 masters are in ready status.

$ oc get nodes

NAME      STATUS    ROLES   AGE  VERSION
master-0  Ready     master  63m  v1.13.4+b626c2fe1
master-1  Ready     master  63m  v1.13.4+b626c2fe1
master-2  Ready     master  64m  v1.13.4+b626c2fe1

At this point you should delete bootstrap virtual machine and remove it from the load balancer.
Now you can simply start worker nodes (one by one, or in pararel) and wait for a while until they'll appear in the nodes list with status ready by calling again oc get nodes command.

Next thing you'll need to do is to approve pending csrs for your worker machines.

Last but not least don't forget to configure storage for you image registry operator.

That's it. Your Openshift v4 cluster is up and running on vSphere!

In part 3 I'll show you how to proceed with cluster configuration.

piątek, 10 maja 2019

Getting started with Openshift v4 - part 1 - Installing on AWS

We've just announced release of new Red Hat Openshift 4, which will be available in a few weeks. However if you are interested in trying Openshift 4 eariler there is already beta version available for you. Please refer to Openshift 4 documentation for more details. 

Openshift 4 installation is currently possible only on AWS, vSphere and bare metal. Support for other platforms including Azure, GCP, OpenStack, Red Hat Virtualization will follow in subsequent minor releases within next few months.

I gave first try to install Openshift 4 on AWS with Installer Provisioned Infrastructure which means installer installed for me Openshift 4 cluster nodes as well as the platform itself.

I must admit this has been very pleasant experience. I had to only set up my AWS account, and register or transfer internet domain at Route53. The rest has been done automatically by the installer.

After about 30 mins I had my cluster of 3 masters and 3 worker nodes up and running.

What is equally important (a least for me) it was also very straightforward to uninstall the cluster using the same installer. In fact I've already installed and uninstalled the cluster a couple of times without any issues.

Nice!

In part 2 I'll share my experience with Openshift 4 installation on vSphere so stay tuned...

niedziela, 14 kwietnia 2019

Becoming Red Hat Certified Architect in Enterprise Applications

Are you application architect or developer working with Red Hat JBoss, Openshift, J2EE, Microprofile or Camel technologies?  Would you like to become Red Hat Certified Architect in Enterprise Applications? You can achieve it by taking Enterprise Applications RHCA certification track (Go to "Cerification details" and select tab "For RHCEMDs & RHCJDs").

For example list of Openshift and JBoss exams you need to pass check my exams transcript.

środa, 6 marca 2019

Prune all evicted or failed Pods

Sometimes you might have a lot of pods in Evicted or other state which requires manual deletion by cluster administrator. This is how you can do that with single command:  

for n in $(oc get projects --no-headers=true | awk '{print $1}'); do echo $n; for pod in $(oc get pods -n $n --no-headers=true | grep "Evicted" | awk '{print $1}'); do oc delete pod ${pod} -n ${n}; done; done;

Similarly you can delete with single command all pods in CrashLoopBackOff state:

for n in $(oc get projects --no-headers=true | awk '{print $1}'); do echo $n; for pod in $(oc get pods -n $n --no-headers=true | grep "CrashLoopBackOff" | awk '{print $1}'); do oc delete pod ${pod} -n ${n}; done; done;


piątek, 8 lutego 2019

Automatic updates of Images

Openshift Container Platform has nice feature called Image Streams which aggregates all tags of container image in single object. This makes it easier to reference images from other configuration objects in Openshift i.e. Deployment Configs. Other nice feature of Image Steams is that you can schedule automatic updates of existing and new tags from Image Stream sourcing image registry. In order to make your Image Stream schedulable for updates you need to add --scheduled=true parameter to oc import-image command which creates new Image Stream from existing container image:

oc import-image ruby --from=registry.redhat.io/rhscl/ruby-25-rhel7 --confirm --all --scheduled=true

Image Streams update scheduler is enabled by default and is configured in master-config.yaml in imagePolicyConfig section:

imagePolicyConfig:
  MaxScheduledImageImportsPerMinute: 10
  ScheduledImageImportMinimumIntervalSeconds: 1800
  disableScheduledImport: false
  maxImagesBulkImportedPerRepository: 3


You can adjust this configuration according to your needs but first remember to refer Openshift docs and restart master api and controller processes after you made any changes.

piątek, 18 stycznia 2019

Garbage Collector

When your nodes are running for some time you might find a lot of terminated containers and container images unreferenced by any pods consuming your container storage. In Docker environment you would execute docker rm and docker rmi to get rid of this objects. In Openshift alternatively you could leverage built in Garbage Collector which can do this for you automatically in the background. 

Garbage Collector is enabled by default in Openshift however with no limits for number of terminated containers and very high limits for unreferenced images storage space. You can adjust this defaults by setting kubeletArguments section of the node-config.yaml: 

kubeletArguments:

  ...
  maximum-dead-containers-per-container:
  - '1'
  maximum-dead-containers:
  - '50'
  minimum-container-ttl-duration:
  - 10s
  image-gc-high-threshold:
  - '70'
  image-gc-low-threshold:
  - '60'

You could learn more about this useful functionality in Openshift docs.





 

czwartek, 3 stycznia 2019

Deleting projects stuck in Terminating state

Sometimes you'll experience OpenShift projects stuck in Terminating state. One of the reasons could be your project has orphan serivceinstance or servicebinding objects. You can list this objects by calling explicitly oc get command:

$ oc get serviceinstance -n your_project_name
$ oc get servicebinding -n your_project_name

If this commands will return any objects you can get rid of them and likely get the project deleted automatically by OpenShift with following commands:

$ for i in $(oc get projects | grep Terminating | awk '{print $1}'); do echo $i; oc get serviceinstance -n $i -o yaml | sed "/kubernetes-incubator/d" | oc apply -f - ; done

$ for i in $(oc get projects | grep Terminating | awk '{print $1}'); do echo $i; oc get servicebinding -n $i -o yaml | sed "/kubernetes-incubator/d" | oc apply -f - ; done