Home Blog 2017 System live monitoring with Apache Sling health checks tools

System live monitoring with Apache Sling health checks tools

Apache sling health check tool

If we want to use automated system for live checking and monitoring current status, performance and configuration of the AEM application environment then we can work with OOTB Apache Sling Health Checks Tools. Below I will explain the basic idea of these tools and will demonstrate simple configuration and customisation examples. Also you will see how UI building is easy and simple for understanding.

Problems and goals

Before to resolve any AEM application issue we should check the following points:

  • Required bundles are up and running;
  • Related web services endpoint are available;
  • Required resources and appropriate content structure exist;
  • And so on …

All above steps can be done manually from time to time but we need to run these validations automatically which help to monitor AEM application at a glance.

If we need to build such automation for live checking and monitoring environment then we should not reinvent the wheel and use OOTB Apache Sling Health Checks Tools (below simply HC).

Health Checks at a glance

HC instance is just an OSGi service that implements interface and return Result according to validate conditions:

public interface HealthCheck {
public Result execute();

Result is a simple immutable class that provides a Status (OK, WARN, CRITICAL and etc.) and one or more log-like messages for additional info.

Note: if you set any Result log message it will be identified within AEM as WARN. Therefore we should define as :

new Result(Result.Status.OK, "Some Message")

Health Checks Execution

AEM is a modular system therefore we have a couple of configuration places for every piece of the system, one of this - HC Executor service which can be configured from "/system/console/configMgr/":

Apache sling health check executor

Note: every HC is executed by HC Executor within Sling Thread pool which guarantee that we have only one single running instance at a time.

Custom Health Checks

For implementation individual HC initially we should implements interface and specify options for service like below:

@Component(metatype = true)
        @Property(name = HealthCheck.NAME,value = "HCName"),
        @Property(name = HealthCheck.TAGS,value = {"meetup"}),
        @Property(name = HealthCheck.MBEAN_NAME,value = "HCName")
@Service(value = {HealthCheck.class})
public class IncorrectLocalhostHC implements HealthCheck {

If HC need to be executed by scheduler (once a day for example) we can specify schedule interval within properties:

@Property(name = HealthCheck.ASYNC_CRON_EXPRESSION, value = "0 0 12 1/1 * ? *")

Also from version of Sling HC core 1.2.6 there will be a new property “hc.resultCacheTtlInMs” which overrides the global default TTL as configured in HC executor for HC responses.

HC can be also configured with annotation @SlingHealthCheck but this not working OOTB in AEM 6.1 :

    name="Health Check Name For Felix Console", 
    mbeanName="JMX Name",
    description="Health Check Description",

So these simple steps allow to implement HC which can be executed from Felix console over the path “/system/console/healthcheck”:

Sling healt check

Health Checks User Interface

If we will open Tools -> Operations -> Dashboard -> Console -> Health Reports (or from path “/libs/granite/operations/content/healthreports.html”) then we will see cards with HC.

Healt check UI

So to add a custom HC as a card on this dashboard we need to create node under the path /apps/granite/operations/config/hc with properties:

  • resource{String} - /system/sling/monitoring/mbeans/org/apache/sling/healthcheck/HealthCheck/[ ]
  • sling:resourceType{String} - granite/operations/components/mbean

Health check folder structure

Operations dashboard also allows to merge HC cards into the groups (as it’s already done for “System Checks” and “Security Checks”). All composite HC configurations locate within factory, so to add our custom HC here we should join new configuration to this factory from Felix console or from configuration file. In case of configuration from Felix console we need to specify “Name” for HC group and “Filter Tags” so all HC with this tags will be available under this HC card composite on Operations dashboard.

Sling healt check operations dashboard

More detailed documentation: