AXAMIT logo
Home Blog 2016 Batch processing content in AEM with Groovy Console

Batch processing content in AEM with Groovy Console

Batch processing content in AEM with Groovy Console

Problems?

One of the issues that AEM developer may meet is processing JCR content. Sometimes you need update/modify/package/analyze a really big amount of data and double challenge if you need to do it on live environment: There are some typical cases:

  • Change content structure (for example news/feb/event -> news/2015-2016/feb/event)
  • Modify/fix content (for example need to change ‘colour’ on ‘color’ on every page)
  • Analyze content (on MSM, editors use assets from another website, need to find all wrong usages)
  • Prepare and package content (for example during migration from CQ5.x on AEM6.x you need to analyze content first and create valid package for its migration)

Solutions

There are may be several solutions and workarounds for each case:

  • Servlets/JSP/Scriptlets
  • AEM Util packages
  • External tools
  • Groovy Console

Let's stop on Groove Console it is a really silver bullet. Advantages of Groovy Console:

  • does not mess project code
  • short and clear scripts
  • predefined services/methods

Let’s start

The AEM Groovy Console is hosted on https://github.com/Citytechinc/cq-groovy-console and avaliable for versions AEM/CQ5 starting from CQ5.4. Once Groovy Console is installed on AEM/CQ5 instance go to http://:/etc/groovyconsole.html

For any groovy script you have already defined variables:

  • session - javax.jcr.Session
  • pageManager - com.day.cq.wcm.api.PageManager -resourceResolver - org.apache.sling.api.resource.ResourceResolver
  • slingRequest - org.apache.sling.api.SlingHttpServletRequest -queryBuilder - com.day.cq.search.QueryBuilder -bundleContext - org.osgi.framework.BundleContext -log - org.slf4j.Logger

methods (some of them):

  • getPage(String path) - Get the Page for the given path, or null if it does not exist.
  • getNode(String path) - Get the Node for the given path. Throws javax.jcr.RepositoryException if it does not exist.
  • activate(String path) - Activate the node at the given path.
  • deactivate(String path) - Deactivate the node at the given path.

And imports:

  • com.day.cq.search
  • com.day.cq.tagging
  • com.day.cq.wcm.api
  • com.day.cq.replication
  • javax.jcr
  • org.apache.sling.api
  • org.apache.sling.api.resource

Also available history and scripts archive. Those features make solving kind of issues mentioned at the beginning of the post easy and in an elegant way.

Example:

Suppose we need to find pages that have “baking” in title and replace it with “banking”.

import com.day.cq.commons.jcr.JcrConstants
def search =  "Baking"
def replace = "Banking"
def path = "/content/geometrixx"
def property = JcrConstants.JCR_TITLE;
def query = createSQL2Query(path,  search , property)
def result = query.execute()
result.nodes.each{node -> 
    def title = node.get(property)
    node.set(JcrConstants.JCR_TITLE, title.replaceAll(search ,replace))
    println node.path
}
save()  
def createSQL2Query(path, term, property) {
    def queryManager = session.workspace.queryManager
    def statement = "SELECT * FROM [cq:PageContent] AS s WHERE ISDESCENDANTNODE([${path}]) and s.[${property}] like '%${term}%'"
    def query = queryManager.createQuery(statement, "JCR-SQL2")
    query
}

You can see solution using Groovy Сonsole is pretty short and straightforward.

Contributor

Viktor Kadol
  • Viktor Kadol
  • Adobe Marketing Cloud Solution Lead