AXAMIT logo
Home Blog 2016 Batch processing content in AEM with Groovy Console

Batch processing content in AEM with Groovy Console

Groovy console

Tasks

One of the tasks that AEM developers may meet is JCR content processing. Sometimes you need to update/modify/package/analyze an immense amount of data, and doing this in a live environment makes it twice as difficult.

Here are some typical cases:

  • Changing the content structure (for example, news/feb/event -> news/2015-2016/feb/event);
  • Modifying/fixing content (for example, need to change ‘colour’ on ‘color’ on every page);
  • Analyzing content (on MSM, editors use assets from another website, need to find all wrong usages);
  • Preparing and packaging content (e.g., during migration from CQ5.x to AEM6.x, you need to analyze content first and create a valid package for the migration).

Solutions

There are several solutions and workarounds for each case:

  • Servlets/JSP/Scriptlets
  • AEM Util packages
  • External tools
  • Groovy Console

Let's focus on the Groove Console. It’s a silver bullet when it comes to solving issues like those mentioned above.

Here are a few advantages of Groovy Console:

  • Does not mess up project code;
  • Short and clear scripts;
  • Predefined services/methods.

Getting Started

The AEM Groovy Console is hosted on https://github.com/Citytechinc/cq-groovy-console and is available for versions AEM/CQ5 starting from CQ5.4. Once the Groovy Console is installed on a local AEM/CQ5 instance, go to http://:/etc/groovyconsole.html

Groovy console

The following predefined variables are immediately available for any groovy script:

  • session - javax.jcr.Session
  • pageManager - com.day.cq.wcm.api.PageManager -resourceResolver - org.apache.sling.api.resource.ResourceResolver
  • slingRequest - org.apache.sling.api.SlingHttpServletRequest -queryBuilder - com.day.cq.search.QueryBuilder -bundleContext - org.osgi.framework.BundleContext -log - org.slf4j.Logger

methods (some of them):

  • getPage(String path) - Get the Page for the given path, or null if it does not exist.
  • getNode(String path) - Get the Node for the given path. Throws javax.jcr.RepositoryException if it does not exist.
  • activate(String path) - Activate the node at the given path.
  • deactivate(String path) - Deactivate the node at the given path.

And imports:

  • com.day.cq.search
  • com.day.cq.tagging
  • com.day.cq.wcm.api
  • com.day.cq.replication
  • javax.jcr
  • org.apache.sling.api
  • org.apache.sling.api.resource

You can also use history and script archives. These features allow you to solve these issues in an easy and elegant way.

Example

Let’s say we need to find pages with the “baking” word in the title and replace it with “banking.”

import com.day.cq.commons.jcr.JcrConstants
def search =  "Baking"
def replace = "Banking"
def path = "/content/geometrixx"
def property = JcrConstants.JCR_TITLE;
def query = createSQL2Query(path,  search , property)
def result = query.execute()
result.nodes.each{node -> 
    def title = node.get(property)
    node.set(JcrConstants.JCR_TITLE, title.replaceAll(search ,replace))
    println node.path
}
save()  
def createSQL2Query(path, term, property) {
    def queryManager = session.workspace.queryManager
    def statement = "SELECT * FROM [cq:PageContent] AS s WHERE ISDESCENDANTNODE([${path}]) and s.[${property}] like '%${term}%'"
    def query = queryManager.createQuery(statement, "JCR-SQL2")
    query
}

As you can see, the Groovy console solution is straightforward and logical.

Contributor

Viktor Kadol
  • Viktor Kadol
  • Adobe Marketing Cloud Solution Lead