Hosting AEM on Azure with Ubuntu

The two primary components of AEM are the author and publisher instances. Both can be downloaded from Adobe as either war or jar.

This guide is to setup a dual server non-production instance for testing or developing.

AEM comes in two flavours: self hosted or cloud SDK. Both of these have a quickstart jar file and both can be used in this guide. The differences are:

  1. self hosted requires a license file. Cloud SDK does not.
  2. self hosted comes with a default WKND website and replication by default. Self hosted requires setting this up if required.

Step 1: install java on your local machine.

Skip this step if you already have the license.properties file, or you are using the cloud SDK, or you already have java installed.

In theory you should have oracle java jre version 11 to create the license file. However, it will work with oracle java 8 also. It may or may not work with non oracle java.

To test your local installation, open a terminal window or command prompt and type “java -version”.

Step 2: get hold of jars and generate license file.

This is the hardest step, unless you happen to be a premium Adobe partner. You will need to write to Adobe with a business case for being allowed to setup a dev environment in order to learn AEM basics. They will send you a link to the jar file, along with a key.

Once you have the quickstart jar (e.g. AEM_6.5_Quickstart.jar), you need to run it locally as is on your local windows, mac or linux machine. Do this by double clicking on it. It will take some minutes to startup, then will prompt you for your key. It will then generate a license.properties file in the same dir as the jar.

Note, running the jar will create a directory called crx-quickstart, which will be around 2GB. Delete this when you are finished.

Step 3: provision 2 VMs in Azure.

One will be for author and one for publish, but you can also put them on the same box and give it more ram.

  1. login to your subscription (or create a free one) at https://portal.azure.com/
  2. Create “new” resource
  3. Search the marketplace for ubuntu
  4. select “Ubutu Server 20.0.4 LTS” from canonical or your preferred distro and version.
    1. create a new resource group or use an existing one (just for reporting)
    2. give it a name like “author1” and “publish1”
    3. Chose a cheap region close to you. Some regions are twice the price of others. See https://azureprice.net/Region for price comparison.
    4. leave availability zone default
    5. For spot instance I go with no if you want it available any time.
    6. for size the cheapest viable server is B2s with 2 vCPU and 4GB ram at £25/m. B instances are for servers which are idle most of the time, i.e. when no devs are using the served pages.
    7. For administration type I would always select SSH public key over password.
    8. chose your username (same one for both boxes). Dont use a generic name such as “admin” or a first name such as “bob” or “david”, these are too easily guessed.
    9. if you dont have an SSH key handy, you can let it generate the pair for the first box, the re-used the generated one for the second.
    10. Inbound port rules: Unless you have a fixed IP VPN, or a VLAN, you will probably want to open SSH (22) to the world so you can administer the box. We can change the ports later. Note: this is using Azure firewall, not Ubuntu’s firewall.
  5. Disks
    1. select standard SSD with default encryption.
    2. Hit “create and attach a new disk”
    3. default name
    4. source type = None
    5. select Standard SSD (again), 32GB or more, hit ok.
  6. Networking
    1. choose same or new virtual network.
    2. default subnet
    3. new public ip for each
    4. NIC security group: basic
    5. inbound ports: SSH (22)
    6. accelerated networking / load balancing not applicable.
  7. Management (leave all default)
  8. Advanced. you can choose Gen 2 (EFI), but Gen 1 is arguably more common.
  9. Tags. Add tags you like, e.g. who created it, what its for, when it should be killed etc.
  10. Hit Create!

Step 4: setup DNS (optional)

You will now have 2 IPs allocated by Azure for your two VMS. These are shown in Azure in the info page for the VM.

If you own a domain, you can now create two A records such as author.yourdomain.com and publish.yourdomain.com with those ips.

Step 5: import the key.

windows

If you are on windows, and you allowed Azure to generate they keys, you need load the key you downloaded into puttygen and convert it into a ppk private key, then load this key into pageant.

Create a new putty profile with host name of yoruname@publish.yourdomain.com or yourname@yourserversIP and one for author following the same pattern.

Linux / Mac

If you have linux or mac, you need to copy the private key file int tour ~/.ssh/id_rsa file and make sure it has the right permissions. There are many guilds on this process.

Step 6: download java

Go to https://www.oracle.com/java/technologies/javase-jdk11-downloads.html and download the deb package for linux x64 (the tar.gz would also work). You have to accept license etc. Don’t install openJDK, which is easy but is not supported by Adobe, If you try it and it works let me know.

Step 7: upload everything to both servers.

You will need to upload the java deb package (or tar.gz), the quickstart Jar and license.properties (if using the non cloud sdk)

The easiest way to do this using filezilla. create a new filezilla profile with the following settings:

Host: your server DNS name or ip
Protocol: SFTP
login type: Key File
user: your user name you entered in Azure host config
Key file: select the ppk (windows) or private key rsa/pem file (linux/macos)

open the new site and you should see the home directory of your user. Upload the 3 files to both your servers.

Step 8: Setup java on your new instance(s)

ssh into each of your new servers, and do the following:

  1. $ sudo apt-get update
  2. $ sudo apt-get dist-upgrade -y
  3. reboot if necessary.
  4.  sudo dpkg -i jdk-11.0.9_linux-x64_bin.deb
  5. either edit /etc/profile (to setup java for all users) or specific users ~/.profile and add the following at the end (change the path to match your java version:
    1. export JAVA_HOME=”/usr/lib/jvm/jdk-11.0.9/”
    2. PATH=$JAVA_HOME/bin:$PATH
  6. open a new shell and test java with “java -version” and you should see something like this:

java version “11.0.9” 2020-10-20 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.9+7-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.9+7-LTS, mixed mode)

Step 9: setup and run AEM

  1. create a new user (you can also create it the regular way with just “sudo adduser aem”)
    1. $ adduser –disabled-password –gecos “” –shell /bin/bash
  2. switch to new user:
    1. sudo su – aem
  3. If you didnt add java paths to the global /etc/profile then you need to add it to this users /home/aem/.profile
  4. copy jar and license to aem:
    1. $ mkdir author (publish on second server)
    2. $ cd author (publish on second server)
    3. $ cp /home/youruser/AEM*.jar .
    4. $ cp /home/youruser/*.properties .
  5. rename the jar. Note: some guides say it should start with “aem”, some say it should start “cqx” where x is a number.
    1. mv AEM_6.5_Quickstart.jar aem-author-p4502.jar (on author server)
    2. mv AEM_6.5_Quickstart.jar aem-publish-p4503.jar (on publish server)
  6. In theory, you should now be able to run the jar with this:
    1. $ java -XX:MaxPermSize=256m -Xmx2048M -jar aem-author-p4502.jar
    2. $ java -XX:MaxPermSize=256m -Xmx2048M -jar aem-publish-p4503.jar
  7. you can also change the port and start in background:
    1. $ nohup java -XX:MaxPermSize=256m -Xmx2048M -jar aem-author-p8080.jar &
    2. $ nohup java -XX:MaxPermSize=256m -Xmx2048M -jar aem-publish-p8080.jar &
  8. check what is happening:
    1. $ tail -f /home/aem/publish/crx-quickstart/logs/error.log
    2. $ tail -f /home/aem/publish/crx-quickstart/logs/stdout.log

Alternative method to run the jar.

There are several scenarios where running the jar fails, but unpacking and running the start script works.

  1. $ java -jar *.jar -unpack
  2. $ /home/aem/publish/crx-quickstart/bin/start

Step 10: view the console/site

You have three options:

  1. open up ports 4502/4503 to the world on the author and publisher Azure firewall respectively (insecure)
  2. open the above ports but only to your fixed IP (secure)
  3. use a ssh tunnel (secure)

By default, both author and publisher do not have SSL, and only run http, not https. So any passwords are visible. A later article will look at how to enable SSL and port 443.

Below are example settings for opening port 4502 (author) and 4503 (publisher) to world:

If you want to open the ports only to your local machine, you can use a ssh tunnel. In putty it will look like this:

Here I am mapping the remote port 4502 to 4512 on my local machine.

Do the same for the other server on 4503

Now you can hit the UI on http://localhost:4512 and http://localhost:4513

However, replication will not work unless you setup a rule to allow this between servers via Azure firewall, or use the vlan ip if you put both servers on the same vlan.

Next steps:

  1. configure SSL in both instances, and run them on port 443.
  2. turn the start/stop script into services which start on boot.
  3. setup and test replication between author and publisher.
  4. setup monitoring to alert you if either goes down.

AEM: getting servlets to respond with UTF-8 instead of ISO-8859-1

Interestingly, setting the right request and response headers for the sling servlet did not help. The problem being that data returned via querybuilder was in ISO-8859-1, so whether written to a local log file or returned via response writer, was with the wrong encoding.

The solution was to create an OSGi config file thus:

/your-web.ui.config/src/main/content/jcr_root/apps/your-web/osgiconfig/config/org.apache.sling.engine.impl.SlingMainServlet.cfg.json

{
"sling.additional.response.headers":[
"X-Content-Type-Options=nosniff",
"Content-Type=text/html;charset=utf-8",
"X-Frame-Options=SAMEORIGIN"
]
}

Setting the content type here solved the problem, although its not obvious.

M1 Mac Monitor compatibility

LG 34WN750 UltraWide 3440×1440 + StarTech TB3CDK2DPUE TB 3 dock + DP 1.4 cable

  • Works at 60Hz at 3440×1440 resolution without needing to use switchResX or option-scaling.
  • Monitor auto-powers off even when in use. When this happens, switching back on does not cause the screen to be recognised, even if the DP cable is unplugged and repluged. Requires unplugging the TB3 cable to the dock, rebooting or powering off the dock.
  • Remembers monitor arrangement between reboots.
  • Excellent image quality.
  • No text scaling option in display preferences.

LG 34N650 2560×1080 W + usbc to HDMI dongle.

  • Works with option-scaling to choose the correct scaling.
  • Excellent image quality.

LG 34@ 4k + usbc to HDMI Dongle.

  • defaults to 1080p. Needs switchResX to change resolution to 4k @ 60Hz.
  • Due to the high DPI of 4K monitors, by default the text is too small to read. To fix, Apple allow you to set the scaling factor in the “displays” system preferences. However, this does not increase the font and icon size, like it does in windows, instead it downgrades the monitor resolution. The result is using monitor at less than its native resolution, which is a very poor solution. It means that if you use photoshop for example, you are using it at the lower resolution. The low resolution produces poorer quality images and text than native resolution with font scaling. There is little point in buying a 4K monitor for use with the M1. You will see better quality using a lower resolution monitor at its native resolution, without scaling. The same monitor on widows is noticeably better, especially with photoshop and text.

LG 32UN880-B 4K HDR direct USBC (with charging)

  • Defaults to 4k @60Hz. Charges.
  • Due to the high DPI of 4K monitors, by default the text is too small to read. To fix, Apple allow you to set the scaling factor in the “displays” system preferences. However, this does not increase the font and icon size, like it does in windows, instead it downgrades the monitor resolution. The result is using monitor at less than its native resolution, which is a very poor solution. It means that if you use photoshop for example, you are using it at the lower resolution. The low resolution produces poorer quality images and text than native resolution with font scaling. There is little point in buying a 4K monitor for use with the M1. You will see better quality using a lower resolution monitor at its native resolution, without scaling. The same monitor on widows is noticeably better, especially with photoshop and text.

Dell Ultrasharp U3014 2560×1600 – Fail

This monitor supports HDMI, DP or Mini DP.

However, it does not work with M1 due to Apple incorrectly using YPbPr mode (analog) instead of RGB (digital). This results in blurred text and incorrect colours. This is an old MacOS “bug” which has existing for more than 5 years, however with intel based macs there was a workaround: to generate an EDID display profile file and edit it to remove the ypbpr mode. Apple have closed this workaround on the M1, s it is no longer possible to use Dell Ultrasharp monitors, which are in direct competition with Apples own displays costing significantly more.

See:

If anyone has the same issue, please file a report through Macs feedback Assistant with the monitor attached, quoting FB8946046 as a reference number which for this issue.

Dell Ultrasharp 3007WFP 2560×1600

Luckily, this 15 year old monitor has Dual Link DVI. DVI does not support YPBPR mode, so Apple can’t cripple the display. To connect to the Mac, I am using a BizLink Active Dual Link DVI to displayport adapter, then an active DP to TB3 adapter, which is then plugged into the Akner TB dock. yes, thats about £500 worth of adapters chained together to get the M1 to work with the dell.

Adobe AEM: An error occurred while calculating the baseline

If you follow any of the AEM tutorials which create models, or create models for your components yourself, you will probably see the following error when you try to do “mvn clean install -pautoInstallSinglePackage” or similar:

org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal biz.aQute.bnd:bnd-baseline-maven-plugin:5.1.2:baseline (baseline) on project aem-guides-wknd.core: An error occurred while calculating the baseline

If you look up, you will hopefully see the offending package:

[ERROR] Baseline mismatch for package com.adobe.aem.guides.wknd.core.models, MINOR change. Current is 2.0.0, repo is 2.0.0, suggest 2.1.0 or –

The solution is to update the package-info.java file.

e.g. ore/src/main/java/come/adobe/aem/guides/wknd/core/models/package-info.java

Change this:

@Version(“2.0.0”)

to this:

@Version(“2.1.0”)

AEM Cloud vs EPiServer DXP

This is a comparison between Adobe AEM Cloud (Adobe Experience Manager Cloud) and EPiServer cloud (called DXC or more recently DXP) from a developers and Dev Ops point of view.

I am not an expert in either CMS, but I have used both in Anger for a year each. They are very different. Below are my observations.

AEMEpiServer
Monitoring of cloud envsNone. Used to supply New relic. Only option now is download log files. No info on server or service load or availability. No indexing or searching of logs. AEM non cloud has excellent JMX support, but access to this is removed in their cloud offering.Full Azure Application Insights with custom event logging etc. Can monitor health of all services and servers, and query log files dynamically. Only missing feature is ability to set alerts.
Costcheaper than Epi for 10 million page views a month (around 30% after haggling)More expensive than AEM, but better value considering the built in monitoring and support.
Editor ExperienceDialog based, no wysiwyg editingon page WYSIWYG
Non Prod whitelistingonly by IP, set in cloud manager tool. Deployment not required to add new IPsset in projects config.xml file, can be by IP, header or any other config item. Much more flexible as can use headers to “whitlist” access to external tools. Deployment required
CDNNo access or control. Uses FastlyCan request changes via tickets. Eg. to block specific countries or override default caching. Can flush caches via UI or API. uses CloudFlare Enterprise.
ease of setting up local dev envvery easy (java -jar aem-author-p4502.jar)easy
langJavaC#
OSlinux/Mac/WindowsWindows only (although moving to .net Core). Mac via Parallels (not M1)
Env paramsneed to write an OSGi service and deployment bundle, and hand edit params per env. Much more coding and complexity than EPi.config.xml + transforms for prod envs (standard .net). Secrets easily pulled in from Azure Key Vault per env.
ModelSling resource models.MVC
business logicin Model (yikes!)in Service layer or in controller.
forum supportPoor, 1 in 4 questions responded toPoor, 1 in 4 questions responded to
technical support ticket resolution (medium priority)7 days on average5 days on average
cloud version limitationsCant use the excellent I18n translation tools in cloud as this has been recently removed from cloud. Thus you have to write your own system and workflow to translate non-content strings.None found
Dev learning CurveSteep. A lot of complex boilerplate configuration is required. Sling resources and the templating language is not as intuitive as MVC.basically its MVC and .net. Relatively little “magic” code.
Local Dev env difference to cloud envSignificantly different to Cloud env. E.g.
– local replication “simulates” clouds “Adobe pipeline” replication using legacy agent.
– builds frequently fail on cloud, but work locally, e.g. because cloud enforce a different set of immutable files.
– locally translation tools, Config manager etc. work, on cloud they simply give an access error as Adobe have removed access for Cloud with no alternative, even on dev cloud envs.
Have not noticed any difference in normal development. Both work on MS SQL Server and IIS.
Cloud Dev envNone. All cloud envs are the same as production – you have no access to debug any of them. Building and deploying process on cloud is slow (30 mins to 1 hour), log file download is slow (10 to 30 minutes), so the the change a line of code->deploy->test cycle is around 1h resulting in a lot of wait time.AEM provide by default 3 envs. The integration env is considered Dev, so developers have full access to this, including being able to debug, SSH to console, change configs etc. Also, it is easy to setup your own version of the cloud env in Azure which contains most of the components used in the hosted version, along with its various services. If you purchase additional non prod envs, these also have dev access.

Although I am, since 15 years, a java developer, who prefers (but not married to) Development on the Mac, I would still chose EPiSever over AEM due to its superior tooling, stability, better dev env and more logical code structure (MVC). AEM is a hodge podge of Open Source tools, with at least 6 different UIs, whereas EPiServer is a cohesive whole with a single UI. Interestingly, AEM seems to be going the headerless+Content Fragment route, in which case you should consider cheaper easier alternatives such as contentful.

Copying files from one onedrive account to another without a computer

MS Office 365 offers 1TB per user. The home version offers 5 users.

What happens if you have say 800GB of data in one onedrive users account, and you want to move it to another? This happened recently to me. I filled my 1TB with photos, not leaving any room for my normal files, but could also be because you change accounts or plans.

Downloading 1TB of files, then uploading them to the new account would take months, and require 1-2TB of disk space. As a macbook pro user, I have 10GB free. Clearly not an option.

This is where mover.io comes in. Recently purchased by Microsoft, it offers the ability to transfer from many different storage providers (A3, Google, dropbox etc) onto onedrive. It also offers onedrive to onedrive, and is completely free. It works away server to server, in the cloud, and your computer doesnt need to be on. It took around 5 days to transfer 800GB without error.

Using mover.io is easy. Just create an account with one of your onedrive accounts (I used the “source” one), then add the second onedrive account as the destination. It shows a tree of files on the source, and you can pick all or a subdirectory. One tip is if you want to copy from a directory called say “files” on the source to “files” on the destination, first create a folder called files on the destination and copy to that, as onedrive only copies the contents of directories, not the directory itself.

I made a couple of test transfers before doing anything large – dont want to waste microsoft’s bandwidth given that its free.

Below we can see any transfers in progress or completed.

Creating a fixed IP “VPN” using ssh tunnel on Mac

If you need a fixed ip to access some web resource, and don’t have one at home, there is a quick and easy solution. First, you need a cheap server. I suggest a 5$ lined instance. See https://www.linode.com/pricing/

Once you can ssh to your server, you can create your “VPN”. on the Mac, simply run this command:

$ ssh -D 8123 -f -C -q -N you@yourserver

replacing “you” with your ssh login username and “yorserver” with your servers IP or DSN name.

Now you need to setup a proxy which your browsers will use.

System Peferences->Network->Advanced-> Proxies

Once Now hit “OK” followed by “Apply”

Open a new browser tab, and you should be using your new external IP. If you go to https://www.whatismyip.com/ you should see your servers IP

Updating the database in episerver

When you update the nuget episerver packages then run the site, you may get something like this:

The database schema for ‘EPiServer.Find.Cms’ has not been updated to version ‘13.2.5’, current database version is ‘13.0.4’. Update the database manually by running the cmdlet ‘Update-EPiDatabase’ in the package manager console or set updateDatabaseSchema=”true” on episerver.framework configuration element.

One solution is to edit your web.config in the root of your project. Change this:

  <episerver.framework>
    <appData basePath="App_Data" />
    <scanAssembly forceBinFolderScan="true" />
    <virtualRoles addClaims="true">

to this:

  <episerver.framework updateDatabaseSchema="true">
    <appData basePath="App_Data" />
    <scanAssembly forceBinFolderScan="true" />
    <virtualRoles addClaims="true">

and restart your project. The database will be updated. You probably dont want this setting for production.

Nashorn: using javascript “modules” from java/groovy

Nashorn, which is built into java, allows java to run javascript, and javascript to call java. The main use case of this is avoid duplicating client side javascript as java on the server side. You can run the same javascript on both sides.

There are plenty of basic tutorials which explain how to call a simple function contained in a javascript string or file for example:

engine.eval("function composeGreeting(name) {" +
  "return 'Hello ' + name" +
  "}");
Invocable invocable = (Invocable) engine;
 
Object funcResult = invocable.invokeFunction("composeGreeting", "baeldung");

However, most javascript is written as modules, which do not pollute the global namespace, and hide private functions and variables e.g:

example2.js

var myModule = (function() {
    var privateVar = 42;
    function composeGreeting() {return 'Hello';};
    return {
       composeGreeting: composeGreeting
    };
})(); 

To call composeGreeting in the above example, we need to first “extract” myModule, then call composeGreeting on that (note: we are using grails/groovy, but the java is almost identical)

File file = new File("example2.js")
Reader reader = file.newReader()
scriptEngine.eval(reader)
Invocable invocable = (Invocable) scriptEngine;
Object funcResult
try {
    funcResult = invocable.invokeMethod(scriptEngine.get("myModule"), "composeGreeting");
} catch (NoSuchMethodException e) {
    e.printStackTrace();
} catch (ScriptException e) {
    e.printStackTrace();
}
System.out.println(funcResult) // "Hello"

Spring and Groovy

Grails is the standard for rapid enterprise application development, but sometimes you need to fall back on plain old Java/Spring. However, you don’t need to sacrifice the productivity increase of groovy. You can use groovy classes and code interchangeably with your java in your Spring project.

Here we are using gradle, which is arguably more powerful (scripting language vs XML) and faster.

We don’t even need to install Groovy – gradle handles this for us.

Simple add the following to your spring projects build.gradle:

plugins {
    id 'groovy'
}
dependencies {
    compile 'org.codehaus.groovy:groovy-all:2.3.11'
}

Here is an example of a basic Spring boot web project’s build.gradle file which was generated with Intellij’s Spring Initializr feature:

buildscript {
    repositories {
        mavenCentral()
    }
    dependencies {
        classpath("org.springframework.boot:spring-boot-gradle-plugin:2.1.6.RELEASE")
    }
}

plugins {
    id 'groovy'
    id 'java'
}

apply plugin: 'idea'
apply plugin: 'org.springframework.boot'
apply plugin: 'io.spring.dependency-management'

bootJar {
    baseName = 'gs-spring-boot'
    version =  '0.1.0'
}

sourceCompatibility = 1.8
targetCompatibility = 1.8

repositories {
    mavenCentral()
}

dependencies {
    compile("org.springframework.boot:spring-boot-starter-web")
    compile 'org.codehaus.groovy:groovy-all:2.3.11'
    testCompile group: 'junit', name: 'junit', version: '4.12'
}

This will generate you a groovy folder under src/main. So you will have:

<projectDir>/src/main/groovy/
<projectDir>/src/main/java/

Now you can create your spring components either as groovy or java.

e.g. create src/main/groovy/hello/MyGroovyController.groovy

package hello
import org.springframework.web.bind.annotation.RequestMapping
import org.springframework.web.bind.annotation.RestController

@RestController
class MyGroovyController  {
    @RequestMapping("/")
    public String index() {
        "Hello from Groovy!"
    }
}

And create a java controller in the file src/main/java/hello/MyJavaController.java

package hello;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.bind.annotation.RequestMapping;

@RestController
public class HelloController {
    @RequestMapping("/java")
    public String index() {
        return "Hello from Java!";
    }
}

For completeness, here is the Application class which makes it all work (src/main/java/hello/Application.java)

package hello;

import java.util.Arrays;

import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.Bean;

@SpringBootApplication
public class Application {

    public static void main(String[] args) {
        SpringApplication.run(Application.class, args);
    }

    @Bean
    public CommandLineRunner commandLineRunner(ApplicationContext ctx) 
        {
        return args -> {
            String[] beanNames = ctx.getBeanDefinitionNames();
            Arrays.sort(beanNames);
            for (String beanName : beanNames) {
                System.out.println(beanName);
            }
        };
    }
}

To run your app, enter the following command:

./gradlew build && java -jar build/libs/gs-spring-boot-0.1.0.jar

Alternatively, you can run the app from the IDE. In intellij, open the gradle tab on the right, open up Tasks, and select application/bootRun.

Now you can hit http://localhost:8080 with your browser and see “Hello from Groovy!” or hit http://localhost:8080/java and see “Hello from Java!”