13-10-2011

Logging the cloud with SimpleDB

In a
previous article,
our chief architect Marcel
Panse
talked you through minimizing downtime on Amazon AWS. We
showed how to deploy an application to a bunch of servers without using
SSH or Windows Remote Desktop. This awesome approach saves us lots of
time and money, but there might be another area we have yet to explore
and optimize.

Dude, where are my logs?


Let assume you run an
elastic cluster of EC2 nodes
. Something bad happens, and boom,
a server dies. Of course, you prepared
for the worst
, but I bet you would like to know what the hell
happened. However, in an elastic infrastructure, you might not even know
on which server the error occurred.

Now what? Normally, you would need to check the logs for those ugly
stack traces. To view the server logs, you would have to log into the
remote server. If you run a large number of servers, you might even have
to check all of them to find the right stack trace. That is a lot
of work.

A common approach is to use an SMTP appender. This appender
will email you a little report for every error in the log, immediately
when the bad stuff occurs. Usually, this really draws the attention of
the programmers, but it lacks critical information. You would still need
to log in to gather more information. To analyse the problem you would
need to read the rest of the log file and find out what happened before
the problem occurred.

So far, so good. However, servers on AWS are not persistent.
Servers also tend to crash, even those at Amazon. When such a server
instance dies or gets terminated on purpose, you will lose all of your
logging from that instance from the Beginning of Time. So, you really
need to store your logs somewhere more permanent.

Sweet! They are in the cloud.


The solution? Store logs in SimpleDB. Amazon
SimpleDB
is a highly available, flexible and scalable non-relational
data store. It is perfect for this situation. It is eventually
consistent, write-optimized, highly available and extremely durable. It
can handle extremely large tables that can keep the logging data very
well. It can query and filter logs. Oh, and it is also really cheap.

Logging abstraction


I switched from Log4j
to Slf4j. Slf4j is a simple logging
facade. It serves as a simple abstraction between various logging
frameworks, like Commons-logging, Log4j and Logback. To switch, you need
to remove all dependencies on all logging frameworks first and add the
Slf4j jars. Then, you should add the logging engine of your choice. I
used Logback. Finally, add the bridges to make it all work.

To wrap it all up:


  • use Logback as logging engine;

  • use Jcl-over-Slf4j for Spring;

  • use Log4j-over-Slf4j for Cobertura.


Logging engine


Logback is intended as a successor
to the popular Log4j project, picking it up where Log4j stopped. It is
build by the same people who build Log4j. It has some very interesting
improvements over Log4j: it is faster and boasts more appenders, filters
and conditional processing. We also have a SimpleDB appender for it. The
SimpleDB appender logs everything to a SimpleDB table. We can also add
some extra information to it like host information, IP address and
application version. Lets look at the configuration of our logback.xml.

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property file="${catalina.home}/conf/platform/deployment.properties" />

<contextName>printcloud</contextName>

<appender name="roll" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${catalina.home}/logs/tomcat.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>tomcat.%d{yyyy-MM-dd}.%i.log</fileNamePattern>
<maxHistory>30</maxHistory>
<timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
<maxFileSize>100MB</maxFileSize>
</timeBasedFileNamingAndTriggeringPolicy>
</rollingPolicy>
<encoder>
<pattern>%date %contextName %level [%thread] %logger{10} [%file:%line] %msg%n</pattern>
</encoder>
</appender>


<appender name="stdout" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d %contextName [%thread] %-5level %logger{36} - %msg%n</pattern>
</encoder>
</appender>

<appender name="simpledb" class="com.kikini.logging.simpledb.SimpleDBAppender">
<domainName>peecho_logging</domainName>
<accessId>ENTER_ACCESS_KEY</accessId>
<secretKey>ENTER_SECRET_KEY</secretKey>
<server>sdb.eu-west-1.amazonaws.com</server>
<componentName>printcloud (${system.version})</componentName>
<host>${HOSTNAME}</host>
</appender>

<root level="info">
<appender-ref ref="roll"/>
<appender-ref ref="stdout" />
<appender-ref ref="simpledb" />
</root>
</configuration>


This configuration logs to stdout, a file and SimpleDB. To log to file,
we used the RollingFileAppender, which creates a new file everyday with
a maximum of 100 MB per file and keeps history for 30 days. It
automatically cleans up old log files. You can also specify a different
configuration file for (junit-)test, which in our case will only log to
stdout. The Simpledb appender has a property ‘server’ which is not
supported in the Simpledb-appender project, but was needed to switch the
host to our European region. You can use the original
project
if you are located in the US, otherwise use this custom
download
– or patch
it yourself.

Querying Simpledb


@Override
public List<LogRow> getLogging(String hostname) {
List<LogRow> result = new ArrayList<LogRow>();
String findExpression = "select * from `" + domain + "` where `host` = '" + hostname + "' and `time` != '' order by time desc limit 500";
SelectRequest selectRequest = new SelectRequest(findExpression, false);
for (Item item : simpleDb.select(selectRequest).getItems()) {
List<Attribute> allAttributes = item.getAttributes();
Map<String, String> attributes = new HashMap<String, String>();
for (Attribute attr : allAttributes) {
attributes.put(attr.getName(), attr.getValue());
}
LogRow logRow = new LogRow(attributes);
result.add(logRow);
}
LOGGER.info("Found " + result.size() + " log records.");
return result;
}


LogRow.java:

public class LogRow {
private String msg;
private String host;
private String component;
private String level;
private String time;

public LogRow(Map<String, String> attributes) {
this.msg = attributes.get("msg");
this.host = attributes.get("host");
this.component = attributes.get("component");
this.level = attributes.get("level");
this.time = attributes.get("time");
}

public String getFull() {
DateTime dateTime = new DateTime(time);
DateTimeFormatter fmt = DateTimeFormat.forPattern("dd-MM-yyyy HH:mm:ss");
return String.format("%s %s %s - %s", dateTime.toString(fmt), level, component, msg);
}

// getters and setters
}


This piece of code queries the log table by host name. You can use the
LogRows to print it to your HTML page. Another fancy thing you can do is
to add filtering by level to the query (info/warn/error). You can also
show the logs aggregated instead of filtered by host name. You can query
SimpleDB as you like to search the logs.

Happy logging!


So, log in SimpleDB. At all times, you can view all logs from all
instances without the need to log in, even when the instance was
terminated. Spent less time on managing your infrastructure – and more
time on the stuff that really matters.

Jolien

Latest blogs

The 7th Magazine, 7 portals to inspire

Read blog

Illustrators Journal: covering the world of art

Read blog

Integrating the Peecho API and checkout: Issuu

Read blog