Tuesday, June 9, 2009

AspectJ Through Bytecode - Examining The Woven Class

In the previous post we had a look at the Aspect class. Now let's go through the woven class itself. So we do a javap of the woven Test class and have a look at the output.

The static initializer
In our aspects we refer to the join point object in two places - in ExceptionTracer.aj and in FactoryIntercept.aj. Corresponding to these references, there are two private static fields of type org.aspectj.lang.JoinPoint$StaticPart injected in to the class.

private static final org.aspectj.lang.JoinPoint$StaticPart ajc$tjp_0;
private static final org.aspectj.lang.JoinPoint$StaticPart ajc$tjp_1;
These fields are initialized in the static initializer of the class, which calls helper methods in aspectj to construct the JoinPoint$StaticPart objects.

Code injection
  • Public methods and variable names are retained after being injected. So they can be accessed later using reflection.
  • The injected methods are just wrappers that call the actual body - a static method in the aspect class.
    public int getCalls();
    Code:
    0: aload_0
    1: invokestatic #110; //Method ajtest/aspects/AroundAndInject.ajc$interMethod$ajtest_aspects_AroundAndInject$ajtest_java_Test$getCalls:(Lajtest/java/Test;)I
    4: ireturn

    public void incCalls();
    Code:
    0: aload_0
    1: invokestatic #102; //Method ajtest/aspects/AroundAndInject.ajc$interMethod$ajtest_aspects_AroundAndInject$ajtest_java_Test$incCalls:(Lajtest/java/Test;)V
    4: return
  • Private injected variables are declared as public, but with an obfuscated name. So they can not be accessed with their original names through reflection. Why public? Because we have an aspect on the field access join point and the aspect needs to access this field from within the aspect code!
    public int ajc$interField$ajtest_aspects_AroundAndInject$nCalls;
Advices 'around' a field access
We test the FieldAccess aspect around the join points involving get of fld1 in our testFieldAccessAspect method. Look at the source code and you can see that we read fld1 thrice in the testFieldAccessAspect method - once to print it, then to increment it by 1 and then again to print it. Now take a look at the modified bytecode of the woven testFieldAccessAspect method in the javap output.
  • At the first instance where we read the field, instead of directly fetching the field, now there is a call to the around advice, a method fld1_aroundBody1$advice, to get the value.
    invokestatic    #144; //Method fld1_aroundBody1$advice:(Lajtest/aspects/FieldAccess;Lorg/aspectj/runtime/internal/AroundClosure;)I
  • The advice method, in turn, invokes another method "private static final int fld1_aroundBody0()" when it needs to access the field value. This method accesses the field directly through a getstatic instruction.

  • At the second and third instances where we read the field again, the same happens, but to a different set of methods.
    invokestatic    #150; //Method fld1_aroundBody3$advice:(Lajtest/aspects/FieldAccess;Lorg/aspectj/runtime/internal/AroundClosure;)I
    invokestatic #152; //Method fld1_aroundBody2:()I

    and

    invokestatic #156; //Method fld1_aroundBody5$advice:(Lajtest/aspects/FieldAccess;Lorg/aspectj/runtime/internal/AroundClosure;)I
    invokestatic #158; //Method fld1_aroundBody4:()I
  • The three sets of methods are identical. And they are copies from the FieldAccess aspect class bytecode method ajc$around$ajtest_aspects_FieldAccess$1$32f71218. So the weaver has been picking up the bytecode from the aspect class method and injecting new advice methods into the woven class.

  • The "ajc$around$ajtest_aspects_FieldAccess$1$32f71218proceed" method in the FieldAccess aspect class is however ignored in this case. It would have been used to chain aspects if I had multiple aspects on the same join point.

  • The reason behind multiple identical methods generated for the advice however beats me. If you have any explanations/suggestions, I'll be glad to hear.
Therefore, having an aspect around a field access may look innocent, but may be an excessive overhead in terms of code generation and execution. If you can, consider a different design like having an accessor method and having an advice around the execution of the accessor method.

Advices 'around' a method call
We test the an aspect around a method call in the call to the "doSyso" method in main. The story here is very similar to the behavior above.
  • There are two methods injected into the Test class for each instance of the call to doSyso. The methods injected for the first instance of the call are:
    private static final void doSyso_aroundBody7$advice(ajtest.java.Test, java.lang.String, ajtest.aspects.AroundAndInject, ajtest.java.Test, java.lang.String, org.aspectj.runtime.internal.AroundClosure);

    private static final void doSyso_aroundBody6(ajtest.java.Test, java.lang.String);
  • The method call at the join point is replaced to call the advice doSyso_aroundBody7$advice.

  • The weaver copies code from
    public void ajc$around$ajtest_aspects_AroundAndInject$1$38b5b4f8(ajtest.java.Test, java.lang.String, org.aspectj.runtime.internal.AroundClosure);
    in the AroundAndInject aspect class into
    private static final void doSyso_aroundBody7$advice(ajtest.java.Test, java.lang.String, ajtest.aspects.AroundAndInject, ajtest.java.Test, java.lang.String, org.aspectj.runtime.internal.AroundClosure);
    in the Test class.

  • The advice method in turn calls the second generated method doSyso_aroundBody6 to actually call the method.
    invokespecial   #21; //Method doSyso:(Ljava/lang/String;)V
  • Again, the reason behind multiple identical methods generated for the advice beats me.

Advices 'around' a method execution

We test the an aspect around a method call in the call to the "doSysoExec" method in main. This is also similar to the behavior above, except:
  • The call to doSysoExec method is retained as it is.
  • The body of the doSysoExec method is replaced with a call to an injected advice method:
    invokestatic    #306; //Method doSysoExec_aroundBody13$advice:(Lajtest/java/Test;Ljava/lang/String;Lajtest/aspects/AroundAndInject;Lajtest/java/Test;Ljava/lang/String;Lorg/aspectj/runtime/internal/AroundClosure;)V
  • The injected advice method in turn calls another advice method
    invokestatic    #308; //Method doSysoExec_aroundBody12:(Lajtest/java/Test;Ljava/lang/String;)V
    which actually contains what the original doSysoExec method had
    private static final void doSysoExec_aroundBody12(ajtest.java.Test, java.lang.String);
    Code:
    0: getstatic #3; //Field java/lang/System.out:Ljava/io/PrintStream;
    3: aload_1
    4: invokevirtual #10; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
    7: return
  • Multiple advice methods are not injected for multiple calls to the method - obviously since we are not interested in the 'calls' (which are at multiple places), but in execution (which is in one method).
Therefore, if your advice can do what it needs to do equally well around both the method call and method execution, prefer the execution join point as it is much more efficient in terms of code generation and execution.

Advices 'before' and 'after'
The testBeforeAfterAspect method in the Test class tests the before and after aspects.
  • The advice code is a method in the aspect class.
  • Calls are made to the advice methods before and after the join pont.
    invokevirtual   #176; //Method ajtest/aspects/BeforeAfterIntercept.ajc$before$ajtest_aspects_BeforeAfterIntercept$1$218e91dd:()V

    and

    invokevirtual #179; //Method ajtest/aspects/BeforeAfterIntercept.ajc$after$ajtest_aspects_BeforeAfterIntercept$2$218e91dd:()V
  • Since our aspect was for an "after" join point, it implicitly meant both "after returning" and "after throwing". And the weaver injected an exception handler to do the job.
Therefore, if your advice can do equally well before and after the join point, consider 'before' to avoid any unnecessary exception handling.

We also used a before advice for the exception handler advice and the code generation is similar.

We got some insights into what really happens when AspectJ weaves our aspects into the code. Hopefully, it will help us designing our aspects better. In the next post we'll see what happens under the hood when we use different aspect instantiation models.

Sunday, June 7, 2009

AspectJ Through Bytecode - Anatomy of an Aspect Class

We have been using AspectJ in our product for sometime now. I thought it would be interesting to examine what actually the AspectJ compiler and weaver do at the bytecode level. I made a few simple test classes and a few aspects to test out different types of pointcuts and join points, particularly:

  • Field access
  • Exceptions
  • Code injection
  • Before, After and Around constructs
  • Intercepting and completely replacing method calls
You can find the source code of the classes and aspects here.

I compiled the test classes and the aspects into separate jar files and used the compile time weaver to create a woven jar file separately. My intention was to examine the java bytecode before and after being woven to get a better understanding of aspectj code generation. Knowing what happens under the hood helps in creating better designs. Let me take you through what I went through. I have included the javap outputs and compiled classes along with the source code, but you may want to download the source code and compile them once yourself before we start.

Examining the Aspects Themselves
First, lets examine the aspect bytecode. We pick up one of the simplest aspects - the FieldAccess aspect, and do a bytecode disassembling with javap. Here's what we see:

  • It is a public class
    (public class ajtest.aspects.FieldAccess extends java.lang.Object)
  • There is a singleton instance of the aspect stored as ajc$perSingletonInstance and initialized in a static block. So only one instance of the aspect is created when the aspect class loads.

    This is an important learning which the novice tend to overlook. This implies that the aspects must be coded to be thread safe. Otherwise, remember to modify the aspect declaration with a per... (perthis, pertarget, ...) modifier.

  • In case there is an exception during initialization of the aspect, there is a private static Throwable named ajc$initFailureCause declared in the class which is initialized in the static block of the class with the exception.
  • Since the aspect was used 'around' the pointcut, there is a method for around and a corresponding method for proceed which is called from within the around method.

    public int ajc$around$ajtest_aspects_FieldAccess$1$32f71218(org.aspectj.runtime.internal.AroundClosure);
    static int ajc$around$ajtest_aspects_FieldAccess$1$32f71218proceed(org.aspectj.runtime.internal.AroundClosure) throws java.lang.Throwable;

  • The proceed method is static and does not simply access the field. Instead, it calls run method of the AroundClosure object. That is to futher chain any other aspects that may need to be run.

    invokevirtual #67; //Method org/aspectj/runtime/internal/AroundClosure.run:([Ljava/lang/Object;)Ljava/lang/Object;

  • Note the strange naming convention of the methods, ending with $1$32f71218. We will take it up later and cover another interesting fact of the AspectJ weaver.
  • Then there are other generated methods like aspectOf and hasAspect.
The other interesting aspect would be the one that does the code injection. So we disassemble the AroundAndInject aspect class using javap. Apart from the regular artifacts that we saw earlier, here are few new ones in this class:

  • For each injected field, the aspect has initializer, getter and setter methods

    public static void ajc$interFieldInit$ajtest_aspects_AroundAndInject$ajtest_java_Test$nCalls(ajtest.java.Test);
    public static int ajc$interFieldGetDispatch$ajtest_aspects_AroundAndInject$ajtest_java_Test$nCalls(ajtest.java.Test);
    public static void ajc$interFieldSetDispatch$ajtest_aspects_AroundAndInject$ajtest_java_Test$nCalls(ajtest.java.Test, int);

  • For each injected method, the aspect has the code that goes into the method body. What is injected into the class are methods that in turn call these methods in the aspect.

    public static void ajc$interMethod$ajtest_aspects_AroundAndInject$ajtest_java_Test$incCalls(ajtest.java.Test);
    public static int ajc$interMethod$ajtest_aspects_AroundAndInject$ajtest_java_Test$getCalls(ajtest.java.Test);

  • For each injected method, there are local dispatcher methods in the aspect that in turn call the method of the instrumented class.

    public static int ajc$interMethodDispatch1$ajtest_aspects_AroundAndInject$ajtest_java_Test$getCalls(ajtest.java.Test);
    public static void ajc$interMethodDispatch1$ajtest_aspects_AroundAndInject$ajtest_java_Test$incCalls(ajtest.java.Test);

  • The aspect itself used the local dispatch methods to access the injected methods or variables. So calling an injected method from within the aspect goes through the following path:
    dispatcher method in aspect --> injected method in class --> method body in aspect.

All this seems to be big overheads, but are required to handle complex situations like multiple aspects overlapping at a join point and weaving the same code at multiple times with different aspects. So, if you are thinking of using aspects to just increment an integer in a class, think twice; there might be better ways of doing it. Use aspects for incorporating complex concerns, that is what it is meant for.

In the next post we'll go through a few woven classes and see what interesting things we can see there.

Tuesday, May 12, 2009

JMX for distributed application monitoring and rule based auto healing

One of our applications is a distributed application consisting of multiple machines spanning across different networks. We needed a framework to be able to configure and monitor the applications from a central location. While we were at it, we also dreamt of having some amount of rule based auto healing as well, essentially tuning through re-configuration.

Most of our applications were in Java, though there were external native components like a database, memcached, httpd and some python components. We also needed to monitor operating system statistics, memory, disk space, CPU (user, system & wait times and context switches). JMX with a few additional components fitted the bill nicely and satisfied most of our requirements.

Configuration of each process was also published as a MBean. This allowed us to view the configuration each process was running with, and modify them at run time as well. Notification handlers in the application would dynamically reconfigure the application when configurations changed.

Each machine in our network ran a process that used Hyperic Sigar to collect operating system statistics and publish it as MBeans. Sigar is a cross platform library that uses JNI underneath to get the job done.

Instead of having a central monitoring node looking at all the machines, we broke it up into an expandable fractal kind of structure. A cluster of machines, all lying in the same network, were assigned a JMX monitoring node. Such a monitoring node would know about and connect to all the processes running on all the machines in its cluster, including the process hosting the Sigar library. This cluster monitoring node would also embed the Drools Engine to be able to run rules locally. The rules go through MBean data and attempt to reconfigure the systems through the configuration MBeans to correct any such correctable errors. The rules also publish cluster specific compact (summary) statistics into a summary MBean in the JMX monitoring node.

A cluster of clusters would be similarly further monitored and summarized by another larger (and remote) JMX monitoring node. Since each of smaller clusters publish only summarized data, accessing that remotely is not a major problem. Such clusters would typically be decided upon based on administrative boundaries of system administrators. This model also lends well to federated application administration at different granularities.

The network and monitoring statistics gets reduced and compacted till all data converges at the central NOC. The user interface at the central NOC displays the status of the next level clusters and any alarms therein. Each cluster can be drilled into in stages till the last process.

Sunday, May 10, 2009

Modeling and Prediction Using R - Simplifying Data

Imagine you are an advertising agency, having the responsibility of managing ad inventory of a media house. The media house owns multiple television channels. Your agency has placed metering devices in a sample of homes spread across the geography, with a people meter that also tracks who is watching the channel. Data is fed into a central system where you collate them. I have taken this example of a television media house just to be able to illustrate the problem. This approach may not be appropriate to this exact scenario, but may be suitable to other similar scenarios.

So if there are:

  • 10 different channels
  • 200 different regions - demographics
  • 20 different psychographic profiles
  • 24 hours per day
  • 7 days a week

Then you have 10*200*20*24*7 = 6,720,000 different combinations across which you collate the data. You need to store that volume of data indexed on most of the columns.

Let's say an advertiser wants to reach a certain number of eyeballs in a certain psychographic segment in a certain set of regions and at a certain time of day and day of week. You have to respond to this requirement with whether you have that many ad slots in the required category and whether they are still available (i.e, not yet booked by someone else).

First of all, to calculate the total slots available, you need to break the requirement into the above 5 dimensions, do a lookup for each dimension and sum up the result. For an ad suitable for everybody, you need to do 6,720,000 lookups!


Simplifying the problem:
Let's simplify the problem. Though we have 24 hours a day, the number of slots available may not be varying every hour. For example, the trend could be that the available groups are 7AM-9AM, 9AM-5PM, 5PM-11PM, 11PM-7AM. We need to be able to discover this trend. And the hourly trend may be different in urban and rural regions. In rural regions, viewership during the day might be more and may fall off earlier in the evening compared to urban regions. All urban regions may have a similar pattern and all rural regions may have a different common pattern. The day of week may matter for a certain psychography and may not for another. On top of this, all patterns may not be statistically significant.

Though it seems daunting, once we find all significant patterns, we can reduce the number of dimensions of our data and have a much simpler set to query.

Implementing a solution:
This solution can be implemented using the R statistical package to do part of the heavy lifting.

To capture seasonal trends in viewership, and to smoothen out aberrations, we can do a trend analysis of the data across multiple weeks. So if viewership increases towards the festive season, we should be able to capture it through the trend line. R has linear regression packages to help do this.

R has packages to do recursive partitioning (rpart, randomforest, etc.). Recursive partitioning can be used to simplify the data and come up with a model that partitions data with each partition representing one significant segment of our population. The partitions will be split only across most significant parameters. Variations that are not statistically significant or random in nature will be ignored and averaged across.

R is single threaded and would not scale to large number of records. To be able to work on the large volume of data that we may have, we'll have to split the data and have R work on each split in a distributed manner.

Friday, May 8, 2009

LED TVs from Samsung - A new breed

LED? For TV display? That was my reaction when I first heard about the Samsung LED TVs. Reading about the technology behind it cleared stuff quite a bit.

In regular LCD TVs a CCFL lamp is used behind the screen to provide light and brightness to the screen. CCFL lamps use lead and mercury (environmental issues) and consume more energy (almost 40% more energy). Moreover, the contrast of a CCFL backlit screen is lesser. Since the lamp is continuously on, even behind the dark surfaces, the dark surfaces actually have a glow that makes them appear brighter than they actually should be. The LED TVs, on the other hand, use LED backlight instead. LED lighting solves this problem by allowing an array of LEDs to provide the light source. With the LED array, backlight at each portion of the screen can be controlled. So if the image processor detects a dark area on the screen, it can switch off the backlight of that area to show the dark area as real dark! Nifty, isn't it?

Samsung LED DLP systems are mentioned in this Wikipedia article on DLP and it seems to be the leader in LED technology. Samsung has brought the following models in its Series 6 and 7 LED TVs:

7 series
- 46” – UA46B7000
- 40” - UA40B7000

6 Series
- 46” – UA46B6000
- 40” - UA40B6000

The WS1 sound bar system produces dynamic, full-bodied sound that far surpasses the quality of the tiny speakers in regular flat TVs. The 100Hz motion plus technology produces sharp motion pictures further enhanced by the high contrast LED backlight. The crystal design naturally blended into a piano black frame and the ultra slim profile and adds to the classy look.

It is also claimed to be one of the greenest, owing to the less power consumption, even in the stand-by mode. Being thinner (just 2.99 cm), more units can be packed together during transport, thereby saving transport cost and energy. How thoughtful!

So next time you bump into an LED TV in some electronics shop, do give it a good look!