vendredi, décembre 10, 2010

Oracle, Javacle et l'ASF, ma vision du problème.

Donc The Apache Software Foundation a décidé de quitter l'Executive Community du JCP , conformément à sa position clairement exprimée en novembre.

La raison? Oracle a décidé unilatéralement de ne pas respecter les termes du contrat qui le lie à l'ASF (le fameux JSPA), et plus spécifiquement le paragraphe 5.C :

"Other than as set forth above, the Spec Lead agrees not to impose any contractual condition or covenant that would limit or restrict the right of any licensee to create or distribute such Independent Implementations."

En clair, ne pas fournir à l'ASF l'accès au TCK sans y ajouter une restriction d'usage (FOU) est une violation de ce contrat.

On peut bien évidemnt arguer que OpenJDK couvre le besoin d'un Java libre, puisque GPL et disposant d'un TCK sans FOU.
Sauf que si vous forkez OpenJDK, Oracle se réserve le droit de faire un procès pour violation de Patent (bien sûr, Oracle n'est pas assez stupide pour attaquer une fondation comme l'ASF, cela ne lui rapporterai rien, quand il suffit de pratiquer à grande échelle le FUD, en menaçant implictement les utilisateurs de ce fork).

Tout cela est bien résumé (en anglais) dans ce post : http://skife.org/java/jcp/2010/12/07/the-tck-trap.html.

Alors, OpenJDK, une issue de secours? Non. Un miroir aux alouettes, un cache-sexe. En tant que tel, OpenJDK est effectivement une solution temporaire pour ceux qui travaillent sur un Mac, par exemple. Le problème, c'est qu'il n'y a aucune garantie sur le long terme qu'Oracle et ses affidés ne laissent pas OpenJDK dépérir, au profit d'une version bien évidement plus puissante du langage, mais payante.

Procès en sorcellerie ? Certainement pas. Il faut ouvrir les yeux : Oracle n'est pas une entreprise philantropique, elle ne respcte aucune règle, elle les créés ! Pourquoi se priver d'exercer son pouvoir quand il n'y a pas de shérif ?

Cela touche du doigt l'origine du problème : la confusion entretenue par ces sociétés sur la signification de l'Open Source. Pour elles, Open Source = Source - IP. Vous pouvez regarder, utiliser, éventuellement contribuer, mais tous les bénéfices reviennent à la société qui gère le projet.
C'est la privatisation du profit et la mutualisation du travail.

L'open source, c'est d'abord une question de gouvernance, et c'est ce pour quoi l'ASF se bat. Il n'y a pas de liberté sans une gouvernance partagée. En politique, ça s'appelle la démocratie, en opposition à la dictature, la ploutocratie, l'oligarchie ou tout autre système de captation de pouvoir. Ce n'est pas pour rien que la devise de l'ASF est :

Community over code

Qu'est-ce que cela signifie pour Java ? Pour l'instant, pas grand chose. Tout un chacun peut l'utiliser, mais cela ne durera pas. Mais il est temps de penser la suite, et cette suite devra être totalement indépendante de sociétés comme Oracle.

L'ASF peut-elle être force de proposition ? Harmony peut-il devenir la base universelle et réellement open-source que Java aurait dû être? Sans aucun doute. Mais il y a du travail.

Alors, laissont agir la communauté :

3000 committers Apache ne peuvent pas se tromper !

dimanche, octobre 31, 2010

Bye bye summer time...

So we switched (in Europe) from summer time to winter time. At 3am, it's 2 am again. So what ?

Well, I was running some tests on my computer before crashing, and I was quite surprised to get an error in a part I didn't modified today and which was running fine this afternoon. What was wrong ?

We use some class to generate CSN (Change Sequence Number), and obviously we have some tests for this class. One of them failed for one hour...

Here is the test :

public class CsnTest
{
private SimpleDateFormat sdf = new SimpleDateFormat( "yyyyMMddHHmmss.123456'Z'" );

@Test
public void testCSN()
{
long ts = System.currentTimeMillis();

Csn csn = new Csn( sdf.format( new Date( ts ) ) + "#123456#abc#654321" );

assertEquals( ts/1000, csn.getTimestamp()/1000 ); <<---- This assert fails.

Why did I get an error ? Because the way we create the CSN is simply wrong : we don't take into account the fact that the computer is not necessarily always using the same time zone, and that some operation assumes that it got a GMT based time, when other uses the Locale.

Be extremely cautious when dealing with dates and time zones :
you may get a very bad surprise in production, instead of experimenting those errors by chance, just because you are running tests at 2:30 am a Saturday before going to bed !

lundi, septembre 20, 2010

Maven community is sometime a strange world ...

3 years ago, I submitted a patch for the Maven antlr plugin. Something quite simple that took me 1 hour to whip, test and to send as a JIRA [1]

It took 6 months for this patch to be applied, something I can understand.

However, 3 years after the code has been patched, I can't get the plugin from the Apache repository, where it was before. Why ? Because the project has been moved from Apache to Mojo. The reason ?

"A release could be done shortly but I would like to move the maven-antlr-plugin and maven-antlr3-plugin (in sandbox) to Mojoproject.

During the last year, I am the main committer on this project. Recently, David Holroyd provided a new plugin that supports Antlr v3, and submitted some patches. Unfornately, he is not an ASF committer. I could take care of David's patches but I think it should be good to give a new life of this project in the Mojo land. It would be more easy to give access to David, so he could maintain it as he wants. " [2]

What's wrong in the Maven community if they can't make someone who is obviously proposing patches and is a wanna-be committer if they have to move the project out of Apache to get this guy working on the project ? Voting process is too complex ?

Seriously, I don't get it ...

PS : of course, I can get the plugin from the Apache repo, but not at the same place. It's now available on [3].


[1] http://jira.codehaus.org/browse/MANTLR-14
[2] http://maven.40175.n5.nabble.com/vote-Move-the-maven-antlr-plugin-to-the-mojo-project-td204608.html#a204608
[3] http://repo1.maven.org/maven2/org/codehaus/mojo/antlr-maven-plugin/2.1/

mardi, juin 22, 2010

Twitter failures...

It seems that Twitter has reached is limits a few weeks ago, with daily failures since then.

I'm just wondering if Twitter's developpers are french, with a project manager named Domenech, and a top developper named Anelka.

Or maybe I'm mixing failures : in any case, if we use the french soccer team as a base to measure other teams failure, then Twitter is just experiencing small bumps on the road atm...

mardi, juin 01, 2010

Microsoft is shooting itself in the foot

So today I received a phone call from the Microsoft 'client support' (so called) entity.

It's pretty clear that they are more into following a process, even if it's totally stupid, than helping customers. (remind me Dr Strangelove, when the guy can break a CocaCola machine to get the 10 cents he needed to give a phone call that would have saved the world ...).

Instead of solving my simple issue with a simple solution (namely, providing me the KEY that is associated with the installed product, which is different from the product we bought - a Windows 7 premium. Probably because the DVD was badly stamped before being put into the box), they keep going insisting that we install a new version, losing 2 more hours plus having to reinstall all the side products.

Seriously, it's time to think about switching to some more friendly system, people. Ubuntu is quite usable, so is Mac OSX (but you might face the exact same problem). In any case, making 25% margin is just a shame when the UQOS (unquality of service) provided is so high : basically, you are on your own.

Coupled to the fact that they have delegated the first level support to external companies in low cost countries (profits, it's all about profits), it makes it pretty obvious that Microsoft is equivalent to what IBM was back in the 1990.


The butterfly is now just an elephant, not a dancing one... Sea elephant on the shore.

lundi, mai 31, 2010

Maven release failures

So today I had to generate the MINA 2.0.0 packages in order to launch a vote. We have a full page on MINA web site explaining how to cut a release : http://mina.apache.org/developer-guide.html#DeveloperGuide-ReleasingaPointRelease%2528CommittersOnly%2529

I must say that Maven Release plugin is either totally dumb, or that what it does is totally counter-intuitive, and broken, IMHO.

What is the problem ? The maven release:prepare follows the steps :

1) Check that there are no uncommitted changes in the sources : OK

2) Check that there are no SNAPSHOT dependencies : OK

3) Change the version in the POMs from x-SNAPSHOT to a new version (you will be prompted for the versions to use) : OK
(here, we went from 2.0.0-RC2-SNAPSHOT to 2.0.0)

4) Transform the SCM information in the POM to include the final destination of the tag : OK
(the SCM info is now scm:svn:http://svn.apache.org/repos/asf/mina/tags/2.0.0, as expected)

5) Run the project tests against the modified POMs to confirm everything is in working order : OK

6) Commit the modified POMs : KO!!!

What's wrong here ? Everything has been committed in the trunk instead of the expected mina/tags/2.0.0 !

Why is the maven release plugin modifying the SCM tag if it's to commit everything in a place which should store the next version, ie 2.0.1-SNAPSHOT ?

I could understand that this is done on purpose (don't see a single reason for that, but who knows...), but at least, can't it *ask* the user before messing with the trunk ?

Sometime, the Maven Way Of Doing Things (tm) is completely broken, and this explains the complaints found on the blogsphere...

vendredi, mai 28, 2010

Why I will find a key for Windows 7 on internet to install it on a computer

Microsoft forces me to be a pirat today.

From time to time, I'm helping the cheese shop owner down my street, because he is a great guy and he knows almost nothing about computer, but uses it everyday to manage his stock of cheeses and bank account.

Today, he told me that he has an issue with the computer he bought last september, with Vista(ss) on it. As he is using a Account Management system which is not compatible with Vista(ss), he decided to bought an upgrade to Windows 7.

So far, so good. He paid 119€ for the Windows 7 familly Premium upgrade from Vista(ss) Home Premium. Quite expensive, but, hey, you have to pay the price when you don't know that Linux is a pretty decent alternative.

After having installed the POS^H^H^Hsoft, he had to activate the key. But then a message said 'the key is not valid for this version of the product'. WTF ??? So he called me.

One hour on the internet, no help. I have to say that Microsoft web sites are probably some of the worst ever when it comes to find a valuable piece of information. In fact, there is *no* information available.

So I decided to call the Activation Center, because, hey, it's The Activation, Stupid !

I spend almost 2 hours being thrown like a ball from one person to another one, from Microsoft to HP and back, from the useless phone center somewhere in north africa, because it's probably cheaper to pay people there than in France (poor guys, being paid around 300€ a month to pick the phone and hear someone like me yelling ...).

At least, we were able to discover that the package labeled "Windows Premium" contains in fact a DVD labeled Premium but containing the Extended edition. Of course, the key is the Premium one, not the extended one.

At this very moment, I thought that the solution was damn easy : they just have to give me the extended edition key, and voilà.

Fuck me ! No way ! Those guys are so stuck in the middle of a huge swamp of bureaucracy, combined with a large dose of idiocy and fear about the consequences of breaking the rules to help a CUSTOMER (I emphasize this word, because, hey, you know what Microsoft, we are not only USERS, we also are CUSTOMERS !) that they can't provide me with those 24 chars and 4 hyphens (and UUID for those who know what it's all about).

So I told the guy (the manager of the manager of the technical guy I was talking to) that I have a better solution : go on the internet, get a hacked key and that will do the trick.

"But that would be piracy !!!" he replied.

You know what ? *yes*. And you forced me to do that, Microsoft.

Last thing : my fellow Cheese seller will buy a Mac next time. He saw mine, and found it quite wonderful.



Losers !


PS : If anyone from Microsoft read this, feel free to contact me to help me to help this poor guy. I won't charge you more than 125 € an hour to get it working. I already wasted 2hours of my precious time btw. Prove me that you are smarter than the system you are now stuck into. Remember USSR ?

PS2: feel free to re-post.

lundi, avril 19, 2010

10 ans de Glénans !

(Exceptionnellement en français, mais je pense que cela ne concerne que les lecteurs francophones.)

Donc cela fait 10 ans que j'ai fait mon premier stage de voile aux Glénans. Il s'agissait de deux semaines de croisière semi-embarqué (moitié à terre, avec découverte de la voile sur petit quillard, et une semaine à bord d'un confortable Dufour 30).

Cassont les mythes : non, Les Glénans, ce n'est pas une école de voile militaire, ni comme on l'entend encore, les 'fachos de la voile'... En fait, je ne crois pas avoir passé de meilleures vacances - mais à l'époque j'avais besoin de changer d'horizon -, et depuis, je renouvelle cette expérience chaque année (voire autant de fois que possible).

Cela fait maintenant 7 ans que je suis moniteur, ayant encadré 14 stages différents (soit 2 semaines par an), sur des supports tels que le Glénas 5.7, Open 5.70, Sprinto, Dufour 30, Sun Fast 43, Sun Fast 32, Dufour 325, et Elan 31.

J'ai au l'occasion de découvrir des lieux magiques, comme le golfe du Morbihan au lever de soleil, l'Aber Wrach au coucher de soleil, le Ras de sein par pétole, Belle île, Houat et sa plage et Hoëdic, Groix, Brehat la mangifique, Batz, l'ile d'Yeu, l'Odet et le Belon, Pot Blanc, Porquerolles, les îles du Frioul, les calanques de Cassis, Sète, le détroit de Messine, les canaries, Cascais, et les îles Scilly.

J'ai à peine touché le fond deux fois : une au mouillage (calcul de marée foireux, et pourtant refait 3 fois, avec réveil à 2h59 du matin pour replanter la 'pioche' 50 mètres plus loin, et envasement bien collant à force de rester dans le secteur rose d'un feu à secteur ...)

Pas de spi explosé, quelques petites blessures, quelques départ au tas (spi en drapeau en tête de mat, départ au lof avec spi en coquetier et GV qui tombe, taquet défait par l'écoute de spi baladeuse, et gros vrac avec voile d'avant déroulée par force 8...) mais rien de grave.

Quelques frayeurs aussi, comme ce coup de 50 noeuds de vent soudain en baie de Loctudy (7 noeuds de vitesse à sec de toile avec 43 noeuds de vent réel...) "Vous avez eu peur ? Non, t'étais super calme, on a juste trouvé bizarre que tu prennes la barre pour ne plus la lacher..." (dixit mes stagiaires).

Et des rencontres assez étonnantes. Un chanteur d'opéra, un réalisateur de cinéma, une ancienne serveuse de café passant son aggregation de lettre à 50 ans, un éleveur d'oies, et même un prêtre ! ( et oui, les voies du seigneurs sont plus impénétrables que la coque de son navire, ce qui l'a conduit à étudier la navigation et les marées !)

Cela m'a aussi donné l'occasion de rencontrer Romain, skipper (http://www.globe-skipper.com/fr/index.html), que je salue au passage, qui me fait le plaisir de me convier à des convoyages me coupant du monde pour quelques jours : France-Cannaries ou France-Grèce, plus quelques trajets en méditerranée sur un voilier de 27 mètres (ex-bateau de course, assez monstrueux avec son mat de 37 mètres...)

Pour ceux que ça intéresse : http://www.glenans.asso.fr/

N'hésitez pas à me contacter pour plus d'information !

mardi, février 16, 2010

Some new LDAP browsers

Today, Stefan Seelmann pointed out that a couple of free LDAP browsers have been launched.

The first one is delivered by Symlabs (no clue about the license though) and the other one is a NetBeans plugin.

I can understand that the NetBeans team develops a specific plugin, but I don't get the reason why a private company develops something that is already available for years, actively developed and used by a hundred of thousands people. I mean, isn't it a waste of time and resources ?

Guys, there is room for you to join The Apache Software Foundation if you think you can give an hand, instead of playing in your own sandbox !

PS: none of those two projects come close to the functionalities we deliver with Apache Directory Studio.

Useful tool

From time to time, I need to get a clue about what some code is doing. Of course, I can - and do - read the code. But when it's highly concurrent, and the stack is deep, I would like to get an immediate vision of the stack trace for a complete execution.

I'm using a small tool called JIP : Java Interactive Profiler. I'm not sure why they added interactive in the name, because there is nothing interactive once you have launch your program, but anyway...

So how does it work ? Simple : you add some instruction in your command line, run your program, and get back some trace. Let's see with an exemple. Yesterday evening, I did some debugging session with MINA and I wanted the trace generated for the server initialization, and a simple client call. I added this argument in the server command line :

-javaagent:/Users/elecharny/jip-1.1.1/profile/profile.jar -Dprofile.properties=/Users/elecharny/jip-1.1.1/profile/profile.properties

and run my server. The profile.properties has just been modified slightly to fit my need :

profiler=on
remote=off
port=15599
ClassLoaderFilter.1=com.mentorgen.tools.profile.instrument.clfilter.StandardClassLoaderFilter
thread-depth=-1
thread.compact.threshold.ms=1
max-method-count=cw-1compact
method.compact.threshold.ms=1
file=/Users/elecharny/jip-1.1.1/profile.txt
exclude=org.eclipse,org.junit,org.slf4j,org.apache.log4j,org.apache.commons
track.object.alloc=on
output=text
output-method-signatures=yes
clock-resolution=ms

Basically, I just set a file name to store the trace, filter some classes, extended the thread depth to infinite. The property file itself is quite well documented, and it should not be a problem to run the tool with a very minimum RTFming.

Here is a sample of the trace I got :
+------------------------------
| Thread: 1
+------------------------------
Time Percent
----------------- ---------------
Count Total Net Total Net Location
===== ===== === ===== === =========
1 422.1 96.1 100.0 22.8 +--Server:main (mina.test2)
1 106.8 23.9 25.3 5.7 | +--NioSocketAcceptor: (org.apache.mina.transport.socket.nio)
1 82.7 16.1 19.6 3.8 | | +--AbstractPollingIoAcceptor: (org.apache.mina.core.polling)
1 24.4 0.0 5.8 | | | +--SimpleIoProcessorPool: (org.apache.mina.core.service)
1 24.4 11.1 5.8 2.6 | | | | +--SimpleIoProcessorPool: (org.apache.mina.core.service)
1 13.1 6.4 3.1 1.5 | | | | | +--NioProcessor: (org.apache.mina.transport.socket.nio)
1 6.7 6.5 1.6 1.5 | | | | | | +--AbstractPollingIoProcessor: (org.apache.mina.core.polling)
1 30.9 0.8 7.3 0.2 | | | +--AbstractPollingIoAcceptor: (org.apache.mina.core.polling)
1 30.0 0.1 7.1 | | | | +--AbstractIoAcceptor: (org.apache.mina.core.service)
1 29.9 11.9 7.1 2.8 | | | | | +--AbstractIoService: (org.apache.mina.core.service)
2 15.7 13.8 3.7 3.3 | | | | | | +--NioSocketAcceptor:getTransportMetadata (org.apache.mina.transport.socket.nio)
1 1.8 1.7 0.4 0.4 | | | | | | | +--DefaultTransportMetadata: (org.apache.mina.core.service)
1 16.1 0.1 3.8 | +--IoBuffer:wrap (org.apache.mina.core.buffer)
1 16.1 0.0 3.8 | | +--IoBuffer:wrap (org.apache.mina.core.buffer)
1 16.0 15.9 3.8 3.8 | | | +--SimpleBufferAllocator:wrap (org.apache.mina.core.buffer)
1 8.3 8.1 2.0 1.9 | +--BufferCodec: (mina.test2)
1 1.3 1.1 0.3 0.3 | +--DefaultIoFilterChainBuilder:addLast (org.apache.mina.core.filterchain)
1 17.3 0.0 4.1 | +--AbstractIoAcceptor:bind (org.apache.mina.core.service)
1 17.3 0.1 4.1 | | +--AbstractIoAcceptor:bind (org.apache.mina.core.service)
1 16.9 1.0 4.0 0.2 | | | +--AbstractPollingIoAcceptor:bindInternal (org.apache.mina.core.polling)
1 4.2 1.5 1.0 0.4 | | | | +--AbstractPollingIoAcceptor:startupAcceptor (org.apache.mina.core.polling)
1 2.6 0.1 0.6 | | | | | +--AbstractIoService:executeWorker (org.apache.mina.core.service)
1 2.6 2.3 0.6 0.5 | | | | | | +--AbstractIoService:executeWorker (org.apache.mina.core.service)
1 11.1 11.1 2.6 2.6 | | | | +--NioSocketAcceptor:wakeup (org.apache.mina.transport.socket.nio)
+------------------------------
| Thread: 11
+------------------------------

This first stack trace is the server bind() method execution.

Here is the stack trace for a message being processed :


+------------------------------
| Thread: 14
+------------------------------
Time Percent
----------------- ---------------
Count Total Net Total Net Location
===== ===== === ===== === =========
1 1115.2 0.1 100.0 +--NamePreservingRunnable:run (org.apache.mina.util)
1 1115.1 0.6 100.0 | +--AbstractPollingIoProcessor$Processor:run (org.apache.mina.core.polling)
6 1048.9 1048.9 94.1 94.1 | | +--NioProcessor:select (org.apache.mina.transport.socket.nio)
6 2.1 0.1 0.2 | | +--AbstractPollingIoProcessor:access$6 (org.apache.mina.core.polling)
6 2.0 0.0 0.2 | | | +--AbstractPollingIoProcessor:handleNewSessions (org.apache.mina.core.polling)
1 2.0 0.1 0.2 | | | | +--AbstractPollingIoProcessor:addNow (org.apache.mina.core.polling)
1 1.2 0.6 0.1 | | | | | +--IoServiceListenerSupport:fireSessionCreated (org.apache.mina.core.service)
6 24.2 0.1 2.2 | | +--AbstractPollingIoProcessor:access$9 (org.apache.mina.core.polling)
6 24.2 0.1 2.2 | | | +--AbstractPollingIoProcessor:flush (org.apache.mina.core.polling)
3 20.8 0.0 1.9 | | | | +--NioProcessor:getState (org.apache.mina.transport.socket.nio)
3 20.8 20.7 1.9 1.9 | | | | | +--NioProcessor:getState (org.apache.mina.transport.socket.nio)
3 3.2 0.2 0.3 | | | | +--AbstractPollingIoProcessor:flushNow (org.apache.mina.core.polling)
2 2.3 0.1 0.2 | | | | | +--AbstractPollingIoProcessor:writeBuffer (org.apache.mina.core.polling)
6 2.5 0.1 0.2 | | +--AbstractPollingIoProcessor:access$10 (org.apache.mina.core.polling)
6 2.4 0.0 0.2 | | | +--AbstractPollingIoProcessor:removeSessions (org.apache.mina.core.polling)
1 2.3 0.0 0.2 | | | | +--AbstractPollingIoProcessor:removeNow (org.apache.mina.core.polling)
1 1.6 0.1 0.1 | | | | | +--IoServiceListenerSupport:fireSessionDestroyed (org.apache.mina.core.service)
1 1.5 0.0 0.1 | | | | | | +--DefaultIoFilterChain:fireSessionClosed (org.apache.mina.core.filterchain)
1 1.2 0.0 0.1 | | | | | | | +--DefaultIoFilterChain:callNextSessionClosed (org.apache.mina.core.filterchain)
1 1.1 0.0 0.1 | | | | | | | | +--IoFilterAdapter:sessionClosed (org.apache.mina.core.filterchain)
1 1.1 0.0 0.1 | | | | | | | | | +--DefaultIoFilterChain$EntryImpl$1:sessionClosed (org.apache.mina.core.filterchain)
1 1.1 0.0 0.1 | | | | | | | | | | +--DefaultIoFilterChain:access$2 (org.apache.mina.core.filterchain)
1 1.1 0.0 0.1 | | | | | | | | | | | +--DefaultIoFilterChain:callNextSessionClosed (org.apache.mina.core.filterchain)
1 1.0 0.0 0.1 | | | | | | | | | | | | +--ProtocolCodecFilter:sessionClosed (org.apache.mina.filter.codec)
4 36.0 0.1 3.2 | | +--AbstractPollingIoProcessor:access$8 (org.apache.mina.core.polling)
4 35.9 0.2 3.2 | | | +--AbstractPollingIoProcessor:process (org.apache.mina.core.polling)
4 34.7 0.1 3.1 | | | | +--AbstractPollingIoProcessor:process (org.apache.mina.core.polling)
2 34.3 0.2 3.1 | | | | | +--AbstractPollingIoProcessor:read (org.apache.mina.core.polling)
3 2.3 0.0 0.2 | | | | | | +--NioProcessor:read (org.apache.mina.transport.socket.nio)
3 2.2 2.2 0.2 0.2 | | | | | | | +--NioProcessor:read (org.apache.mina.transport.socket.nio)
1 31.2 0.0 2.8 | | | | | | +--DefaultIoFilterChain:fireMessageReceived (org.apache.mina.core.filterchain)
1 31.0 0.0 2.8 | | | | | | | +--DefaultIoFilterChain:callNextMessageReceived (org.apache.mina.core.filterchain)
1 30.9 0.0 2.8 | | | | | | | | +--IoFilterAdapter:messageReceived (org.apache.mina.core.filterchain)
1 30.9 0.0 2.8 | | | | | | | | | +--DefaultIoFilterChain$EntryImpl$1:messageReceived (org.apache.mina.core.filterchain)
1 30.9 0.0 2.8 | | | | | | | | | | +--DefaultIoFilterChain:access$5 (org.apache.mina.core.filterchain)
1 30.9 0.0 2.8 | | | | | | | | | | | +--DefaultIoFilterChain:callNextMessageReceived (org.apache.mina.core.filterchain)
1 30.8 0.1 2.8 | | | | | | | | | | | | +--ProtocolCodecFilter:messageReceived (org.apache.mina.filter.codec)
1 1.3 1.2 0.1 0.1 | | | | | | | | | | | | | +--ProtocolCodecFilter:getDecoderOut (org.apache.mina.filter.codec)
1 28.6 0.0 2.6 | | | | | | | | | | | | | +--ProtocolCodecFilter$ProtocolDecoderOutputImpl:flush (org.apache.mina.filter.codec)
1 28.6 0.0 2.6 | | | | | | | | | | | | | | +--DefaultIoFilterChain$EntryImpl$1:messageReceived (org.apache.mina.core.filterchain)
1 28.5 0.0 2.6 | | | | | | | | | | | | | | | +--DefaultIoFilterChain:access$5 (org.apache.mina.core.filterchain) 1 28.5 0.0 2.6 | | | | | | | | | | | | | | | | +--DefaultIoFilterChain:callNextMessageReceived (org.apache.mina.core.filterchain)
1 28.5 0.0 2.6 | | | | | | | | | | | | | | | | | +--DefaultIoFilterChain$TailFilter:messageReceived (org.apache.mina.core.filterchain)
1 28.3 0.3 2.5 | | | | | | | | | | | | | | | | | | +--Server$MyIOHandler:messageReceived (mina.test2)
1 27.9 0.0 2.5 | | | | | | | | | | | | | | | | | | | +--AbstractIoSession:write (org.apache.mina.core.session)
1 27.9 1.4 2.5 0.1 | | | | | | | | | | | | | | | | | | | | +--AbstractIoSession:write (org.apache.mina.core.session)
1 26.1 0.0 2.3 | | | | | | | | | | | | | | | | | | | | | +--DefaultIoFilterChain:fireFilterWrite (org.apache.mina.core.filterchain)
1 26.1 0.0 2.3 | | | | | | | | | | | | | | | | | | | | | | +--DefaultIoFilterChain:callPreviousFilterWrite (org.apache.mina.core.filterchain)
1 26.0 0.0 2.3 | | | | | | | | | | | | | | | | | | | | | | | +--DefaultIoFilterChain$TailFilter:filterWrite (org.apache.mina.core.filterchain)
1 26.0 0.0 2.3 | | | | | | | | | | | | | | | | | | | | | | | | +--DefaultIoFilterChain$EntryImpl$1:filterWrite (org.apache.mina.core.filterchain)
1 26.0 0.0 2.3 | | | | | | | | | | | | | | | | | | | | | | | | | +--DefaultIoFilterChain:access$7 (org.apache.mina.core.filterchain)
1 25.9 0.0 2.3 | | | | | | | | | | | | | | | | | | | | | | | | | | +--DefaultIoFilterChain:callPreviousFilterWrite (org.apache.mina.core.filterchain)
1 25.9 21.3 2.3 1.9 | | | | | | | | | | | | | | | | | | | | | | | | | | | +--ProtocolCodecFilter:filterWrite (org.apache.mina.filter.codec)
1 2.3 2.2 0.2 0.2 | | | | | | | | | | | | | | | | | | | | | | | | | | | | +--ProtocolCodecFilter:getEncoderOut (org.apache.mina.filter.codec)
1 1.2 0.5 0.1 | | | | | | | | | | | | | | | | | | | | | | | | | | | | +--ProtocolCodecFilter$ProtocolEncoderOutputImpl:flushWithoutFuture (org.apache.mina.filter.codec)

mardi, janvier 26, 2010

Pick good names for your methods/data structures...

I had hard some time today, trying to get a clue about what is what in Apache Directory Server project.

When Alex Karasulu started to work on the project back in 2001, he had to design a brand new ASN/1 codec, producing LDAP messages. He named those messages and this codec "Snickers", just because he has used the "Snacc" ASN/1 compiler before (Snacc couldn't be used by the project, because it was not free of use).

So we had our Snickers codec, and SnickersXXX messages.

When I joined the project in 2005, I was crazy enough to think that I can improve this portion of code performances. Sure I did, but I picked some other funny names, "Twix" Codec and TwixMessageXXX messages.

Great ! It was funny, we laugh a lot, and congratulate each others, thinking how funny we were ... What I don't say is that I had to write a converter from Snickers to Twix messages, as Twix messages are used in the front end, and Snickers messages are manipulated all over the back end. So converting back and forth was mandatory.

Damn asses ...

5 years later, when I come back into this crap, I have *no* bloody idea about what is what. Is Twix for the frontend or the backend ?

When you pick a name, and when you think it's funny, just think about those, and probably you, who will not have fun at all when it'll be time to fix some code in this area, with no clue about what Twix and Snickers are...

A Pale is a Pale. Don't name it Chair because it sounds funny..

vendredi, janvier 22, 2010

Fixing performance issues in your application

I was lucky enough to assist to a presentation done by Kirk Pepperdine last Wednesday. I won't present Kirk (you can check his résumé here ), it's enough to say that he is well known as a java performance Guru.

The presentation was conducted in two parts, the first one was a Q & A session, the second part was about debugging live an application that was carefully slowed down by introducing bugs in it.

Preamble
Let me talk about the application first : the guys who invited Kirk (AFAIK, it was an 'extra', provided as a bonus following an internal presentation they paid for. Thank to Xebia for having shared this presentation with external people) have prepared the application (the well known and useless Pet Clinic) by adding some of the anti-pattern they have met when doing consulting for many of their clients. Kirk had no clue about the bugs that have been injected.


Q&A
It was asked us to provide some questions when we registered, and Kirk answered them extensively. Here are some of the Q and A I remember of :

Q : Which GC should we use ?
A : The one which works. Usually, just focus on your application, you'll not need to pick a specific GC .

Q : What do you think about other languages like Groovy, Scala, wrt performance ?
A : It's irrelevant. Picking a language to develop your application should not be a matter of performance only. 'Whatever works' is the way to go. If you want to build an application fast, and if it's not expected to be heavily loaded, then even php is a good choice.

Q : What tools to you use to check for performance bottlenecks
A : A few : a system monitor, HP-JMeter, an VisualVM

Q : How can you best write an application which depends heavily on concurrent code ?
A : Don't use any synchronization. There are ways to avoid synchronization, based on state machine theory. (pointers needed here ...)

Q : What is the ratio of GC problems you have to deal with when working for a client ?
A : Around 40%. Assuming that I'm the last hope for many of my clients, it's may be an irrelevant number. Usually, people successfully fix easier issues themselves.

Q : Do you check the code when you start tracking some performance issue ?
A : Never. I'm not a coder, I don't have time to go through thousands of line of code. I just spot the place in the code which has problem.

Q : Managers don't let me adding some traces in the application on production… What should I tell them ?
A : Managers know the difference between a slow application and a dead application. Do what you have to do, or find another client. (in other words : you don't cure cancer with aspirin...)

Q : Which profiler do you use, or prefer ?
A : YourKit: it's simple and efficient.

The most interesting presentation I have seen in years. I actually learn things in an area I thought I was efficient…

Live demo

What was the crux about this part was the processes Kirk adopted to point out the problems in the code.

Step 1

First, he asked for a baseline to work on. Namely, you should have a scenario which demonstrates the kind of real performances issues a real client perceives. Improving some application which is already perceived as working is a waste of time, energy and money. Without a base line, you also have no way to check that you have improved the application. Last, not least, define your expectations, otherwise, you won't meet them ! So here, the team has defined a JMeter test, and defined the expected response time for each page.

Step 2

Second, run the baseline scenario, and measure the response time, plus a few other counters :
- CPU (users and system)

That's it, nothing more. Here, the code has not been even checked. The only thing Kirk did was to remove all the tuning for the JVM, like the memory min and max size, and every other premature configurations.

The rational is that you have no idea at this point if those parameters have any effect, but they for sure have an impact, probably polluting the results.

Looking at the CPU consumption and response time (90%CPU, around 5% system), with an average of 10s per page, it was clear the application has a performance issue, but there was no clue about what's going on yet.

Step 3

Then he checked the way the GC was running. He added some instruction on the JVM setting to generate some GC traces, run the application for a few minutes, then checked the logs ("You have to be patient ! Memory leaks may take a while to be noticed.")

A quick look at the metrics shown that the GC was eating 13% of the whole CPU. Way too much.

Step 4

Kirk now decided to connect to the running application, using VisualVM. The idea was to check the way objects were allocated. After a few minutes of tests, the allocated objects graph shown that we have a linear increase over time, which means a memory leak.

Finding the memory leak was a matter of minutes : find an application object (no need to check a Java object like byte[] or String : "Java collection objects don't leak…"). What is the key for Kirk is the number of generations an object survived : the higher this number, the more likely this object is leaking. Very new to me.

As a side note, he also said that many of the existing tools don't provide this generation number. They base the detection of leaking object on delta between snapshots. Not convenient.

Then you can check where the object was allocated checking the stack trace, an now, look at the code.

At this point, the important lesson is : just look at the code when you know in which method you have a problem.

(the application had another memory leak he found too, using the very same approach)

Another lesson : he asked to remove the caches in the code, instead of blind-guessing what was wrong with those caches (they were leaking). His moto was : "Why would you optimize your code by adding cache when you have no idea about what's going wrong in your code ?"

Step 5

Once this initial problem was fixed, he re-runs the test, and he saw that the CPU was not going any upper than 50%. Very wrong when the response time was still awful. In this case, the System CPU was high (the ration between user and system should be around 5-10%/90-95%).

What does it mean ? Contention. How to find where we have contention ? Easy : generating a thread-dump.

No fancy profiler, no long source reading, just a thread-dump.

It immediately shown that only two threads were used to deal with 50 concurrent clients requesting the application.

A quick tuning on Tomcat (number of threads accepting requests), and we moved to the next step.

Step 6

One last measure shown now that we had much better performances, but with a very high CPU system usage : around 20%.

Same action here : thread dump, look at the blocking threads, go to the portion of code where the thread was waiting. A bad thread.sleep( 100 ) was found in the code.

And it was over for the demonstration : 2 hours to fix bugs that would have took days and days for most of us!

Conclusion

In two hours, he made the application running way faster, simply by using a couple of tools, and without reading the code.

Impressive.

Thanks to Kirk Pepperdine, Cyrille Le Clerc and Xebia !

Follow up
I have forgotten a few things :
  • at some point after stet 4, GC went up to 65%. Kirk suspected that some part of the code was calling the GC. You bet !
  • after the presentation, Kirk said that the very first step is really to catch all the GC problems first, as they will probably hide other problems.

samedi, janvier 09, 2010