J2EE Connector Architecture: Easing EA
Automatic Memory Management in the Java HotSpot Virtual Machine
Understanding the Java 2 Platform, Standard Edition Privileged Code: A Practical Approach
Optimizing for the Java Servlet API and Java DataBase Connectivity (JDBC) Technology
A Crash Course in Dynamic Class Generation
The talk was presented by David Chappell. Back in the early COM days he wrote one
of the best high-level overview books on COM. He's now the Chief Technology Evangelist
for Sonic Software having switched camps a long time ago. He has authored O'Reilly's Java
Message Service and Java Web Services books and Professional ebXML Foundations for Wrox Press.
Rosetta Stone for this talk:
- EAI - Enterprise Application Integration
- EIS - Enterprise Information System(s)
- JCA - Java Connector Architecture, also the Java Cryptography API
The J2EE Connector Architecture defines a standard for connecting the J2EE platform
to enterprise information systems. It defines connection management, transaction management,
and security. It's nothing more than a common API on top of resource adapters supplied by
vendors that then talk to your EIS. It can manage transactions across multiple resource
managers.
The service provider's interfaces are in javax.resource.spi and the common client
interface's are in javax.resource.cci.
Here's some code:
ConnectionSpec connSpec = new
com.sun.connetor.cciblackbox.ciConnectoinSpec
(username password);
javax.resource.cci.ConnectoinFactory factory =
new (javax.resource.cci.ConnectoinFactory)
conn.getConnectionFactory();
(... more code that I didn't get typed in ...)
ix = m_connection.creatInteraction();
CciInteractionSpec iSpec = new CciInteractionSpec();
ISpec.setFunctionName(storedProcedureName);
IndexedRecord iRec = m_recordFactory.creatIndexRecord(...);
(... some more code that I didn't get typed in ...)
JCA is currently defined so that the communication paths are one-way. In the future,
you will be able to subscribe to events.
The speaker then went into a long, boring non-technical discussion of the challenges with
and problems of EAI. Now back into some semi-technical stuff. The speaker doesn't advocate
using an EJB as an integration point into the application. In his words, "That's stupid."
He first proposes using remoteable resource adapters through JCA. The next alternative is
to us a JMS enterprise message bus. His final solution is to using a JMS enterprise message
bus which feeds into a remoteable JCA container as the entry point to each application server.
This way by the time you figure out how to use JMS after buying his JMS book, you'll then be
ready to buy his JCA book which will no doubt be out by then.
The summary for this talk is that Chappell writes half-decent books but doesn't really
know enough technical knowledge to give a decent talk. In the Q&A session he brings one of
his "colleagues" on stage to help answer the hard stuff.
Earlier in the week I walked out of a garbage collection talk before it even started
after coming to the realization that I've never been to a session on garbage collection
that ended up being worthwhile. Now I find myself back at one again. I seem to have a
fatal attraction with garbage collection. I suppose it's a type-A personality thing.
Bob Marley is playing over the speakers and I take that might be a good omen.
He speaker is Y. Srinivas Ramakrishna. With a name like that, it makes you wonder what
the "Y" is an abbreviation of. He's a staff engineer at Sun.
"As a result of this presentation, you will be able to write better programs in Java."
That's nice. Garbage collection is the reclamation of unused memory. If you've ever
witnessed homeless people in San Francisco rummaging through garbage cans you can use
that as a frame of reference. If that's not enough visualization for you, memory
management (the politically correct term) also helps prevent "dangling pointers".
There are different algorithms and policies. There is a trade off between use of heap
space and collector speed. There is trade off between pause times and collector throughput.
Orthogonal dimensions of the garbage collection design space:
- Accurate vs. Conservative
- Handleless vs. Handle-based
- Partial collection vs. Full collection
- Stop-the-World vs. Concurrent
- Single-threaded vs. Multi-threaded
- Free Wal-Mart bags vs. 30lb. Heafty bags
In accurate GC there are no accidental memory leaks. In conservative GC there are
accidental memory leaks. It makes you wonder why they didn't just call it "GC with
memory leaks" and "GC without memory leaks". Accurate GC enables object migration.
Conservative GC follows the zoo design pattern and there is no object migration.
Accurate GC is hard to implement because it has to be accurate. Conservative GC is simple.
On the next slide, A has a reference to B. B has a reference to C. C has a reference to
A and B. This is a handleless system. In a handle-based system we have a, b, and c with
even more little arrows going in different directions.
Partial GC involves independently collected regions similar to the garbage collection
in the suburbs. In full GC the entire heap is collected each time. This is similar to a
rainstorm in 18th Century Europe. As long as everyone dumps their trash from their window
onto the street beforehand, then everything is fine. Perhaps a better analogy is Star Wars
Episode I where the Imperial Destroyer dumps its heap in someone else's quadrant before
making the jump to hyperspace.
Stop-the-world GC involves freezing your object graph. This is slow but it helps to
mask the smell. In multi-threaded GC synchronization is required. No mad rushes to the
compactor.
Java HotSpot is accurate, handleless, partial, stop-the-world (default), and single-threaded
(default).
To do GC we need reflection via java.lang.Object.getClass(), synchronization, and an
immutable hash code.
Object Layout in the Java HotSpot Virtual Machine. There is a 2-word object header
consisting of GC status bits, hash code bits, synchronization bits, and class pointer.
In HotSpot stop-the-world GC only threads executing bytecode are stopped. If a thread
is already stopped then it's not stopped. Before threads are stopped there is a security
checkpoint where the VM tracks the thread's state, otherwise known as "peeking under the
threads" (speaker's words, not mine).
After a "peeking" episode, new objects are allocated in the "nursery". The nursery
can be thread-local. An allocation updates a single pointer, usually in-lined by
compilers. The statement new java.lang.Object(…) usually takes about 10 separate
instructions to the native people.
The default collector in HotSpot is the "mark, sweep, compact" collector. Then there
is the "incremental", aka "train" collector, and concurrent collectors will probably be
added in the future. The "mark, sweep, compact" collector is efficient when the garbage
ratio is low. It pauses proportional to the heap size. First, all objects on the heap are
painted gray. Birds which have been fed gray soy earlier fly over the heap until this process
is complete. Then the GC walks on the heap until it finds objects that should be compacted.
Then it jumps up and down until the objects are blackened. Objects move from gray to black
while "Paint it Black" blares out in the background thread.
In the train algorithm all of your trash is put onto trains. There are many trains. Cars
are collected individually. The "mostly-concurrent" collector is under development. It is
mostly concurrent. There are no real-time guarantees. In the speakers words, "Mutators run
while the collector sweeps up unmarked objects, optionally coalescing adjacent free blocks."
I think that makes it crystal clear.
Future GC algorithms that have been proposed include "parallel scavenge", "parallel mark,
sweep, compact", "mix-and-match some combinations", and "thrash and burn".
There are no questions about death in Java. When you die, you become unreachable.
Your finalize() method is invoked. Then you're reincarnated to the heap.
The speaker then showed an interesting demo using the mother of all application design
patterns, the screensaver. In the first version, there is no GC; hence, no pauses in the
screensaver animation. With GC you get periodic pauses.
In summary, there is nothing hot about HotSpot. If it seems like your program stops
running at regular intervals while the garbage collector does it's thing, it's because
the garbage collector stops your program at regular intervals while it does it's thing.
The concurrent GC looks cool, but the speaker made no mention of how to turn this on. Is
this in Client HotSpot or Server HotSpot or both? Is it a command-line flag? I doubt the
speaker even knows. All the concepts behind garbage collection are cool, and if they would
let those of us who know how to write usable code work on it for awhile perhaps the world
would be a cleaner place.
In the post-Sept. 11th world you need to secure your Java code. Terrorists are prevalent
in the JVM and suicide-bombers give a new meaning to "bytecode".
The speakers are from the IBM T.J. Watson Research Center and are currently authoring
Enterprise Java Security to be published by Addison-Wesley in 2002.
There are a lot of permissions: SocketPermission, RuntimePermisssion, FilePermission,
AWTPermission, and SecurityPermission. Then there's the concept of CodeSource. This is a
set of signers (certificates) and a CodeBase URL. The default implementation uses a policy
file associated with a CodeSource.. A ProtectionDomain is a CodeSource and a set of Permissions.
Classes loaded via a ClassLoader are assigned to a ProtectionDomain. Classes are in
the same ProtectionDomain when the classes are signed by the same keys and are from the
same URL, in other words they're from the same country and their passports are signed by
the same passport signer dude. Classes are in different ProtectionDomains when the
CodeSources are different. The SecureClassLoader assigns a ProtectionDomain to each class.
The default Policy implementation reads from a flat file like this:
grant signedBy "mykey",
codeBase "file:/application/*"
{
permission java.lang.RuntimePermission "queuedPrintJob";
permission java.io. (.. switched slides too fast ..)
There are two domains. The system domain has the Java runtime classes, and there are
no constraints on them. They're special classes, much like the "special class" you remember
from grade school where the retarded people were sent periodically. The application domain
has application classes and libraries.
From this point on the speakers increased their trend of covering material too quickly,
too fast for me to understand it and flipping slides too fast for me to copy it down to look
at later. I got up and left hoping to spend twenty minutes on the trade show floor before it
closes for the week only to find that it had already closed.
The best session I went to last year was given by a consultant. This talk is too. Tim
Kientzle, Independent Consultant, www.kientzle.com. He might have given the talk I went to
last year. I don't remember. Anyway, this will be a good test for my theory that the best
talks are given by consultants and in general those with real-world experience as opposed
to lab rats from Sun and IBM.
Yes! This is the same speaker, exact same talk too. It should be worthwhile though.
The talk starts off with the question of how do we make the HelloWorld servlet three
times faster:
class HelloWorld extends HttpServlet
{
private String html = " ... ";
void doGet( request, response )
{
PrintWriter socket;
Socket = response.getWriter();
Socket.print(html);
}
}
The answer is given further below.
Techniques:
- Avoid creating objects (much of your Java code can be eliminated if you don't create objects)
- Synchronize carefully (i.e. use protection)
- Beware of char/byte conversions (It's a dangerous world.)
A servlet container creates a new thread for each simultaneous request. A heavily loaded
web server can easily have 200-300 threads running at a given instant. Servlets are long-lived,
possibly even weeks and months or even Millennia, so memory management matters. Most of your
created objects within a Servlet are short-lived. Heap thrashing can kill performance.
Basic Measurements: Pages/Second. Test from outside your program. How many pages can your
server spit out within a minute? How far can it spit? The Apache ab program is a good start.
Browse the website during a test to look for synchronization bugs. Also grab thread dumps
(use gloves) while the test is running.
Basic Measurements: Seconds/Page. Test from within your program. Check
System.currentTimeMilis() at the start and end of critical sections and report so System.err,
but not too frequently. It's better to keep counts for time ranges instead. How many requests
are getting answered in less than a tenth of a millisecond, less than a millisecond, etc.
Other Useful Measurements. The database requests per page. How heavily loaded is the
system? Heap usage is determined by:
Runtime.totalMemory() - Runtime.freeMemory()
JVM Thread Dump. On UNIX "kill -QUIT", under Windows "Ctrl-Break", will cause the JVS
to bend over and dump a stack. Look at it to see how many threads are in the same method.
(You'll need good lightening.) Are threads progressing with successive dumps? Turn off JIT
to get line numbers in stack dumps.
When you look in a stack dump, look for the number of threads waiting on a monitor.
Are system monitors a bottleneck? Should be buy more monitors? Flatscreen?
A synchronized block creates a monitor and a write barrier in a servlet. Instance data
for a servlet must be synchronized, if not avoided altogether.
This example creates three heap allocations:
String buildForm(arg)
{
StringBuffer b = new StringBuffer(); <-Two heap allocations
b.append(...);
b.append(...);
b.append(...);
return b.toString(); <-One heap allocation
}
You an avoid the heap allocations by passing the buffer around instead:
void buildForm(StringBuffer b, arg)
{
b.append(...);
b.append(...);
b.append(...);
}
Tips for reducing heap activity:
- Use long instead of java.util.Date. The Date class is nothing but a long with a bunch of deprecated methods.
- Use several scalars instead of a small array
- Use arguments and locals instead of member variables
- Use char array instead of StringBuffer if you're appending individual characters.
- Pass StringBuffers around.
- Design methods to modify pre-existing objects, not create new ones.
- Feed your servlets lots of fiber.
Be careful to set the HTML page headers correctly for pages you cache and those you don't
want cached as well.
| Cache-Control Headers |
| Header |
Cachable |
Non-Cachable |
| Date |
Now |
Now |
| Expires |
Time |
- |
| Last-Modified |
Time |
Time |
| Cache-Control |
maxage=# |
no-cache |
| Pragma |
- |
no-cache |
Be sure to set the Expires header on anything that is cacheable.
Look at the HTTP/1.1 specification for details.
I missed all the code for the following FastServlet example, but I think I have it in my
notes from last year.
class FastServlet extends HttpServlet
{
void doGet(request, response)
{
String uri = request.getRequestURI();
Servable servable;
if (!cache.contains(uri))
{
...
A Servable is something that can be sent as an HTTP response and implements this interface:
interface Servable
{
void emit(OutputStream);
String getContentType();
long getExpiratoin();
long getLastModified();
}
For designing an effective cache, keep lots in memory but don't blow the heap. You've
never smelt anything worse than a heap on fire. Use weak and soft references for proper
gastro-intestinal throughput. Try WeakHashMap with a most recently used list. You can go
to stuff you're most likely to need quickly, but the GC can clean up stuff whenever it needs
to.
A page is a servable that holds a text response. Create a page at the top level, mostly a
StringBuffer, and pass it down into methods that add text. Use placeholder objects for dynamic
elements.
class Page implements Servable
{
...more code that I missed
Back to making HelloWorld 3 times faster. Converting chars to bytes is very slow. Always
use OutputStream, never PrintWriter. It lets you write characters if you want. It also lets
you write bytes if you want.
class HelloWorld extends HttpServlet
{
private String html = " ... ";
private byte[] b = html.getBytes();
void doGet( request, response )
{
OutputStream socket;
Socket = response.getOutputStream();
...
}
}
I think I have the above in its entirety from last year's notes if you need it.
To speed up your cache, individually cache small components of your pages that are
repeated often, preferably as bytes. Store expiration times with every single piece of
data you cache. As an example, if your cache fails to get a message saying that the data
has been changed in the database, then it should automatically expire that data at some point.
A good database connection pool should expire connections at regular intervals because
servlets are much longer lived that most databases like. Also make sure your connection pool
allows on-the-fly parameter changes and multiple databases.
For partitioning your database, separate read-only data from read/write data.
This session will cover how to generate Java classes on the fly within your code in
case you forgot to write a class for something when you developed it. The code can finish
writing itself.
This session is in one of the largest rooms for this year's conference and it's
completely packed.
The easiest way to generate a class is to use the Java compiler. (Note to self: Start
using the compiler.) The J2EE reference implementation builds all your EJB stubs this way.
This is OK in cases when you can afford to wait several hours for your code to compile.
Invoking com.sun.tools.javac.Main() is much faster than Runtime.exec(). Load the class in
it's own class loader so it can be garbage collected when you're done with it.
Building It In Five Easy Steps
- Build a prototype by hand.
- Write the unit test: Does the prototype solve the problem?
- Disassemble the prototype with javap -c
- Use a bytecode toolkit to generate the prototype.
- Generalize your ode for any solution.
Define an interface first so you can program against it. Classes implementing the
interfaces will be generated so you can't compile against them. Define a class implementing
the interface and make sure that it works. Ensure that your prototype isn't naïve.
Now we're at Step 3. Run javap -c ClassNameHere. The speaker then did a straightforward
walkthrough of some Java bytecode show which bytecode is executed as you step through the
Java code. It's quite easy and not hard at all. You can generally infer what instructions
do by looking at the java source code.
We have two steps remaining. Write Java code that outputs the bytecode on the fly.
Then generalize it for any number of fields, loops, etc. Utilize a bytecode toolkit to
make your job a lot easier. There are a lot of half-backed bytecode toolkits on the web.
The speaker suggests using these toolkits:
Your toolkit will expose an API for you to build your class file. The speaker's slides
showed some examples for this and it was quite easy to do, but he went through them too fast
for me to dictate. The API has methods that correspond directly to bytecode instructions. You
create a class object, then create a method object for it, then make API calls to add bytecode
instructions using your decompiled class as an example.
You have to write a custom classloader to load your class. (Look at his slides on the web
for this.)
To do all of this you want to pick a simple case of your most complex case as the example
to start with.
Reflection was made much faster in 1.4 by generating small classes for method.invoke calls
on the fly and letting the JIT compiler optimize them.
Java Virtual Machine Specification
http://java.sun.com/docs/books/vmspec/
Example code
http://dynclass-eg.sourceforge.net (It may be until after the conference before the example code is put here.)
A question during the Q&A session asked about bytecode injection. The speaker and
one of his cohorts didn't know of anyone doing this. However, the CTO of VisiVue
mentioned in his BOF session on Tuesday night (see my notes for the New Debugging and
Visualization Technologies for Java) that their debugger injects bytecode into a program
in order to know everything that's going all.
This was a very interesting talk and I look forward to making my code write itself.
[ Previous ]
[ Index ]
[ Next ]