Thursday, May 12, 2011

Web Application and App Server Performance Tunning Tips

Source content is from Java Performance Tuning

The following pages have their detailed tips extracted below

•J2EE Application server performance
•Tuning IBM's WebSphere product
•WebSphere V3 Performance Tuning Guide
•Weblogic tuning (generally applicable Java tips extracted)
•Overview of common application servers
•Web application scalability.
•J2EE Application servers
•Load Balancing Web Applications
•J2EE clustering
•"EJB2 clustering with application servers"
•Choosing an application server
•Choosing a J2EE application server, emphasizing the importance of performance issues
•Implementing clustering on a J2EE web server (JBoss+Jetty)
•Tuning tips intended for Sun's "Web Server" product, but actually generally applicable.
•Various tips
•iPlanet Web Server guide to servlets, with a section at the end on "Maximizing Servlet Performance".
•Sun community chat on iPlanet
•Article on high availability architecture

--------------------------------------------------------------------------------

The following detailed tips have been extracted from the raw tips page

J2EE Application server performance (Page last updated April 2001, Added 2001-04-20, Author Misha Davidson, Publisher Java Developers Journal). Tips:
•Good performance has sub-second latency (response time) and hundreds of (e-commerce) transactions per second.
•Avoid n-way database joins: every join has a multiplicative effect on the amount of work the database has to do. The performance degradation may not be noticeable until large datasets are involved.
•Avoid bringing back thousands of rows of data: this can use a disproportionate amount of resources.
•Cache data when reuse is likely.
•Avoid unnecessary object creation.
•Minimize the use of synchronization.
•Avoid using the SingleThreadModel interface for servlets: write thread-safe code instead.
•ServletRequest.getRemoteHost() is very inefficient, and can take seconds to complete the reverse DNS lookup it performs.
•OutputStream can be faster than PrintWriter. JSPs are only generally slower than servlets when returning binary data, since JSPs always use a PrintWriter, whereas servlets can take advantage of a faster OutputStream.
•Excessive use of custom tags may create unnecessary processing overhead.
•Using multiple levels of BodyTags combined with iteration will likely slow down the processing of the page significantly.
•Use optimistic transactions: write to the database while checking that new data is not be overwritten by using WHERE clauses containing the old data. However note that optimistic transactions can lead to worse performance if many transactions fail.
•Use lazy-loading of dependent objects.
•For read-only queries involving large amounts of data, avoid EJB objects and use JavaBeans as an intermediary to access manipulate and store the data for JSP access.
•Use stateless session EJBs to cache and manage infrequently changed data. Update the EJB occasionally.
•Use a dedicated session bean to perform and cache all JNDI lookups in a minimum number of requests.
•Minimize interprocess communication.
•Use clustering (multiple servers) to increase scalability.

Tuning IBM's WebSphere product. White paper: "Methodology for Production Performance Tuning". Only non-product specific Java tips have been extracted here. (Page last updated September 2000, Added 2001-01-19, Author Gennaro (Jerry) Cuomo, Publisher IBM). Tips:
•A size restricted queue (closed queue) allows system resources to be more tightly managed than an open queue.
•The network provides a front-end queue. A server should be configured to use the network queue as its bottleneck, i.e. only accept a request from the network when there are sufficient resources to process the request. This reduces the load on an app server. However, sufficient requests should be accepted to ensure that the app server is working at maximum capacity, i.e. try not to let a component sit idle while there are still requests that can be accepted even if other components are fully worked.
•Try to balance the workload of the various components.
•[Paper shows a nice throughput curve giving recommended scaling behavior for an server]
•The desirable target bottleneck is the CPU, i.e. a server should be tuned until the CPU is the remaining bottleneck. Adding CPUs is a simple remedy to this.
•Use connection pools and cached prepared statements for database access.
•Object memory management is particularly important for server applications. Typically garbage collection could take between 5% and 20% of the server execution time. Garbage collection statistics provide a useful monitor to determine the server's "health". Use the verbosegc flag to collect basic GC statistics.
•GC statistcs to monitor are: total time spent in GC (target less than 15% of execution time); average time per GC; average memory collected per GC; average objects collected per GC.
•For long lived server processes it is particularly important to eliminate memory leaks (references retained to objects and never released).
•Use -ms and -mx to tune the JVM heap. Bigger means more space but GC takes longer. Use the GC statistics to determine the optimal setting, i.e the setting which provides the minimum average overhead from GC.
•The ability to reload classes is typically achieved by testing a filesystem timestamp. This check should be done at set intermediate periods, and not on every request as the filesystem check is an expensive operation.

WebSphere V3 Performance Tuning Guide (Page last updated March 2000, Added 2001-01-19, Authors Ken Ueno, Tom Alcott, Jeff Carlson, Andrew Dunshea, Hajo Kitzhöfer, Yuko Hayakawa, Frank Mogus, Colin D. Wordsworth, Publisher IBM). Tips:
•[The Red book lists and discusses tuning parameters available to Websphere]
•Run an application server and any database servers on separate server machines.
•JVM heap size: -mx, -ms [-Xmx, -Xms]. As a starting point for a server based on a single JVM, consider setting the maximum heap size to 1/4 the total physical memory on the server and setting the minimum to 1/2 of the maximum heap. Sun recommends that ms be set to somewhere between 1/10 and 1/4 of the mx setting. They do not recommend setting ms and mx to be the same. Bigger is not always better for heap size. In general increasing the size of the Java heap improves throughput to the point where the heap no longer resides in physical memory. Once the heap begins swapping to disk, Java performance drastically suffers. Therefore, the mx heap setting should be set small enough to contain the heap within physical memory. Also, large heaps can take several seconds to fill up, so garbage collection occurs less frequently which means that pause times due to GC will increase. Use verbosegc to help determine the optimum size that minimizes overall GC.
•In some cases turning off asynchronous garbage collection ("-noasyncgc", not always available to all JVMs) can improve performance.
•Setting the JVM stack and native thread stack size (-oss and -ss) too large (e.g. greater than 2MB) can significantly degrade performance.
•When security is enabled (e.g. SSL, password authentication, security contexts and access lists, encryption, etc) performance is degraded by significant amounts.
•One of the most time-consuming procedures of a database application is establishing a connection to the database. Use connection pooling to minimize this overhead.

Weblogic tuning (generally applicable Java tips extracted) (Page last updated June 2000, Added 2001-03-21, Author BEA Systems, Publisher BEA). Tips:
•Response time is affected by: contention and wait times, particularly for shared resources; and software and hardware component performance, i.e. the amount of time that resources are needed.
•A well-designed application can increase performance by simply adding more resources (for instance, an extra server).
•Use clustered or multi-processing machines; use a JIT-enabled JVM; use Java 2 rather than JDK 1.1;
•Use -noclassgc. Use the maximum possible heap size that also is small enough to avoid the JVM from swapping (e.g. 80% of RAM left over after other required processes). Consider starting with minimum initial heap size so that the garbage collector doesn't suddenly encounter a full heap with lots of garbage. Benchmarkers sometimes like to set the heap as high as possible to completely avoid GC for the duration of the benchmark.
•Distributing the application over several server JVMs means that GC impact will be spread in time, i.e. the various JVMs will most likely GC at different times from each.
•On Java 1.1 the most effective heap size is that which limits the longest GC incurred pause to the longest acceptable pause in processing time. This will typically require a reduction in the maximum heap size.
•Too many threads causes too much context switching. Too few threads may underutilize the system. If n=number of threads, k=number of CPUs, then: (n < k) results in an under utilized CPU; (n == k) is theoretically ideal, but each CPU will probably be under utilized; (n > k) by a "moderate amount of threads" is practically ideal; (n > k) by "many threads" can lead to significant performance degradation from context switching. Blocked threads count for less in the previous formulae.
•Symptoms of too few threads: CPU is waiting to do work, but there is work that could be done; Can not get 100% CPU; All threads are blocked [on i/o] and runnable when you do an execution snapshot.
•Symptoms of too many threads: An execution snapshot shows that there is a lot of context switching going on in your JVM; Your performance increases as you decrease the number of threads.
•If many client connections are dropped or refused, the TCP listen queue may be too short.
•Try to avoid excessive cycling (creation/deletion or activation/passivation) of beans.

Overview of common application servers. I've extracted the performance related features (Page last updated October 2001, Added 2001-10-22, Author Pieter Van Gorp, Publisher Van Gorp). Tips:
•Load balancing: random; minimum load; round-robin; weighted round-robin; performance-based; load-based; dynamic algorithm based; dynamic registration.
•Clustering. Additionally: distributed transaction management; in-memory replication of session state information; no single point of failure.
•Connection pooling.
•Caching. JNDI caching. Distributed caching with synchronization.
•Thread pooling.
•Configurable user Quality of Service.
•Analysis tools.
•Low system/memory requirements.
•Optimized subsystems (RMI, JMS, JDBC drivers, JSP tags & cacheable page fragments).
•Optimistic transaction support.

Web application scalability. (Page last updated June 2000, Added 2001-05-21, Author Billie Shea, Publisher STQE Magazine). Tips:
•Web application scalability is the ability to sustain the required number of simultaneous users and/or transactions, while maintaining adequate response times to end users.
•The first solution built with new skills and new technologies will always have room for improvement.
•Avoid deploying an application server that will cause embarrassment, or that could weaken customer confidence and business reputation [because of bad response times or lack of calability].
•Consider application performance throughout each phase of development and into production.
•Performance testing must be an integral part of designing, building, and maintaining Web applications.
•There appears to be a strong correlation between the use of performance testing tools and the likelihood that a site would scale as required.
•Automated performance tests must be planned for and iteratively implemented to identify and remove bottlenecks.
•Validate the architecture: decide on the maximum scaling requirements and then performance test to validate the necessary performance is achievable. This testing should be done on the prototype, before the application is built.
•Have a clear understanding of how easily your configurations of Web, application, and/or database servers can be expanded.
•Factor in load-balancing software and/or hardware in order to efficiently route requests to the least busy resource.
•Consider the effects security will have on performance: adding a security layer to transactions will impact response times. Dedicate specific server(s) to handle secure transactions.
•Select performance benchmarks and use them to quantify the scalability and determine performance targets and future performance improvements or degradations. Include all user types such as "information-gathering" visitors or "transaction" visitors in your benchmarks.
•Perform "Performance Regression Testing": continuously re-test and measure against the established benchmark tests to ensure that application performance hasn?t been degraded because of the changes you?ve made.
•Performance testing must continue even after the application is deployed. For applications expected to perform 24/7 inconsequential issues like database logging can degrade performance. Continuous monitoring is key to spotting even the slightest abnormality: set performance capacity thresholds and monitor them.
•When application transaction volumes reach 40% of maximum expected volumes, it is time to start executing plans to expand the system

J2EE Application servers (Page last updated April 2001, Added 2001-04-20, Authors Christopher G. Chelliah and Sudhakar Ramakrishnan, Publisher Java Developers Journal). Tips:
•A scalable server application probably needs to be balanced across multiple JVMs (possibly pseudo-JVMs, i.e. multiple logical JVMs running in the same process).
•Performance of an application server hinges on caching, load balancing, fault tolerance, and clustering.
•Application server caching should include web-page caches and data access caches. Other caches include caching servers which "guard" the application server, intercepting requests and either returning those that do not need to go to the server, or rejecting or delaying those that may overload the app server.
•Application servers should use connection pooling and database caching to minimize connection overheads and round-trips.
•Load balancing mechanisms include: round-robin DNS (alternating different IP-addresses assigned to a server name); and re-routing mechanisms to distribute requests across multiple servers. By maintaining multiple re-routing servers and a client connection mechanism that automatically checks for an available re-routing server, fault tolerance is added.
•Using one thread per user can become a bottleneck if there are a large number of concurrent users.
•Distributed components should consider the proximity of components to their data (i.e., avoid network round-trips) and how to distribute any resource bottlenecks (i.e., CPU, memory, I/O) across the different nodes.

Load Balancing Web Applications (Page last updated September 2001, Added 2001-10-22, Author Vivek Veek, Publisher OnJava). Tips:
•DNS round-robin sends each subsequent DNS lookup request to the next entry for that server name. This provides a simple machine-level load-balancing mechanism, but is only appropriate for session independent or shared-session servers.
•DNS round-robin has no server load measuring mechanisms, so requests can still go to overloaded servers, i.e. the load balancing can be very unbalanced.
•Hardware load-balancers solve many of the problems of DNS round-robin, but introduce a single point of failure.
•A web server proxy can also provide load-balancing by redirecting requests to multiple backend webservers.

J2EE clustering (Page last updated August 2001, Added 2001-08-20, Author Abraham Kang, Publisher JavaWorld). Tips:
•Consider cluster-related and load balancing programming issues from the beginning of the development process.
•Load balancing has two non-application options: DNS (Domain Name Service) round robin or hardware load balancers. [Article discusses the pros and cons].
•To support distributed sessions, make sure: all session referenced objects are serializable; store session state changes in a central repository.
•Try to keep multiple copies of objects to a minimum.

"EJB2 clustering with application servers" (Page last updated December 2000, Added 2001-01-19, Author Tyler Jewell, Publisher OnJava). Tips:
•[Article discusses multiple independent ways to load balance EJBs]

Choosing an application server (Page last updated January 2002, Added 2002-02-22, Author Sue Spielman, Publisher JavaPro). Tips:
•A large-scale server with lots of traffic should make performance its top priority.
•Performance factors to consider include: connection pooling; types of JDBC drivers; caching features, and their configurability; CMP support.
•Inability to scale with reliable performance means lost customers.
•Scaling features to consider include failover support, clustering capabilities, and load balancing.

Choosing a J2EE application server, emphasizing the importance of performance issues (Page last updated February 2001, Added 2001-02-21, Author Steve Franklin, Publisher DevX). Tips:
•Application server performance is affected by: the JDK version; connection pooling availability; JDBC version and optimized driver support; caching support; transactional efficiency; EJB component pooling mechanisms; efficiency of webserver-appserver connection; efficiency of persistence mechanisms.
•Your application server needs to be load tested with scaling, to determine suitability.
•Always validate the performance of the app server on the target hardware with peak expected user numbers.
•Decide on what is acceptable downtime for your application, and ensure the app server can deliver the required robustness. High availability may require: transparent fail-over; clustering; load balancing; efficient connection pooling; caching; duplicated servers; scalable CPU support.

Implementing clustering on a J2EE web server (JBoss+Jetty) (Page last updated September 2001, Added 2001-10-22, Author Bill Burke, Publisher OnJava). Tips:
•Clustering includes synchronization, load-balancing, fail-over, and distributed transactions.
•[article discusses implementing clustering in an environment where clustering was not previously present].
•The different EJB commit options affect database traffic and performance. Option 'A' (read-only local caching) has the smallest overhead.
•Hardware load balancers are a simple and fast solution to distributing HTTP requests to clustered servers.

Tuning tips intended for Sun's "Web Server" product, but actually generally applicable. (Page last updated 1999, Added 2000-10-23, Author ? - a Sun document, Publisher Aikido). Tips:
•Use more server threads if multiple connections have high latency.
•Use keep-alive sockets for higher throughput.
•Increase server listen queues for high load or high latency servers.
•Avoid or reduce logging.
•Buffer logging output: use less than one real output per log.
•Avoid reverse DNS lookups.
•Write time stamps rather than formatted date-times.
•Separate paging and application files.
•A high VM heap size may result in paging, but could avoid some garbage collections.
•Occasional very long GCs makes the VM hang for that time, leading to variability in service quality.
•Doing GC fairly often and avoiding paging is more efficient.
•Security checks consume CPU resources. You will get better performance if you can turn security checking off.

Various tips. For web servers? (Page last updated 2000, Added 2000-10-23, Author ?, Publisher ?). Tips:
•Test multiple VMs.
•Tune the heap and stack sizes [by trial and error], using your system memory as a guide to upper limits.
•Keep the system file cache large. [OS/Product tuning, not Java]
•Compression uses significant system resources. Don't use it on a server unless necessary.
•Monitor thread utilization. Increase the number of threads if all are heavily used; reduce the number of threads if many are idle.
•Empirically test for the optimal number of database connections.

iPlanet Web Server guide to servlets, with a section at the end on "Maximizing Servlet Performance". (Page last updated July 2000, Added 2001-02-21, Author ?, Publisher Sun). Tips:

•Try to optimize the servlet loading mechanism, e.g. by listing the servlet first in loading configurations.
•Tune the heap size.
•Keep the classpath short.

Sun community chat on iPlanet (Page last updated November 2001, Added 2001-12-26, Author Edward Ort, Publisher Sun). Tips:

•Optimal result caching (caching pages which have been generated) needs tuning, especially the timeout setting. Make sure the timeout is not too short.

Article on high availability architecture. If the system isn't up when you need it, its not performing. (Page last updated November 1998, Added 2000-10-23, Author Sam Wong, Publisher Sun). Tips:

•Eliminate all potential single-points-of-failure, basically with redundancy and automatic fail-over.
•Consider using the redundant components to improve performance, with a component failure causing decreased performance rather system failure.

No comments:

´