[Performance] Debugging Load and Memory Leaks Issue


Debugging problem during production is very important as there are many potential contributing factors.  Users may need to isolate the few parameters to identify the contribution factor. 

Possible causes 

a/ Third party code that is deployed with the server like JDBC Drivers, codes like Javascript
b/ Java VM  version, see if patch is present
c/ Load of the report rendering
d/ Insufficient memory configured i.e. 
 The total Java RAM configured exceed the Server production RAM.  This could lead to production slow download. 
e/ Improper sizing 
f/ Our code defects  etc


It could be a combination of these few factors.

1/ Environment information


For us to help customers do further investigation, customers should send the following information to our side(support@elixirtech.com)

Elixir Repertoire Server
- Version
- Running method(startServer.bat? startServer.sh? System Service used?)
- Configuration like bat/sh file (Heap memory setting),  wrapper.conf for System Service configuration.
- Type of rendering when the incident occurs(PDF? Excel? etc).
- Log files when the incident occurs(server.log, activity.log, etc).
- The template files(and corresponding data source if any)

Other information
- Java VM version(1.5? 1.6?)
- Database Server,  Drivers

System architecture
- CPU 
- Operating system  32/64bit
- Number of server/ load balancer and the configuration. 


Type Load during the incident like
- number of concurrent clients
- number of pages per report renders etc. 

2)  Tips Checking the logs


- Identify load at the time of rendering and error can be obtained from logs.
- Check which report is being render see if the issue is replicated with the same report.

Resolution
This may lead to the need to increase the JVM memory etc.   

3) Review the current System/ Network architecture for potential bottle neck

- Check if there is bottleneck in the current design in system or net work
- Check if there is sufficient memory given to the system. - Typically you should keep 20% - 30% of the memory of Operating system function. 
- Tunning maybe needed
http://java.sun.com/performance/reference/whitepapers/tuning.html

4) Get heap and thread memory dump 


http://java.sun.com/javase/6/webnotes/trouble/TSG-VM/html/memleaks.html

The command(could be found in the above link):
-XX:+HeapDumpOnOutOfMemoryError 
May be useful  

You can collect the heap dump and pass to development for analysis

- Migration issue.
In the older ER4 version, it is multi process where as ER7 is a single process with many rendering thread. The calculation is different. You should recommended to resize the servers.  
  
Other useful tools for monitoring memory

http://java.sun.com/developer/technicalArticles/J2SE/monitoring/


5) Some Suggested Resolution

1/ Check if the real cause is due misconfiguring of server. It is possible that a proper size is not carried out on the production requirement.

2/ Tuning and apply Java best practice. 
http://java.sun.com/performance/reference/whitepapers/tuning.html

3/ Modify architecture. (after you have done 1)


4/ Tune the report or the different configurable like cube.
- For report you can turn off page count to keep the memory usage low if you have no need to retain page count.  
- Turn off Table mode with SQL pre-sorted to match the grouping order in the template with grouping set to None for sorting if you don't need sorting and grouping.
This should cut down the memory requirement down to the theoretical minimum, i.e just 1 record at a time, as there's no need to cache data in memory required for sorting.

5/ Temporary solution is to restart the server so that the memory issue is cleared while we try to debug the problem.


6)Mapping version 4 memory to V7 for Reporting

A) Version 4 consists of 

i) One listener process and memory is set in the batch file.
ii) N child processes and memory per child process is set in the configuration file.

Total memory needed  is  1 x P (listener process)  + N x P (child processes)

Though we need X amount of memory for the whole report Server, the individual Java process will never exceed the memory value set for that process. If largest child process is set at 256MB, the individual child process will never grow beyond 256 MB even at full load, instead the memory management is left to OS to handle natively.  And we do not have to worry too much on the Java tuning parameters and Garage Collection (GC)

It is easier to estimated the memory needed per report as only one report is rendered in the process.

So typically for a ER4 dual engine server

A) 1 Listener process == 128 MB
B) 2 Engine == 256 X2  MB


The total memory is 128 + 256X2 = 768MB.  But it is still "safe" if you have only 512MB on your hardware server  as a single Java process don't exceed 256MB. 


B) Version 7 consists of only threads to handle the incoming request. 

i) N rendering threads for report (only), there is also a thread for queue but I think we can ignore it for now. 

Total memory needed  is  N x P (Reporting threads)

Where as for threaded application this is different, at full load if the server is set for 512MB (per thread 512MB), the you will need to full 512MB to be committed for rendering the reports concurrently.  So if the hardware server has only 512MB it will run out of the physical memory to allocate.  The Garbage collection will kick in a compete with the OS resource to free memory.

The only approach in this case we have to fine tune the server Java VM Garage collection options. We can need to lower the over memory to 80% of the total as in process rendering some of the memory is shared between the threads. But this is only after load testing. 

If except for Version 7 there are also other services running like scheduling etc, it will also affect the server memory requirement. If it is just a plain report it can be disabled. The complication comes in when running Ad hoc Cube or Perceptive. There are more things to consider.


Example

To translate  Version 4  Server to  2 Threaded engine V7 server which is

A) Dual ER4 server has 1 Listener (128MB) + 2 child processes (256MB per process) which work out to 768MB

to

B) Version 7 which has  2 Threads ==256 X2  MB == 512MB

This mean effectively for 1 Java process, you will need 512 MB.  You will have to allocate the actual physical memory for it. Otherwise you will hit into resource contention problem. We can only hope the will be a mixture of small and large report rendering. But it is possible to have long rendering report which can grab all the resources.

So this will be an issue when a very large report being rendered.e. 512MB per report rendering.  While for  version 4, it is safe to have 2GB server with 5 child processes of 512MB each. But for version 7, this means that you will allocate 5 x 512MB of memory i.e. 2560MB needed. And two problems encountered now,
a) It exceeded hardware memory
c) If the OS is 32bit and has only 2GB ram then you can't increase your hardware ram beyond that. 

The only way is to reduce the concurrent threads to so the total memory is at 70% below that of hardware memory i.e. 3 reporting thread instead of 5.