Fine tune Java programs

There is simply no substitute for the experience of writing and tuning your own parallel programs.

Well, as you will very soon find out, the stuffs presented here are a little unorganized. My apology! For one thing, I’ve been dumping things that I thought useful over the time here. Hopefully in the future I will find some time to sort them out. For another, I feel the experience/knowledge I gained related to the contents presented here has come into the spectrum of “tacit knowledge”, which is well known for being hard to explain (imagine teaching people how to swim or ride a bike). However, I did come across a pretty organized blog on the same matter, in case you are interested.

Multi-thread and parallele programming

Since CoJava applies Java fork-join parallelism framework, this post is mostly concerned with efficiently parallelizing fork-join computations in Java. The fork/join framework uses a thread pool in which a fixed number of threads are created. Each thread has a queue of tasks that are awaiting a chance to execute. When a task is started (forked), it is added to the queue of the thread that is executing its parent task. Because each thread can be executing only one task at a time, each thread’s task queues can accumulate tasks which are not currently executing. Threads that have no tasks allocated to them will attempt to steal a task from a thread whose queue has at least one task - this is called work stealing. By this mechanism, tasks are distributed to all of the threads in the thread pool. By using a thread pool with work stealing, a fork/join framework can allow a relatively fine-grained division of the problem, but only create the minimum number of threads needed to fully exploit the available CPU cores. Typically, the thread pool will have one thread per available CPU core. There are some rules of thumb regarding general parallel programming (credit goes to the course website of CMU 15-418/618)

Want at least as much work as parallel execution capability (e.g., program should probably spawn at least was much work as there are cores)
Want more independent work than execution capability to allow for good workload balance of all the work onto the cores. “parallel slack” = ratio of independent work to machine’s parallel execution capability (in practice: ~8 is a good ratio)
But not too much independent work so that granularity of work is too small (too much slack incurs overhead of managing fne-grained work)

Big data structure

More about Fork-Join

Memory handling in Java

Stack

-Xss

-Xms and -Xmx switches

thread stacks

java -XX:+PrintFlagsFinal -version | grep ThreadStackSize
Or java -XX:+PrintFlagsFinal -version | grep -iE 'HeapSize|ThreadStackSize'

CPU load

the Runtime class

totalMemory()
maxMemory()

freeMemory()

java -Xms64m -Xmx1024m -jar chat.jar

Use jconsole


-Dcom.sun.management.jmxremote.port=7009
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.local.only=false


java -Dcom.sun.management.jmxremote \
 -Dcom.sun.management.jmxremote.port=7009 \
 -Dcom.sun.management.jmxremote.ssl=false \
 -Dcom.sun.management.jmxremote.authenticate=false \
 -Dcom.sun.management.jmxremote.local.only=false \
 -Xmx6g -jar /home/linly/bin/chatV3_best.jar /projects/sequence_analysis/vol4/CHAT_simGWAS/newSimGWASData/CHATResources/CHAT_prep.xml

Run Java-program-based jobs on HPC

this post

we can specify the number of GC threads to use for parallel GC. For example, with -XX:ParallelGCThreads=6 each parallel GC will be executed with six threads. If we don’t explicitly set this flag, the JVM will use a default value which is computed based on the number of available (virtual) processors. The determining factor is the value N returned by the Java method Runtime.availableProcessors(). For N <= 8 parallel GC will use just as many, i.e., N GC threads. For N > 8 available processors, the number of GC threads will be computed as 3+5N/8. Using the default setting makes most sense when the JVM uses the system and its processors exclusively. However, if more than one JVM (or other CPU-hungry systems) are all running on the same machine, we should use -XX:ParallelGCThreads in order to reduce the number of GC threads to an adequate value. For example, if four server JVMs are running on a machine with 16 processor cores, then -XX:ParallelGCThreads=4 is a sensible choice so that GCs of different JVMs don’t interfere with each other. I also got the following error messages for some job failure:


Java HotSpot(TM) 64-Bit Server VM warning: Insufficient space for shared memory file:
   /tmp/hsperfdata_linly/22248
Try using the -Djava.io.tmpdir= option to select an alternate temp location.

The file /tmp/hsperfdata_linly/22248 was created by JVM by default to export its statistics. It keeps creating /tmp/hsperfdata_username directories. There are cases that the JVM statistics caused garbage collection pauses because the JVM modifies its statistics during garbage collection and safepoints by memory mapping to /tmp/hsperfdata_username, which could be blocked for hundreds of milliseconds until disk I/O completes. For details, take a look at this article. Based on this post, we can use "-XX:+PerfDisableSharedMem=false" to prevent JVM from creating the statistics and thus suppresses the creation of the hsperfdata_userid directories. This option is recommended by Sun for causing less performance issues than turning off "-XX:-UsePerfData" option.

Integrate other languages into Java program

Set up java.library.path in Eclipse IDE

Take a look at this post

With R

JRI: if you are compiling your java program using Maven, you must see this post. I copied the contents below in case the post gets lost in the future. Strictly speaking both R and rJava need to be installed on the machine in order to resolve the "Cannot find JRI native library!" issue when instantiating your org.rosuda.REngine.REngine object, and this cannot be done exclusively by way of adding the JRIEngine dependency in your pom.xml. Steps:

Install R
Install rJava with R

install.packages("rJava")

Add rJava/jri to java.library.path classpath


-Djava.library.path="/usr/local/lib/R/3.3/site-library/rJava/jri/"

Add R_HOME to environment variables (where you installed R). Note that if you're trying to run this in your IDE, it won't inherit the path you set in ~/.bash_profile, you need to set it in your run configuration. In Eclipse, select the run or debug configuration of your project. Click the Environment tab. Add new environment variable such as R_HOME=/usr/local/Cellar/r/3.3.1_2/

Ensure maven has dependency for JRIEngine in pom.xml



    com.github.lucarosellini.rJava
    JRIEngine
    0.9-7

Instantiate REngine (I need this version in order to pass dataframe to R from java)
```
String[] Args = {"--vanilla"};
REngine engine = REngine.engineForClass("org.rosuda.REngine.JRI.JRIEngine", Args, new REngineStdOutput (), false);
```
What you should end up with looks something like this at runtime, if you instantiate with the callback argument (new REngineStdOutput () ); otherwise if you just instantiate with the String engineForClass("org.rosuda.REngine.JRI.JRIEngine"), you'll wont get the below output from R on startup/elsewise, depending on if you want it or not:
```
    R version 3.3.1 (2016-06-21) --
 
```

Figure out the logging system


log4j:WARN No appenders could be found for logger (org.apache.flink.api.java.ExecutionEnvironment).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

this post

the website


Why do I see a warning about "No appenders found for logger" and "Please configure log4j properly"?
This occurs when the default configuration files log4j.properties and log4j.xml can not be found and the application performs no explicit configuration. log4j uses Thread.getContextClassLoader().getResource() to locate the default configuration files and does not directly check the file system. Knowing the appropriate location to place log4j.properties or log4j.xml requires understanding the search strategy of the class loader in use. log4j does not provide a default configuration since output to the console or to the file system may be prohibited in some environments. Also see FAQ: Why can't log4j find my properties in a J2EE or WAR application?.

log4j manual

this one

an even better one

References

Using Jconsole
http://jagadesh4java.blogspot.com/2014/09/analyzing-jvm-crash.html
Eclipse Memory Analyzer (MAT) - Tutorial
Beginner's Introduction to Java's ForkJoin Framework