Useful logs to collect if FCSDK becomes Unresponsive / Has high CPU

Occasionally we have had situations where the FAS server/s and Media brokers become unresponsive to service traffic.

Such problems are often due to threading or memory retention issues. Therefore it is worth gathering the following information in the order suggested before doing a restart of the services.

1. Capture the start time of the Servers

Run the command:

ps -ef | grep java > java_processes.txt

and send us the resultant java_processes.txt file

2. Gather thread dumps

For the App server run the command:
kill -3 `ps -ef | grep [a]ppserver | awk '{print $2}'`

For the loadbalancer
run the command:

kill -3 `ps -ef | grep [l]oadbalancer | awk '{print $2}'`

Both outputs will be in the /opt/cafex/FAS-X.X.X/domain/log/console.log

For the Media broker controller run the command:
kill -3 `ps -ef | grep [m]ulti-rtp-proxy-manager | awk '{print $2}'`

The output will be in the /opt/cafex/FCSDK-X.X.X/media_broker/console.log

Note: You will need to get a copy of the console.log files prior to restarting FAS and Media Broker as the files will be overwritten on restart.

3. Gather output from TOP

The top thread output is important to collect as it can be correlated with the thread dump to identify spinning threads:

For app server box run the following command:
top -H -d 2 -n 30 -b > fas_top.txt

This will take 60 seconds to complete and then send us the
fas_top.txt file

For media broker box run:
top -H -d 2 -n 30 -b > mb_top.txt

4. Gather full logs

Furthermore gathering the full logs is advisable (see  https://support.cafex.com/hc/en-us/articles/200513801-Capturing-Logs#Manual_Server_Log_Retrieval )

If the logs are rolling over too quickly due to heavy traffic and disk space is available it may be worth increasing the size and number of app server server.logs 20@200M gives 8 times more logging than the default and may capture the problem and only takes 4Gig of disk. This would provide better logging if the problem reoccurs in the future. If you do alter the logging then a FAS restart is required.

 

5. Take heap dumps

For app server run the following command:
`which jmap` -J-d64 -dump:format=b,file=as_heap.bin `ps -ef | grep [a]ppserver | awk '{print $2}'`
For the loadbalancer the following command:
`which jmap` -J-d64 -dump:format=b,file=lb_heap.bin `ps -ef | grep [l]oadbalancer | awk '{print $2}'`

For the
Media broker controller the following command:
`which jmap` -J-d64 -dump:format=b,file=mbc_heap.bin `ps -ef | grep [m]ulti-rtp-proxy-manager | awk '{print $2}'`
(remove the -J-d64 flag if running on 32 bit box)

It will take a few seconds to complete and then you will have a 512M+ file to get to us. Probably via the ftp.cafex.com server If the file is smaller than this then there isn't a memory problem so the file can be discarded. The above commands can also be run with the -F Force option if the process does not respond.

Important Note: In certain situation the heap dump may take so long to capture that it causes a time out in FAS which triggers a restart so make sure you have a copy of the thread dump console.log and start time of servers prior to taking the heap dump.

 

 

 

 

 

 

Comments are disabled on these articles if you require help contact support@cafex.com.

Have more questions? Submit a request

Comments

Powered by Zendesk