Basic troubleshooting in a Server(Linux) Machine

Ashutosh Kumar
2 min readMay 29, 2021

Below are the few commands to do a quick check and to monitor the server.

Check for the RAM usage

free -m

Check for the disk usage

df -h <path>

Check for the iNode usage
- if there are disk read and writes happening, this can be a bottleneck.

df -ih <path>

Check for the heap memory usage

Identify the xmx min and max in case of java application

Check for the running application

top command

Check for the context switching

cat /proc/<pid>/status

Check count of java thread

ls /proc/$(pidof java)/task/ | wc -l

To get the number of threads for a given pid

ps -o nlwp <pid>

To get the sum of all threads running in the system

ps -eo nlwp | tail -n +2 | awk ‘{ num_threads += $1 } END { print num_threads }’

On linux specifically, here is one way to do it per-process:

#!/bin/sh
while read name val; do
if [ "$name" = Threads: ]; then
printf %s\\n "$val"
break
fi
done < /proc/"$1"/status

You may then invoke this script with a PID as an argument, and it will report the number of threads owned by that process.

To get the thread count for the whole system, this suffices:

#!/bin/sh
count() {
printf %s\\n "$#"
}
count /proc/[0-9]*/task/[0-9]*

Count the number of threads in a process
https://www.golinuxcloud.com/check-threads-per-process-count-processes/

All about iNode here
https://www.2daygeek.com/linux-check-count-inode-usage/#:~:text=How%20to%20check%20Inode%20number,first%20field%20of%20the%20output.

What to do next ?

For the server machine – monitor the following

CPU utilisation – We can let this become high by increasing the threads.

Disk space utilization – See the trend in disk space utilization.

Memory utilization

For the dependencies – monitor the following

DB instance – mysql, postgres etc as RDS

Connection limit

Memory utilization

CPU utilization

Disk utilizationMemory utilization

Check if we can have some read replica of Db to manage the load by GET APIS.

Cloud Storage like S3

Check the average time being spent

NewRelic gives us a good idea on this.

Identify the read and write operations – I/O or disk reads/writes

iNode usage during peak time will convey

Find the asynchronous and synchronous request points and measure the time taken by them

Identify the different thread pools being exploited by the application and the thread count as well.

--

--

Ashutosh Kumar

Backend Engineering | BIT Mesra | Building Microservices and Scalable Apps | Mentor