Root Filesystem Full – No Space Left on Device due to open files

Here’s an interesting scenario that I was asked to look into recently with the root file system on an Oracle database server filling up. Normally cleaning up disk space is straight forward; find the large and/or old files and delete them. However, in this case there was a difference is space usage reported between df and du, and the find utility could not locate any file over 1G in size.

Here’s the status of the root file system which was causing the “No Space Left on Device” error message.

# df -h .
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VGExaDb-LVDbSys1
                       30G   30G     0 100% /

After deleting around 2G of old files and logs, the error went away but the output of df -h showed the root file system slowing filling up again. These directory sizes hardly changed at all, only MB differences. From the “/” directory, here are all the directories that are on the “/” file system as seen in df -h $dir .

# du -sh *
7.7M     bin
67M     boot
3.5M     dev
8.5M     etc
1.2G     home
440M     lib
28M     lib64
16K     lost+found
4.0K     media
1.2G     mnt
6.8G     opt
1.1G     root
41M     sbin
4.0K     selinux
4.0K     srv
0     sys
12M     tmp
3.0G     usr
260M     var

Notice here that the sum of these directories only adds up to around 15G, leaving the rest of the used space unaccounted for, and the file system used space was still increasing.

Next was to look at open files. It is worth mentioning here that even if a file is deleted, it’s space may not be reclaimed if the process that created it, or still using it, is still running. Using the lsof ( list open files ) utility will show these files.

# lsof | grep deleted
...
expdp      7271  oracle    1w      REG              253,0 16784060416    2475867 /home/oracle/nohup.out (deleted)
expdp      7271  oracle    2w      REG              253,0 16784060416    2475867 /home/oracle/nohup.out (deleted)
…
#
# ps -ef | grep 7271
oracle    7271     1 99 May31 ?        3-10:43:36 expdp               directory=DP_DIR dumpfile=exp_schema.dmp logfile=exp_schema.log schemas=schema

The above shows an export data pump job ( pid = 7271 ) whose process was still running at the OS level, although it was not running in the database. This job was probably canceled out for some reason, but was not cleaned up although the nohup file was deleted. The background process was still running at the OS level and the nohup.out file is taking up the space filling up the “/” partition. It is worth mentioning here that the use of nohup is NOT desired with data pump. The data pump utilities are server side processes; if you kick off a job and then loose your terminal for whatever reason, the data pump job is still running.

Once the expdp process 7271 was killed at the OS level, the space was reclaimed.

# df -h .
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VGExaDb-LVDbSys1
                       30G   13G   16G  45% /
Advertisements

One thought on “Root Filesystem Full – No Space Left on Device due to open files

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s