Page 1 of 1

Ubuntu 16.04 - "IOException: Too many open files"

Posted: Fri Oct 06, 2017 4:50 pm
by gilthanaz
[Problem]
Starting a service using "sudo service <servicename> start" or the service automatically starting after a reboot fails or is semi successful. Especially the latter is very nasty: I ran a "7 Days to Die" server that appeared to be up, but any players who wanted to connect got a message like "Server is still initializing, please try again later". Studying the server log after a trying a restart did show an "IOException: Too many open files".

Code: Select all

 $ ulimit -n
did show that Ubuntu server 16.04.2 LTS does have a quite low default of 1024. Obviously not enough, so we tried to raise it in /etc/security/limits.conf:

Code: Select all

* soft nofile 16384
* hard nofile 16384
Changes to this are only live after logging out and back in! We did that, checked again with ulimit -n, and everything looked fine. Unfortunately, the error happened again when executing

Code: Select all

$ sudo service 7days start


Checking the process itself for the limit was also done:

Code: Select all

$ cat /proc/<PID>/limits

Code: Select all

Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        0                    unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             47903                47903                processes
Max open files            1024                 1024                 files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       47903                47903                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
Max open files was still set to 1024 for the process! A reboot did also not change this, so it was clear that the limits.conf file was probably helping with other issues, but not this specific one.


[Solution]
In the end the problem was that upstart has its own set of limits when invoking a service and thus ignores the values set in the limits.conf file. The solution was to modify the /lib/systemd/system/7days.service configuration file and add "LimitNOFILE=1024000" in the [Service] section (Note: a lower value probably would've been sufficient, e.g. 16384). Below, find the now working version as an example:

/lib/systemd/system/7days.service:

Code: Select all

[Unit]
Description=7 Days to Die THC Private Server
After=network.target nss-lookup.target

[Service]
User=7days
Group=7days
Type=simple
PIDFile=/run/7days.pid
ExecStart=/home/7days/7days_server/startserver.sh -configfile=/home/7days/config/serverconfig.xml
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
Restart=always
LimitNOFILE=1024000

[Install]
WantedBy=multi-user.target
After changes to service config files the daemon must be reloaded:

Code: Select all

$ sudo systemctl daemon-reload
Restarting the 7 Days to Die server, with a stop before to make sure it's not running:

Code: Select all

$ sudo service 7days stop
$ sudo service 7days start
This time, the server should come up nicely and accept connections.


[Cause]
The most likely cause why the server worked without issues for many months and suddenly couldn't start up anymore is the creation of additional region files the more of the map is discovered by players. To make exploration less laggy, I've used the command "visitmap -10000 -10000 10000 10000" in the command shell of the 7days telnet access. This causes the whole map to be generated in the background while people are playing on the server. The process takes about 36 hours and creates 1600 Files, so on the next restart the default file limit for this process as too low.