Search Google Appliance


Long Running Processes

ARC allows users to run long term jobs, also known as long running processes (LRP), on the research compute servers. If you intended on running jobs longer than 24 hours, please contact ARC.

Be aware that servers may be taken offline for upgrades over winter and summer breaks. Typically ARC notifies affected users one week in advance of scheduled downtime.

Controlling Your LRPs

The research servers are a shared resources. ARC reserves the right to kill or renice processes that become a detriment to other users. The following 4 steps will help you be mindful of other users and control your LRPs.

Determine the Amount of Free Resources

SSH to research.pdx.edu and run the command "top". This will provide you with a Windows Task Manager like listing of currently running processes and available system resources. As shown in the image below, the load average should be low and the free memory high. A load average greater than 8 generally means the system is at capacity. Low free memory may cause your process to crash. Press Q on the keyboard to exit "top".

Determine the Fully Qualified Domain Name of the Server

Run the command "echo $HOST". This will return the fully qualified domain name (FQDN) of the server as shown in the image below.

Keep Your Process Running After You Close Your SSH Session

  1. The Unix utility "screen" allows users to keep running processes on the server after they log out. Screen creates a virtual terminal that users can connect and disconnect to at will. It is important to note that your screen only runs on the server which you started it on.
  2. After typing the command "screen," you will be inside a virtual terminal. Any commands you run will be executed inside of this virtual terminal.
  3. To disconnect from your screen, press Ctrl+A+D. To stop your virtual terminal, run the command "exit" in the virtual terminal.
  4. To reconnect to your screen you must ssh to the fully qualified domain name found in step 2 and then run the command "screen -dr". Once inside of your screen, you can check on the progress of your process.

Make Your Processes Play Nice

  1. The nice and renice commands adjust the priority of the processes. ARC recommends that users nice all of their processes with a value of at least 5. To nice a process when it is started, type the command "nice +5 {other_commands}". The "other_commands" will be run with a lower priority.
  2. To change the priority of a currently running process, you must first determine the process ID (PID). This can be accomplished by running the command "ps -u". As shown in the image below, this lists all of the processes you have on the server. Now that you have the PID, run the command "renice +5 {PID}".

Putting It All Together

  1. SSH to research.pdx.edu, this will connect you to the research server with the lowest average load.
  2. Run the command "top" and verify that the server is not under heavy load.
  3. Run the command "echo $HOST" to determine the FQDN of the server
  4. Run the command "screen" to keep your processes running after you close your connection.
  5. Run the command "nice +5 {start_my_job}".
  6. Press Ctrl+A+D, then close your session.
  7. Ssh to the FQDN of the server you started your screen on and run the command screen -dr
  8. Check on your LRP.

Further Resources

For more information on long running processes, contact Academic & Research Computing.

You can also contact the Helpdesk for additional assistance.