How to Troubleshoot SSH Shell Environment Issues

SSH is the primary method available for managing DigitalOcean Droplets. Dealing with SSH errors or failures can be frustrating because the errors themselves often prohibit you from accessing your servers.

There are two prerequisites to troubleshooting SSH issues:

  1. Should I troubleshoot SSH? Determine whether troubleshooting is the right decision or if migration/redeployment is more appropriate.
  2. What should I do before troubleshooting SSH?. Make sure the issue is truly with SSH, then review the information and skills necessary to resolve SSH issues, like having root access to the server and understanding how to access and edit files.

When to Consider Migration or Redeployment

To resolve your issue quickly, first determine whether troubleshooting the connection is the right solution for your problem or if you should instead focus on recovering your data for redeployment.

Some issues, such as an accidental recursive rm or chmod command or incorrect network configuration, can lock you out of a Droplet permanently. Other issues may seem like connection problems, but are actually more complex issues with no clear resolution, like corrupted file systems, incorrect file permissions and ownership, and broken system packages and required libraries.

You can typically identify boot errors through the Droplet console startup output. File system issues and startup failures that prevent a working console login session are signs that troubleshooting your network configuration may not be the better option. In situations like this, the best approach is to salvage what you can. In some cases, a good backup or snapshot strategy is the fastest way back to your previous working environment.

What to Do Before Troubleshooting

If you’ve decided that troubleshooting is right for your situation, go through the following steps:

  1. Check the control panel. Before anything else, make sure there are no ongoing issues, like an outage in the region impacting your Droplet.

  2. Check if Droplet is disabled because of abuse. Droplets are sometimes disabled due to the detection of abusive activity. If your Droplet has been disabled, an email has been sent to the email address linked to your DigitalOcean account with the title Networking Disabled: <your-droplet-name>. You can also log in to the support portal to see if any support tickets have been created for your resources.

    If your Droplet has been disabled due to suspected abuse, contact our support team for further information.

  3. Recover root access. If you do not have the current root password, reset it using the reset root password function in the control panel.

  4. Access the Recovery Console. If you cannot log in to the Droplet, the Recovery Console is another way to gain access (as long as your Droplet is running and you have a working root password).

  5. Reboot your Droplet. Many connectivity problems can be resolved after a reboot. If you’re experiencing connectivity issues, try rebooting the Droplet and see if this resolves the issue.

    Before rebooting your Droplet, we highly recommend taking a snapshot of it. This allows you to redeploy your Droplet in its current configuration if rebooting the Droplet causes more serious problems.

    To reboot your Droplet, log in to it and run the following command:

    sudo reboot
    
  6. Review file management and permissions. Some of these solutions may require you to review or edit files on the system or manage permissions.

  7. Check logs. Once you can get into the Droplet, check the system’s log files for more information to identify the error so you can then look up a solution.

    You can learn more about the logs on your server with this Linux logging tutorial and this journalctl and systemd logging tutorial.

  1. Use verbose SSH output. The level of detail an SSH client provides about the SSH session is generally quiet by default. It’s helpful to have more information when debugging an issue.

    For the OpenSSH client, you can use the -v option with multiple v entries to increase the verbosity of the output, as in ssh -v [email protected]. While most issues are revealed with a single v, some issues may benefit from -vvv.

    The PuTTY client supports an Event Log accessible from the context icon in the application window bar. There’s also an option for configuring session logging from the settings page when initiating the connection.

After you decide to troubleshoot an SSH issue instead of migrating or redeploying, you can identify and resolve specific SSH errors based on which phase of a successful SSH connection you need to debug.

Once your SSH connection is established and you are authenticated, the remote shell environment is then executed. There are a couple of issues that can occur at this point as described below, followed by actions you can take to address them.

Errors

Could Not chdir To Home Directory

In some cases, you may cause damage to directory ownership or permissions that can cause problems when trying to access the home directory. This can result in errors like the following:

Could not chdir to home directory /home/user: Permission denied
Could not chdir to home directory /home/user: Input/output error
Could not chdir to home directory /home/user: No such file or directory

Some issues might stem from the user home directory not existing, its ownership being incorrect, or its permissions being too restrictive. This also might happen when filesystem issues have corrupted the home directory.

To troubleshoot this issue, try checking the home directory’s existence, permissions, and ownership.

This Account Is Currently Not Available

In some cases, users may be configured to not have a login shell. This can manifest in several ways in the shell not responding. You might see an error like this:

This account is currently not available.

Here are some potential causes of this issue:

  • The user is a system user and not intended for shell access.
  • The user shell is assigned to nologin, true, false or another non-shell binary. In this case, you can update the user shell.

Resource Temporarily Unavailable

The SSH service, like any service, requires system resources to operate. This means that when your Droplet is under resource-constrained conditions, the service may fail to open a working shell environment. These conditions include exhausting the system memory, reaching the system’s open file limit, or crashing the runtime environment.

You might see an error message like this:

ssh: connect to host example.com port 22: Resource temporarily unavailable

Resource issues can be difficult to debug, and depends on the kind of access you have to your Droplet. Read below on how to handle resource issues.

Solutions

Below are some troubleshooting methods and solutions to common SSH environment errors.

Checking The Home Directory

In some cases, you may need to use the Recovery Console to log in as root to evaluate the home directory with sufficient permissions to address any issues. Verify that /home and the path for the user’s home directory exist using stat or a similar utility.

If the directories exist, verify that the user’s home directory has appropriate permissions (at least 700) and ownership (the user, not root).

Updating The User Shell

From the Recovery Console, log in as root or a user with sudo access. You can review the /etc/passwd file directly or use the getent command to list the details:

getent passwd user

The output looks like this, with /usr/sbin/nologin at the end.

user:x:1000:1000::/home/user:/usr/sbin/nologin

To update this, use the system command usermod and specify the correct shell to use, like /bin/bash.

usermod -s /bin/bash user

Run the getent command again to see the change reflected in the output:

user:x:1000:1000::/home/user:/bin/bash

You can then try logging in again.

Dealing With Resource Issues

Dealing with resource issues is a very context-specific situation.

If resource contention is caused by network requests (like a DDoS attack against a web application), you may be able to disable the service or block traffic at the firewall from the Recovery Console. This may allow enough room for you to assess the impact of the situation and implement mitigation strategies or consider scaling your deployment.

If you cannot log in from the Recovery Console, the last resort option is to power cycle or reboot the Droplet. Depending on the cause of the resource exhaustion, this may hit the same environment or initially support a connection that gives an Unable to fork process error when you attempt to run a command. Catching the Recovery Console or SSH connection to the Droplet after a reboot but before it becomes unresponsive is key to troubleshooting the root cause.

You can learn more about scaling, load balancing, and expanding your Droplet’s resources in the following tutorials:

Conclusion

If you need further help, you can open a support ticket. Make sure to include the following information:

  • The username, host, and port you are using to connect.
  • The authentication mechanism you expect to use.
  • The full output of the errors linked to the stage of error, including verbose output of the SSH client
  • All of the information you’ve gathered from troubleshooting so far.
  • Anything you were unclear about while referencing this article.

Including all the above diagnostic information and clarifying where you are encountering the issue when trying to connect can help us quickly get up to speed with where your need on the issue is.

Problems with SSH authentication includes permission denied with SSH keys and passwords.
Problems with SSH connectivity include hostname resolution errors and connections being refused or timing out.
Problems during SSH protocol initiation include the client suddenly getting dropped or closed, the client returning errors about cipher negotiation, or issues with an unknown or changed remote host.