Troubleshooting

The sections below covers common issues that may arise on a system and how to address them.

System

The table below provides guidance for resolving certain system-related issues that may arise.

Symptom Reason Verification Next steps

Application shuts down regularly

Nginx TLS config and logrotate conflict

  • Check if logrotate is enabled.

  • Check if the Nginx TLS certificate requires a password on startup.

Choose one:

  • Disable logrotate

  • Remove password from Nginx TLS certificate

Unable to install or update rpm/deb packages using yum/apt

System is not properly subscribed to SDE update server

  • Check to ensure system is properly subscribed

  • sudo subscription-manager status should return:

+-------------------------------------------+
   System Status Details
+-------------------------------------------+
Overall Status: Current
  • sudo yum repolist should return a something similar to the following:

Loaded plugins: fastestmirror, presto, product-id, search-disabled-repos, *subscription-manager*
Setting up Update Process
Determining fastest mirrors
Default_Organization_Centos_7_centosplus_x86_64
Default_Organization_Centos_7_extras_x86_64
Default_Organization_Centos_7_os_x86_64
Default_Organization_Centos_7_updates_x86_64
Default_Organization_Custom_7_custom_x86_64
Default_Organization_EPEL_6Server_x86_64
Default_Organization_EPEL_6_x86_64
Default_Organization_IUS_https_dl_iuscommunity_org_pub_ius_stable_CentOS_7_x86_64
Default_Organization_Katello_latest_client_el6_x86_64
Default_Organization_NGINX_centos_7_x86_64
Default_Organization_PostgreSQL_9_4_redhat_rhel-7-x86_64
Default_Organization_PostgreSQL_9_4_redhat_rhel-7Server-x86_64
Default_Organization_Puppet_el_6_dependencies_x86_64
Default_Organization_Puppet_el_6_pc1_x86_64
Default_Organization_Puppet_el_6_products_x86_64
  • Manually subscribe the system using the following command. It is imperative that the activation key be entered in the custom system config as described in Supported system configuration.

  • Note that the Acme key is an example, and you should replace it with the actual key provided by SDE Support.

  • sudo subscription-manager remove --all && sudo subscription-manager unregister && sudo subscription-manager clean

  • sudo subscription-manager register --org="Default_Organization" --activationkey="Acme-rhel7-81ab92bb-54ff-4aaaa-9122-ab4ac1e1241e"

Error running sde reprovision

No outbound access to updates.sdelements.com or anvil.sdelements.com

Run in offline mode: sde reprovision --offline-mode

Nginx TLS/SSL error

Check if Nginx certificate specifies the server’s fully qualified domain name (FQDN)

Update certificate for server’s FQDN. Re-run reprovision.

Malformed /etc/hosts file

Check if localhost is defined in /etc/hosts

Update the hosts file. Re-run reprovision.

Limited diskspace

Lingering PostgreSQL temp files

Check the pgsql_tmp directory for large temp files

Remove the temp files

Upgrades

The table below provides steps to resolve upgrade issues.

Symptom Reason Verification Next steps

Connection problem to updates.sdelements.com or anvil.sdelements.com

No outbound network access from the server to updates.sdelements.com or anvil.sdelements.com

Run commands:

ping updates.sdelements.com
ping anvil.sdelements.com
  • If ping is successful then network access is confirmed: a firewall exception may be required.

  • Request a firewall exception from your IT team or configure an HTTP proxy.

Network access limited by local firewall

Run command:

iptables -nL

Network access limited by firewall

Run commands:

tracepath updates.sdelements.com/443
tracepath anvil.sdelements.com/443
  • If tracepath return successfully then updates server should be accessible.

  • Request a firewall exception from your IT team or configure an HTTP proxy.

Update fails

An unexpected issue encountered in the updater.

Upgrade the SD Elements updater and try the upgrade again.

Check /docs/sde/log/deploy.log for errors and reach out to SD Elements Support with the details.

Issue Tracker, Scanner & LDAP integration

The table below provides troubleshooting guidance for issues regarding integration with an Issue Tracker, scanner or LDAP server.

Symptom Reason Verification Next steps

Invalid server or server unreachable

Connection details are invalid

Verify the connection details are correct.

Update the connection with the correct information and retry.

No network access to server.host.name

Run command:

ping server.host.name
  • If ping is successful then network access is confirmed: a firewall exception may only be required.

  • If ping fails then DNS cannot resolve the IP of the other server or the server is unreachable.

    • Request a firewall exception from your IT team or configure an HTTP proxy.

    • Update the system so that server.host.name resolves to an IP.

Network access limited by local firewall

Run command:

iptables -nL

Network access limited by firewall

Run command:

tracepath server.host.name/port
  • If tracepath returns successfully then server should be accessible. Try the integration again.

  • If tracepath fails, request a firewall exception from your IT team or configure an HTTP proxy.

Network access is limited by a transparent proxy

A transparent proxy may be at issue if outbound network access is already confirmed for other external systems but not for this server.

Transparent proxies allow companies to control traffic without burdening systems with configuration. Contact the IT team for details and request a whitelist to the desired endpoint, if needed.

Network access is limited by an IPS (Intrusion Prevention System)

Check with IT team if traffic to server.host.name:port is filtered by an IPS rule.

Investigate the cause for rule being triggered. Request an exception for the specific server.

TLS/SSL validation error

HTTPS connection fails certificate validation

Verify the TLS/SSL connection

Add the certificate to the system

Connection relies on a proxy that rewrites TLS/SSL certificates or its own certificate is untrusted.

Check that the proxy’s certificate or its CA certificate is trusted by the system: Validate TLS/SSL connection to the proxy.

Add the HTTPS proxy certificate to the system

TLS/SSL connection error

HTTPS connection to server requires Server Name Indication (SNI) support

Contact SD Elements product team to prioritize SNI support

HTTPS connection fails due to cipher or protocol error

Validate a TLS/SSL connection.

Investigate whether the target server supports minimum TLS security settings. For example, SSLv3 is not supported.

Jobs stuck or not working

Celery needs a restart

Application shows jobs stuck in status "Waiting…​" for more than 10 minutes

On the SSH console run:

sde supervisor restart all

Inconsistent connection

DNS issue

If connection to a server fails intermittently, the problem may be due to a flaky DNS lookup.

Add an entry to /etc/hosts for the server and retry.

Job unexpectedly fails

Integration issue or unsupported server

Examine celery logs for the error.

Timeout reached

Examine celery logs for a SoftTimeLimitExceeded() or similar error.

Extend the job timeout

Integration server error

Examine celery logs.

  • If the error is of the form "HTTP 5xx" it is a server or gateway error.

    • Contact the integration server owner or network team to investigate.

    • Verify that the integration server is patched and up-to-date.

    • If the problem persists and no resolution is found then open a support case about the error

LDAP SSO error

Use the in-app troubleshooting mechanism.

  • Use ldapsearch if the web interface cannot surface the underlying issue.

Missing LDAP configuration

Examine /docs/sde/local_settings for LDAP configuration

Update local_settings with the appropriate configuration. Restart Apache.

Capture detailed integration logs

Diagnosing integration issues is aided greatly by detailed logs between SD Elements and the other server. Follow the steps below to collect verbose logs for a problematic integration.

Prerequisites:
  • SSH credentials for sde_admin or sudo access.

  • Application Super User access.

Steps:
  1. Login to the SD Elements web application as a Super User.

  2. Open the problematic integration connection.

  3. Enable option Debug Mode.

  4. Access the SD Elements server SSH console as sde_admin.

  5. Run command:

    sde manage_django run_session_capture_server 2> /docs/sde/log/debug_integration.log
  6. Run the problematic integration until it completes.

  7. Disable option Debug Mode on the integration.

  8. Cancel the run_session_capture_server command by entering Ctrl-C.

The full integration logs are captured in file debug_integration.log.

Caution
Credentials are stored as cleartext in the log file. Remove the file from the system as soon as possible.

Modify the application job timeout

Integrations with Issue Tracker systems, scanning tools, and LDAP servers are run by the Celery process. By default these jobs time out after 10 minutes.

To modify the job timeout to 15 minutes, for example, follow the steps below.

Prerequisites:
  • SSH credentials for sde_admin

Steps:
  1. Access the SD Elements server SSH console as sde_admin.

  2. Update file /docs/sde/local_settings set:

    CELERY_JOB_TASK_SOFT_TIME_LIMIT = 15 * 60
  3. Save the file and run:

    sde supervisor restart all
    sde apache restart

New jobs are configured to expire after 15 minutes.

Modify the GitHub API delay rate

The default delay between GitHub issue creation and updates is 20 seconds. This delay is necessary to adhere to GitHub’s secondary rate limiting for certain content creation endpoints.

If you are a GitHub Enterprise client and have rate limiting disabled, or if the default timeout is insufficient for syncing all project tasks, follow the steps below to modify the timeout value.

Prerequisites:
  • SSH credentials for sde_admin.

Steps:
  1. Access the SD Elements server SSH console as sde_admin.

  2. Open the file /docs/sde/local_settings and add or update the following line:

    SDETOOLS_GITHUB_API_REQUEST_DELAY_SECONDS = 30
  3. Save the file and run:

    sde supervisor restart all
    sde apache restart

The example increases the timeout from 20 seconds to 30 seconds. Use a value that works best for your use case.

Open a support case using the support portal

If the system is acting abnormally, reach out to SD Elements Support with details about the issue. You will receive a response within 3 business hours.

Prerequisites:
Steps:
  1. Open the support portal https://support.sdelements.com

  2. Click on Submit a request

  3. Enter your details:

    • Email address: A way for the support team to respond to you directly.

    • Subject: A brief description of the issue

    • Description: The issue you experienced as well as steps to reproduce. include your system versions.

    • Attachments: Screenshots, log files, or other information helpful to better understand and diagnose your issue.

  4. Click Submit.

A new support ticket is created for your issue. You will receive an email soon as confirmation.

Open a support case using email

If the system is acting abnormally, reach out to SD Elements Support with details about the issue. You will receive a response within 3 business hours.

Prerequisites:
Steps:
  1. Compose a new email to support@sdelements.com

    • Subject: A brief description of the issue

    • Body: The issue you experienced as well as steps to reproduce. include your system versions.

    • Attachments: Screenshots, log files, or other information helpful to better understand and diagnose your issue.

  2. Click Send.

A new support ticket is automatically created for your issue. You will receive an email soon as confirmation.

results matching ""

    No results matching ""