November 2020 Release Notes

New User Interface

The July 2020 Release made it possible for users to opt-in to preview of our new user interface(UI). With this release the UI is now the default for all users. The https://trustgrid.atlassian.net/wiki/spaces/DOCS/pages/476643332 give a full breakdown of all the changes and improvements, but here is a quick list:

  • A new top navigation bar with “breadcrumbs” that makes moving between pages easier

  • Tabs subpages have been moved to a left side navigation bar

  • A new omnibar search option at the top of the page can help you quickly jump to the resource you need

  • Numerous changes to the Node details page including consolidating all graphs to a single page and adding visual indicators for connectivity status

  • Support for the Wireguard vpn protocol was added

  • The Alerts table was renamed Events and links were added to the Events table so you can jump straight to the impacted Node

New Alarms System

The previous Alarm system operated as follows.

  1. An Event would be generated a node or the Trustgrid cloud based on a change in status. (e.g. a node disconnects)

  2. If the Event matched the conditions of a configured Filter then an Alert would be generated and sent to the configured notification integrations (e.g. email, PagerDuty, OpsGenie, Slack or Teams)

The new system introduces a few new terms. It operates like this:

  1. An Event would be generated based on a change in status. This is the same as before.

  2. The Event will now generate an Alert.

  3. The Alert is compared to the conditions defined within an Alarm.

  4. Matching Alarms will send the details to any selected Channel.

  5. The Channel will then push the information through the configured notification integrations. (e.g. email, PagerDuty, OpsGenie, Slack or Teams)

As part of the release deployment, Trustgrid will convert all existing Filters to Alarms and Channels automatically. No action is required to maintain the current behavior.

Resolvable Alerts

The new system introduces the idea of alerts being resolved. While an alert is in the unresolved state any repeat of that event will not generate a new alert. This should reduce the noise of repetitive alerts.

There are three ways to resolve alerts:

  1. Manually in the portal.

    1. This can be done for all Alerts across the organization via the Alarms > Events page.

    2. Or, to resolve alerts for a specific node you can click the Info panel at the top right.

  2. Alerts will resolve themselves after 24 hours

  3. Alerts may resolve if a corresponding event indicates the status has changed. An example would be the node connectivity events.

    1. When a node disconnects it would generate an alarm. If you have a system such as PagerDuty configured this would create an incident within that system.

    2. When the node reconnects the alarm would be resolved within the Trustgrid system. In the PagerDuty system the incident would automatically be resolved.

Alarms

The Alarm configuration is very similar to the previous Filter configuration but has been enhanced to allow for greater flexibility.

  • You can now select multiple Nodes, Event Types or Tag matches. Previously you’d have to create a filter for each.

  • You can choose if the All, Any or None of the criteria should match to trigger the alarm. Previously all criteria had to be true for a match to occur.

  • The notification integration has been moved to channels

Also, the “Extra Info to send with alert” section has been renamed Description but the contents are still included in the Alert details sent to any integration.

Channels

In the old system, you had to define the notification integrations information (email addresses, API keys, and Webhooks) for each filter you defined. If one of these changed you had to update each filter manually.

Channels separate the integration information from the Alarm configuration. Within a channel, you define one or more notification integrations. Then from the Alarm, you can select one or more channels.

Testing Alarms & Channels

After you’ve configured your Alarms and Channels you can test them to confirm they behave as expected.

  1. Navigate to Alarms > Events

  2. Find an Event that matches your Alarm filter (or that you think it did)

  3. Click the Test button to the right of the event

Alert Suppression

A brand new feature is the ability to configure suppression windows that will stop any Alarms from triggering. This should enable performing maintenance without generating extraneous notifications to your integrated systems.

Speed Test

Finally, we’ve added the ability to test the WAN connection of a node. This will test the bandwidth between the Node and the Trustgrid control plane.

To run the tool:

  1. Navigate to the desired Node’s page

  2. Select the Network > Interfaces

  3. Under “Interface Tools” click the Speed Test button

  4. Click the GO button to start the test. It will test both upload and download speeds.

  5. After it completes it will display the average results.