November 2020 Release Notes
New User Interface
The July 2020 Release made it possible for users to opt-in to preview of our new user interface(UI). With this release the UI is now the default for all users. The https://trustgrid.atlassian.net/wiki/spaces/DOCS/pages/476643332/July+2020+Release+Notes give a full breakdown of all the changes and improvements, but here is a quick list:
A new top navigation bar with “breadcrumbs” that makes moving between pages easier
Tabs subpages have been moved to a left side navigation bar
A new omnibar search option at the top of the page can help you quickly jump to the resource you need
Numerous changes to the Node details page including consolidating all graphs to a single page and adding visual indicators for connectivity status
Support for the Wireguard vpn protocol was added
The Alerts table was renamed Events and links were added to the Events table so you can jump straight to the impacted Node
New Alarms System
The previous Alarm system operated as follows.
An Event would be generated a node or the Trustgrid cloud based on a change in status. (e.g. a node disconnects)
If the Event matched the conditions of a configured Filter then an Alert would be generated and sent to the configured notification integrations (e.g. email, PagerDuty, OpsGenie, Slack or Teams)
The new system introduces a few new terms. It operates like this:
An Event would be generated based on a change in status. This is the same as before.
The Event will now generate an Alert.
The Alert is compared to the conditions defined within an Alarm.
Matching Alarms will send the details to any selected Channel.
The Channel will then push the information through the configured notification integrations. (e.g. email, PagerDuty, OpsGenie, Slack or Teams)
As part of the release deployment, Trustgrid will convert all existing Filters to Alarms and Channels automatically. No action is required to maintain the current behavior.
Resolvable Alerts
The new system introduces the idea of alerts being resolved. While an alert is in the unresolved state any repeat of that event will not generate a new alert. This should reduce the noise of repetitive alerts.
There are three ways to resolve alerts:
Manually in the portal.
This can be done for all Alerts across the organization via the Alarms > Events page.
Or, to resolve alerts for a specific node you can click the Info panel at the top right.
Alerts will resolve themselves after 24 hours
Alerts may resolve if a corresponding event indicates the status has changed. An example would be the node connectivity events.
When a node disconnects it would generate an alarm. If you have a system such as PagerDuty configured this would create an incident within that system.
When the node reconnects the alarm would be resolved within the Trustgrid system. In the PagerDuty system the incident would automatically be resolved.
Alarms
The Alarm configuration is very similar to the previous Filter configuration but has been enhanced to allow for greater flexibility.
You can now select multiple Nodes, Event Types or Tag matches. Previously you’d have to create a filter for each.
You can choose if the All, Any or None of the criteria should match to trigger the alarm. Previously all criteria had to be true for a match to occur.
The notification integration has been moved to channels
Also, the “Extra Info to send with alert” section has been renamed Description but the contents are still included in the Alert details sent to any integration.
Channels
In the old system, you had to define the notification integrations information (email addresses, API keys, and Webhooks) for each filter you defined. If one of these changed you had to update each filter manually.
Channels separate the integration information from the Alarm configuration. Within a channel, you define one or more notification integrations. Then from the Alarm, you can select one or more channels.
Testing Alarms & Channels
After you’ve configured your Alarms and Channels you can test them to confirm they behave as expected.
Navigate to Alarms > Events
Find an Event that matches your Alarm filter (or that you think it did)
Click the Test button to the right of the event
Alert Suppression
A brand new feature is the ability to configure suppression windows that will stop any Alarms from triggering. This should enable performing maintenance without generating extraneous notifications to your integrated systems.
Speed Test
Finally, we’ve added the ability to test the WAN connection of a node. This will test the bandwidth between the Node and the Trustgrid control plane.
To run the tool:
Navigate to the desired Node’s page
Select the Network > Interfaces
Under “Interface Tools” click the Speed Test button
Click the GO button to start the test. It will test both upload and download speeds.
After it completes it will display the average results.