Using vRealize Log Insight to Manage and Troubleshoot NSX

 
I recently wrote a post on how to Deploy vRealize Log Insight with the NSX Content Pack. The post outlined the initial installation, redirection of NSX component logging to vRLI and and the installation of the NSX Content Pack.In this post, I wanted to show all of the cool features of the content pack, how to use it for day to day operations, and how to use it to troubleshoot NSX. Let’s start out with creating and managing dashboards.
 

Creating and Managing Dashboards

Custom Dashboard

 
To create a custom dashboard, navigate to Interactive Analytics at the top. From here, you can create your own custom filter logic to determine configuration changes, issues, or anything relevant to you via logging in your NSX environment. As an example, I created a dashboard that looks for any “ERROR” or “warning” logs on all 3 of my NSX controllers, using a custom time range of “All Time.”
 
1
 

Modifying Existing Dashboards

 
If you like the pre-created dashboards, but want to slight modify them, click the magnifying glass icon on the top right of the dashboard.
 
2
 
It will bring you to the Interactive Dashboard with the logic already populated. You can add, change, remove any logic that you would like. Once you are happy with your changes, click Add to Dashboard at the top right. You will need to provide a name, any relevant notes, and which dashboard to save the configuration.
 
3
 
4
 

Renaming Dashboards, or Creating new Dashboards

 
You will find your new Dashboard by clicking the top left pane, and selecting My Dashboards. If you would like to rename your Dashboard, or create more Dashboards, you can find the menu option here. Alternatively, you can create a new Dashboard by clicking New Dashboard at the bottom left of the same pane.
 
6
 

Favorites, Alerts, Exporting and Sharing Queries

 
On the right side of the Filter logic you will see the following icons. Here, you can add the current query to favorites, add to dashboard, create an alert, or Export and Share the query.
 
7
 

Adding a query to Favorites

 
If you don’t want to add the query to a dashboard and would just rather store it as a favorite so that you can come back and review the output or logic, click the Star Icon. Be sure to provide a name and notes to make it easier to find if you end up favoriting a lot of queries.
 
8
 
To find your favorites, click the Star Icon in the logic pane, and select it from the dropdown.
 
9
 

Creating and Managing Alerts

 
To create an Alert from a query, click the following Icon.
 
10
 
Configure the name, any notes, and email where you would like to send the alert, then click Save.
 
11
 
If you want to receive emails on pre-populated dashboards click Alerts, then Manage Alerts, and check in the boxes for the email alerts that you would like to receive. Then click enable and provide an email address. The next time an alert is logged for the dashboard you have chosen, an email will be sent with the log entries.
 
12
 

Exporting event results, chart data, or sharing queries

 
To Export Event results, click the following icon. The events export will use a .txt file, and the chart data allows you to specify .json or .csv. One thing that I thought was pretty awesome, was being able to share the query. If you want to share the quary with a colleague, you can click the Share Query button and it will provide you with a link that you can share.
 
13
 
14
 

Troubleshooting NSX with vRLI

 
Next, I want to show everyone how easy it can be to troubleshoot NSX issues with vRLI. I will reproduce a couple of problems in my lab, and use the dashboards and queries to determine what is causing a specific problem, whether it is a configuration issue, a service crashing, or something else entirely.
 

Using Dashboards to troubleshoot and identify issues

 
I won’t go over all of the dashboards, nor every troubleshooting or configuration issue as my post would be too long. However, I do want to give a couple of real world examples of using the dashboards in vRLI to resolve configuration or health issues.
 

Example One: Quickly Identify Health Issues

 
I think one of the best dashboards, is the NSX-vSphere – Infrastructure tab because it shows control plane and management plane, and data plane connectivity. I will start by powering off an NSX controller to see what is reported. You can see the controller is disconnected from NSX manager.
 
19
 
In vRLI, there are errors being reported from the manager. To view what errors are being reported, hover over the bar graph, click, and select interactive analysis.
 
20
 
21
 
It will bring you to the log filter logic and display the appropriate logs. As you can see, it is showing that Controller-1 is failing to connect.
 
22
 
If you are experiencing any issues with NSX, whether it is the control plane, the dataplane, or the management plane, these dashboards will give you a graph of the number of errors reported, and will show you what errors are being reported in the logs. I highly recommend that this be one of the first places to go when troubleshooting!
 

Example Two: Troubleshooting Configuration Issues

 
A common troubleshooting scenario is the configuration of Dynamic routing, or specifically the setup of OSPF. If you configure OSPF and the devices are not showing up as neighbors, not in the full state, or not distributing routes you can use vRLI to determine what is causing the problem.
 
The first thing that you need to do, is enable logging for OSPF. Using the vSphere Web Client, navigate to your Edge or DLR, then go to Manage -> Routing -> Global Configuration -> Dynamic Routing Configuration and click Edit. Check the box to enable logging, and select your logging level. I kept the default of INFO. Make sure to Publish Changes.
 
15
 
Next, go back to vRLI and Navigate to the appropriate Dashboard. In this case it is Logical Router Alerts.
 
16
 
Just looking at the Dashboard I see that there are alerts being triggered under OSPF Area ID Mismatch. There have been several errors posted in the last 5 minutes. The other dashboards don’t seem to be logging any errors, so it is pretty clear there is an Area ID mismatch.
 
17
 
If I click on the bargraph, and select Interactive Analysis I will get more information on the Area mismatch. This brings me to the section to review the actual logged error messages. In this case, I can easily tell that the area received is “1” however I configured the interface to be in area “3.” I now know that I need to go back to the Edge or DLR and change the Area in order for OSPF to work.
 
18
 

Final Thoughts

In Summary, there are a TON of awesome features and pre-build dashboards in the NSX Content pack for vRLI. I definitely suggest playing around with the GUI to get a feel for everything you can do with the product. I was going to include all of the Distributed Firewall dashboards, but to me, this post is getting too long that stuff is so important and powerful that I decided to split it out into it’s own post. More to come in the next few days on using vRLI with the Distributed Firewall dashboards. Stay tuned..
 
 

Posted by:

Sean Whitney

Leave A Comment

Your email address will not be published. Required fields are marked (required):

Back to Top