Domain controllers are a vital part of your network. They allow you to organize your network, and manage your users, computers, and other resources. So it’s important that you regularly check the health of your domain controllers. But how do you do that, and what signals do you need to look for?
Especially when you have multiple domain controllers running, it’s important to keep an eye on the replication between the domain controllers.
In this article
In this article, I will explain how you check the domain controllers’ health, and I have written a script that will automatically check the health of all your domain controllers.
Check Domain Controller Health
The Domain Controllers are the backbone of your network. Any issue with a domain controller can result in an issue where users can’t log in anymore, and that isn’t a good way to start the day. To spot any potential issues we will need to check the health of our domain controllers regularly.
Your Windows servers come with a couple of built-in tools that allow you to check the health of your domain controllers and Active Directory. The most important one is the DCDiag tool, which will check your servers on 21 points with the default test.
To check the replication between your domain controllers you can use the RepAdmin tool. This tool allows you to quickly check if the replication is running without any errors. (You can also test the replication with DCDiag, but RepAdmin gives you more info)
If you run these two tools regularly, then you can quickly spot potential issues.
But running these tools every week manually to check your domain controller health isn’t really going to work. The output also doesn’t really show you at a glance any potential issues.
So I created a PowerShell script that will check the health of all your domain controllers and Active Directory. It will run all the important tests, including DCDiag, and format the results nicely in the console, or you can export it to an HTML file and email it.
The script checks the following important parts:
- DNS – Are the DNS Records and DNS Server addresses configured correctly
- Latency – Check the latency of each domain controller
- Uptime – Readout the update of the domain controller
- Domain Roles – Verify each FSMO holder is reachable
- Free space – Check the free space on the system drive
- AD Database size – Readsout the AD database size
- Services – Check if all the important services (13) are up and running
- DCDiag – Runs the default DCDiag tests
- Replication – Checks last replication attempt, success, and the delta
- NTP Time offset – Checks if the NTP offset isn’t to big
Checking DNS
One of the first tests we are doing in the script is checking the DNS settings. We want to make sure that the DNS records for each domain controller are registered in the DNS.
We also want to make sure that the DNS Servers in the network adapter are configured correctly.
Latency
The ping test will not only check if the server is reachable but also show a warning if the latency is too high. The latency in a local network should be below 10 ms, but when you have multiple domain controllers across different sites, then the latency is fine as long as it’s below 100 ms.
Uptime
The uptime is more informative, but it will show a warning when the server is running for less than 12 hours.
Domain Roles
To keep the information in your Active Directory in sync across all domain controllers, there are 5 FSMO roles. For each role, there is only one master. In most cases, all the roles are performed by one domain controller. But in some cases, you might have split the roles across multiple domain controllers.
It’s important that the server for each of those roles is available. In the script, we check if the domain controllers hold one or more of the roles, and check if we can reach the server for the other roles. When one is offline, an alert will be shown in the export.
Services
The domain controller requires a couple of services to run. Now these are also checked by default with the DCDiag test. But for the reporting it’s easier to check the services separately, so we can easily show the services that are not running.
The following services are checked:
- Active Directory Domain Services
- AD Web Services
- DFS Replication
- DHCP client
- DNS Client
- DNS Server
- Intersite Messaging
- Kerberos Key Distribution Center
- Netlogon
- Remote Call Procedure
- Server (Lanmanserver)
- Security Accounts Manager
- Workstation (Lanmanworkstation)
- Windows Event log
- Windows Time, Workstation
DCDiag
The script will run the default DCDiag tests, except the services tests. If one of the checks fails, then it will be pointed out which of the tests failed.
Replication
For good domain controller health, it’s important that the replication between your controllers is running smoothly. To monitor this, we will look at three metrics:
- Last replication attempt
- Last successful replication
- Replication delta
The report will also show the replication partner of the server.
Time Offset
The last thing we are going to check is the time difference between the servers. This should be within plus or minus one second. If the offset is larger, then you can get replication issues with your other domain controllers or authentication issues in your network.
Domain Controller Health Script
The PowerShell script to check the health of your domain controllers allows you to easily and regularly check the health of your Active Directory and servers. You can schedule the script to run every week and save the results to an HTML file. Or you can run the script manually and view the results directly in the console.
You can download the latest version of the script here from my GitHub Repository
At the beginning of the script, you will find a couple of settings that you can set. These include the file name and location and the desired output format.
# Set variables $reportDate = Get-Date -Format "dd-MM-yyyy" $reportFileName = "LazyDCHealthCheck-$reportDate.html" $reportPath = "c:\" $outputToConsole = $true $outputToHtml = $false
You will need to run the script on a domain controller. Using a PSSession doesn’t work, because that will give errors when the scripts try to connect to the other domain controllers.
When you run the script with the output to console on, you will get a result similar to the one below (hopefully with fewer errors 😉 ). The error count at the end of the script counts all DCDiag errors as one, so that is why you only see two errors in the summary in the example below.
We can also export the domain controller health results to HTML. You can do this in two formats, depending on the number of domain controllers in your network. If you only have one or two domain controllers, then each domain controller will be listed vertically, making it easy to view the results in an email for example.
But if you have more domain controllers, then it’s easier to read the results of the Active Directory health check if they are listed horizontally.
Around line 551 in the script, you will find the if statement below. This statement determines if you get the horizontal or vertical table. You can change this number if you prefer the horizontal or vertical table
if ($allDomainControllers.length -gt 2) {
Wrapping Up
It’s important to check your domain controller’s health regularly to spot any potential issues early on. This script checks the most important parts of your server and Active Directory and allows you to easily view the result in the PowerShell console or in an HTML report.
Make sure that you check out this article to read more on scheduling PowerShell tasks.
If you have any suggestions for the scripts or find any errors, please let me know in the comments below or here on GitHub.
line 148 should be:
if (!(Test-Connection -ComputerName $_.Value -Count 1 -Quiet -ErrorAction SilentlyContinue)) {
Indeed, thanks
Hi Ruud,
Awesome script – thanks for sharing! Is it typical to have the “DCDiag – SystemLog” fail for all DCs? Not finding much info when searching online.
You can check the evenvwr for more details or try this command :
dcdiag /v /test:systemlog
I got an these errors.
line 150 !Test-Connection
line 280, 290 and 308 :Cannot find an overload for “op_Subtraction” and the argument count: “2”.
I have uploaded version 1.2, which should fix/catch these errors.
Line 150, script isn’t able to test connection to one of the FSMO role holders
And the others are all related to retrieving replication data, script seems to be unable to retrieve the data.
Hi Ruud,
First Great Job.
I noted some comments, improvments, suggestions for a next release.
Hope this could help, later