Symptom
In a Director environment with multiple discovery engines, the discovery job hangs near the end of completion.
Cause
Each discovery engine is configured with the subnets/networks that it has access to. But in your discovery definition you have configured the job to scan a network that none of the discovery engines are configured to scan.
When a discovery job is submitted, it is broken up in to many little workable segments. Discovery engines pick up the segments that they are zoned to scan. If there are segments that no discovery engines are configured to handle a particular network/subnet, then that segment will stay in the Discovery Job table and never be picked up.
Resolution
Review your discovery list and compare it to the zones configured under your discovery engines (you will find these configurations in your Platform Tree). Either remove the violating subnet from your discovery definition or add the subnet to the appropriate discovery engine.
If your discovery definition is too large to manually search and compare, you can perform a sql query to find find out which segment of your discovery job is not being picked up by discovery engines.
SQL Query to perform:
select * from dbo.discovery_jobs
Sample output:
ID | Definition | Engine | Priority | Inserted | Updated | JobState | WorkUnits | StartIP | EndIP | PortRange |
1 | {f85d5074-d573-4422-89a9-067c10d971dc} | {} | 1 | 41:20.9 | 41:20.9 | 0 | 255 | 3232235777 | 3232236031 | 443 |
The StartIP and EndIP fields are the scope of the network segment that is causing the hang. However, in the database it is in Decimal format. You will need to go use a Decimal IP to Real IP conversion to find the IP address of your real subnet (like http://www.ipaddresslocation.org/convertip.php). Once we converted it, we learned that the range 192.168.1.1-192.168.1.255 is not being picked up the by engine. We can now remove that range from the discovery job definition or add that range to the appropriate discovery engine.
Comments