| Author | Messages | |
adwulf
Posts:112
 | | 03/13/2012 10:53 AM |
| Dear all,
I've had three DCs experience the same issue (out of 35 in the forest), and I was hoping somebody might be able to shed some light on the 'version storage' memory.
Last week, I saw two DCs (DFL + FFL WS2003, btw) stop replicating, and logging several errors in the Directory Service logs like this:
Event ID 1479, Source: NTDS Replication, Category: Replication Active Directory could not update the following object on the local domain controller with changes received from the following source domain controller. Active Directory does not have enough database version store to apply the changes.
Object: DC=167.20.168,DC=192.in-addr.arpa,CN=MicrosoftDNS,CN=System,DC=omnia,DC=corpad,DC=local Object GUID: 00c193d5-e567-4a10-a34e-ee75710b6e13 Source domain controller: eca4c491-cd29-41aa-a075-4cde1ac094f0._msdcs.corpad.local
And:
Event ID 1519, Source: NTDS General, Category: Internal Processing Internal Error: Active Directory could not perform an operation because the database has run out of version storage.
Additional Data Internal ID: 2020f05
Also, when attempting garbage collection, event 705 from NTDS ISAM is logged:
NTDS (532) NTDSA: Online defragmentation of database 'g:\ntds\ntds.dit' terminated prematurely after encountering unexpected error -1069. The next time online defragmentation is started on this database, it will resume from the point of interruption.
The DNS server has also been logging errors (with ID 4015):
The DNS server has encountered a critical error from the Active Directory. Check that the Active Directory is functioning properly. The extended error debug information (which may be empty) is "00002024: SvcErr: DSID-02080490, problem 5008 (ADMIN_LIMIT_EXCEEDED), data -1069". The event data contains the error.
Data: 0000: 0000000b
The two affected DCs were rebooted, and have been fine since. This morning, the issue started again on another DC, and the type of entries logged in the Directory Service and DNS Server logs is the same.
We have looked at the hotfix from http://support.microsoft.com/kb/974803 (The domain controller runs slower or stops responding when the garbage collection process runs) - but this doesn't seem to match what we're seeing. For example - I can't seem to find any objects with a DELTIME, but without isDeleted being set - and also, we don't see the CPU spikes described in this article. Furthermore - if it were such objects that were causing the issue when the GC ran, surely we would have seen this on all DCs in the forest within 12 hours?
So my question is - what could be causing the depletion of 'version storage', and how can I monitor it?
Many thanks in advance,
-- AdamT
List info: http://www.activedir.org/List.aspx
| | | |
| DaemonRoot
Posts:173
 | | 03/13/2012 12:31 PM |
| Hi there,
How's disk I/O doing? Have you checked http://support.microsoft.com/?kbid=974803 ??? Does it ring a bell? You can emulate this by triggering the doGarbageCollector command via LDP. Cheers,
~d
-----Original Message----- From: activedir-owner@xxxxxxxxxxxxxxxx [mailto:activedir-owner@xxxxxxxxxxxxxxxx] On Behalf Of Adam Thompson Sent: Tuesday, March 13, 2012 4:51 AM To: ActiveDir@xxxxxxxxxxxxxxxx Subject: [ActiveDir] Run out of version storage
Dear all,
I've had three DCs experience the same issue (out of 35 in the forest), and I was hoping somebody might be able to shed some light on the 'version storage' memory.
Last week, I saw two DCs (DFL + FFL WS2003, btw) stop replicating, and logging several errors in the Directory Service logs like this:
Event ID 1479, Source: NTDS Replication, Category: Replication Active Directory could not update the following object on the local domain controller with changes received from the following source domain controller. Active Directory does not have enough database version store to apply the changes.
Object: DC=167.20.168,DC=192.in-addr.arpa,CN=MicrosoftDNS,CN=System,DC=omnia,DC=corp ad,DC=local Object GUID: 00c193d5-e567-4a10-a34e-ee75710b6e13 Source domain controller: eca4c491-cd29-41aa-a075-4cde1ac094f0._msdcs.corpad.local
And:
Event ID 1519, Source: NTDS General, Category: Internal Processing Internal Error: Active Directory could not perform an operation because the database has run out of version storage.
Additional Data Internal ID: 2020f05
Also, when attempting garbage collection, event 705 from NTDS ISAM is logged:
NTDS (532) NTDSA: Online defragmentation of database 'g:\ntds\ntds.dit' terminated prematurely after encountering unexpected error -1069. The next time online defragmentation is started on this database, it will resume from the point of interruption.
The DNS server has also been logging errors (with ID 4015):
The DNS server has encountered a critical error from the Active Directory. Check that the Active Directory is functioning properly. The extended error debug information (which may be empty) is "00002024: SvcErr: DSID-02080490, problem 5008 (ADMIN_LIMIT_EXCEEDED), data -1069". The event data contains the error.
Data: 0000: 0000000b
The two affected DCs were rebooted, and have been fine since. This morning, the issue started again on another DC, and the type of entries logged in the Directory Service and DNS Server logs is the same.
We have looked at the hotfix from http://support.microsoft.com/kb/974803 (The domain controller runs slower or stops responding when the garbage collection process runs) - but this doesn't seem to match what we're seeing. For example - I can't seem to find any objects with a DELTIME, but without isDeleted being set - and also, we don't see the CPU spikes described in this article. Furthermore - if it were such objects that were causing the issue when the GC ran, surely we would have seen this on all DCs in the forest within 12 hours?
So my question is - what could be causing the depletion of 'version storage', and how can I monitor it?
Many thanks in advance,
-- AdamT
List info: http://www.activedir.org/List.aspx
List info: http://www.activedir.org/List.aspx
| | | |
| khoover
Posts:8
 | | 03/13/2012 12:40 PM |
| That KB article also recommends that you call Microsoft for help if this problem appears. That sort of language in a published KB article is (IMO) a signal that this isn't something you want to mess around with. You might want to get on the phone with them before pushing too many more buttons.
- Ken Hoover
-- Ken Hoover Manager, Windows Systems Group (WINSYS) Yale University ITS Infrastructure Services x2-1260 ken.hoover@xxxxxxxxxxxxxxxx * http://blogs.yale.edu/roller/page/kjh27
-----Original Message----- From: activedir-owner@xxxxxxxxxxxxxxxx [mailto:activedir-owner@xxxxxxxxxxxxxxxx] On Behalf Of daemonR00t Sent: Tuesday, March 13, 2012 8:30 AM To: activedir@xxxxxxxxxxxxxxxx Subject: RE: [ActiveDir] Run out of version storage
Hi there,
How's disk I/O doing? Have you checked http://support.microsoft.com/?kbid=974803 ??? Does it ring a bell? You can emulate this by triggering the doGarbageCollector command via LDP. Cheers,
~d
-----Original Message----- From: activedir-owner@xxxxxxxxxxxxxxxx [mailto:activedir-owner@xxxxxxxxxxxxxxxx] On Behalf Of Adam Thompson Sent: Tuesday, March 13, 2012 4:51 AM To: ActiveDir@xxxxxxxxxxxxxxxx Subject: [ActiveDir] Run out of version storage
Dear all,
I've had three DCs experience the same issue (out of 35 in the forest), and I was hoping somebody might be able to shed some light on the 'version storage' memory.
Last week, I saw two DCs (DFL + FFL WS2003, btw) stop replicating, and logging several errors in the Directory Service logs like this:
Event ID 1479, Source: NTDS Replication, Category: Replication Active Directory could not update the following object on the local domain controller with changes received from the following source domain controller. Active Directory does not have enough database version store to apply the changes.
Object: DC=167.20.168,DC=192.in-addr.arpa,CN=MicrosoftDNS,CN=System,DC=omnia,DC=corp ad,DC=local Object GUID: 00c193d5-e567-4a10-a34e-ee75710b6e13 Source domain controller: eca4c491-cd29-41aa-a075-4cde1ac094f0._msdcs.corpad.local
And:
Event ID 1519, Source: NTDS General, Category: Internal Processing Internal Error: Active Directory could not perform an operation because the database has run out of version storage.
Additional Data Internal ID: 2020f05
Also, when attempting garbage collection, event 705 from NTDS ISAM is logged:
NTDS (532) NTDSA: Online defragmentation of database 'g:\ntds\ntds.dit' terminated prematurely after encountering unexpected error -1069. The next time online defragmentation is started on this database, it will resume from the point of interruption.
The DNS server has also been logging errors (with ID 4015):
The DNS server has encountered a critical error from the Active Directory. Check that the Active Directory is functioning properly. The extended error debug information (which may be empty) is "00002024: SvcErr: DSID-02080490, problem 5008 (ADMIN_LIMIT_EXCEEDED), data -1069". The event data contains the error.
Data: 0000: 0000000b
The two affected DCs were rebooted, and have been fine since. This morning, the issue started again on another DC, and the type of entries logged in the Directory Service and DNS Server logs is the same.
We have looked at the hotfix from http://support.microsoft.com/kb/974803 (The domain controller runs slower or stops responding when the garbage collection process runs) - but this doesn't seem to match what we're seeing. For example - I can't seem to find any objects with a DELTIME, but without isDeleted being set - and also, we don't see the CPU spikes described in this article. Furthermore - if it were such objects that were causing the issue when the GC ran, surely we would have seen this on all DCs in the forest within 12 hours?
So my question is - what could be causing the depletion of 'version storage', and how can I monitor it?
Many thanks in advance,
-- AdamT
List info: http://www.activedir.org/List.aspx
List info: http://www.activedir.org/List.aspx
List info: http://www.activedir.org/List.aspx
| | | |
| nayan.khatua
Posts:10
 | | 03/13/2012 12:42 PM |
| Hi, Please ref : http://www.microsoft.com/technet/support/ee/transform.aspx?ProdName=Windows+Operating+System&ProdVer=5.0&EvtID=1519&EvtSrc=Active+Directory&LCID=1033
Please try with this command :
Please try to defragment with command " esentutl /d <Path *.dit>" Ex : C:\WINDOWS\system32>esentutl /d d:adamntds.dit
If possible, provide event log files.
*Regards*
*Nayan Khatua |**95020 43633*
*“ *Take the first step in faith, You don’t have to see the whole staircase, just take the first step *”*
On Tue, Mar 13, 2012 at 4:20 PM, Adam Thompson <adwulf@xxxxxxxxxxxxxxxx> wrote:
> Dear all, > > I've had three DCs experience the same issue (out of 35 in the > forest), and I was hoping somebody might be able to shed some light on > the 'version storage' memory. > > Last week, I saw two DCs (DFL + FFL WS2003, btw) stop replicating, and > logging several errors in the Directory Service logs like this: > > > Event ID 1479, Source: NTDS Replication, Category: Replication > Active Directory could not update the following object on the local > domain controller with changes received from the following source > domain controller. Active Directory does not have enough database > version store to apply the changes. > > Object: > > DC=167.20.168,DC=192.in-addr.arpa,CN=MicrosoftDNS,CN=System,DC=omnia,DC=corpad,DC=local > Object GUID: > 00c193d5-e567-4a10-a34e-ee75710b6e13 > Source domain controller: > eca4c491-cd29-41aa-a075-4cde1ac094f0._msdcs.corpad.local > > > > And: > > Event ID 1519, Source: NTDS General, Category: Internal Processing > Internal Error: Active Directory could not perform an operation > because the database has run out of version storage. > > Additional Data > Internal ID: > 2020f05 > > > > Also, when attempting garbage collection, event 705 from NTDS ISAM is > logged: > > NTDS (532) NTDSA: Online defragmentation of database > 'g:\ntds\ntds.dit' terminated prematurely after encountering > unexpected error -1069. The next time online defragmentation is > started on this database, it will resume from the point of > interruption. > > > The DNS server has also been logging errors (with ID 4015): > > The DNS server has encountered a critical error from the Active > Directory. Check that the Active Directory is functioning properly. > The extended error debug information (which may be empty) is > "00002024: SvcErr: DSID-02080490, problem 5008 (ADMIN_LIMIT_EXCEEDED), > data -1069". The event data contains the error. > > Data: 0000: 0000000b > > > > > The two affected DCs were rebooted, and have been fine since. This > morning, the issue started again on another DC, and the type of > entries logged in the Directory Service and DNS Server logs is the > same. > > We have looked at the hotfix from > http://support.microsoft.com/kb/974803 (The domain controller runs > slower or stops responding when the garbage collection process runs) - > but this doesn't seem to match what we're seeing. For example - I > can't seem to find any objects with a DELTIME, but without isDeleted > being set - and also, we don't see the CPU spikes described in this > article. Furthermore - if it were such objects that were causing the > issue when the GC ran, surely we would have seen this on all DCs in > the forest within 12 hours? > > So my question is - what could be causing the depletion of 'version > storage', and how can I monitor it? > > > Many thanks in advance, > > -- > AdamT > > List info: http://www.activedir.org/List.aspx >
| | | |
| adwulf
Posts:112
 | | 03/13/2012 12:57 PM |
| On 13 March 2012 12:30, daemonR00t <daemonroot@xxxxxxxxxxxxxxxx> wrote: > Hi there, > > How's disk I/O doing? > Have you checked http://support.microsoft.com/?kbid=974803 ??? Does it ring > a bell? > You can emulate this by triggering the doGarbageCollector command via LDP. > Cheers, >
I did see that KB article - but:
i) We don't see any sign of objects with a delTime but no isDeleted value. ii) If such objects existed, surely all DCs would have suffered the same fate when they ran their garbage collection (which would have been within 12 hours). iii) The KB Article says that the issue lasts for a few minutes. In our case, it lasts for hours. Until the DC is rebooted, in fact.
I did run the doGarbageCollector against the affected DC this morning, and got a warning (as posted in the original email).
Regards,
-- AdamT Гладна мечка хоро не играе. A hungry bear doesn't dance.
List info: http://www.activedir.org/List.aspx
| | | |
| bijubabuk
Posts:153
 | | 03/13/2012 1:39 PM |
| I had the similar issue in the past with some of our core domain controllers in data center, it happened mainly bcs of high load from lot of application like user provisioning etc. they were 2003 32bit boxes and when we upgraded to higher horsepower boxes with 2008r2 things went ok.
what could be causing the depletion of 'version storage', and how can I monitor it? - there are some advanced counters u can enable to see the "version buckets allocated" counters - try the steps in here - http://blogs.technet.com/b/carlh/archive/2009/06/17/squeaky-lobster-moni toring-version-store-counters.aspx
You can also increase the size of the version storage with a regkey, but I think it's not recommended and have consequences like the memory u reserved in the reg value will be reserved only for versioning purpose and LSASS may run out of mem, especially in x86 boxes. - refer the workaround part in here - http://support.microsoft.com/kb/974803
I think it would be wise to call MS and get some deeper look into the issue
Rgds
My working hours are from 11:00 to 19:30 IST (00:30 to 09:00 CST)
-----Original Message----- From: activedir-owner@xxxxxxxxxxxxxxxx [mailto:activedir-owner@xxxxxxxxxxxxxxxx] On Behalf Of adwulf@xxxxxxxxxxxxxxxx Sent: Tuesday, March 13, 2012 4:21 PM To: ActiveDir@xxxxxxxxxxxxxxxx Subject: [ActiveDir] Run out of version storage
Dear all,
I've had three DCs experience the same issue (out of 35 in the forest), and I was hoping somebody might be able to shed some light on the 'version storage' memory.
Last week, I saw two DCs (DFL + FFL WS2003, btw) stop replicating, and logging several errors in the Directory Service logs like this:
Event ID 1479, Source: NTDS Replication, Category: Replication Active Directory could not update the following object on the local domain controller with changes received from the following source domain controller. Active Directory does not have enough database version store to apply the changes.
Object:
DC=167.20.168,DC=192.in-addr.arpa,CN=MicrosoftDNS,CN=System,DC=omnia,DC= corpad,DC=local
Object GUID:
00c193d5-e567-4a10-a34e-ee75710b6e13
Source domain controller:
eca4c491-cd29-41aa-a075-4cde1ac094f0._msdcs.corpad.local
And:
Event ID 1519, Source: NTDS General, Category: Internal Processing Internal Error: Active Directory could not perform an operation because the database has run out of version storage.
Additional Data
Internal ID:
2020f05
Also, when attempting garbage collection, event 705 from NTDS ISAM is logged:
NTDS (532) NTDSA: Online defragmentation of database 'g:\ntds\ntds.dit' terminated prematurely after encountering unexpected error -1069. The next time online defragmentation is started on this database, it will resume from the point of interruption.
The DNS server has also been logging errors (with ID 4015):
The DNS server has encountered a critical error from the Active Directory. Check that the Active Directory is functioning properly.
The extended error debug information (which may be empty) is
"00002024: SvcErr: DSID-02080490, problem 5008 (ADMIN_LIMIT_EXCEEDED), data -1069". The event data contains the error.
Data: 0000: 0000000b
The two affected DCs were rebooted, and have been fine since. This morning, the issue started again on another DC, and the type of entries logged in the Directory Service and DNS Server logs is the same.
We have looked at the hotfix from
http://support.microsoft.com/kb/974803 <http://support.microsoft.com/kb/974803> (The domain controller runs slower or stops responding when the garbage collection process runs) - but this doesn't seem to match what we're seeing. For example - I can't seem to find any objects with a DELTIME, but without isDeleted being set - and also, we don't see the CPU spikes described in this article. Furthermore - if it were such objects that were causing the issue when the GC ran, surely we would have seen this on all DCs in the forest within 12 hours?
So my question is - what could be causing the depletion of 'version storage', and how can I monitor it?
Many thanks in advance,
--
AdamT
List info: http://www.activedir.org/List.aspx <http://www.activedir.org/List.aspx>
| | | |
| adwulf
Posts:112
 | | 03/13/2012 2:30 PM |
| On 13 March 2012 13:37, <Biju_babu@xxxxxxxxxxxxxxxx> wrote: > > what could be causing the depletion of 'version storage', and how can I > monitor it? - there are some advanced counters u can enable to see the > "version buckets allocated" counters - try the steps in here - > http://blogs.technet.com/b/carlh/archive/2009/06/17/squeaky-lobster-monitoring-version-store-counters.aspx > Thanks for that link. It looks like it's just the thing I need to keep a track of what's happening to the version store.
> > > You can also increase the size of the version storage with a regkey, but I > think it's not recommended and have consequences like the memory u reserved > in the reg value will be reserved only for versioning purpose and LSASS may > run out of mem, especially in x86 boxes. - refer the workaround part in here > - http://support.microsoft.com/kb/974803 >
These are x64 WS2003 servers - although there are some x86 ones still in the domain. So far, only x64 DCs have been affected.
Currently LSASS memory usage tends to be 700-950 MB (virtual) size. We could throw more RAM at the x64 DCs, as they only have 4 GB at the moment.
> > I think it would be wise to call MS and get some deeper look into the issue >
Agreed. I've asked for it, but they cannot escalate to MS directly. It needs to be done via their 3rd party.
Many thanks for sharing your insight into this issue. Most helpful! Especially the squeaky lobster! :-)
Regards,
-- AdamT
List info: http://www.activedir.org/List.aspx
| | | |
| bdesmond
Posts:1042
 | | 03/13/2012 5:35 PM |
| Don't think this will help
Thanks, Brian Desmond brian@xxxxxxxxxxxxxxxx
w - 312.625.1438 | c - 312.731.3132
-----Original Message----- From: activedir-owner@xxxxxxxxxxxxxxxx [mailto:activedir-owner@xxxxxxxxxxxxxxxx] On Behalf Of Adam Thompson Sent: Tuesday, March 13, 2012 6:00 AM To: activedir@xxxxxxxxxxxxxxxx Subject: Re: [ActiveDir] Run out of version storage
On 13 March 2012 12:38, Nayan K Khatua <nayan.khatua@xxxxxxxxxxxxxxxx> wrote: > Hi, > Please ref : > http://www.microsoft.com/technet/support/ee/transform.aspx?ProdName=Wi > ndows+Operating+System&ProdVer=5.0&EvtID=1519&EvtSrc=Active+Directory& > LCID=1033
Thanks, I hadn't seen that. I'll see if I can identify any long-running transactions.
> Please try with this command : > > Please try to defragment with command " esentutl /d <Path *.dit>" > Ex : C:\WINDOWS\system32>esentutl /d d:adamntds.dit > > If possible, provide event log files. >
Shouldn't that run when the GC finishes?
Thanks,
-- AdamT Гладна мечка хоро не играе. A hungry bear doesn't dance.
List info: http://www.activedir.org/List.aspx
List info: http://www.activedir.org/List.aspx
| | | |
| skradel
Posts:350
 | | 03/13/2012 5:47 PM |
| While it probably won't help with this specific problem, you might want to move the DNS data out of the domain partition and into ForestDNSZones / DomainDNSZones.
And who knows... it might help anyway.
--Steve
On Tue, Mar 13, 2012 at 6:50 AM, Adam Thompson <adwulf@xxxxxxxxxxxxxxxx> wrote: [snip] > Object: > DC=167.20.168,DC=192.in-addr.arpa,CN=MicrosoftDNS,CN=System,DC=omnia,DC=corpad,DC=local > Object GUID: > 00c193d5-e567-4a10-a34e-ee75710b6e13 > Source domain controller: > eca4c491-cd29-41aa-a075-4cde1ac094f0._msdcs.corpad.local [snip]
List info: http://www.activedir.org/List.aspx
| | | |
| adwulf
Posts:112
 | | 03/15/2012 12:23 PM |
| On 13 March 2012 14:28, Adam Thompson <adwulf@xxxxxxxxxxxxxxxx> wrote: >> > Thanks for that link. It looks like it's just the thing I need to > keep a track of what's happening to the version store. >
Just a word of warning if anyone else is looking to use 'Squeaky Lobster' counters. On all of the DCs I applied this to, it caused perflib to unload the counters.
The Collect Procedure for the "ESENT" service in DLL "c:\Windows\system32\esentprf.dll" generated an exception or returned an invalid status. Performance data returned by counter DLL will be not be returned in Perf Data Block. The exception or status code returned is the first DWORD in the attached data.
The data section shows 0000: c0000005 00000000 (access violation).
This didn't cause a major issue, but did prevent some monitoring from functioning.
-- AdamT
List info: http://www.activedir.org/List.aspx
| | | |
| adwulf
Posts:112
 | | 04/19/2012 12:49 PM |
| On 15 March 2012 12:22, Adam Thompson <adwulf@xxxxxxxxxxxxxxxx> wrote: > > Just a word of warning if anyone else is looking to use 'Squeaky > Lobster' counters. On all of the DCs I applied this to, it caused > perflib to unload the counters. >
In case anyone else reads back on this thread and wonders what happened, here's a brief summary.
A call was opened with MSFT, who pointed out that there were a large volume of deleted (tombstoned, not garbage collected) objects on the day in which we saw a DC fail with this error. Checking with some of the customer's other teams, we found that there had been a big project to clean up various legacy exchange objects. So, this was all throttled and the cleanup project was finished. The MSFT call was closed.
And then a couple of weeks later, it happened again. Twice.
Fortunately, on one occasion, I was in the office and notified of the issue quite quickly, and was able to use repadmin to take a look at what had been going on, leading up to the time of the incident. I noticed some groups having their memberships modified, and found that the group memberships in some cases were six to eight thousand. It appears that a user was being removed from several of these groups.
Not a problem, for a DFL/FFL of Win 2003, thanks to LVR - I thought. But on checking the creation timestamps for these groups - they were created before the RTM date of Win 2003, so LVR doesn't apply for the whole group membership.
So - in summary, I *think* I have found the culprit of the version storage depletion issues, and it's having pre-Win2003 groups with > 5,000 members.
-- AdamT A hungry bear doesn't dance.
List info: http://www.activedir.org/List.aspx
| | | |
| joe1
Posts:39
 | | 04/20/2012 9:30 PM |
| Thanks for the heads up on that- I have the same issue occasionally on a few of our hub site DCs.
Sent from my phone
On 19 Apr 2012, at 12:48, "Adam Thompson" <adwulf@xxxxxxxxxxxxxxxx> wrote:
> On 15 March 2012 12:22, Adam Thompson <adwulf@xxxxxxxxxxxxxxxx> wrote: >> >> Just a word of warning if anyone else is looking to use 'Squeaky >> Lobster' counters. On all of the DCs I applied this to, it caused >> perflib to unload the counters. >> > > In case anyone else reads back on this thread and wonders what > happened, here's a brief summary. > > A call was opened with MSFT, who pointed out that there were a large > volume of deleted (tombstoned, not garbage collected) objects on the > day in which we saw a DC fail with this error. Checking with some of > the customer's other teams, we found that there had been a big project > to clean up various legacy exchange objects. So, this was all > throttled and the cleanup project was finished. The MSFT call was > closed. > > And then a couple of weeks later, it happened again. Twice. > > Fortunately, on one occasion, I was in the office and notified of the > issue quite quickly, and was able to use repadmin to take a look at > what had been going on, leading up to the time of the incident. I > noticed some groups having their memberships modified, and found that > the group memberships in some cases were six to eight thousand. It > appears that a user was being removed from several of these groups. > > Not a problem, for a DFL/FFL of Win 2003, thanks to LVR - I thought. > But on checking the creation timestamps for these groups - they were > created before the RTM date of Win 2003, so LVR doesn't apply for the > whole group membership. > > So - in summary, I *think* I have found the culprit of the version > storage depletion issues, and it's having pre-Win2003 groups with > > 5,000 members. > > -- > AdamT > A hungry bear doesn't dance. > > List info: http://www.activedir.org/List.aspx
List info: http://www.activedir.org/List.aspx
| | | |
| adwulf
Posts:112
 | | 03/12/2013 1:07 PM |
| On 19 April 2012 12:47, Adam Thompson <adwulf@xxxxxxxxxxxxxxxx> wrote:
> So - in summary, I *think* I have found the culprit of the version > storage depletion issues, and it's having pre-Win2003 groups with > > 5,000 members. > > I noticed this ancient thread while looking for something else, and noticed that I never said what fixed the issue in the end.
The issue was not with legacy group memberships at all. We cleaned those up, and still got the problems.
What we found was that a process dump of the lsass.exe process pointed at a third-party change auditing software package, which was installed on the DCs. We removed this package temporarily on some DCs, and found that after doing so, the lsass.exe process was actually crashing, not hanging. We were no longer seeing any version store depletion, just lsass.exe crashing with an access violation.
This was fairly quickly identified as being the bug documented in: http://support.microsoft.com/kb/981259 "A domain controller that is running Windows Server 2003 SP2 stops responding intermittently".
The third party change auditing software had a DLL hooked into the lsass.exe process, which was apparently preventing the thread from terminating, which slightly altered the symptoms of the issue. The hotfix was applied to all relevant DCs, the change auditing software was updated to a newer release, and we all lived happily ever after.
-- AdamT
| | | |
|
|