Here is the troubleshooting I recommend to start with when your Queue seems
to be having problems:
Here are some helpful hints and some things you can do to investigate why
the Queue is not processing jobs for a particular project:
1. Is your Queue still running properly - if it still processes jobs other
than the ones for the projects that are "blocked" then there is no need to
kill the Queue service - it is behaving as designed.
2. Use the Manage Queue page to look at correlations (use the CorrelationUID
column for help here) to see why a certain correlation is blocked. If you
cannot see any problems and your queue is still working, then your filters
on the Manage Queue page are probably not right - check them, especially the
History section (the problem may have actually occurred days ago). Using
the "By Project" filter works nicely for looking at the queue job history of
projects. For other correlations, use CorrelationUID.
3. Look for jobs in the Failed and Blocking state - those are the jobs that
are "blocking" others on the same correlation (again, use the correlation
UID here to see what jobs are affected). You can either retry these jobs if
the error looks like something having to do with something recoverable (like
loss of network or DB conn), or you can cancel. Canceling with the default
settings will cancel the entire correlation, so make sure you know what data
you could be losing by doing so.
4. Then look to see if maybe there are jobs stuck in the "Getting Enqueued"
state. If so, WinProj needs to be opened again on that user's machine who
submitted the job to see if WinProj will continue sending the project. If
that doesn't work, then you will need to cancel the jobs in this "getting
enqueued" state. Note that this effectively means that the save from
WinProj never happened, and that data will need to be resaved again. This
is the same thing that happens when you just blindly kill/restart the queue
service. But at least doing it this way means that you know what is being
lost, and which projects may need special attention later.
5. Look at the error (click the link in the Error column) to get an idea
about why the failure occurred. Sometimes you can correct the problem and
re-save/re-submit your job.
6. Start comparing Event Logs to what you've found on the Manage Queue page.
Look for errors around the same time as failed jobs in the queue.
7. ULS Logs. Same technique as #5 - look for errors around the same time as
failed jobs in the queue.
Once you clear the blocking job(s), the queue should immediately resume
processing on that correlation again, and pick up from where it last left
off (except, of course, if the jobs were all canceled in the process of
performing the steps above).
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
in message