C
Colin
Our setup -
6 boxes
2 x Web Server in NLB cluster (one host currently STOPPED)
2 x App Servers (SM and VP) in NLB cluster (one host STOPPED)
1 x 2 node SQL Cluster (Active Passive) with AS on the 1st node
All servers are Windows 2003 Server Enterprise Edition SP1 + security
updates etc.
Issue 1
XML files on the VP server are not being deleted, although it appears that
some/all of the changes are actually getting through to the database and
being shown in PWA instantly. Update: just been told they have started going
through again, very weird.
Issue 2
Occasionally we get PWA hanging when attempting to log on (most users at
once). When this is happening there are no obvious CPU, MEM or Disk issues on
any of the servers.
Issue 3
We occasionally get CPU "race" conditions with SQLSERVER.EXE which flatlines
at 50% CPU. If I stop start SQLServer it goes away. Memory and disk all
appear ok on the SQL box.
We've had PSS calls open for as long as I can remember (since
implementation) and we're not sure this has even been implemented correctly,
we're lacking confidence in the 3rd party that did the work shall we say.
I've re-run the complus tool just in case that would help, doesnt seem to
have.
We have the relevant SQL re-index jobs setup, so I'm told by the DBAs.
Our domain account used for EPM is not locked and is defined with the
correct password in all services.
Another thing that may be of use, we used NETWORK AUTHORITY when setting up
the connections to the database from the front ends, the 3rd party that
installed the software hadnt done it that way before. We have done it that
way in our previous installation and that system was in fine working order.
The theory here is that the clustering has caused some config or bug issue in
the background. Hence why we've stopped one each of the web and app servers
to reduce that possibility. (well the AP server has to be stopped as the SM
doesnt like being clustered).
PSS have told us to re-install the whole system, which we may have to. I
just wanted to see if I could get some pointers as to where to look for each
of the problems above.
6 boxes
2 x Web Server in NLB cluster (one host currently STOPPED)
2 x App Servers (SM and VP) in NLB cluster (one host STOPPED)
1 x 2 node SQL Cluster (Active Passive) with AS on the 1st node
All servers are Windows 2003 Server Enterprise Edition SP1 + security
updates etc.
Issue 1
XML files on the VP server are not being deleted, although it appears that
some/all of the changes are actually getting through to the database and
being shown in PWA instantly. Update: just been told they have started going
through again, very weird.
Issue 2
Occasionally we get PWA hanging when attempting to log on (most users at
once). When this is happening there are no obvious CPU, MEM or Disk issues on
any of the servers.
Issue 3
We occasionally get CPU "race" conditions with SQLSERVER.EXE which flatlines
at 50% CPU. If I stop start SQLServer it goes away. Memory and disk all
appear ok on the SQL box.
We've had PSS calls open for as long as I can remember (since
implementation) and we're not sure this has even been implemented correctly,
we're lacking confidence in the 3rd party that did the work shall we say.
I've re-run the complus tool just in case that would help, doesnt seem to
have.
We have the relevant SQL re-index jobs setup, so I'm told by the DBAs.
Our domain account used for EPM is not locked and is defined with the
correct password in all services.
Another thing that may be of use, we used NETWORK AUTHORITY when setting up
the connections to the database from the front ends, the 3rd party that
installed the software hadnt done it that way before. We have done it that
way in our previous installation and that system was in fine working order.
The theory here is that the clustering has caused some config or bug issue in
the background. Hence why we've stopped one each of the web and app servers
to reduce that possibility. (well the AP server has to be stopped as the SM
doesnt like being clustered).
PSS have told us to re-install the whole system, which we may have to. I
just wanted to see if I could get some pointers as to where to look for each
of the problems above.