Products

Solutions

Resources

Partners

Community

Blog

About

QA

Ideas Test

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

Welcome to the DNN Community Forums, your preferred source of online community support for all things related to DNN.
In order to participate you must be a registered DNNizen

HomeHomeUsing DNN Platf...Using DNN Platf...Administration ...Administration ...SQL Performance SpikeSQL Performance Spike
Previous
 
Next
New Post
12/7/2006 2:09 PM
 

We're having a significant performance issue with our production DNN system and are reaching out to the community for some help. Essentially, from time to time, our SQL Server just spikes at 100% killing performance on all portals.  It appears to be related to GetTabs or the Event Log Purge Buffer but we're not completely sure if this is the culprit.

Let me start with the hardware particulars:

  1. Single DNN instance (code) sitting on a Dell server setup as file sharing only.
    1. Dell PowerEdge Server: Dual P4 2.86 Mhz Processors, 1GB Ram, RAID 5 – 5 x 73GB SCSI-320 Drives 
  2. Two Web Servers (pointing to the single code source) setup in a Web Farm
    1. (2) Dell PowerEdge Servers: P4 2.86, 1GB Ram
  3. Single SQL Server serving the web servers
    1. Dell PowerEdge Server: Dual P4 3.0 Xeon, 4GB Ram, RAID 1 - 2 x 70GB SCSI (OS, Apps), RAID 5 – 3 x 135 GB SCSI (Databases)  Processors are logically setup as four separate processors
  4. Everything sites behind a load balancer
  5. All owned equipment in a dedicated rack (co-located)

DNN Information

  1. Version 4.0.2
  2. 111 Portals
  3. 8126 Tabs (across all portals)
  4. Core DNN and Purchased Modules
  5. All business queries performed in custom queries hit web services from yet another server

Traffic Information

  1. Plenty of bandwidth coming in (we push close to 10MB/sec at peak)
  2. Avg 50,000 PageViews per day (1.5 Million per month)
  3. Avg 7MB/Sec throughput, Peak at 10MB/second
  4. 100 MB Switched Network into NOC then 3 OC-48 connections from NOC.


Behavior:

  1. In what appears to be random events, all four SQL processors will spike within a second or two to 100%.  
  2. Users will experience one of the following issues:
    1. Slow load time
    2. No load (typing in URL shows progress bar but no page loads)
    3. Page loads without skin (shows base blue skin with error saying it can’t find the portal skin)
    4. Page loads but shows only a Server Application Error test message
    5. Spike will continue for as little a one minute or for as long as two hours.
    6. Spike will resolve on it’s own
  3. We can force a resolution by performing one or more of the following
    1. Removing one web server from the farm (we have two in the farm)
    2. Turning off one of the DNN Web Sites within IIS
    3. Turning a site back on occasionally re-spikes the SQL Server

Tests completed so far (prior to or during spikes):

  1. SQL Profiler, shows no abnormal queries….many GetTabs queries
  2. SQL Processes show GetTabs waiting when spike occurs
  3. Killing specific spids does not improve performance
  4. Removing one of the web applications (or servers) will resolve the SQL spike.

So, we’re pulling our hair out at this point and looking at al levels of the applications.  We’ve turned off most logging, almost all of the DNN schedule, turned off Event Buffer Loads, etc.  While we keep focusing on hardware for resolution it does not appear to be a hardware limit since 99% of the time the hardware handles the load with the SQL Server at 20% and each of the web servers at < 10% processor and memory.

Does anyone have specific suggestions or tests or scripts we can perform to determine what exactly is causing the spike?  Any an all help would be appreciated. 



Steven Webster
Manager, Community Platform
F5 Networks, DevCentral
 
New Post
12/7/2006 6:38 PM
 

There are several places in the code that have been fixed for performance in the next release (DNN 4.4).

GetTabs was a major one.

My advice would be to try and keep the admins from doing any updates (or even logging in) until you can get upgraded.
If this is not possible, try to do updates only when traffic is lightest.


DotNetNuke Modules from Snapsis.com
 
New Post
12/8/2006 5:42 AM
 

You said you can see that GetTabs is waiting. Do you mean it is blocked by another SPID? If so, find out the query being run by the SPID that is blocking:

  1. Run sp_who2
  2. Analyze the BlkBy (blocked by) column to find the SPID that is the root cause of the block
  3. Run DBCC inputbuffer(blocking SPID) to get the command it was running

You could try rebuilding or defragmenting the indexes in your database. Badly performing indexes might account for certain queries not being performed efficiently. See this article:

http://www.mssqlcity.com/Articles/Adm/index_fragmentation.htm

Note the warning about using DBCC DBREINDEX :

During rebuilding a clustered index, an exclusive table lock is put on the table, preventing any table access by your users, and during rebuilding a nonclustered index a shared table lock is put on the table, preventing all but SELECT operations to be performed on it, you should schedule DBCC DBREINDEX statement during CPU idle time and slow production periods.

 
New Post
12/12/2006 1:55 PM
 
Thanks for the posts. In the time since my last post we seem to have [temporarily] solved the problem.  I found a post on another forum about some inefficient joins on GetTabs and GetModules procedures.  After we looked at the joins (left outer on a string comparison to a concatenated file id string) and the way case statements were being used in the query we decided to remove the file icon functionality and file linking functionality by simply returning blanks in these procedures.
While this disabled the functionality from the system a quick query confirmed that none of our sites or administrators actually used this functionality.
After a testing period on development we moved the changes to production. The effect was almost immediate.  Our SQL server puttered along all weekend at about 10% processor and hit a high of only 40%.  Performance overall for every site and tab increased. Pages which we had previously benchmarked at 3.5 seconds to load are now loading in 1.6 seconds!
I'm well aware of the 4.4 release and look forward to getting it in our test environment as soon as possible.  With the upgrade we expect a number of breaking changes from third part modules so we aren't expecting to release this into production for another month after its release to the community but I’m excited about what it means to our sites and administrators. Hopefully, this issue will be resolved with this release.


Steven Webster
Manager, Community Platform
F5 Networks, DevCentral
 
Previous
 
Next
HomeHomeUsing DNN Platf...Using DNN Platf...Administration ...Administration ...SQL Performance SpikeSQL Performance Spike


These Forums are dedicated to discussion of DNN Platform and Evoq Solutions.

For the benefit of the community and to protect the integrity of the ecosystem, please observe the following posting guidelines:

  1. No Advertising. This includes promotion of commercial and non-commercial products or services which are not directly related to DNN.
  2. No vendor trolling / poaching. If someone posts about a vendor issue, allow the vendor or other customers to respond. Any post that looks like trolling / poaching will be removed.
  3. Discussion or promotion of DNN Platform product releases under a different brand name are strictly prohibited.
  4. No Flaming or Trolling.
  5. No Profanity, Racism, or Prejudice.
  6. Site Moderators have the final word on approving / removing a thread or post or comment.
  7. English language posting only, please.
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out