Products

Solutions

Resources

Partners

Community

Blog

About

QA

Ideas Test

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

Welcome to the DNN Community Forums, your preferred source of online community support for all things related to DNN.
In order to participate you must be a registered DNNizen

HomeHomeDevelopment and...Development and...Building ExtensionsBuilding ExtensionsOther Extension...Other Extension...Scheduler behind the scenesScheduler behind the scenes
Previous
 
Next
New Post
2/26/2014 9:21 AM
 

Hey Nils,

Yes - from what we have seen working on a number of different scheduler issues over the years - I believe that there is still a set of edge cases where it is possible for multiple scheduler worker pools to be started on the same system - resulting in the sort of thing you are currently seeing.  

Its been a while since i dug around in the code - but this thread got me to thinking on the code again - and i believe I may now know where the issue is coming from.

Specifically - this occurs when a system is setup so that it uses the REQUEST_MODE scheduler as opposed to the TIMER_MODE -  and most often on test servers which see maybe more load or application restarts.

In REQUEST_MODE a test is performed every time any sort of http request in made to the server - this test checks to see if the scheduler is up and running - well actually it tests to see if the scheduler is ready to receive requests using SchedulingProvider.ReadyForPoll equal to true - if it is ready - a new background worker thread is created and that thread is bound to the scheduler provider and the thread is then turned on and the scheduler starts executing.

BUT the problem is - that there is a small window of time while that new worker thread is being started up - after that first test of SchedulingProvider.ReadyForPoll - where it could be possible for a second httprequest to also  find that SchedulingProvider.ReadyForPoll is still flagged as ready to start - and also attempt to start a separate background scheduler thread.

the code that is triggered by the http request handler looks like this:

////////////////////////////////////////////////////////

if (SchedulingProvider.SchedulerMode == SchedulerMode.REQUEST_METHOD && SchedulingProvider.ReadyForPoll)
                    {
                        Logger.Trace("Running Schedule " + (SchedulingProvider.SchedulerMode));
                        var scheduler = SchedulingProvider.Instance();
                        var requestScheduleThread = new Thread(scheduler.ExecuteTasks) {IsBackground = true};
                        requestScheduleThread.Start();
                        SchedulingProvider.ScheduleLastPolled = DateTime.Now;
                    }

////////////////////////////////////////////////////

The issue is that - SchedulingProvider.ReadyForPoll - is not semaphore locked in any way.  So between the time when SchedulingProvider.ReadyForPoll is tested and the time when SchedulingProvider.ScheduleLastPolled is set - which is how the ReadyForPoll value is actually turned to false - there is some significant potential lag time - given that creating a worker thread in itself takes sometime - potentially many thousands of clock cycles - and that in addition the actual values behind ReadyForPoll and ScheduleLastPolled are actually stored and dereferenced into the DNN DataCache - adding even more clock cycles. 

It seems to me that what is needed here is for that block of code to be wrapped in a lock()  statement on a static object so that there is no possible chance of two separate http requests getting into that same code block at the same time.

Would be interested in hearing other peoples thoughts on this one.

Westa

 
New Post
2/26/2014 9:44 AM
 
Nils - to also answer your question more specifically - this issue is actually potentially happening above the layer that maxThreads relates to.

It is the scheduler provider that is uses the value of maxThreads to create its thread pool - but in the case above - what seems to be potentially happening is that there may actually be more than one scheduler provider instance being created under rare circumstances - in which case both scheduler provider instances would then possibly set about creating their own thread pools.

Westa

PS...

Its only a theory at the moment from doing a manual code trace - dont have time to do any actual debug walkthru's - something which is pretty nasty to do on threaded systems at the best of times.

And what im not totally sure of is the potential impact of the fact that the scheduler rely's on a number of static members which it would seem could collide if this is actually happening
 
New Post
2/26/2014 2:36 PM
 
Wes Tatters wrote:
[...]

It seems to me that what is needed here is for that block of code to be wrapped in a lock()  statement on a static object so that there is no possible chance of two separate http requests getting into that same code block at the same time.

Would be interested in hearing other peoples thoughts on this one.

Westa

Hi Westa,

Thank you very much for your extensive answer. It is really appreciated.

The above statement is exactly what I was thinking is happening here. The issue of multiple threads starting can be recreated by:

  1. Create a number of 5 or so schedulers.
  2. Set them to have the same "Frequency" as "Retry Lapse Time", eg. 20min/20min
  3. Make those schedulers write into the same file.

This should give you an exception after a short periode of time, when the scheduler triggers more than one thread. The write operation has to take some time though as otherwise the operation might be complete before the next thread tries to open the file for write access.

It is a rather interesting topic so I might do some investigation if I'll find the time for that.


Cheers,
Nils

 
New Post
2/26/2014 6:41 PM
 

Ok - SOOOOOO - setup a debug test on a clean 7.2.1 install with source this morning.

Made sure that the system was setup to use REQUEST_METHOD and added some debug notes so that messages would appear in the debug output log - compiled this build and ran a test or  two - the code for the tests is below - will post the results in a second.

  public static void RunSchedule(HttpRequest request)
        {
            System.Diagnostics.Debug.WriteLine("dnn RUN SCHEDULER HTTPREQUEST");

            if (!IsUpgradeOrInstallRequest(request))
            {
                try
                {
                    if (SchedulingProvider.SchedulerMode == SchedulerMode.REQUEST_METHOD && SchedulingProvider.ReadyForPoll)
                    {
                        System.Diagnostics.Debug.WriteLine("dnn START SCHEDULER ATTEMPT ");

                        Logger.Trace("Running Schedule " + (SchedulingProvider.SchedulerMode));
                        var scheduler = SchedulingProvider.Instance();
                        var requestScheduleThread = new Thread(scheduler.ExecuteTasks) {IsBackground = true};
                        requestScheduleThread.Start();
                        SchedulingProvider.ScheduleLastPolled = DateTime.Now;
                    }
                }
                catch (Exception exc)
                {
                    System.Diagnostics.Debug.WriteLine("dnn START SCHEDULER - EXCEPTION");

                    Exceptions.LogException(exc);
                }
            }
        }


 
New Post
2/26/2014 6:50 PM
 
In a normal startup with no overloading of the server or multiple requests - more on that soon what you see in the log is -

dnn RUN SCHEDULER HTTPREQUEST
dnn START SCHEDULER ATTEMPT <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
dnn RUN SCHEDULER HTTPREQUEST
dnn RUN SCHEDULER HTTPREQUEST

HOWEVER - and this is where the issue occurs - if there are a couple of requests to the server made during the time when the application pool is restarting then well - seems like all bets are off.

To recreate the second test.

Recompliled the code - to force the application pool to release the dnn context - could also done by simply editing the web.config.

The w3wp,exe worker process is still running - but the dnn application is not - so we are in a restart state.
I then attached visual studio to w3wp.exe - doing this makes sure we are seeing all activities that will happen as the dnn application starts up.

I then opened up 3 blank web browser pages - and setup the url ready to open the test sites dnn home page.
With the debug running - i then clicked on the browse button on each page in quick succession.


'w3wp.exe' (Managed (v4.0.30319)): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System.Windows.Forms

\v4.0_4.0.0.0__b77a5c561934e089\System.Windows.Forms.dll', Skipped loading symbols. Module is optimized and the debugger option

'Just My Code' is enabled.
dnn RUN SCHEDULER HTTPREQUEST
dnn START SCHEDULER ATTEMPT <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
dnn RUN SCHEDULER HTTPREQUEST
dnn RUN SCHEDULER HTTPREQUEST
dnn START SCHEDULER ATTEMPT <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
The thread '' (0x3fa4) has exited with code 0 (0x0).
dnn START SCHEDULER ATTEMPT <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
The thread '' (0x53d8) has exited with code 0 (0x0).
The thread '' (0xc64) has exited with code 0 (0x0).
....
'w3wp.exe' (Managed (v4.0.30319)): Loaded 'Microsoft.GeneratedCode'
dnn RUN SCHEDULER HTTPREQUEST
dnn RUN SCHEDULER HTTPREQUEST
A first chance exception of type 'System.Threading.ThreadAbortException' occurred in mscorlib.dll
...
An exception of type 'System.Threading.ThreadAbortException' occurred in mscorlib.dll but was not handled in user code
dnn RUN SCHEDULER HTTPREQUEST
A first chance exception of type 'System.Threading.ThreadAbortException' occurred in mscorlib.dll
An exception of type 'System.Threading.ThreadAbortException' occurred in mscorlib.dll but was not handled in user code
The thread '' (0x61bc) has exited with code 0 (0x0).

=============================================================



 
Previous
 
Next
HomeHomeDevelopment and...Development and...Building ExtensionsBuilding ExtensionsOther Extension...Other Extension...Scheduler behind the scenesScheduler behind the scenes


These Forums are dedicated to discussion of DNN Platform and Evoq Solutions.

For the benefit of the community and to protect the integrity of the ecosystem, please observe the following posting guidelines:

  1. No Advertising. This includes promotion of commercial and non-commercial products or services which are not directly related to DNN.
  2. No vendor trolling / poaching. If someone posts about a vendor issue, allow the vendor or other customers to respond. Any post that looks like trolling / poaching will be removed.
  3. Discussion or promotion of DNN Platform product releases under a different brand name are strictly prohibited.
  4. No Flaming or Trolling.
  5. No Profanity, Racism, or Prejudice.
  6. Site Moderators have the final word on approving / removing a thread or post or comment.
  7. English language posting only, please.
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out