1

I am running a background thread in my asp.net web service application. This thread's responsibility is to hit database after a specific time and update a datatable in the Cache. The data table has around 500K rows. In task manager when I look in processes, the web dev server for first time consumes around 300,000K on next time it goes to 500,000K and some times it reaches above 1,000,000K and sometimes drop back to 500,000-600,000K. As I am doing work on my local machine so data in database is not changing. Can anyone please guide me what I am doing wrong in the code:

protected void Application_Start(object sender, EventArgs e)
    {
        Thread obj = new Thread(new ThreadStart(AddDataInCache));
        obj.IsBackground = true;
        obj.Start();
    }

private void AddDataInCache()
    {
        Int32 iCount = 0;
        while (true)
        {
            MyCollection _myCollection = new MyCollection();
            DataTable dtReferences = null;
            DataTable dtMainData = null;
            try
            {
                dtMainData = _myCollection.GetAllDataForCaching(ref dtReferences);

                HttpRuntime.Cache.Insert("DATA_ALL_CACHING", dtMainData, null,
                    Cache.NoAbsoluteExpiration, Cache.NoSlidingExpiration,
                    CacheItemPriority.Default, null);

                HttpRuntime.Cache.Insert("DATA_REFERENCES_CACHING", dtReferences, null,
                    Cache.NoAbsoluteExpiration, Cache.NoSlidingExpiration,
                    CacheItemPriority.NotRemovable, null
                    );
            }
            catch (Exception ex)
            {

            }
            finally
            {
                if (_myCollection != null)
                    _myCollection = null;

            }
            iCount++;
            Thread.Sleep(18000);
        }
    }

In GetAllDataForCaching I am getting a SqlDataReader from my Data Access layer as:

public DataTable GetAllDataForCaching(ref DataTable dReferenceTable)
{
      DataTable dtReturn = new DataTable();
      SqlDataReader dReader = null;
      try
      {
            dReader = SqlHelper.ExecuteReader(CommandType.StoredProcedure, "[GetDataForCaching]", null);
            if (dReader != null && dReader.HasRows)
            {
                  dtReturn.Load(dReader);
                  dReferenceTable = new DataTable();
                  if (dReader.HasRows)
                {
                    DataTable dtSchema = dReader.GetSchemaTable();
                    List<DataColumn> listCols = new List<DataColumn>();

                    if (dtSchema != null)
                    {
                        foreach (DataRow drow in dtSchema.Rows)
                        {
                            string columnName = System.Convert.ToString(drow["ColumnName"]);
                            DataColumn column = new DataColumn(columnName, (Type)(drow["DataType"]));
                            column.Unique = (bool)drow["IsUnique"];
                            column.AllowDBNull = (bool)drow["AllowDBNull"];
                            column.AutoIncrement = (bool)drow["IsAutoIncrement"];
                            listCols.Add(column);
                            dReferenceTable.Columns.Add(column);
                        }
                    }

                    while (dReader.Read())
                    {
                        DataRow dataRow = dReferenceTable.NewRow();
                        for (int i = 0; i < listCols.Count; i++)
                        {
                            dataRow[((DataColumn)listCols[i])] = dReader[i];
                        }
                        dReferenceTable.Rows.Add(dataRow);
                    }
                }
            }
      }
      finally
        {
            if (dReader != null)
            {
                if (dReader.IsClosed == false)
                    dReader.Close();
                dReader = null;
            }
        }
    return dtReturn;
}

I am using Visual Studio 2008.

Imran Balouch
  • 2,170
  • 1
  • 18
  • 37
  • One more thing that I want to know is that sometimes Cache returns null, if anyone can put light on it will be more helpful, I am not putting it as a separate question as it is relevant to this code. – Imran Balouch Jul 04 '12 at 09:12
  • What's the idea with the dReferenceTable? Is it supposed to be just a copy of the dtReturn table? – user1429080 Jul 04 '12 at 10:06
  • Nope, dReference Table contains the references of statuses, like if statusId in dtReturn is 1, than dReference will have values for Id 1 in different languages, which later is joined with dtReturn to show status according to language of user. – Imran Balouch Jul 04 '12 at 10:45
  • You are getting Cache as null because it has priority null. Cache takes memory from server, so server manages memory for cached data. You did not set any expiration, but you should set one of them. And before getting or inserting data from Cache we should always check if this is null or not. Other wise each time the thread will run a cache object is created in memory and it will get deleted only when the Cache memory if full on server. Thats why in your task manager you see different status. – Narendra Jul 06 '12 at 13:18

5 Answers5

3

I'll start by addressing the follow up question:

... is that sometimes Cache returns null ...

This can be because presumably it takes some time for the background thread to fill the cache. When Application_Start fires, you start up the background thread and then the Application_Start finishes. The application can then move on to other tasks, for instance processing a page.

If during the processing of the page, an attempt is made to access the cache before the initial run of AddDataInCache has finished, then the cache will return null.

Regarding memory consumption, I don't immediately see how you could improve the situation unless you are able to reduce the amount of row in the cached DataTables.

In the first call to AddDataInCache, the cache is empty to begin with. Then your GetAllDataForCaching creates two DataTables and fill them with data. This causes the process to aquire memory to store the data in the DataTables.

On the second and subsequent calls to AddDataInCache, the cache already holds all the data that was fetched on the previous run. And then again you create two new datatables and fill them with data. This causes the memory consuption to go up again in order to hold both the preexisting data in cache and the new data in the DataTables created in the second run. Then once the second run has completed loading the data you overwite the preexisting data in the cache with the new data fetched in the second run.

At his point the data that was in the cache from the first run becomes eligible for garbage collection. But that doesn't mean that the memory will be immediately reclaimed. The memory will be reclaimed when the garbage collector comes around and notices that the DataTables are no longer needed in memory.

Note that the cached items from the first run will only become eligible for garbage collection if no "live" objects are holding a reference to them. Make sure that you keep your cache usage short lived.

And while all of this is going on, your background thread will happily go about its business refreshing the cache. It's therefore possible that a third cache refresh occurs before the garbage collector releases the memory for the DataTables fetched in the first run, causing the memory consumption to increase further.

So in order to decrease the memory consumption, I think you will simply have to reduce the amount of data that you store in the cache (fewer rows, fewer columns). Increasing the time between cache refreshes might also be helpful.

And finally, make sure you are not keeping old versions of the cached objects alive by referencing them in long lived requests/application processes.

user1429080
  • 9,086
  • 4
  • 31
  • 54
  • Cache is found null after the inital run of 'AddDataInCache', i access the page once Cache is loaded and it works fine but after that sometimes I found cache as null. I put a check before inserting the cache that if it is not null, than remove it. I am not sure when Garbage Collector is going to take action. – Imran Balouch Jul 04 '12 at 12:10
  • You set `CacheItemPriority.Default` on key "DATA_ALL_CACHING". Try changing that to NotRemovable since you will anyway refresh it in a short while. Also: Explicitly removing before inserting the new copy to the cache will not help, so better not to do it. There is a small chance that some part of your code will try to access the cache between `Remove` and `Insert`. – user1429080 Jul 04 '12 at 12:22
  • What is your opinion if before `Thread.Sleep(18000);`, I put these 2 lines, `GC.Collect();` and `GC.WaitForPendingFinalizers();`. What can be the side effects of it? As I am sure its gonna solve this memory issue. – Imran Balouch Jul 05 '12 at 14:32
  • @Imran Balouch That goes outside my field of expertise. I cannot with good concience say anything about it. – user1429080 Jul 05 '12 at 14:45
  • GC.WaitForPendingFinalizers will do nothing for you in this case. Once you called GC.Collect() the memory is claimed and can go... – Marcelo De Zen Jul 10 '12 at 16:15
  • As i got idea of Garbage Collection from your answer so I am awarding you the bounty. – Imran Balouch Jul 12 '12 at 07:27
2

You will be a lot more efficient with a timer than having the thread sleep like that. Timers are more memory and CPU - efficient.

Peter Bromberg
  • 1,498
  • 8
  • 11
2

I agree with Peter and I will recommend you to use System.Threading.Timer, you can find this following link useful:

http://blogs.msdn.com/b/tmarq/archive/2007/07/21/an-ounce-of-prevention-using-system-threading-timer-in-an-asp-net-application.aspx

Mohan Laal
  • 51
  • 1
  • 6
  • Thanks for the answer dear, but timers are also almost doing the same thing as was thread doing. And thanks for sharing the link, it sounds like a good implementation. – Imran Balouch Jul 02 '12 at 14:31
1

I have done it by putting following code before Thread.Sleep(18000);

GC.Collect();
GC.WaitForPendingFinalizers();

It is keeping the memory in control so far.

Imran Balouch
  • 2,170
  • 1
  • 18
  • 37
  • I am not marking it as an answer yet, as I will wait for some more feedback, if possible. – Imran Balouch Jul 06 '12 at 07:25
  • 3
    If you want any solution you can go with this. But this is not a good solution.GC.Collect() is almost never a good solution. You'll will start to tell to GC how it must work, this is not a good idea. GC works diferently depending on SO and Hardware configuration. The thing you should do is just read from dReader and put on cache instead of making a whole copy of data in dReference. – Marcelo De Zen Jul 10 '12 at 16:17
0

Firstly, you should use using (see IDisposable) when working with database connection, command, reader etc.

Secondly web cache can be cleared because of pool recycling or IIS reset. That's why you cannot rely on having your items in cache "for ever". This is a safe way to get the data:

private DataTable GetDataWithReferences(out DataTable dtReferences)
{
    dtReferences = HttpRuntime.Cache["DATA_REFERENCES_CACHING"];
    DataTable dtMainData = HttpRuntime.Cache["DATA_ALL_CACHING"];
    if ( null == dtMainData )
    {
        dtMainData = _myCollection.GetAllDataForCaching(/*ref - why?*/out dtReferences);
        // cache insert
    }

    return dtMainData;
}
Community
  • 1
  • 1
Peter Ivan
  • 1,467
  • 2
  • 14
  • 27