A1e6f320dcdb6a0afd7fd19a1e5f932e

Hello,

I created a this "ThreadSafeCachedEnumerable<T>" to speed up a few parts of my application. The idea is to create cache items retrieved from an IEnumerable<T>.

Unfortunately, the code is getting a little ugly and the performance is hindered by the locking. Any suggestions for making it easier to read or perform better would be appreciated.

Thanks,
-Charles

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;

namespace ConsoleApplication1
{
    /// <summary>
    /// Wraps an IEnumerable&lt;T&gt; and provides a thread-safe means of caching the values."/>
    /// </summary>
    /// <typeparam name="T"></typeparam>
    class ThreadSafeCachedEnumerable<T> : IEnumerable<T>
    {
        // An enumerator from the original IEnumerable<T>
        private IEnumerator<T> enumerator;

        // The items we have already cached (from this.enumerator)
        private IList<T> cachedItems = new List<T>();

        public ThreadSafeCachedEnumerable(IEnumerable<T> enumerable)
        {
            this.enumerator = enumerable.GetEnumerator();
        }

        #region IEnumerable<T> Members

        public IEnumerator<T> GetEnumerator()
        {
            // The index into the sequence
            int currentIndex = 0;

            // We will break with yield break 
            while (true)
            {
                // The currentIndex will never be decremented,
                // so we can check without locking first
                if (currentIndex < this.cachedItems.Count)
                {
                    var current = this.cachedItems[currentIndex];
                    currentIndex += 1;
                    yield return current;
                }
                else
                {
                    // If !(currentIndex < this.cachedItems.Count),
                    // we need to synchronize access to this.enumerator
                    lock (enumerator)
                    {
                        // See if we have more cached items ...
                        if (currentIndex < this.cachedItems.Count)
                        {
                            var current = this.cachedItems[currentIndex];
                            currentIndex += 1;
                            yield return current;
                        }
                        else
                        {
                            // ... otherwise, we'll need to get the next item from this.enumerator.MoveNext()
                            if (this.enumerator.MoveNext())
                            {
                                // capture the current item and cache it, then increment the currentIndex
                                var current = this.enumerator.Current;
                                this.cachedItems.Add(current);
                                currentIndex += 1;
                                yield return current;
                            }
                            else
                            {
                                // We reached the end of the enumerator - we're done
                                yield break;
                            }
                        }
                    }
                }
            }
        }

        #endregion

        #region IEnumerable Members

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            return this.GetEnumerator();
        }

        #endregion
    }
}

Refactorings

No refactoring yet !

F9a9ba6663645458aa8630157ed5e71e

Ants

July 8, 2009, July 08, 2009 02:52, permalink

No rating. Login to rate!

This is probably more readable.

using System;
using System.Collections.Generic;
using System.Linq;
using Xunit;
using System.Threading;

namespace ConsoleApplication1
{
    /// <summary>
    /// Wraps an IEnumerable<typeparamref name="T"/> and 
    /// provides a thread-safe means of caching the values.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    class ThreadSafeCachedEnumerable<T> : IEnumerable<T>
    {
        private IEnumerator<T> baseEnumerator;
        private IList<T> cachedItems = new List<T>();

        public ThreadSafeCachedEnumerable(IEnumerable<T> enumerable)
        {
            this.baseEnumerator = enumerable.GetEnumerator();
        }

        bool CacheNextItem(int currentIndex)
        {
            lock (baseEnumerator)
            {
                while (currentIndex >= this.cachedItems.Count)
                {
                    if (!this.baseEnumerator.MoveNext())
                        return false;
                    this.cachedItems.Add(this.baseEnumerator.Current);
                }
            }
            return true;
        }

        public IEnumerator<T> GetEnumerator()
        {
            int currentIndex = 0;

            while (     currentIndex < this.cachedItems.Count
                    ||  CacheNextItem(currentIndex))
            {
                yield return this.cachedItems[currentIndex++];
            }
        }

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            return this.GetEnumerator();
        }
    }

    public class ThreadSafeCachedEnumerableFacts
    {
        [Fact]
        public void CanHandleEmpty()
        {
            var enumerable = new ThreadSafeCachedEnumerable<int>(new int [] { } );
            Assert.Empty(enumerable);
        }

        [Fact]
        public void CanHandleOne()
        {
            var enumerable = new ThreadSafeCachedEnumerable<int>(new int[] { 3 });
            Assert.Equal(1, enumerable.Count());
            Assert.True(enumerable.Contains(3));
        }

        [Fact]
        public void CanHandleTwo()
        {
            var enumerable = new ThreadSafeCachedEnumerable<int>(new int[] { 3, 5 });
            Assert.Equal(2, enumerable.Count());
            Assert.True(enumerable.Contains(3));
            Assert.True(enumerable.Contains(5));
        }

        [Fact]
        public void UsesCache()
        {
            var enumerable = new ThreadSafeCachedEnumerable<int>(new RandomNumbers(10));
            var list1 = new List<int>(enumerable);
            var list2 = new List<int>(enumerable);

            Assert.Equal(list1.Count, list2.Count);
            for (int i = 0; i < 10; ++i)
                Assert.Equal(list1[i], list2[i]);
        }

        [Fact]
        public void IEnumerableInterfaceWorks()
        {
            var enumerable = (System.Collections.IEnumerable) new ThreadSafeCachedEnumerable<int>(new int[] { 3 });
            var list = new List<int>();
            foreach (int i in enumerable)
                list.Add(i);
            Assert.Equal(1, list.Count);
            Assert.True(list.Contains(3));
        }

        class RandomNumbers : IEnumerable<int>
        {
            int m_count;
            Random m_random = new Random();

            public RandomNumbers(int count)
            {
                m_count = count;
            }

            public IEnumerator<int> GetEnumerator()
            {
                for (int i = 0; i < m_count; ++i)
                    yield return m_random.Next();
            }

            System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
            {
                return this.GetEnumerator();
            }
        }
    }
}
9db4be375fa804f8d1211435fdf6b465

Matt Kerr

November 1, 2010, November 01, 2010 13:16, permalink

No rating. Login to rate!

You could improve this by replacing use of lock and List with ConcurrentDictionary from .NET Framework 4.0, where the key is the iteration index

020d2213bb97b769cfdf9ba45516fb1f

Vebyskarverar

May 4, 2011, May 04, 2011 20:49, permalink

No rating. Login to rate!

hello! somebody maybe use google?

<a href=http://google.com>google</a>

F9a9ba6663645458aa8630157ed5e71e

Ants

June 2, 2011, June 02, 2011 12:57, permalink

No rating. Login to rate!

@Chris: If you do call ToArry() or ToList(), then all the data gets retrieved right there and regardless of all the data being used or not, the point of IEnumerable is that the data is retrieved as needed.

To make an extreme point (because this is the wrong way to go about it), lets say that the IEnumerable enumerates all the .COM domain names ever registered in alphabetical order by doing batched queries to DNS. If user1 of the IEnumerable is just looking to see if "ChrisMartin.com" has been registered, they have to wait for "ZZZ.COM' to be put into the list or array, before the first item can be looked at. In the meantime, if user2 is using the the same IEnumerable instance in another thread looking for "Ants.com", then the same data has to be pulled down again (e.g. no caching).

A12a0565d032994007450e9c7dea6976

Chris Martin

June 1, 2011, June 01, 2011 08:05, permalink

No rating. Login to rate!

You can eliminate this whole class by using IEnumerable.ToArray() or IEnumerable.ToList()

F9a9ba6663645458aa8630157ed5e71e

Ants

June 3, 2011, June 03, 2011 19:44, permalink

No rating. Login to rate!

@Matt: I can see the possibilities of using the ConcurrentDictionary to replace the List, but I can't see how I can do without the lock. If two different threads come in calling Next() on their respective enumerators, how do I keep the calls from interweaving the calls to baseEnumerator.Next()?

See my code above. In CacheItem() if there is a thread switch right after I call baseEnumerator.MoveNext(), but before I put the baseEnumerator.Current item into the list (to be replaced by ConcurrentDictionary), there a chance of the other thread to catch up and call baseEnumerator.MoveNext() as well. Now an item from the baseEnumerator has been skipped.

Your refactoring





Format Copy from initial code

or Cancel