public void ChangeCommuteItems(int startAtRecord)
{
using (DataContext context = new DataContext())
{
var commuteItems = context.FloridaCommuteItems.OrderBy(c=>c.commuteItemID);
var commuteDays = context.FloridaCommuteDays.Where(d => d.addedDate >= new DateTime(2008, 8, 17, 16, 30, 00));
var theJoin = from c in commuteItems
from d in commuteDays
where c.commuteDayID == d.commuteDayID
select c;
var theJoinArray = theJoin.ToArray();
var itemsToConvert = theJoin.Count();
int convertedSoFar = 0;
do
{
int batchCount = itemsToConvert-convertedSoFar < 1000 ? itemsToConvert-convertedSoFar : 1000;
try
{
for (int i = 0; i < batchCount; i++)
{
var distance = theJoinArray[i].commuteDistance.ConvertKilometersToMiles();
theJoinArray[i].commuteDistance = distance;
convertedSoFar++;
Console.WriteLine("Converted record {0} of {1}",convertedSoFar, itemsToConvert );
}
Console.WriteLine("I'm Going to submit Changes Now");
SubmitChanges(context);
Console.WriteLine("I'm Done Submitting Changes");
}
catch (Exception)
{
throw new ApplicationException(String.Format("Florida Commute Items: Started at record {0}, failed on item {1}", startAtRecord, convertedSoFar));
}
} while (convertedSoFar < itemsToConvert);
}
}
Refactorings
No refactoring yet !
Ants
August 27, 2009, August 27, 2009 21:11, permalink
As much as I despise SQL, I think this is a case where you do want to have a SQL UPDATE statement that does the conversion in one big transaction. I know that this is completely opposed to your thought of doing batch processing.
Why do I make this recommendation even though it pains me do do it? Unless your commuteDistance field is a complex type that records its unit of measure, chances are very likely that it is just a simple int. Consider what happens if you have 5000 records, and the 3456th record an exception is thrown. Sure the roll back will happen for records 3000 to 4000, but now you have the first 3000 records stored in miles, and the last 2000 records in kilometers. The next time you rerun, will theJoin, know which records are kilometers and which are miles?
Ah, but you'll say "That's what the startAtRecord parameter is for.", but it looks like startAtRecord isn't used except to build the ApplicationException message string. It doesn't seem to be used to skip the first few records.
Anyway, if you really want to go down this batching route, here's the way I would refactor it. I got rid of itemsToConvert since it seemed like it is only used for the UI. Additionally, I added logic to respect the startAtRecord parameter.
void SubmitChanges(DataContext context, int startAtRecord, int currentRecord)
{
try
{
Console.WriteLine("I'm Going to submit Changes Now");
SubmitChanges(context);
Console.WriteLine("I'm Done Submitting Changes");
}
catch (Exception)
{
throw new ApplicationException(String.Format("Florida Commute Items: Started at record {0}, failed on item {1}", startAtRecord, currentRecord));
}
}
public void ChangeCommuteItems(int startAtRecord)
{
using (DataContext context = new DataContext())
{
var commuteItems = context.FloridaCommuteItems.OrderBy(c=>c.commuteItemID);
var commuteDays = context.FloridaCommuteDays.Where(d => d.addedDate >= new DateTime(2008, 8, 17, 16, 30, 00));
var theJoin = from c in commuteItems
from d in commuteDays
where c.commuteDayID == d.commuteDayID
select c;
int currentRecord = 0;
int converted = 0;
foreach(var item in theJoin)
{
if (currentRecord++ < startAtRecord)
continue;
item.commuteDistance = item.commuteDistance.ConvertKilometersToMiles();
++converted;
Console.WriteLine("Converted record {0}", currentRecord);
if (converted % 1000 == 0)
SubmitChanges(context, startAtRecord, currentRecord);
}
SubmitChanges(context, startAtRecord, currentRecord);
}
}
I need to do a bunch of data updates to some db records and want to batch the submits to improve performance and to avoid a giant rollback in case of an exception. This is how I am doing it, which works but the conversion to an array seems inelegant.
Note, this is a one-off, so things like magic numbers for the batch size are present. Let that part go.