I have the following method:
public IObservable<DataManagementWorkItem> GetWorkItemSource(int maxConcurrentCalls)
{
return m_namespaceManager
.GetNamespaceConnectionInfoSource(true, drainAndDisable: false)
.Select(nci => Observable.Defer(() => GetPolicySourceForNamespace(nci)))
.Merge(maxConcurrentCalls)
.Where(IsValid)
.Select(ToWorkItem)
.Where(o => o != null);
}
It implements the following logic:
- Enter the monad by obtaining
IObservable<NamespaceConnectionInfo>
from the namespace manager (GetNamespaceConnectionInfoSource
). - As namespaces become available obtain
IObservable<DataManagementPolicy>
corresponding to the particular namespace (GetPolicySourceForNamespace
). However, use theMerge
operator to restrict the number of concurrent calls toGetPolicySourceForNamespace
. - Filter out bad
DataManagementPolicy
records (cannot be done in SQL). - Translate the seemingly good
DataManagementPolicy
records toDataManagementWorkItem
instances. Some could turn out asnull
, so they are filtered out at the end.
The GetNamespaceConnectionInfoSource
can fault after having produced certain amount of valid NamespaceConnectionInfo
objects. It is entirely possible that certain amount of DataManagementWorkItem
objects have already been produced by that time in the final observable sequence.
I have a unit test, where:
GetNamespaceConnectionInfoSource
throws after having produced 25 namespacesGetPolicySourceForNamespace
produces 10 objects per namespace- The concurrency limit is 10
I am also interested to examine the items produced in the final observable before it is faulted:
var dm = DependencyResolver.Instance.GetInstance<IDataManagement>();
var workItems = new List<DataManagementWorkItem>();
try
{
var obs = dm.GetWorkItemSource(10);
obs.Subscribe(wi => workItems.Add(wi));
await obs;
Assert.Fail("An expected exception was not thrown");
}
catch (Exception exc)
{
AssertTheRightException(exc);
}
The workItems
collection has a different number of items every time. One run it has 69 items, another - 50, yet another - 18.
My interpretation is that when the fault occurs there are good NamespaceConnectionInfo
and DataManagementPolicy
objects in various phases of processing, all of which get aborted because of the fault. The amount is different each time, because the items are produced asynchronously.
And here lies my problem - I do not want them to be aborted. I want them to run to completion, be produced in the final observable sequence and only then to communicate the fault. In essence I want to hold the exception and re-throw it at the end.
I tried to modify the implementation a little bit:
public IObservable<DataManagementWorkItem> GetWorkItemSource(int maxConcurrentCalls)
{
Exception fault = null;
return m_namespaceManager
.GetNamespaceConnectionInfoSource(true, drainAndDisable: false)
.Catch<NamespaceConnectionInfo, Exception>(exc =>
{
fault = exc;
return Observable.Empty<NamespaceConnectionInfo>();
})
.Select(nci => Observable.Defer(() => GetPolicySourceForNamespace(nci)))
.Merge(maxConcurrentCalls)
.Where(IsValid)
.Select(ToWorkItem)
.Where(o => o != null)
.Finally(() =>
{
if (fault != null)
{
throw fault;
}
});
}
Needless to say - it did not work. Finally
does not seem to propagate any exceptions, which I actually agree with.
So, what is the right way to achieve what I want?
EDIT
Unrelated to the question, I have found that the test code I use to collect the produced DataManagementWorkItem
instances is bad. Instead of
var obs = dm.GetWorkItemSource(10);
obs.Subscribe(wi => workItems.Add(wi));
await obs;
it should be
await dm.GetWorkItemSource(1).Do(wi => workItems.Add(wi));
The difference is that the latter subscribes to the source of items just once, whereas the original version subscribed twice:
- by
Subscribe
- by
await
It does not affect the qustion, but screws my mocking code.
Clarification
This more of a clarification. Each namespace produce a sequence of 10 policy objects. But this process is asynchronous - the policy objects are produced sequentially, but asynchronously. During all that time namespaces continue to be produced and hence given 25 namespaces before the fault there are three possible "states" in which a produced namespace can be:
- No policy objects have yet been produced for it, but the asynchronous policy production process has been started
- Some (but less that 10) policy objects have already been produced
- All 10 policy objects for the namespace have been produced
When an error in the namespace production occurs the entire pipeline is aborted, regardless of the "state" in which "good" namespaces are right now.
Let us have a look at the following trivial example:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Reactive.Linq;
using System.Reactive.Subjects;
using System.Threading;
namespace observables
{
class Program
{
static void Main()
{
int count = 0;
var obs = Observable
.Interval(TimeSpan.FromMilliseconds(1))
.Take(50)
.Select(i =>
{
if (25 == Interlocked.Increment(ref count))
{
throw new Exception("Boom!");
}
return i;
})
.Select(i => Observable.Defer(() => Observable.Interval(TimeSpan.FromMilliseconds(1)).Take(10).Select(j => i * 1000 + j)))
.Merge(10);
var items = new HashSet<long>();
try
{
obs.Do(i => items.Add(i)).GetAwaiter().GetResult();
}
catch (Exception exc)
{
Debug.WriteLine(exc.Message);
}
Debug.WriteLine(items.Count);
}
}
}
When I run it I usually have the following output:
Boom!
192
But, it could also display 191. However, if we apply the fault concat solution (even if it does not work when there are no faults):
int count = 0;
var fault = new Subject<long>();
var obs = Observable
.Interval(TimeSpan.FromMilliseconds(1))
.Take(50)
.Select(i =>
{
if (25 == Interlocked.Increment(ref count))
{
throw new Exception("Boom!");
}
return i;
})
.Catch<long, Exception>(exc =>
{
fault.OnError(exc);
return Observable.Empty<long>();
})
.Select(i => Observable.Defer(() => Observable.Interval(TimeSpan.FromMilliseconds(1)).Take(10).Select(j => i * 1000 + j)))
.Merge(10)
.Concat(fault);
Then the output is consistently 240, because we let all the asynchronous processes that have already been started to complete.
An awkward solution based on answer by pmccloghrylaing
public IObservable<DataManagementWorkItem> GetWorkItemSource(int maxConcurrentCalls)
{
var fault = new Subject<DataManagementWorkItem>();
bool faulted = false;
return m_namespaceManager
.GetNamespaceConnectionInfoSource(true, drainAndDisable: false)
.Catch<NamespaceConnectionInfo, Exception>(exc =>
{
faulted = true;
return Observable.Throw<NamespaceConnectionInfo>(exc);
})
.Finally(() =>
{
if (!faulted)
{
fault.OnCompleted();
}
})
.Catch<NamespaceConnectionInfo, Exception>(exc =>
{
fault.OnError(exc);
return Observable.Empty<NamespaceConnectionInfo>();
})
.Select(nci => Observable.Defer(() => GetPolicySourceForNamespace(nci)))
.Merge(maxConcurrentCalls)
.Where(IsValid)
.Select(ToWorkItem)
.Where(o => o != null)
.Concat(fault);
}
It works both when the namespace production faults and both when it is successful, but it looks so awkward. Plus multiple subscriptions still share the fault. There must be a more elegant solution.
GetNamespaceConnectionInfoSource source code
public IObservable<NamespaceConnectionInfo> GetNamespaceConnectionInfoSource(bool? isActive = null,
bool? isWorkflowEnabled = null, bool? isScheduleEnabled = null, bool? drainAndDisable = null,
IEnumerable<string> nsList = null, string @where = null, IList<SqlParameter> whereParameters = null)
{
IList<SqlParameter> parameters;
var sql = GetNamespaceConnectionInfoSqls.GetSql(isActive,
isWorkflowEnabled, isScheduleEnabled, drainAndDisable, nsList, @where, whereParameters, out parameters);
var sqlUtil = m_sqlUtilProvider.Get(m_siteSettings.ControlDatabaseConnString);
return sqlUtil.GetSource(typeof(NamespaceConnectionInfo), sqlUtil.GetReaderAsync(sql, parameters)).Cast<NamespaceConnectionInfo>();
}
public IObservable<DbDataReader> GetReaderAsync(string query, IList<SqlParameter> parameters = null, CommandBehavior commandBehavior = CommandBehavior.Default)
{
return Observable.FromAsync(async () =>
{
SqlCommand command = null;
try
{
var conn = await GetConnectionAsync();
command = GetCommand(conn, query, parameters);
return (DbDataReader)await command.ExecuteReaderAsync(commandBehavior | CommandBehavior.CloseConnection);
}
finally
{
DisposeSilently(command);
}
});
}
public IObservable<object> GetSource(Type objectType, IObservable<DbDataReader> readerTask)
{
return Observable.Create<object>(async (obs, ct) => await PopulateSource(objectType, await readerTask, true, obs, ct));
}
private static async Task PopulateSource(Type objectType, DbDataReader reader, bool disposeReader, IObserver<object> obs, CancellationToken ct)
{
try
{
if (IsPrimitiveDataType(objectType))
{
while (await reader.ReadAsync(ct))
{
obs.OnNext(reader[0]);
}
}
else
{
// Get all the properties in our Object
var typeReflector = objectType.GetTypeReflector(TypeReflectorCreationStrategy.PREPARE_DATA_RECORD_CONSTRUCTOR);
// For each property get the data from the reader to the object
while (await reader.ReadAsync(ct))
{
obs.OnNext(typeReflector.DataRecordConstructor == null ?
ReadNextObject(typeReflector, reader) :
typeReflector.DataRecordConstructor(reader));
}
}
}
catch (OperationCanceledException)
{
}
finally
{
if (disposeReader)
{
reader.Dispose();
}
}
}