c# - Most Optimal TPL Dataflow Design? -

- July 15, 2010

i ask input how best design optimal architecture using tpl dataflow. not have written code yet there no sample code can post. not looking code (unless volunteered) either assistance in design appreciated:

the requirements follows:

i have 3 core datablocks dependent on each other in specific ways. datablock1 producer produces objects of type foo1. datablock2 supposed subscribe foo1 objects (from datablock1) , potentially (not upon each , every foo1, subject specific function) produce foo2 objects stores in output queue other datablocks consume. datablock3 consumes foo1 objects (from datablock1) , potentially produces foo3 objects datablock2 consumes , transforms foo2 objects.

in summary, here datablocks , each produce , consume:

datablock1: produces(foo1), consumes(nothing)
datablock2: produces(foo2), consumes(foo1, foo3)
datablock3: produces(foo3), consumes(foo1)

an additional requirement that same foo1 processed @ same time in datablock2 , datablock3. ok if foo1 objects first consumed datablock2 , once datablock2 has done work same foo1 objects posted datablock3 work. foo2 objects datablock2 can result either operations on foo1 objects or foo3 objects.

i hope makes sense, happy explain more if still unclear.

my first idea create tpl dataflowblocks each of 3 datablocks , make them handle incoming streams of different object types. idea split datablocks , have each datablock handle streams of 1 single object type. recommend or there better solution may work?

svick has helped on datablock1 , operational, stuck on how go transforming current environment (as described above) tpl dataflow.

any ideas or pointers appreciated.

let's split problem in 3 , solve each independently.

the first 1 how produce item conditionally. think best option use transformmanyblock , let function return collection 1 or 0 items.

another option link 2 blocks conditionally, nulls ignored , return null when don't want produce anything. if that, have link source nulltarget, nulls don't stay in output buffer.

the second problem how send foo1s both block #2 , block #3. can see 2 ways here:

use broadcastblock linked both target blocks (#2 , #3). careful this, because broadcastblock doesn't have output queue, if target block postpones item, means won't process it. because of this, shouldn't set boundedcapacity of blocks #2 , #3 in case. if don't that, never postpone , messages processed both blocks.
after processing foo1 block #2, manually post() (or better, sendasync()) block #3.

i'm not sure “at same time” mean, in general, tpl dataflow doesn't make guarantees order of processing of independent blocks. can alter priority of different blocks using a custom taskscheduler, i'm not sure useful here.

the last , complicated problem how process items of different types in single block. there several ways how this, though i'm not sure best you:

don't process them in single block. have 1 transformblock<foo1, foo2> , 1 transformblock<foo3, foo2>. can link them both single bufferblock<foo2>.
as suggested, use batchedjoinblock<foo1, foo3>, batchsize of 1. means resulting tuple<ilist<foo1>, ilist<foo3>> contain either 1 foo1 or 1 foo3.
enhance previous solution linking batchedjoinblock transformblock produces more suitable type. either tuple<foo1, foo3> (one of items null), or f# choice<foo1, foo3>, ensures 1 of 2 set.
create new block type scratch, want. should isourceblock<foo2> , have 2 properties: target1 of type itarget<foo1> , target2 of type itarget<foo3>, built-in join blocks.

with options #1 , #3, encapsulate blocks single custom block, looks block #4 outside, it's more reusable.

Search This Blog

Convert PH

c# - Most Optimal TPL Dataflow Design? -

Comments

Post a Comment

Popular posts from this blog

c# - SVN Error : "svnadmin: E205000: Too many arguments" -

c# - Copy ObservableCollection to another ObservableCollection -

All overlapping substrings matching a java regex -