c# - Most Optimal TPL Dataflow Design? -
i ask input how best design optimal architecture using tpl dataflow. not have written code yet there no sample code can post. not looking code (unless volunteered) either assistance in design appreciated:
the requirements follows:
i have 3 core datablocks dependent on each other in specific ways. datablock1 producer produces objects of type foo1. datablock2 supposed subscribe foo1 objects (from datablock1) , potentially (not upon each , every foo1, subject specific function) produce foo2 objects stores in output queue other datablocks consume. datablock3 consumes foo1 objects (from datablock1) , potentially produces foo3 objects datablock2 consumes , transforms foo2 objects.
in summary, here datablocks , each produce , consume:
- datablock1: produces(foo1), consumes(nothing)
- datablock2: produces(foo2), consumes(foo1, foo3)
- datablock3: produces(foo3), consumes(foo1)
an additional requirement that same foo1 processed @ same time in datablock2 , datablock3. ok if foo1 objects first consumed datablock2 , once datablock2 has done work same foo1 objects posted datablock3 work. foo2 objects datablock2 can result either operations on foo1 objects or foo3 objects.
i hope makes sense, happy explain more if still unclear.
my first idea create tpl dataflowblocks each of 3 datablocks , make them handle incoming streams of different object types. idea split datablocks , have each datablock handle streams of 1 single object type. recommend or there better solution may work?
svick has helped on datablock1 , operational, stuck on how go transforming current environment (as described above) tpl dataflow.
any ideas or pointers appreciated.
let's split problem in 3 , solve each independently.
the first 1 how produce item conditionally. think best option use transformmanyblock
, let function return collection 1 or 0 items.
another option link 2 blocks conditionally, null
s ignored , return null
when don't want produce anything. if that, have link source nulltarget
, null
s don't stay in output buffer.
the second problem how send foo1s both block #2 , block #3. can see 2 ways here:
- use
broadcastblock
linked both target blocks (#2 , #3). careful this, becausebroadcastblock
doesn't have output queue, if target block postpones item, means won't process it. because of this, shouldn't setboundedcapacity
of blocks #2 , #3 in case. if don't that, never postpone , messages processed both blocks. - after processing foo1 block #2, manually
post()
(or better,sendasync()
) block #3.
i'm not sure “at same time” mean, in general, tpl dataflow doesn't make guarantees order of processing of independent blocks. can alter priority of different blocks using a custom taskscheduler
, i'm not sure useful here.
the last , complicated problem how process items of different types in single block. there several ways how this, though i'm not sure best you:
- don't process them in single block. have 1
transformblock<foo1, foo2>
, 1transformblock<foo3, foo2>
. can link them both singlebufferblock<foo2>
. - as suggested, use
batchedjoinblock<foo1, foo3>
,batchsize
of 1. means resultingtuple<ilist<foo1>, ilist<foo3>>
contain either 1foo1
or 1foo3
. - enhance previous solution linking
batchedjoinblock
transformblock
produces more suitable type. eithertuple<foo1, foo3>
(one of itemsnull
), or f#choice<foo1, foo3>
, ensures 1 of 2 set. - create new block type scratch, want. should
isourceblock<foo2>
, have 2 properties:target1
of typeitarget<foo1>
,target2
of typeitarget<foo3>
, built-in join blocks.
with options #1 , #3, encapsulate blocks single custom block, looks block #4 outside, it's more reusable.
Comments
Post a Comment