pipeline objects (aka processors) are intanciated at each micro-batch by spark executors. This leads to too many objects creation and GC purge.
we could instead lazily create a pool of processors into a CoreControllerService that handles a pool of objects needed for record processing.
may be with https://commons.apache.org/proper/commons-pool/
pipeline objects (aka processors) are intanciated at each micro-batch by spark executors. This leads to too many objects creation and GC purge.
we could instead lazily create a pool of processors into a CoreControllerService that handles a pool of objects needed for record processing.
may be with https://commons.apache.org/proper/commons-pool/