Technical Reports Published in 2013

BU-CE-1301: PDF

Attributed Relational Graphs for Cell Nucleus Segmentation in Fluorescence Microscopy Images: Supplementary Material

Salim Arslan, Tulin Ersahin, Rengul Cetin-Atalay, and Cigdem Gunduz-Demir

More rapid and accurate high-throughput screening becomes possible with the development of automated microscopy imaging, for which cell nucleus segmentation commonly constitutes the core step. This technical report contains the supplementary material for the model-based algorithm that we developed for segmenting cell nuclei in fluorescence microscopy images.

Keywords: Nucleus segmentation, model-based segmentation, fluorescence microscopy imaging, graph, attributed relational graph.

BU-CE-1302: PDF

Scaling forecasting algorithms using clustered modeling

Izzeddin Gur, Mehmet Guvercin, and Hakan Ferhatosmanoğlu

Research on statistical forecasting has traditionally focused on building more accurate models for a given time-series. The models are mostly applied only to limited data due to their limitation on efficiency and scalability. However, many enterprise applications such as Customer Relationship and Experience Management (CRM and CEM) require scalable forecasting on large number of data series. For example, telecommunication companies need to forecast each of their customers’ traffic load individually to understand their needs and behavior, and to tailor targeted campaigns. Forecasting models are easily applied on aggregate traffic data to estimate the total traffic volume for revenue estimation and resource planning. However, they cannot be applied to each user individually as building accurate models for large number of users would be time consuming. The problem is exacerbated when the forecasting process is continuous and the models need to be updated periodically. We address the problem of building and updating forecasting models continuously for multiple data series and propose dynamic clustered modeling optimized for forecasting. We introduce representative models as an analogy to cluster centers, and apply the models to each individual series through iterative nonlinear optimization. The approach performs modeling and clustering simultaneously, makes forecasts by applying representative models to each data, and updates the model parameters for a continuous forecasting process. Our findings indicate that understanding an individual’s behavior within its segment’s model provides more scalability and accuracy than computing the individual model itself. Experimental results from a real telecom CRM application show the method is highly efficient and scalable, and also more accurate than having separate individual models.

Keywords: Scalable forecasting, time-series models, dynamic maintenance, clustered modeling, streaming data, performance, accuracy

BU-CE-1303: PDF

Cartesian product partitioning of multi-dimensional reachable state spaces

Tuğrul Dayar and M. Can Orhan

Markov chains (MCs) are widely used to model systems which evolve by visiting the states in their state spaces following the available transitions. When such systems are composed of interacting subsystems, they can be mapped to a multi-dimensional MC in which each subsystem normally corresponds to a different dimension. Usually the reachable state space of the multi-dimensional MC is a proper subset of its product state space, that is, Cartesian product of its subsystem state spaces.

Compact storage of the matrix underlying such a MC and efficient implementation of analysis methods using Kronecker operations require the set of reachable states to be represented as a union of Cartesian products of subsets of subsystem state spaces.

The problem of partitioning the reachable state space of a three or higher dimensional system with a minimum number of partitions into Cartesian products of subsets of subsystem state spaces is shown to be NP-complete. Two algorithms, one merge based the other refinement based, that yield possibly non-optimal partitionings are presented. Results of experiments on a set of problems from the literature and those that are randomly generated indicate that, although it may be more time and memory consuming, the refinement based algorithm almost always computes partitionings with a smaller number of partitions than the merge based algorithm. The refinement based algorithm is insensitive to the order in which the states in the reachable state space are processed, and in many cases it computes partitionings that are optimal.

Keywords: Discrete geometry, orthogonal polytope decomposition, multi-dimensional system, reachable state space, Cartesian product partitioning, Markov chain