Top

DOE-IT Seminar Series on Design of Experiments

DOE-IT Seminars on Design of Experiments

October 11, 2019: Mixed-level OMARS designs for three-level quantitative factors and two-level categorical factors (by José Núñez Ares)

November 8, 2019: Bayesian filtering techniques for optimal dynamic experiments (by Arno Strouwen)

November 29, 2019: Enumeration of two-level even-odd designs of strength 3 (by Eric Schoen)

January 17, 2020: A Literature Review on the Intersection between Big Data and Design of Experiments (by Alan Vazquez)

February 28, 2020: A-Optimal versus D-Optimal Design of Screening Experiments (by Peter Goos)

March 20, 2020: Literature review on the construction of fractional factorial split-plot designs (by Alexandre Bohyn)

April 3, 2020: Optimal designs for generalized linear models: a review (by Karel Van Brantegem)

April 24, 2020: A study of design-based model selection methods (by Mohammed Saif Ismail Hameed)

May 15, 2020: The choice set size: to be fixed or not to be fixed (by Martina Vandebroek)

May 29, 2020: Single-case experimental designs and N-of-1 RCTs: design issues and applications (by Patrick Onghena)

June 12, 2020: I-optimal versus D-optimal designs for choice experiments with mixtures (by Mario Becerra)

This seminar series is intended to present the newest developments in design of experiments, to foster cross-fertilization and stimulate critical discussion. Participation is free. The seminars all start at 11.45 am and end at 1 pm. The location is room 01.18, Kasteelpark Arenberg 30, 3001 Leuven (unless otherwise specified). If you would like to attend please send an email to jose . nunezares @ kuleuven . be

Travel information

Leuven is easy to reach by train from the Brussels train stations (roughly 20-30 minutes), from Liège (30 minutes) and from the Brussels national airport (15 minutes by train). Note that the Brussels South airport is further.

Mixed-level OMARS designs for three-level quantitative factors and two-level categorical factors

October 11, 2019 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

The family of orthogonal minimally aliased response surface or OMARS designs is a new family of three-level experimental design for studying quantitative factors. Many experiments, however, also involve one or more categorical factors. Using mixed integer programming techniques, we constructed a new database of 1,394 OMARS designs with additional categorical factors with two levels. Like the original OMARS designs, the new mixed-level designs are orthogonal main-effect plans and minimize the aliasing between the main effects and the second-order effects (two-factor interactions and quadratic effects). These properties of the newly found designs distinguish them from definitive screening designs involving two-level categorical factors and other mixed-level designs recently introduced in the literature. The new OMARS designs with categorical factors also turn out to outperform the available benchmark designs in terms of various design quality measures.

mkdf-team-image

José Núñez Ares

José Núñez Ares is currently a VLAIO Innovation Mandate holder at the MeBios division at KU Leuven. He worked as a postdoctoral researcher at UW-Madison (Wisconsin) with prof. Jeff Linderoth after obtaining his doctoral degree at KU Leuven. In his research, he uses operations research techniques to discover novel cost-efficient experimental designs.

Bayesian filtering techniques for optimal dynamic experiments

November 8, 2019 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

Dynamical models are a common way to analyze, control and optimize physical, chemical and biological processes. Before the model can be used in practice the model parameters have to estimated, often from noisy data. The experiments to estimate these parameters can be both costly and laborious. In this seminar we focus on designing informative experiments, specifically for dynamic models this involves constructing time-varying input profiles. Two common tools to quantify the information content of an experiment are the Fisher information matrix and the Kullback–Leibler divergence. Both involve the likelihood function of the model parameters.

Calculating the likelihood of the parameters becomes challenging when dynamic models have both measurement and process noise. This complexity is because the likelihood of the parameters depends on the true state of the dynamical system, which varies randomly due to the process noise. But this state is not exactly known due to the measurement noise. Calculating the likelihood of the parameters thus also requires recursively calculating the likelihood of the state at every measurement time. The problem of recursive state estimation is known as Bayesian filtering. In the special case of a linear dynamical system, the Bayesian filter reduces to the Kalman filter. In this seminar we combine Bayesian optimal design methodology with Bayesian filtering techniques to construct informative dynamic experiments.

mkdf-team-image

Arno Strouwen

Ir. Arno Strouwen is a Ph.D. student in the MeBioS division of the Department of Biosystems at KU Leuven. His research deals with optimal input profiles for dynamic experiments.

Enumeration of two-level even-odd designs of strength 3

November 29, 2019 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

Suppose that you have to design an experiment to investigate the effect of 17 potential enzyme stabilizers on the stability of some enzyme at room temperature. Biochemists assure you that a lot of the stabilizers should have an effect on the stability. They want the experiment to reveal the active main effects of the stabilizers, and they want to search for synergistic or antagonistic effects. They fix the experimental budget at 64 runs. What design do you recommend? There is no doubt about the answer: it should be a two-level even-odd strength 3 design of course! A two-level design is nice because it does not waste effort to detect quadratic effects. A strength-3 design has its main effects orthogonal to each other and also orthogonal to the two-factor interaction effects. This ensures optimal chances to detect active main effects. What then is so nice about even-odd designs? Well, ‘even’ designs are fold-over designs. This implies that there are at most N/2 – 1 degrees of freedom available to search for interactions. Even-odd designs don’t have this fold-over structure and so, in principle, have more degrees of freedom to search for interactions. For the enzyme stability example, even designs offer 31 degrees of freedom for interactions, while the best even-odd designs we obtained offer 46 degrees of freedom. In the talk, I present a systematic enumeration method for even-odd designs, show some results of the enumeration and also the gaps that must yet be filled before the research can be turned into a nice paper.

mkdf-team-image

Eric Schoen

A guest professor at the Faculty of Bioscience Engineering of the KU Leuven and a senior statistical consultant at the contract research organization TNO in the Netherlands, he teaches the experimental design course for master students at Leuven. His main research area is the construction of orthogonal experimental designs.

A Literature Review on the Intersection between Big Data and Design of Experiments

January 17, 2020 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

In recent years, big data has emerged as a prominent feature of many problems in science and industry. Standard statistical methods are no longer computationally feasible to analyze big data sets. To overcome this issue, researchers from the community of Design of Experiments (DoE) have advocated the use of data reduction methods for big data, inspired by ideas from the DoE field. In this presentation, I will review relevant DoE-inspired methods for data reduction. More specifically, I will review the Information-Based Subdata Selection (IBOSS) approach of Wang et al. (2018), and the Orthogonal Array-Based Subdata Selection (OABSS) approach of Wang et al. (2019). The IBOSS approach, which is based on the optimal design of experiments framework, selects a sample from big data by maximizing the so-called D-optimality criterion. The output of the approach is a sample which provides efficient estimates of the parameters in a linear regression model. The OABSS approach is an alternative to the IBOSS approach, which is inspired by the literature on orthogonal arrays. I will discuss the advantages and disadvantages of both approaches, and provide a critical look into their practical relevance.

mkdf-team-image

Alan Vazquez

Alan Vazquez is a postdoctoral fellow of the Fund for Scientific Research Flanders (FWO) within the Faculty of Bioscience Engineering at KU Leuven. He authored several publications in high-impact journals in statistics and operations research. He obtained his Ph.D. from the University of Antwerp and was a visiting researcher at UCLA

A-Optimal versus D-Optimal Design of Screening Experiments

February 28, 2020 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

The purpose of this talk is to convince the audience that, for screening experiments, A-optimal designs are equally good or better than D-optimal designs. In most cases, the A-optimal design and D-optimal design are the same, and therefore equally good. When the two designs are different, the A-optimal design often has more desirable statistical features than the D-optimal design. A-optimal designs generally have more uncorrelated columns in their model matrix, and their nonzero correlations are generally smaller in magnitude than those of the D-optimal design. Also, even though A-optimal designs minimize the average variance of the parameter estimates, there are many cases where they outperform the D-optimal design in terms of the variances of all individual parameter estimates. A-optimal designs can also substantially reduce the worst prediction variance compared to D-optimal designs. Finally, the rationale for the A-optimality criterion is easier to understand than the rationale for the D-optimality criterion, since the A-optimality criterion is directly related to the variances of the estimates of the parameters in the model.

mkdf-team-image

Peter Goos

A full professor at the Faculty of Bio-Science Engineering of the University of Leuven and at the Faculty of Applied Economics of the University of Antwerp, where he teaches various introductory and advanced courses on statistics and probability. His main research area is the statistical design and analysis of experiments.

Literature review on the construction of fractional factorial split-plot designs

March 20, 2020 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

Fractional factorial (FF) designs are extremely popular in the industry, and are often used with complete randomization of the experimental runs. However, the nature of the process sometimes imposes restrictions on the randomization. In these cases, fractional factorial split-plot (FFSP) designs may represent a good design option. These designs are composed of two separate FF designs, a whole-plot subdesign, and a sub-plot subdesign. While many authors have proposed techniques to construct FFSP designs (Addelman 1964; Huang, Chen, and Voelkel 1998; Schoen 1999), few give actual methodology to find optimal ones. In contrast, Bingham & Sitter (1999) presented algorithms to create minimum aberration (MA) FFSP designs and even presented a catalog of these designs for 8, 16 and 32 runs. In this seminar, we will dive into their work and dissect their algorithms to understand the creation of MA FFSP designs. Later, we will look into the work of Bingham, Schoen & Sitter (2004) and the special case of MA FFSP designs where the whole-plot subdesign has to be replicated in order to provide a good number of subplots. Indeed, for a number of applications, it is often interesting to increase the number of whole plots without also increasing their size in terms of the number of subplots.

mkdf-team-image

Alexandre Bohyn

Alexandre Bohyn holds a master degree in Bioscience Engineering from UC Louvain and a Master of Statistics degree from KU Leuven. He is a Ph.D. student in the Biostatistics group of the MeBioS division at KU Leuven.

Optimal Designs for Generalized Linear Models: A Review

May 15, 2020 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

Generalized linear models (GLMs) form a collection of statistical regression models that handle data sampled from distributions of the exponential family. Examples of distributions belonging to the exponential family include the normal distribution, Poisson distribution and binomial distribution. Like the name suggests, GLMs are an extension of the classical linear model. Except for the classical linear model, optimal experimental designs for GLMs depend on the model parameters. This results from the fact that the Fisher information matrix incorporates the so-called link function that relates the mean of the response of the distribution with the corresponding linear predictor. This seminar will be mainly based on the book ‘Design of Experiments for Generalized Linear Models’ by Kenneth Russell, and it will provide an introduction on how optimal experimental designs for GLMs can be constructed. Special attention will be paid to applications.

mkdf-team-image

Karel Van Brantegem

Karel Van Brantegem holds a master degree in Bioscience Engineering from KU Leuven. He is a Ph.D. student in the Biostatistics group of the MeBioS division at KU Leuven.

A study of design-based model selection methods

April 24, 2020 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

The most commonly used variable selection methods for supersaturated designs such as Stepwise (AIC/BIC), Forward Selection, LASSO and the Dantzig Selector are very general and can be applied to a wide variety of situations. However, for certain unique experimental designs, it is possible to exploit the design structure and conduct a specific data analysis that leverages the design properties to detect active effects with high probability. In this talk, I will be reviewing two such popular model selection procedures.

The first method was initially proposed and refined by Miller and Sitter in 2001 and 2005, respectively. These authors illustrated their method using Margolin designs which are fold-over designs in which the main effects are orthogonal to the two-factor interactions. The second method was proposed by Jones and Nachtsheim in 2017 for the analysis of data from definitive screening designs. This method involves a new way of estimating the error variance. Jones and Nachtsheim performed a simulation study to show that their new analysis method outperforms multiple benchmark methods. During this talk, I will discuss the advantages and limitations of these methods, and directions for future work in this area.

mkdf-team-image

Mohammed Saif Ismail Hameed

Mohammed Saif Ismail Hameed holds a bachelor degree and a master degree in Engineering from the Ramaiah Institute Of Technology and the University of Michigan, respectively. He is a Ph.D. student in the Biostatistics group of the MeBioS division at KU Leuven.

The choice set size: to be fixed or not to be fixed

June 12, 2020 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

Two-alternative choice tasks have long been preferred in choice experiments because they are simple to analyze. However, the increase in computer power has made it possible to design and analyze discrete choice experiments with more alternatives. Researchers have to select an appropriate number of alternatives that will yield enough information about the respondents’ preferences without overloading the choice set. It is common to keep the choice set size fixed because it is hoped that it helps in keeping the error variance in each choice task more or less constant. However, as the complexity of the choice set also influences the error variance, the relationship between the information in the experiment, the error variance and the size of the choice sets is much more complicated and small choice sets can be quite complex while large choice sets can be relatively simple.

In this presentation, we investigate whether a design can be improved by allowing for different choice set sizes. We look for efficient designs with varying choice set sizes for the conditional logit (CL) model as well as for the heteroscedastic conditional logit (HCL) model introduced by Ben-Akiva and Lerman (1985) which parameterizes the error variance as a function of the complexity of the choice set.

While investigating efficient designs for the CL model, we showed once more that the overall efficiency increases with increasing fixed choice set size but, to our surprise, we also found designs with varying choice set sizes that outperform designs with fixed choice set size, given the same number of choice sets and total number of alternatives.
When the complexity is reflected in the error variance and modelled by the HCL, we show that designs with constant choice set size are far from optimal. We found many choice designs with varying choice set sizes that are much more efficient to estimate the preference parameters than designs with the same number of alternatives in each choice set. We show that these designs have smaller average complexity and larger efficiency than efficient designs with fixed choice set size.

mkdf-team-image

Martina Vandebroek

Martina Vandebroek is a Full Professor at the Faculty of Economic and Business at KU Leuven, within the Research Centre for Operations Research and Statistics (ORSTAT). She has published in leading journals in statistics, operations research and marketing.

Single-case experimental designs and N-of-1 RCTs: Design issues and applications

May 29, 2020 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

A single-case experimental design is an experimental design in which one unit is observed repeatedly during a certain period of time under different levels of at least one manipulated variable. In the behavioral sciences, the “unit” is most commonly a person, but any well-defined observational entity can be defined as “the case”. N-of-1 Randomized Controlled Trials (N-of-1 RCTs) are considered as a specific subset of designs with repeated cross-overs for a single patient as used in medical research. In this presentation, an overview will be given of the designs that are most frequently used in this type of research. I will present some applications and discuss the design issues and potential of this type of designs.

mkdf-team-image

Patrick Onghena

Mario Becerra holds a master degree in Computer Science and a bachelor degree in Applied Mathematics from the Instituto Tecnológico Autónomo de México (ITAM). He is a Ph.D. student in the Biostatistics group of the MeBioS division at KU Leuven.

I-optimal versus D-optimal designs for choice experiments with mixtures

June 12, 2020 @ Room 01.18, Kasteelpark 30, Heverlee, Belgium

Many products and services can be described as mixtures of ingredients. One example is concrete, made of cement, water, and sand. Choice modeling deals with the choices of a group of decision makers from a finite set of options, such as customers deciding whether to buy product A or B. Choice modeling can be combined with mixtures to determine what combinations of ingredient proportions people prefer. For example, a choice model can be used to determine what combination of sugar, flour, egg and chocolate proportions is preferable to a group of subjects in a study.

Choice models are non-linear, hence an optimal design for a choice experiment is a function of the unknown parameters. There are several ways to circumvent this problem, one of them being a Bayesian approach. This approach has the additional benefit that it gives researchers the opportunity to include prior knowledge in the form of a prior distribution. This is advantageous because researchers almost always possess information about the phenomena they are investigating, as scientific research is sequential.

The current state-of-the-art for Bayesian choice experiments with mixtures is restricted to D-optimality. However, since we typically want models that give good predictions, an I-optimal design is often more appropriate than a D-optimal design because it is focused on the prediction variance of the model. In this seminar, I will compare results concerning I-optimal designs for choice experiments with mixtures with results concerning D-optimal designs.

mkdf-team-image

Mario Becerra

Mario Becerra holds a master degree in Computer Science and a bachelor degree in Applied Mathematics from the Instituto Tecnológico Autónomo de México (ITAM). He is a Ph.D. student in the Biostatistics group of the MeBioS division at KU Leuven.