Robustness of Generalized Row-Column Designs Against Missing Observation(s)

Article Summary:

This article deals with the robustness against missing observation of generalized row-column design which are an important class of experimental designs and have profound applications in the field of agricultural and allied research...

Robustness of Generalized Row-Column Designs Against Missing Observation(s)
Authors: Anindita Datta, Seema Jaggi, Eldho Varghese^#, Cini Varghese,
Arpan Bhowmik, and Mohd. Harun
ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110012
^#ICAR- Central Marine Fisheries Research Institute, Kochi- 682 018

1. Introduction

When the heterogeneity present in the experimental material is from two sources, then two-dimensional blocking or double blocking of the experimental units is recommended for control or reduction of experimental error. The two blocking systems are referred to generally as row blocking and column blocking and the resulting designs are termed as Row-Column (RC) designs. These designs are used to control variability in field and animal experiments. Most of the row-column designs developed in the literature have one unit corresponding to the intersection of each row and column. However, for the instances when the number of treatments is large with limited experimental resources, Generalized Row-Column (GRC) designs are used where there is more than one unit in each row-column intersection. GRC design is an arrangement of v treatments in p rows and q columns such that the intersection of each row and column (cell) consists of more than one unit. Consider an experiment was conducted to compare the colour intensities of apple sauce (Edmondson, 1998). The treatments consist of all combinations of 12 blends of apple sauce with 4 concentration of cinnamon. Treatments could be stored for 4 different lengths of time. A GRC design as shown below was used in which rows represented cinnamon concentrations, columns as storage times and symbols as blends.

Rows (Cinnamon Concentrations)	Columns (Storage Time)
Rows (Cinnamon Concentrations)	I	II	III	IV
I	1 5 9	2 6 10	3 7 11	4 8 12
II	2 7 10	1 8 9	4 5 12	3 6 11
III	3 8 12	4 7 11	1 6 10	2 5 9
IV	4 6 11	3 5 12	2 8 9	1 7 10

These designs are studied in the literature in different names such as Semi-Latin square in which there are n rows, n columns and intersection of each row and column contains a cell of k units [Bailey and Monod (2001)], Trojan square [Bailey (1988,1992), Edmonson (1998,2002)], Generalized incomplete Trojan-type designs [Jaggi et al. (2010, 2016)]and Row-column designs with multiple units per cell [Datta et al. (2014, 2015,2016)].

In usual practice, these trials are conducted under controlled conditions and it is assumed that there are no disturbances that occur while conducting or measuring the observations. The presence of missing observations, outliers in the data, etc. are some of the disturbances that may occur during experimentation. These disturbances may lead to wrong interpretation of results or less precise comparisons among treatments tried in the experiment. In order to overcome such situations, designs which are insensitive or robust against missing observations/ outliers are required.

2. Methodology

A GRC design is considered here with v treatments arranged in p rows, q columns and in each row-column intersection (i.e. cells) there are k units or plots resulting in total n= pqk experimental units or observations. The following three-way classified model with treatments, rows and columns is considered:

Y_l(ij) = m + t_l(ij) + Î±_i + b _j + e_l(ij); i =1,2,...,p; j = 1,2,...,q; l = 1,2,...,k, ... 2.1

where Y_l(ij) is the response from the l^th unit corresponding to the intersection of i^th row and j^th column. m is the general mean, t_l(ij) is the effect of the treatment appearing in the l^th unit corresponding to the intersection of i^th row and j^th column, Î±_i is the i^th row effect and b_j is the j ^th column effect. e_l(ij) is the error term identically and independently distributed and following normal distribution with mean zero and constant variance.

A GRC design is robust against loss of observations, if the loss of efficiency of the residual design as compared to the original design is small. If C_d is the information matrix for estimating the treatment effects of GRC design d and C_d* is that of the residual design d* after the observations are lost, then the efficiency E of the residual design relative to the original design is given by
E = (Harmonic mean of non-zero eigen of Cd* / Harmonic mean of non-zero eigen of Cd)

A GRC design is said to be robust if the efficiency of the resulting design after loss of information is more than 95%.

SAS code has been written in PROC IML to calculate the information matrix ( C-matrix) of treatment effects, its eigen-values and the harmonic mean of non-zero eigen-values of C-matrix of original design and the residual design for GRC design.

3. Results

Here, the robustness of different classes of GRC designs (Bailey, 1992) against missing of one or more observations within a cell as per the efficiency criteria, as defined in methodology, has been investigated. We consider a design to be highly robust against missing observation(s) if the loss in efficiency of the residual design is not more than 5% and robust if the loss in efficiency of the residual design is between 5% to 10%.

The robustness of this class of designs has been investigated against missing of some/ all observations of last column. Without loss of generality, the observations from units of last column are assumed to be missing as the columns can always be interchanged. Table 3.1 gives the parameters of the designs considered i.e., number of treatments (v â‰¤ 10), number of rows (p), number of columns (q), replication (r), cell size (k) and the number of observation(s) missing with the unit/ cell number of the last column from which the observation(s) are missing along with the efficiency (E) of the residual design relative to the original design.

Datta et al. (2015) developed this series of GRC design for unequal cell sizes. This design is developed by using a BIB design with parameters v*, b* (even), r*, k*, Î»*. The resulting design have parameters v = v*, p = 2 rows of size (v*b*)/2, q = b* columns of size v*, r = b*, k₁ = k*, and k₂ = v*- k*.

Example 3.1: Consider a BIB design with parameters v* = 5, b* = 10, r* = 4, k* = 2, Î»* = 1. The following is a GRC design with parameters v = 5, p = 2 of size 25 each and q = 10 columns of size 5, r = 10, k₁ = 2 and k ₂ = 3.

Rows	Columns
Rows	I			II				III			IV			V			VI				VII			VIII			IX			X
I	1		2		1		3	1		4	1		5	2		3	3	4	5	2		4	5	2	3	5	2	3	4		1	4	5
II	1	3	5		1	3	4	1	2	5	1	2	4	1	2	3	2		4	2			5	3		4	3		5		4		5

Table 3.1 highlights the parameter of the GRC designs developed based on the above series along with number of observation missing and the cell number from which the observations are missing, harmonic mean of non-zero eigen values of information matrix of original design and the residual design under the three-way model. The efficiency (E) of the residual design relative to the original design have also been highlighted.

Table 3.1: Parameters and efficiency of the residual design

S. No.	v	p	q	r	k	No. of observation missing	Unit/ Cell No.	HM (C_d₎	HM (C_d* )	E
1	5	2	10	10	2 3	1	last unit in last cell	8.50	8.29	0.98
2	5	2	10	10	3 3	2	last any two units from last cell	8.50	8.17	0.96
3	5	2	10	10	4 3	3	last cell total	8.50	7.93	0.93
4	5	2	10	10	5 3	2	last unit of each cell of last column	8.50	8.01	0.94
5	5	2	10	10	6 3	5	last unit of each cell of last column and last cell total	8.50	7.69	0.91
6	9	2	12	12	3 6	1	last unit	12.00	11.86	0.99
7	9	2	12	12	4 6	2	last any two units from last cell	12.00	11.72	0.98
8	9	2	12	12	5 6	3	last any three units from last cell	12.00	11.59	0.97
9	9	2	12	12	6 6	4	last any four units from last cell	12.00	11.47	0.96
10	9	2	12	12	7 6	5	last any five units from last cell	12.00	11.34	0.94
11	9	2	12	12	8 6	6	total last cell	12.00	11.21	0.93
12	9	2	12	12	9 6	2	last unit of each cell of last column	12.00	11.73	0.98
13	9	2	12	12	10 6	9	last unit of each cell of last column and last cell total	12.00	11.09	0.92
14	9	2	18	8	4 5	1	last unit	18.00	17.86	0.99
15	9	2	18	8	5 5	2	last any two units from last cell	18.00	17.73	0.99
16	9	2	18	8	6 5	3	any three units from last cell	18.00	17.60	0.98
17	9	2	18	8	7 5	4	last four units from last cell	18.00	17.47	0.97
18	9	2	18	8	8 5	5	total last cell	18.00	17.34	0.96
19	9	2	18	8	9 5	2	last unit of each cell of last column	18.00	17.73	0.99
20	9	2	18	8	10 5	9	last unit of each cell of last column and last cell total	18.00	17.35	0.96
21	10	2	30	30	3 7	1	last unit in last cell	29.76	29.64	1.00
22	10	2	30	30	4 7	2	last two units from last cell	29.76	29.52	0.99
23	10	2	30	30	5 7	3	last any three units from last cell	29.76	29.40	0.99
24	10	2	30	30	6 7	4	last any four units from last cell	29.76	29.29	0.98
25	10	2	30	30	7 7	5	last any five units from last cell	29.76	29.19	0.98
26	10	2	30	30	8 7	6	last any six units from last cell	29.76	29.09	0.98
27	10	2	30	30	9 7	7	last cell total	29.76	28.95	0.97
28	10	2	30	30	10 7	2	last unit of each cell of last column	29.76	29.53	0.99
29	10	2	30	30	11 7	8	last unit of each cell of last column and last cell total	29.76	28.84	0.97

It can be seen from Table 3.1 that the efficiency of the resultant design is quite high for most of the designs. Out of 29 designs, 24 design have efficiency more than and equal to 95% and hence are highly robust whereas 5 designs are only robust as their efficiencies are less than 95%.

4. Conclusion

The series of GRC designs investigated are found to be robust against loss of observations. There is a decreasing trend in efficiency with increase in number of missing observations. In fact, the intensity or the consequences depends upon the size of the design. It can be seen that smaller designs are more affected by the missing observations.

References:

Bailey, R. A. (1988). Semi Latin squares. Journal of Statistical Planning and Inference, 18, 299-312.

Bailey, R. A. (1992). Efficient semi-Latin squares. Statistica Sinica, 2, 413-437.

Bailey, R. A. and Monod, H. (2001). Efficient semi-Latin rectangles: Designs for plant disease experiments. Scandanavian Journal of Statistics, 28, 257-270.

Datta, A., Jaggi, S., Varghese, C. and Varghese, E. (2014). Structurally incomplete row-column designs with multiple units per cell. Statistics and Applications, 12(1&2), 71-79.

Datta, A., Jaggi, S., Varghese, C. and Varghese, E. (2015). Some series of row-column designs with multiple units per cell. Calcutta Statistical Association Bulletin, 67, (265-266), 89-99.

Edmondson, R. N. (1998). Trojan square and incomplete Trojan square design for crop research. Journal of Agricultural Science,131, 135-142.

Jaggi, S., Varghese, C., Varghese, E. and Sharma, V. K. (2010). Generalized incomplete Trojan-type designs. Statistics and Probability Letters, 80, 706-710.

Jaggi, Seema, Varghese, Cini, and Varghese Eldho (2016): A series of generalized incomplete Trojan-type designs. Journal of Combinatorics, Information and System Sciences: American Journal, 40(1-4), 53-60.

About Author / Additional Info:
â€¢ Working as a scientist from 2012
â€¢ published around 35 research papers in national and international journals of repute
â€¢ served as resource person in different institute
â€¢ received IARI merit medal for outstanding academic performance in Ph.D.
â€¢ received Dr. G.R. Seth young Scientist Award-2015 by Indian Society of Agricultural Statistics
â€¢ Received Krishi Vigyan Gaurav (honorary title) by ARCC and BKAS

Publish Your Research Online

Robustness of Generalized Row-Column Designs Against Missing Observation(s)

Article Summary: