# Optimization of Microring-Based Interconnection by Leveraging the Asymmetric Behaviors of Switching Elements

Piu-Hung Yuen and Lian-Kuan Chen*, Senior Member, IEEE*

*Abstract—***With the objectives of reducing power consumption, insertion loss, nonuniformity and crosstalk, optimal configurations for optical interconnection are derived, by leveraging the unequal characteristics at bar or cross state of microring switching elements. A novel and efficient heuristic is proposed for finding the optimum switching configuration. It is shown that the optimum average total insertion loss per path using**  $2 \times 2$  **Basic Switching Element (2B-SE) achieves a 3.65-dB improvement for**  $128 \times 128$ **switch size. The optimum average total insertion loss using 2B-SE in the worst-cast path is shown to be 7.2 dB less than the base**line values. Furthermore, for  $128 \times 128$  switch size, the minimum **improvement is 7.2 dB for the nonuniformity in the worst case, whereas the total crosstalk has a 2.43-dB improvement. The effect of non-uniform switch elements is also studied.**

*Index Terms—***Optical interconnections, optical switching devices, optimization, resonators.**

## I. INTRODUCTION

**W**ITH the growing trend of number of cores per chip and computation resources per chip, the advancement for conventional interconnections is essential to meet the demands of high-bandwidth capacity, low power consumption, compact footprint and high scalability. Recently, an ITRS reported that optical interconnection based on metal-oxide-semiconductor (CMOS)-compatible silicon photonic is a suitable candidate that meets the aforementioned demands [1]. The carrier-injection-based silicon microring resonator is promising for large-scale-integrated optical interconnections using  $2 \times 2$ switching elements (SEs) due to its very compact footprint and potential sub-nanosecond switching time [2]. However, the scalability of microring-based interconnection is limited by issues including power consumption, insertion loss, nonuniformity and crosstalk. Current technology for a  $2 \times 2$  switching elements of microring requires non-negligible power consumption at bar or cross state on average [3]. Intrinsic insertion loss also limits the number of successive switching elements

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JLT.2013.2253761

per switching path [4], [5]. Also, the nonuniformity of optical power is an important issue to be considered in meeting the requirement of a large-scale optical interconnection. Last, as in each  $2 \times 2$  SE, there is an optical power leakage to the non-intended output-port and thus it creates many possible crosstalk powers at each output-port. This total crosstalk at output is severe and needs to be reduced for the scalability consideration of microring-based interconnection.

To improve the scalability of microring-based interconnection, a mirroring technique which exploits the spatial dimension was proposed [6]. For each input-output pair, a routing algorithm chooses the path along either the normal plane or the mirrored plane that gives lower total insertion loss. However, one mirrored plane and one plane selector are required additionally. This scheme has a doubled complexity of that in the normal case in terms of the number of microrings. Moreover, the plane selector introduces an additional insertion loss to the signals directed to the two planes. One study optimized the nonuniformity of optical power at output-port by designing a new architecture (2D-SE) for a  $2 \times 2$  microring-based switching element [7]. However, interconnections using 2D-SE double the complexity in terms of the number of microrings used. Also, an incoming signal suffers a higher insertion loss for both bar and cross state in 2D-SE compared with that in 2B-SE.

Regarding to the power consumption, it is estimated that the total energy consumption of the network devices of a typical network service provider is about 3 TWh per year [8], which costs US\$500 billion given that electricity is US\$0.17/kWh. Tucker *et al.* showed that the capacities and power consumptions of high-end routers grow in exponential factors of 2.5 and 1.65 every 18 months [9], [10]. Solutions to the reduction of power consumption in large-scale switch fabrics could come from two origins. One category relies on innovative technologies in materials and physical design. The other category exploits intelligent operational strategies. For example, a study on power-efficient interconnection networks uses dynamic voltage scaling with links [11].

Recently, we proposed a generic model to optimize the microring-based interconnection configurations for the reduction of power consumption and insertion loss [12]. However, the crosstalk issues and nonuniformity issues have not been considered yet. Hence, the aim of this paper gives a comprehensive study of the optimization of microring-based interconnection configuration for the reduction in overall power consumption, average total insertion loss per path, average nonuniformity of optical power and total crosstalk at each output-port.

Manuscript received October 14, 2012; revised February 06, 2013; accepted March 09, 2013. Date of publication March 21, 2013; date of current version April 10, 2013. This work was supported in part by research grants from Hong Kong Research Grants Council (Project No: 410908).

The authors are with the Department of Information Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong (e-mail:yph010@ie.cuhk. edu.hk; lkchen@ie.cuhk.edu.hk).



Fig. 1. Microring-based switching elements. (a) 1B-SE. (b) 2B-SE. (c) 2D-SE.

The paper is organized as follows. The asymmetric behaviors of  $2 \times 2$  SE of microring in terms of unequal power consumption, insertion loss and crosstalk will be discussed in Section II. Section III focuses on our proposed optimization scheme for the reduction of power consumption and insertion loss. A novel and effective heuristic finding the optimum switch configuration is proposed. Simulation results will also be shown and discussed in this section. In Section IV, the principle to determine the optimum switching configurations for the reduction of nonuniformity and crosstalk will be illustrated, followed by simulation results. Section V discusses the case when the power consumptions of switch elements are not uniform and Section VI gives the conclusion.

#### II. ASYMMETRIC BEHAVIORS AT BAROR CROSS STATE

## *A. Power Consumption*

A microring with a simple architecture as shown in Fig. 1(a) (labeled 1B-SE) can be designed to be either on/off-resonance when there is no electrical power supplied [7]. Therefore, a  $2 \times 2$  SE which uses two rings as shown in Fig. 1(b) may have unequal power consumption at bar or cross state. One recently reported measurement for the average power consumption of a 1B-SE is 100  $\mu$ W [13]. Another measurement shows that the SE architecture in Fig. 1(b) (labeled 2B-SE) requires nearly no power at cross state and less than 500  $\mu$ W at bar state [14]. 2B-SE was experimentally demonstrated in [15], in which the physical geometry of the device and experimental setup were shown. In this study, we assume that the power consumption of a 2B-SE is 200  $\mu$ W at bar (cross) state and 0  $\mu$ W at cross (bar) state without loss of generality.

## *B. Insertion Loss*

Due to the propagation loss inside the ring waveguide and the coupling between the busline and the ring, 1B-SE exhibits asymmetric insertion loss at on/off resonance [4], [5]. In [5], it is shown that input signal suffers significant power loss of 1.4 dB and nearly negligible power loss of 0.1 dB when the ring is in drop and through configurations, respectively. In that particular case, the insertion loss for a 2B-SE at bar state is 1.4 dB and that at cross state is 0.2 dB. To resolve the issue of asymmetries, a new architecture of SE is proposed as Fig. 1(c) (labeled 2D-SE) in [5]. It achieves equal insertion loss of 1.5 dB at both bar state and cross state. Albeit the asymmetric insertion loss is resolved, the total insertion loss of a large-scale optical interconnection has not been optimized and a 2D-SE doubles the complexity of that for 2B-SE in terms of the number of microrings used.

TABLE I SUMMARY OF ASSUMPTIONS IN 2B-SE.

| Assumptions in 2B-SE  | Bar-state   | Cross-state        |
|-----------------------|-------------|--------------------|
| Power consumption     | $200 \mu W$ | $0 \mu W$          |
| <b>Insertion</b> loss | 1.4 dB      | $0.2$ dB           |
| Crosstalk             | $-44.6$ dB  | $-17.8 \text{ dB}$ |

It should be noted that waveguide loss beyond the microring insertion loss is not considered in this study.

#### *C. Nonuniformity*

For 2B-SE architecture, there is no nonuniformity issue at bar or cross state. However, due to the asymmetric insertion loss at bar or cross state, nonuniformity of optical power occurs when cascading the switching elements to build an optical interconnection. The architecture of 2D-SE shown in Fig. 1(c) solves the nonuniformity issues at the expense of doubled complexity in terms of the number of microring used.

### *D. Crosstalk*

For each 2B-SE, there are crosstalks from the residual signals appearing at the non-intended outputs at bar or cross state configurations. The crosstalks of 2B-SE and 2D-SE are calculated in [7], based on the experimental 1B-SE crosstalk values measured in [4]. For 2B-SE, the crosstalks at bar state and cross state are assumed to be  $-44.6$  dB and  $-17.8$  dB, respectively. To resolve the issue of asymmetries, 2D-SE achieves equal crosstalk  $(-39.7 \text{ dB})$  at both bar state and cross state, similar to the case in insertion loss. Albeit the asymmetric crosstalk issues are resolved, the average total crosstalks at each output port have not been optimized and its complexity is doubled compared to that in 2B-SE in terms of the number of microrings used.

The values assumed in the study were from [7], which were based on experimental measurements from 1B-SE and extended to 2B-SE by calculation. Hence, the effect of the physical parameters and their variations are already included in the assumed values as listed in Table I.

# III. OPTIMIZATION SCHEME FOR THE REDUCTION OF POWER CONSUMPTION AND INSERTION LOSS

## *A. Principle*

The following describes a generic principle in determining the optimum switching configurations of total power consumption and average total insertion loss per path of classical switch architectures. A large-scale microring-based optical interconnection which is constructed by cascading  $N$  number of  $2B-SEs$  $(2 \times 2 \text{ SEs})$  will be examined. In an  $n \times n$  switch, there are  $2<sup>N</sup>$  sets of switching configurations to satisfy n! possible traffic matrices. For each traffic matrix,  $T$ , there may be more than one possible configuration as  $2^N > n!$ . A 4  $\times$  4 Benes switch is selected as an example for illustration since it exhibits minimum complexity among various nonblocking architectures. As shown in Fig. 2, a  $4 \times 4$  Benes switch can be viewed as a black box which is composed of  $2 \times 2$  SEs only. For given input/



Fig. 2. Abstract of a  $4 \times 4$  Benes switch.



Fig. 3. Traffic-demand transformation. (a) Input/output pairs request. (b) Traffic matrix.



Fig. 4.  $4 \times 4$  Benes Interconnection using  $2 \times 2$  switching elements.

output pairs, we can always find the corresponding traffic matrix. Fig. 3 shows an example of traffic-demand transformation for transforming input/output pairs request to a traffic matrix,  $T_0$ .

The above interconnection can be expressed as  $T = S_3 \times$  $E_2 \times S_2 \times E_1 \times S_1$  which is equal to the following equation. It should be noted that there is one 2B-SE in each color block.

$$
T = \begin{bmatrix} s_{11} & s_{11}' & 0 & 0 \\ s_{11}' & s_{11} & 0 & 0 \\ 0 & 0 & s_{12}' & s_{12}' \\ 0 & 0 & s_{12}' & s_{12} \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} s_{21} & s_{21}' \\ s_{21}' & s_{21} \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} s_{31} & s_{31}' \\ s_{31}' & s_{31}' \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} s_{11}' & s_{31}' \\ s_{12}' & s_{31}' \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} s_{11}' & s_{12}' \\ s_{12}' & s_{12}' \\ 0 & 0 & 0 \end{bmatrix}
$$

The interconnection representation in (1) shows a generic decomposition of an interconnection based on  $2 \times 2$  SE. There are many possible switching configurations for achieving the same traffic matrix  $T$ . All possible configurations can be found by the equation of  $T = S_3 \times E_2 \times S_2 \times E_1 \times S_1$ . The variables  $S_1$ ,  $S_2$  and  $S_3$  are stage matrices, consisting of the switching configurations of the 2B-SEs at the first, second and third stage, respectively, where  $E_1$  and  $E_2$  are the edge matrices for the interconnection between each stage. Hence,  $T$  can be expressed as (1).  $s_{i,j}$  and  $s'_{i,j}$  represent the bar and cross state of the  $j^{\text{th}}$ 2B-SE (represented by different color blocks in Fig. 4 and (1) at stage-*i*, where  $s_{i,j} = 1(0)$  denotes the switch is at bar (cross) state. It should be noted that  $s_{i,j}$  and  $s'_{i,j}$  are binary and complementary of each other.

First, for the given input/output pairs in Fig. 3(a), we can obtain the traffic matrix  $T_0$  as in Fig. 3(b). By analyzing  $T_0$ using (1), all possible switching configurations can be found. There are four possible configurations after solving the equation for the  $T_0$  in Fig. 3(b). Different configurations, depending on the number of bar or cross state, result in different total power consumption as well as different total insertion loss per path. Fig. 5(a)–(d) depict all four possible switching configurations for the given  $T_0$  and the corresponding number of bar-state



Fig. 5. (a)-(d) Possible configurations for traffic matrix  $T_0$ .

and cross-state switches. We first take the minimization of total power consumption as the objective function. The same principle holds for the reduction of total insertion loss per path. The operating power for a 2B-SE at bar state,  $P_b$ , is assumed to be higher than that at cross state,  $P_c$ , without loss of generality. The two configurations in Fig.  $5(a)$  and (b) are the optimum switching configurations of total power consumption  $= P_b +$  $5P_c$ , whereas the other two in (c) and (d) are of high total power consumption,  $3P_b + 3P_c$ .

When the number of ports increases, the above equations are still valid by multiplying more stage matrices and edge matrices. Hence, the total power consumption of an  $n \times n$  Benes switch can be optimized by configuring the 2B-SEs appropriately.

## *B. Calculation of Power Consumption and Insertion Loss*

We assume the following parameters for 2B-SE in our study: (i) the power consumption at bar state  $(P_b)$  and cross state  $(P_c)$ are 200  $\mu$ W and 0  $\mu$ W, respectively; (ii) the insertion loss at bar state  $(IL_b)$  and cross state  $(IL_c)$  are 1.4 dB and 0.2 dB, respectively.

The principle of determining the optimum switching configuration for the reduction of total power consumption is discussed in the previous section. With the objective of reducing total insertion loss per path in 2B-SE, we apply the same principle as in the power consumption case. If the bar-state switch has higher power consumption and higher insertion loss, the two objectives of the reduction of the overall power consumption and the average total insertion loss per path can be achieved simultaneously by minimizing the total number of bar-state switches  $\sum_{i,j} s_{i,j}$ . The objective functions for minimizing the total power consumption,  $P_T$ , and the average total insertion loss per path,  $IL_T$ , are to minimize the total number of bar-state switches. Therefore, the objective function for the minimization of the total power consumption is:

$$
\min_{s_{ij}, s'_j \in \{0, 1\}} \left\{ P_b \sum_{i,j} s_{i,j} + P_c \sum_{i,j} s'_{i,j} \right\} \tag{2}
$$



Fig. 6. Switching configuration optimization by the looping algorithm. (a) Switching configuration before optimization. (b) Optimum switching configuration for outer stages.

Similarly, the objective function for the minimization of the average total insertion loss per path is:

$$
\min_{s_{ij}, s'_{ij} \in \{0, 1\}} \left\{ \frac{\left[ IL_b \sum_{i,j} s_{i,j} + IL_c \sum_{i,j} s'_{i,j} \right]}{n} \right\} \tag{3}
$$

# *C. Heuristic*

To find the optimum switching configuration with shorter computation time, we develop a novel heuristic to minimize the number of bar-state switches (i.e., changed to the switching configuration of lower insertion loss or lower power consumption). For Benes architecture, the heuristic presents an iterative procedure for changing bar state to cross state recursively for each loop. Fig. 6 shows the switching configuration optimization by a looping algorithm. The procedure is illustrated as follows:

*1) Step 1:* For a given initial configuration, the inputs of the first stage switches and the outputs of the last stage's switches are connected by a dotted line. Then a few loops, denoted by different colors, will be formed as shown in Fig. 6(a). The change of switch states of the first and the last stages in one loop are independent of those in the other loops, hence the connection matrix can be maintained if the switch states in the first and the last stages in one loop are all changed simultaneously. It should be noted that the dotted lines at the inputs and the outputs are only for the illustration of the looping algorithm, not physical connections.

*2) Step 2:* For each loop, all SEs at the first stage (e.g., SE of port  $I_1$  and  $I_2$ ; SE of port  $I_3$  and  $I_4$ ) and the last stage (e.g., SE of port  $O_3$  and  $O_4$ ; SE of port  $O_5$  and  $O_6$ ) shall be changed to their opposite states at the same time if there are more bar-state SEs in that loop. Otherwise, the change of configuration for SEs at the first and the last stage is finished.

*3) Step 3:* As shown in Fig. 6(a), if the loop passes through two central modules ( $M_1$  &  $M_2$ ), input-output pairs in the same loop at M<sub>1</sub> (ports in M<sub>1</sub>: 1  $\rightarrow$  2'; 2  $\rightarrow$  3') will interchange input-output pairs in the same loop at  $M_2$  (ports in  $M_2$ : 1  $\rightarrow$ 



Fig. 7. Simulation results of the power saving for different number of ports.

 $3'$ ;  $2 \rightarrow 2'$ ). Fig. 6(b) shows the switching configuration after changing bar state to cross state.

*4) Step 4:* An  $n \times n$  Benes switch can be decomposed into two  $N/2 \times N/2$  modules in the middle consisting of  $2 \times 2$  SEs at the first and the last stage. An  $N/2 \times N/2$  module can then be broken down into three stages consisting of  $2 \times 2$  SEs at the first and the last stage and repeat from *step 1*. The change is performed recursively from the outer stages to the inner stages until the central module become  $2 \times 2$  SEs.

After the heuristic, the number of bar-state switches is minimized and thus the optimum switching configuration for the minimization of the total power consumption and the average total insertion loss per path are achieved. Our heuristic can reduce the computation time to one third of the exhaustive search. The improvement using the heuristic is substantial at a larger switch size. Hence, it demonstrates that the proposed heuristic for finding the optimum switching configuration is efficient.

## *D. Results and Analysis*

*1) Power Consumption:* Using the assumed values in Section II, the power saving using the optimum switching configuration for a given switch size is obtained and compared with the average power consumption without optimization. The results in Fig. 7 show that significant power savings for Benes, Spanke-Benes and Crossbar switch [16] can be achieved for a  $128 \times 128$  switch. The curve shows that the power saving of the optimum switching configuration increases with the number of ports. Crossbar switch has the best performance due to its high flexibility. As Crossbar switch exhibits the lowest non-blocking probability among the three architectures, it has more number of possible switching configurations to satisfy a given traffic matrix. Thus, the power saving is the highest among the three architectures.

Fig. 8 depicts the absolute power consumption for different switch architectures. It increases nonlinearly with the number of ports. Albeit Crossbar switch achieves a high power saving, its absolute power consumption is almost double of that for Benes switch for  $128 \times 128$  switch size due to the high complexity of Crossbar switch.

We further investigate the effect of symmetry in cross or bar state on the relative power saving.  $R$  is defined as  $R = P_c/P_b$ . We denote C to be the sum of  $P_c$  and  $P_b$ ,  $C = P_c + P_b$ .  $P_c$  and  $P_b$  can be written as  $P_c = CR/(1 + R)$ and  $P_b = C/(1 + R)$ , respectively. Without optimization, since each traffic demand is equally probable, the number of bar-sate switches and cross-state switches both equals to  $N/2$ 



Fig. 8. Simulation results of the absolute power consumption for different switch architectures for different number of ports.



Fig. 9. Simulation results of the power saving for different degree of power symmetry with different number of ports in Benes architecture.

on average. Hence, the total power consumption before and after optimization are  $P_{T\_{rm org}} = C(N/2 + N/2)/2 = CN/2$ and  $P_{T\_{\text{opt}}}=P_b\overline{N}_b+P_c\left(\overline{N}-\overline{N}_b\right)$ , respectively, where  $\overline{N}_b$ is the minimum total number of bar-state switches at optimum configuration. The relative power saving,  $r$  is simplified as:

$$
r = \frac{P_{T\text{-org}} - P_{T\text{-opt}}}{P_{T\text{-org}}}
$$
  
= 1 -  $\left[\frac{R}{1+R}\right] \left[2 - \frac{2\overline{N}_b}{N}\right] - \left[\frac{1}{1+R}\right] \left[\frac{2\overline{N}_b}{N}\right]$  (4)

It is observed that the relative power saving depends on the degree of power symmetry,  $R$ , and the ratio of switch in bar state,  $\overline{N}_b/N$ . Thus, optical interconnections with the same R in SE, achieve the same relative power saving regardless of the absolute power consumption for each SE.

Fig. 9 shows the results of power saving for different degree of power symmetry. Results reveal that the higher the degree of power symmetry, the lower the power saving. Power saving attains its maximum when  $P_c = 0$ . In other words, it is favorable to fabricate a microring SE with higher asymmetry in power consumption at cross or bar state for lower overall power consumption, assuming the total power consumption of bar and cross-state is constant.

*2) Insertion Loss:* The total insertion loss per path is a dominant factor for the scalability of optical interconnections. Simulations are performed to minimize (1) the average total insertion loss per path and (2) the insertion loss of the worst path. Fig. 10 shows that the average total insertion loss per path increases with the number of ports in Benes architecture. The average total insertion loss per path using 2B-SE achieves a



Fig. 10. Total insertion loss per path with different number of ports in Benes architecture.

3.65-dB improvement while the optimum average total insertion loss using 2B-SE in worst-case path (WCP) has a 7.2-dB improvement for  $128 \times 128$  switch size at optimum switching configurations compared with the baseline values without optimization. There is no optimization for microring-based interconnection using 2D-SE because it attains the same insertion loss at bar or cross state.

# IV. OPTIMIZATION SCHEME FOR THE REDUCTION OF NONUNIFORMITY AND CROSSTALK

## *A. Principle*

The following describes a generic principle in determining the optimum switching configurations of nonuniformity of optical power and total crosstalk at each output port of classical switch architectures. The interconnection representation of a microring-based optical interconnection is the same as that discussed in Section III. By applying the principle for identifying all the possible switching configurations for given input/output pairs, we can always find the corresponding traffic matrix as well as all possible switching configurations. Different configurations, depending on the number of bar or cross state that each path passes through, result in different nonuniformity of optical power among all output-ports as well as different total crosstalk at each output.

A 4  $\times$  4 Benes switch composed of 2  $\times$  2 switching elements is selected as an example for illustration. First, for the given input/output pairs,  $T_0$  in Fig. 11(b) is analyzed. All possible switching configurations are depicted in Fig.  $12(a)$ –(d).

## *B. Calculation of Nonuniformity*

We first take the minimization of nonuniformity of optical power among different output-ports as the objective function. The insertion loss for a 2B-SE at bar state,  $IL_b$ , is assumed to be higher than that at cross state,  $IL_c$ , without loss of generality. The two configurations in Fig. 12(a) and (d) are the optimum switching configurations with no nonuniformity (uniform power), whereas the other two in (b) and (c) are of high nonuniformity (=  $2IL_b - 2IL_c$ ). It should be noted that the nonuniformity of optical power among different output-ports is



Fig. 11. Traffic-demand transformation. (a) Input/output pairs request. (b) Traffic matrix.



Fig. 12. (a)-(d) Possible configurations for traffic matrix  $T_0$ .

calculated by the difference between the largest optical output power and smallest optical output power among all output-ports.

When the number of ports increases, the above calculations are still valid by multiplying more stage matrices and edge matrices. Hence, the total power consumption of an  $n \times n$  Benes switch can be optimized by configuring the 2B-SEs appropriately. The switching configuration with the lowest nonuniformity of optical power among different output-ports will be selected as the optimum switching configuration.

## *C. Calculation of Crosstalk*

With the minimization of the total crosstalk at each output-port being the objective function, we first find all possible switching configurations for a given traffic matrix  $T_0$ . After identifying all possible switching configurations as in Fig.  $12(a)$ –(d), we need to determine the optimum switching configuration that gives the smallest total crosstalk at each output-port on average. The crosstalk for a 2B-SE at cross state is assumed to be higher than that at bar state without loss of generality. The switching configuration in Fig. 12(a) is taken as an example to illustrate the calculation of the total crosstalk at each output-port. For better understanding, we re-draw the switching configuration in Fig. 12(a) as shown in Fig. 13. Crosstalk power and signal power are marked as dotted line and thick line, respectively.

In this three-stage Benes interconnection, there are first-order, second-order and third-order crosstalks. The highest number of order is the same as the number of stages for an  $S$ -stage Benes interconnection. The number of  $n<sup>th</sup>$ -order crosstalks destined to each output-port in  $S$ -stages Benes interconnection is



Fig. 13. One possible switching configuration for traffic matrix  $T_0$  with showing signal and crosstalk power.



Fig. 14. Contributions of different order crosstalks to output-port 1'. (a) Three 1st-order crosstalks. (b) Three 2nd-order crosstalks. (c) One 3rd-order crosstalk paths

 $C_n$ . Hence, for each output-port, there are three 1st-order, three 2nd-order and one 3rd-order crosstalks in this 3-stage Benes interconnection.

For each output-port, the calculation of the total crosstalk is similar. The total crosstalk at output-port 1' is contributed by the first-order, second-order and third-order crosstalks as shown in Fig.  $14(a)$ –(c), respectively. After determining all the total crosstalk at each output-port, we take the average value as well as the maximum value among all the output-ports. As different switching configurations result in different total crosstalk at each output-port, the switching configuration that gives the smallest total crosstalk is chosen as the optimum switching configuration.

# *D. Results and Analysis*

*1) Nonuniformity:* Using the measurement values as mentioned in Section II, we first calculate the value of the total insertion loss per path at each output-port and then obtain the nonuniformity of optical power among different output-ports. The nonuniformity of optical power under the optimum switching configuration for a given switch size is obtained. Simulation results are shown in Fig. 16.

In our simulations, we have investigated several nonuniformity cases. The maximum nonuniformity corresponds to the worst case in which the switching configuration gives the highest nonuniformity for a given traffic matrix whereas the minimum nonuniformity corresponds to the optimum switching configuration. The average nonuniformity corresponds to the average nonuniformity among all switching configurations in one switch size without optimization. The average of optimum nonuniformity and the average of the maximum nonuniformity for all traffic matrices increase with the number of switch size as shown in Fig. 16. The optimum nonuniformity for  $128 \times 128$ switch size has a 3.21-dB improvement on average, compared with the average nonuniformity without optimization.



Fig. 15. Contributions of different order crosstalks with different switch size.



Fig. 16. Nonuniformity of optical power among different output ports with different switch size.

For each traffic matrix, the worst case corresponds to the switching configuration that gives the maximum nonuniformity. Besides studying the average of the nonuniformity of worst cases among all traffic matrices at one particular switch size, we also investigate the maximum nonuniformity among all traffic matrices (i.e., the worst case of the worst case, labeled as WWC). However, among different traffic matrices at one particular switch size, WWC may not be unique. It is because the nonuniformity of their worst case could be the same before optimization, but they may have different improvements after optimization. Hence, we investigate the minimization of the nonuniformity among different WWCs using the optimum switching configuration and identify the best improvement (i.e., the minimum nonuniformity of different WWCs), the average improvement (i.e., the average nonuniformity of different WWCs) and the minimum improvement (i.e., the maximum nonuniformity of different WWCs) using the optimum switching configurations. Results show that it can achieve significant improvement. With the optimum switching configuration, the best improvement is 9.6 dB, the average improvement is 8.7 dB and the minimum improvement is 7.2 dB for  $128 \times 128$  switch size. It should be noted that the



Fig. 17. Total crosstalk at each output-port with different switch size.

nonuniformity of optical power among different output-ports is linearly proportional to the difference between the insertion loss at bar state and that at cross state.

*2) Crosstalk:* Besides the nonuniformity, the total crosstalks at each output-port with the optimum switching configuration for different switch size are investigated. Results depicted in Fig. 17 show that the optimum switching configuration can achieve a 1.87-dB improvement for  $128 \times 128$  switch size on average, compared with the average case without optimization. On the other hand, we also investigate the improvement of the highest total crosstalk for a particular traffic matrix (i.e., the worst case of the total crosstalk at output-port that the switching configuration gives the highest total crosstalk at one output-port). In the worst case, the improvement of the total crosstalk at output-port using our optimum scheme is 2.43 dB for  $128 \times 128$  switch size.

It should be noted that the improvement of total crosstalk for each output-port is not that significant, compared with that for the total power consumption, the insertion loss per path as well as the nonuniformity among different output-ports. This is because the optimization of the total crosstalk at each port includes all the possible crosstalk powers from each input-port, whereas in the previous cases the calculation of the parameters concerned (power consumption, insertion loss, and nonuniformity) for each path does not have contribution from other paths.

#### V. DISCUSSION

To illustrate the effect of nonuniformity of switching elements, variation of power consumption is taken as an example in the following discussion. The objective function for the minimization of the total power consumption  $(P_T)$  is shown in (5), where  $N_b$  denotes the possible number of bar-state switch,  $P_b$ and  $P_c$  denotes the possible power consumption at bar state and at cross state in the case of nonuniform power consumption among different switch elements, respectively. The possible number of bar-state switch is lower-bounded by  $N_b$ , the aforementioned minimum number of bar-state switches at optimum configuration due to the connectivity constraint, and is upper-bounded by  $N - \overline{N}_b$  due to the characteristics of Benes architecture. It is assumed that the possible power consumption at cross state is always less than or equal to that at bar state, i.e.,

 $\left\{\tilde{P}_{c}\right\} \leq \min\left\{\tilde{P}_{b}\right\}$  For a given traffic matrix, the optimization can be expressed as

$$
\min \left\{ \tilde{P}_T \right\} = \min \left\{ \sum_{i=1}^{\tilde{N}_b} \tilde{P}_{bi} + \sum_{i=1}^{N - \tilde{N}_b} \tilde{P}_{ci} \right\} \tag{5}
$$

where  $\tilde{P}_{bi}$  and  $\tilde{P}_{ci}$  are the power consumption of the  $i^{\text{th}}$  barstate switch and cross-state switch, respectively. For the objective function in (5), the optimum value of number of bar-state switches should be found first. Since  $P_b$  is always greater than  $P_c$ , it is straightforward to show that the optimum value of  $N_b$ is equal to  $\overline{N}_b$  as any other switch configuration that has an  $\overline{N}_b$ larger than  $\overline{N}_b$  will lead to the increase in  $\tilde{P}_T$ . If the power consumption of each switch element is known, the minimum  $P_T$ can be derived by calculating the  $P_T$  in (5) for various switch configurations with  $\tilde{N}_b = \overline{N}_b$  and then select the minimal one. Given the distributions of  $P_b$  and  $P_c$ , the distribution of  $P_T$  can also be found based on (5) with  $N_b$  equal to  $\overline{N}_b$ .

### VI. CONCLUSIONS

In this paper, we successfully demonstrated that with optimum switching configuration, the overall power consumption and the average total insertion loss per path can be reduced for microring-based interconnection. A new heuristic is proposed and it is shown to reduce the computation time for the optimum switching configuration to one third of that of the exhaustive search. The results also depict that the optimum average total insertion loss per path using 2B-SE achieve a 3.65-dB improvement for  $128 \times 128$  switch size. The optimum average total insertion loss using 2B-SE in the worst-cast path is shown to be 7.2 dB less than the baseline values without optimization. In addition, with optimum switching configuration, the nonuniformity of optical power among different output-ports and the total crosstalk at each output-port can be reduced for microring-based interconnection. Finally, we investigate the effect when switching elements are not uniform. It should be noted that our proposed optimization scheme is generic and the principle can be applied to any kind of switching fabrics as long as there are asymmetric properties in power consumptions and insertion loss.

#### **REFERENCES**

- [1] The International Technology Roadmap for Semiconductors, S. I. Association, 2011.
- [2] A. W. Poon, X. Luo, F. Xu, and H. Chen, "Cascaded microresonatorbased matrix switch for silicon on-chip optical interconnection," *Proc. IEEE*, vol. 97, no. 7, pp. 1216–1238, 2009.
- [3] P. Dong, S. Preble, and M. Lipson, "All-optical compact silicon comb switch," *Opt. Exp.*, vol. 15, no. 15, pp. 9600–9605, Jul. 2007.
- [4] B. G. Lee, A. Biberman, P. Dong, M. Lipson, and K. Bergman, "All optical comb switch for multiwavelength message routing in silicon photonic networks," *IEEE Photon. chnol. Lett.*, vol. 20, no. 10, pp. 767–769, 2008.
- [5] A. Shacham, K. Bergman, and L. P. Carloni, "Photonic networks-on-chip for future generations of chip multiprocessors," *IEEE Trans. Comput.*, vol. 57, no. 9, p. 1246, 2008.
- [6] A. Bianco, D. Cuda, M. Garrich, R. Gaudino, G. Gavilanes, P. Giaccone, and F. Neri, "Optical interconnection architectures based on microring resonators," in *Proc. IEEE Int. Photonics Switching Conf.*, Sep. 2009, pp. 1–2.
- [7] A. Bianco, D. Cuda, R. Gaudino, G. Gavilanes, F. Neri, and M. Petracca, "Scalability of optical interconnects based on microring resonators," *IEEE Photon. Technol. Lett.*, vol. 22, pp. 1081–1083, 2010.
- [8] G. Shen and R. S. Tucker, "Energy-minimized design for IP Over WDM networks," *J. Opt. Netw.*, vol. 1, no. 1, pp. 176–186, Jun. 2009.
- [9] J. Baliga, K. Hinton, and R. S. Tucker, "Energy consumption of the internet," in *Proc. COIN-ACOFT*, Melbourne, Australia, Jun. 24–27, 2007, pp. 1–3.
- [10] J. Baliga, R. Ayre, K. Hinton, W. V. Sorin, and R. S. Tucker, "Energy consumption in optical IP networks," *J. Lightw. Technol.*, vol. 27, no. 13, pp. 2391–2403, Jul. 2009.
- [11] L. Shang, L. S. Peh, and N. K. Jha, "Dynamic voltage scaling with links for power optimization of interconnection networks," in *Proc. 7th Int. Symp., High-Performance Comput. Architecture*, Feb. 2003, pp. 91–102.
- [12] P.-H. Yuen and L.-K. Chen, "Optimization of microring-based optical interconnection configurations for the reduction of power consumption and insertion loss," in *Proc. OFC/NFOEC*, 2012, Paper OW4I.6.
- [13] N. Kirman and J. F. Martínez, "A power-efficient all-optical on-chip interconnect using wavelength-based oblivious routing," in *Proc. 15th ASPLOS*, 2010, pp. 15–28.
- [14] B. A. Small, B. G. Lee, K. Bergman, Q. Xu, and M. Lipson, "Multiplewavelength integrated photonic networks based on microring resonator devices," *J. Opt. Netw.*, vol. 6, no. 2, pp. 112–120, 2007.
- [15] B. G. Lee, A. Biberman, N. Sherwood-Droz, C. B. Poitras, M. Lipson, and K. Bergman, "High-speed  $2 \times 2$  switch for multi-wavelength message routing in on-chip silicon photonic networks," in *Proc. ECOC* , Sep. 21–25, 2008, pp. 1–2.
- [16] G. I. Papadimitriou, C. Papazoglou, and A. S. Pomportsis, "Optical switching: Switch fabrics, techniques, and architectures," *J. Lightw. Technol.*, vol. 21, no. 2, pp. 384–405, Feb. 2003.
- [17] I. T. Monroy and E. Tangdiongga, "Performance evaluation of optical cross-connects by saddlepoint approximation," *J. Lightw. Technol.*, vol. 16, no. 3, pp. 317–323, 1998.
- [18] L. Gillner, C. P. Larsen, and M. Gustavsson, "Scalability of optical multiwavelength switching networks: Crosstalk analysis," *J. Lightw. Technol.*, vol. 17, no. 1, pp. 58–67, 1999.
- [19] T. Gyselings, G. Morthier, and R. Baets, "Crosstalk analysis of multiwavelength optical cross connects," *J. Lightw. Technol.*, vol. 17, no. 8, pp. 1273–1283, Aug. 1999.
- [20] J. Sun, "Influence of crosstalk on the scalability of WDM cross-connect networks," *Proc. Inst. Elect. Eng. Optoelectron.*, vol. 149, no. 2, pp. 59–64, Apr. 2002.

**Piu-Hung Yuen** received the B.Eng. and M.Phil. degrees, both in information engineering, from The Chinese University of Hong Kong in 2009 and in 2012, respectively.

His current research interests include optical interconnections, simulation, and network optimization.

**Lian-Kuan Chen** (S'85–M'92–SM'04) received the B.S. degree from National Taiwan University, Taipei, Taiwan, in 1983, and the M.S. and Ph.D. degrees from Columbia University, New York, NY, USA, in 1987 and 1992, respectively, all in electrical engineering.

He worked at Jerrold Communications, General Instruments (GI), USA, during 1990–1991 and engaged in research on linear lightwave video distribution systems, with contributions on distortion reduction schemes for various optical components. He joined the Department of Information Engineering, The Chinese University of Hong Kong, and established the Lightwave Communications Laboratory in 1992. He was the Chairman of the Department during 2004–2006. His current research interests include broadband local access networks, photonic and electronic signal processing, performance monitoring and optimization of optical networks, and biophotonics. He has published over 200 papers in international conferences and journals, primarily in the field of optical communications.

Prof. Chen is a Senior Member of the IEEE Photonics Society, the IEEE Communications Society, and a member of OSA. He was an Associate Editor of IEEE PHOTONICS TECHNOLOGY LETTERS during 2005–2011.