# Battery Cluster Fault-Tolerant Control for High Voltage Transformerless Grid-Tied Battery Energy Storage System

Xiqi WU, Shuai GAO, YiLin LIU, Rui LI, Xinyu JIANG, and Xu CAI

Abstract—The battery fault-tolerant operation is one of the important issues for such a large-capacity cascaded H-bridge converter-based battery energy storage system (BESS). Conventional redundant method by bypassing whole SMs at ac-side may lead to insufficient modulation ratio margin and waste the potential of healthy H-bridge part. First, the comparison of ac-side bypassing submodules (SMs) with dc-side disconnecting cluster is made; and the concept of new battery cluster fault tolerance strategy is discussed. In order to give full play to the grid voltage support capability of the faulty module, a battery cluster fault tolerance operating control combining proposed fault-tolerant strategy and optimal zero sequence voltage injection is proposed in this paper. As a result, it can effectively enhance the capability of handling battery warnings and failure and is able to tolerate any number of faulty battery cluster modules. Besides, state of charge(SOC) balancing of healthy SMs and capacitor voltage balancing of faulty SMs are incorporated into control algorithm. At last, effectiveness of the proposed battery fault tolerance strategy is thoroughly verified in BESS of 10 kV/5 MW and 14 SMs per phase by MATLAB/Simulink simulations and hardware-inloop experiment.

*Index Terms*—Battery energy storage system, cascaded H-bridge, fault tolerance, redundant control.

#### I. INTRODUCTION

SALE-up application of energy storage is necessary for a high proportion of fluctuating wind and solar new energy sources. Battery energy storage is the fastest growing energy storage method, and the scale of energy storage power plant is moving from hundred MWh level to GWh level [1]–[3]. The technical bottleneck hindering the large capacity of single energy storage is variations of battery cells due to the inconsistent manufacturing process and the inhomogeneous operating environment. The inconsistency of battery cells leads to the barrel effect between series-connected battery cells and the circulating current effect between parallel-connected battery clusters, which reduces the available capacity of the battery, increases the loss of the battery system, and the local cell faults are prone to trigger the safety problems [4], [5]. As result, the single unit capacity of traditional battery energy storage system (BESS) based on two-level converter generally does not exceed 1 MW due to the limitation by technologies such as battery safety, cells grouping method and battery management system (BMS).

Transformerless grid-tied BESS (TGT-BESS) is based on cascaded H-bridge (CHB) converter and has a highly modular configuration, which facilitates capacity expansion and redundancy design, and can be connected MVAC or HVAC without the bulky line-frequency transformer, eliminating losses caused by linear fractional transformation (LFT) [6], [7]. All of the aforementioned advantages favor the usage of TGT-BESS in the large-capacity energy storage application such as grid-side and new energy plant side. As shown in Fig. 1, the large-capacity battery stack is separated as numerous individual cluster units connected to each H-bridge circuit, which avoids the circulating current in the battery stack, reduces the system cycle loss and improves the system safety at the same time [8]. Compared with the traditional BESS, TGT-BESS features the largecapacity of single converter unit, requiring fewer parallel units to form a large-scale energy storage power plant. These results in a simpler power plant structure and control strategy, which render energy storage system have faster response as well as avoid stability problems [9], [10].

However, a big challenge brought by a large single-unit capacity is the reliability of system operation. Long time, stable and safe grid-tied operation for such a highly energy-concentrated BESS relies on the reliability from two aspects: power switches and battery cells. Therefore, the capability of faulty tolerance and postfault operation are critical issues [11]. On the one hand, the higher grid-tied voltage level increases the number of cascades, and more power switches are used. Thus, each H-bridge module may become a potential point of failure due to short-circuiting or open-circuiting of the power devices, which increases the probability of system failure [12]. Many fault-tolerant approaches have been proposed for multilevel

Manuscript received November 18, 2024; revised February 02, 2025; accepted March 06, 2025. Date of publication June 30, 2025; date of current version April 18, 2025. This work was supported in part by National Key R&D Program of China under the Grant 2023YFB4204400. *(Corresponding author: Rui Li.)* 

X. Wu, Y. Liu, R. Li, and X. Cai are with the Key Laboratory of Control of Power Transmission and Conversion, School of Electronics, Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China (e-mail: wuxiqi@sjtu.edu.cn; yilinliu@sjtu.edu.cn; liruiqd@sjtu.edu.cn; xucai@sjtu.edu.cn).

S. Gao and X. Jiang are with the Guangzhou Zhiguang Energy Storage Technology Co., Ltd., Guangzhou 510760, China (e-mail: gaoshuai@gzzg.com.cn; jiangxinyu@gzzg.com.cn).

Digital Object Identifier 10.24295/CPSSTPEA.2025.00011



Fig. 1. Circuit diagram of TGT-BESS.

converters in different applications and mainly focus on faults intrigued by power switches [13]–[15], which can be divided into two categories: hardware redundancy method and software control technique.

The hardware method is implemented by adding redundant submodules (SMs). The redundant modules work at hot standby or cold standby states depending to whether they are active or not in normal conditions. In [16], a redundant SM is employed in each of the three phases and operates in hot standby mode. In case of a fault, the faulty module and one non-faulty module in the other two phases are bypassed simultaneously. However, added redundant modules in each phase increases the hardware cost and size of the system, especially in battery application. Just a cold standby redundancy module is employed for threephase legs in [17], [18]. In normal operation, the redundant module is in the bypass state; when the fault occurs, the redundant module is inserted into faulty phase through a switch network to ensure that the operating state of system is the same as before the fault. Although the system cost effectively is lowered since only one redundant module is needed, this method has strong limitations and is only applicable to the situation where a single power module (PM) of the system fails.

Software methods adjust the control strategy to allow the system to continue to operate after faulty module is bypassed [19], [20]. Symmetrical bypass method is used in [21] and then capacitor voltage of the non-faulty module is enhanced to maintain a constant sum of dc voltage, allowing the system to still emit the original reactive capacity. However, this method not only increases the voltage level of the filter capacitors and power switches but also is not suitable for BESS due to clamped effect by battery voltage. In [22], [23], unsymmetrical bypass method is used for BESS. After bypassing the faulty module, a zero-sequence voltage is injected to achieve the maximum magnitude of the three-phase balanced line voltage [24]. While avoiding the occurrence of over-modulation, the method also maintains battery SOC equalization in energy storage systems. A fault-tolerant control method is proposed in [25] for the occurrence of open-circuit and short-circuit faults in the switch. Instead of bypassing the whole module directly after the fault, the faulty module is switched from full-bridge mode to fault-tolerant half-bridge mode by changing its modulation strategy [26]. At this situation, the output ac voltage of SM is only half of the rated voltage, and the faulty module still has the power output and voltage support capability. Since this fault-tolerant mode introduces a dc voltage bias in the faulty phase, it is necessary to inject a dc component of equal amplitude in the non-faulty phase as well, so as not to affect the grid-connected current waveform [27].

Compared with power devices, massive battery cells, as an energy carrier, are more prone to occurrence of some warnings or failures: such as internal short-circuiting of cells, excessive voltage or temperature difference of cells within a cluster, overcharging or over-discharging or overheating of cells [28]. Especially at the end of the system charging and discharging, there will often be a battery cell that firstly reaches the cut-off voltage, resulting in shutdown and reduction of BESS capacity utilization. In response to this kind of situation, state-of-art studies take conventional fault-tolerant method of directly bypassing the faulty SMs like handling failure of power switches in [16]–[24].

On this ground, this article presents a new battery cluster fault tolerance strategy for TGT-BESS, which takes advantage of the healthy power H-bridge part of the faulty SMs, where they are still able to generate ac voltage to support grid voltage. The main technical contributions of this article are summarized as follows:

1) Instead of directly bypassing the faulty module through the ac bypass switch after a cluster fault occurs, the faulty cluster is disconnected through the dc circuit breaker. The faulty module is switched from charge or discharge mode to reactive fault-tolerant mode and continues to operate, giving full play to the grid voltage support capability of the faulty module.

2) The proposed strategy is able to tolerate any number of faulty battery cluster modules, while other nonfaulty modules can still continue to absorb or release energy. This is important to improve the system capacity utilization at the end of charging and discharging.

3) A battery clusters fault-tolerant control strategy incorporating third harmonic voltage injection (THVI) is proposed to avoid the risk of system downtime due to insufficient modulation ratio margins caused by the traditional direct ac bypass of SMs.

The rest of this paper is organized as follows: the conventional SM ac-side directly-bypassing and a new battery cluster fault tolerance strategy are discussed in Section II. A battery cluster fault tolerance operating control for TGT-BESS combining fault-tolerant strategy, zero sequence voltage injection and the balancing control of capacitor voltage of faulty SMs and SOC of healthy SMs are theoretically analyzed in Section III. The effectiveness of proposed control scheme is verified by offline simulation and hardware-in-loop (HIL) experiment in Section IV. The conclusions are given in Section V.



Fig. 2. Fault tolerance methods. (a) Directly bypassing whole SM at ac side. (b) Exiting battery cluster at dc side.



Fig. 3. Ac-side bypassing process.

# II. CONVENTIONAL SM BYPASS METHOD AND A New Battery Fault-Tolerant Strategy

#### A. System Configuration

Fig. 1 shows the circuit schematic of TGT-BESS. It is composed of *n* SMs per phase and each SM comprises PM and battery module (BM). PM is an H-bridge converter with a filter capacitor *C*, which is the main circuit of static synchronous compensator. BM contains a battery cluster, its BMS, a dc soft-start circuit, a dc circuit breaker  $S_{dc}$  and a filter inductor  $L_b$ . The H-bridge cells are connected in series on their ac side; then, the output terminal is connected to medium-voltage grid directly through a filter inductor  $L_f$  and an ac soft-start circuit. A set of two thyristor switches  $S_{ac}$  is utilized to bypass the whole H-bridge SM.

In Fig. 1,  $v_{kj}$  and  $i_{dckj}$  represent the ac side voltage and dc side current of *j*th H-bridge cell in *k* phase, where k = a, b, c stands for the phase and j = 1, 2, ..., n stands for the H-bridge cell number.  $v_{Ckj}$ ,  $v_{bkj}$  and  $i_{bkj}$  represent capacitor voltage, battery voltage and current of *j*th H-bridge cell in *k* phase.  $v_k$  and  $i_k$  represent the CHB converter output voltage and phase current of *k* phase.

## B. SM AC-Side Directly-Bypassing Method

As shown in Fig. 2(a), the conventional fault-tolerant method directly bypasses whole fault SM by ac-side switches. Fig. 3 shows two states of this process. After receiving fault diagnosis information from BMS, the healthy H-bridge first output a zero level; and then the bypassing switches turn on to avoid short circuit. Simultaneously at other phases, two healthy SMs at the same position are also bypassed for physical symmetry of three-phase arms. Furthermore, the modulation voltage of rest SMs should be enlarged to match the grid voltage and carriers



Fig. 4. Charging and discharging curves of battery output voltage to SOC.

also need to be arranged. Usually, one or more reductant SMs are employed and operate in hot standby duty to avoid overmodulation after quitting fault SMs.

For Lithium battery application, the output port has a wide voltage range during its whole SOC area as shown in Fig. 4. Therefore, configuration of battery should take the minimum port voltage into account and it is the terminal voltage during discharging process that decides whether the set of battery configuration meets the grid-tied requirement. Fig. 4 shows that the 1C charging curve is always much higher than the discharging curve, which means the charging operation has lower risk of overmodulating. In order to avoid overcharging and over discharging state are set at 2.90 V and 3.50 V, which corresponds to 3% and 97% SOC at 1C condition, respectively.

Taking 35kV/20MW/20MWh TGT-BESS as an example, 2703.2V-rated cells are connected in series to form an 864V-rated battery cluster. Each phase leg has 38 SMs in normal operation and only contains one redundant SM considering the minimum 2.9 V cell voltage. After bypassing some fault SMs, low SOC range are cut down and the utilization ratio of capacity is reduced gradually as the fault number increases. Therefore, the conventional ac-side directly bypassing method has limited capability of fault tolerance operation while the potential of healthy power H-bridge modules that may exist is also wasted.

#### C. A New Battery Cluster Fault Tolerance Strategy

Based on aforementioned discussion, Lithium battery has significantly greater probability of occurrence of warning or fault than semiconductor switch, especially at charge and discharge terminal. As a result, H-bridge part after removing battery cluster can also be made full use of and may have some potential to be exploited further.

Thereby, a new battery cluster fault tolerance strategy is proposed as shown in Fig. 2(b). After receiving fault diagnosis or battery warning information from BMS, battery clusters are disconnected from H-bridge and PM part continues to operate in main circuit. Assuming that voltage drop of grid-side filtering



Fig. 5. Phase diagram of modulation voltage.



Fig. 6. Key waveforms during battery cluster exiting process.

inductor is ignored, the fundamental frequency element of output voltage of converter is equivalent to grid voltage  $V_s$ , which also equals to converter modulation signals. Fig. 5 shows phase diagram of modulation voltage of healthy SMs and fault-tolerant PMs. The modulation of converter is decomposed into two vectors: modulation voltage of healthy SMs  $V_{mp}$  is in phase with grid-side current  $I_s$  and modulation voltage of PMs  $V_{mq}$  is perpendicular with  $I_s$ . Their relationship can be expressed as following:

$$V_{\rm mp} = \sqrt{V_{\rm S}^2 - V_{\rm mq}^2}$$
 (1)

In this situation, the power factor of system deviates from unity value, where healthy SMs continues to be charged and discharged while faulty SMs operate in mode of reactive power due to absence of battery active power.

Beneficially, non-faulty SMs should not output a voltage which amplitude is nearly identical to grid voltage like the conventional directly bypassing SMs. Due to the voltage support brought by the PMs of faulty SMs, the voltage amplitude outputted by rest healthy SMs is reduced as (1) and their overmodulation after occurrence of fault is avoided.

Fig. 6 shows key waveform of SMs during battery cluster exiting process to illustrate the principle of proposed fault tolerance strategy:

State  $1[t_0-t_1]$ : SM operates in discharging or charging state and its modulation voltage has the same or opposite phase as the phase current. It can be seen that battery current and capacitor voltage all have a dc bias value. A fluctuating component due to pulsating power of single-phase topology.

State  $2[t_1-t_2]$ : Receiving fault diagnosis or battery warning information at  $t_1$ , its modulation voltage vector is regulated to be orthogonal with phase current vector. Thus, battery cluster stops charging or discharging and its dc bias current is removed. In this state, the cross zero point of battery current is always detected.

State  $3[t_2-t_3]$ : At a certain cross zero point of  $t_2$ , a turn-off

signal is given to dc circuit breaker and the faulty cluster is disconnected from its PM. And this H-bridge part continues to operate in mode of reactive power to support the grid voltage.

# III. PROPOSED BATTERY CLUSTER FAULT TOLERANCE CONTROL FOR CHB-BASED BESS

From aforementioned analysis, when whole faulty SMs are directly bypassed at ac side, the modulation margin of rest SMs is reduced gradually and this may result in overmodulation. The strategy of battery cluster exiting at dc side make full use of potential of non-faulty H-bridge power part to support grid voltage. However, more complex control is required to achieve SOC balancing of healthy SMs and capacitor voltage balancing of faulty SMs. Thereby, this section represents a battery cluster fault tolerance operating control for TGT-BESS combining proposed fault-tolerant strategy and zero sequence voltage injection to minimize reactive power of system.

## A. Battery Cluster Fault Tolerance Control Based on THVI and Proposed Battery Cluster DC-Side Exiting Strategy

As shown in Fig. 5, when faulty SMs provide voltage support by their H-bridge part for converter, the system should output a certain reactive power. As the amplitude that should be outputted by faulty SMs increases, the ratio of reactive to active power becomes more and more large. This will lead to a uncomplete range of four quadrant operation. In order to enhance power factor of system as much as possible after occurrence of battery warning or fault, the proposed battery cluster fault tolerance strategy is improved by optimal THVI and analyzed in this part.

In order to more easily implement the fault tolerance, symmetrical exiting strategy is used, where three battery clusters at the same position of different phases are disconnected from their respective PMs when any cluster has a fault. Hence, the charging or discharging power should be regulated as follow:

$$P_{\rm sys} = \, {\rm sgn}(P_{\rm set}) \cdot {\rm min}\left( \left| P_{\rm set} \right|, \, \frac{n - n_{\rm fal}}{n} P_{\rm nom} \right) \tag{2}$$

where  $P_{\text{set}}$  and  $P_{\text{nom}}$  denote the set target power and normalized active power when all batteries work well.  $n_{\text{fal}}$  represents the number of faulty SMs in each phase.

To simplify the analysis, the following assumptions are established.

1) The voltage drop of grid-side filtering inductor is neglected and the amplitude of modulation voltage of each phase  $V_{\rm m}$  is equal to that of grid phase voltage  $V_{\rm s}$ .

2) Inner-phase SOC of non-faulty SMs is well balanced and their sum dc voltage  $V_{\text{bat}}$  is equal to  $(n-n_{\text{fal}})v_{\text{bat}}$ , where  $v_{\text{bat}}$  is battery voltage of each healthy SMs.

3) Inner-phase capacitor voltage of faulty SMs is well balanced and their sum dc voltage  $V_C$  is equal to  $n_{fal}v_C$ , where  $v_C$  is capacitance voltage of each faulty PMs.

Defining the amplitude of injected third harmonic voltage (THV) as  $V_{\text{THV}}$ , when non-faulty SMs have enough margin to



Fig. 7. Schematic diagram of modulation voltage of faulty and non-faulty SMs. (a)  $\sqrt{3}$  /2  $V_s \leq V_{bat} < V_s$ . (b)  $V_{bat} < \sqrt{3}$  /2  $V_s$ .

generate modulation voltage, the grid voltage is supported only by non-faulty SMs.

$$\begin{vmatrix} V_{\rm mp} = V_{\rm S} \\ V_{\rm mq} = 0, \ V_{\rm hat} = (n - n_{\rm fal}) v_{\rm hat} \ge V_{\rm S} \\ V_{\rm THV} = 0 \end{cases}$$
(3)

If  $V_{\text{bat}} < V_{\text{s}}$ , THV is injected to enhance dc voltage utilization ratio of healthy SMs as shown in Fig. 7(a). Assuming that nonfaulty SMs can output a voltage with its amplitude of the first harmonic voltage (FHV) equal to  $V_{\text{s}}$ , the modulation voltage of healthy SMs can be expressed as

$$v_{\rm mp} = V_{\rm s} \sin(\omega t) + V_{\rm THV} \sin 3\omega t \tag{4}$$

Take the derivative of (4) as

$$\left. \frac{\mathrm{d}v_{\mathrm{mp}}}{\mathrm{d}\omega t} \right|_{\omega t=\theta_0} = V_{\mathrm{S}} \cos\theta_0 + 3V_{\mathrm{THV}} \cos 3\theta_0 = 0 \tag{5}$$

 $\theta_{\rm 0}$  represents the angle at the peak value of  $v_{\rm mp}$  , which results in

$$\sin\theta_0 = \sqrt{\frac{V_{\rm s} + 3V_{\rm THV}}{12V_{\rm THV}}} \tag{6}$$

Substitute (6) into (4), an extreme value of the function  $v_{mp}$  is solved as

$$v_{\rm mp\_ext1} = \frac{V_{\rm S} + 3V_{\rm THV}}{3} \sqrt{\frac{V_{\rm S} + 3V_{\rm THV}}{3V_{\rm THV}}}$$
(7)

It is easy to solve another extreme value of the function  $v_{mp}$ at the extreme point of  $\pi/2$  as

$$v_{\rm mp\_ext2} = V_{\rm S} - V_{\rm THV} \tag{8}$$

On this basis, the amplitude of modulation voltage of nonfaulty SMs can be derived as

$$v_{\rm mp\_max} = \begin{cases} v_{\rm mp\_ext2}, V_{\rm THV} \leq \frac{V_{\rm S}}{9} \\ v_{\rm mp\_ext1}, V_{\rm THV} > \frac{V_{\rm S}}{9} \end{cases}$$
(9)

Take the derivative of (7)

$$\frac{\mathrm{d}v_{\mathrm{mp}\_max}}{\mathrm{d}V_{\mathrm{THV}}} = \frac{6V_{\mathrm{THV}} - V_{\mathrm{S}}}{6V_{\mathrm{THV}}} \sqrt{\frac{V_{\mathrm{S}} + 3V_{\mathrm{THV}}}{3V_{\mathrm{THV}}}} = 0 \qquad (10)$$

From (7), when  $V_{\text{THV}} = V_s/6$ ,  $v_{\text{mp max}}$  has the minimum value as

$$V_{\rm mp\_min} = \frac{\sqrt{3}}{2} V_{\rm S} \tag{11}$$

When  $V_{\text{mp_min}} \leq V_{\text{bat}}$  THVI can help non-faulty SMs to generate modulation voltage. As shown in Fig. 7(a), the amplitude of modulation voltage of non-faulty SMs after injecting THV is equal to  $V_{\text{bat}}$ . The amplitude of injected THV can be calculated as

$$V_{\rm THV} = \begin{cases} V_{\rm S} - V_{\rm bat}, V_{\rm bat} \ge \frac{8}{9} V_{\rm S} \\ f(V_{\rm S}, V_{\rm bat}), V_{\rm bat} < \frac{8}{9} V_{\rm S} \end{cases}$$
(12)

In (12), the function  $f(V_s, V_{bat})$  is calculate according to the equation that (7) is equal to  $V_{bat}$  and is derived as

$$f(V_{\rm S}, V_{\rm bat}) = \left(\sqrt{\frac{V_{\rm S}^2 V_{\rm bat}^4}{36} - \frac{V_{\rm bat}^6}{27}} - \frac{V_{\rm S} V_{\rm bat}^2}{6}\right)^{\frac{1}{3}} - \frac{V_{\rm S}}{3} + \frac{V_{\rm bat}^2}{3\left(\sqrt{\frac{V_{\rm S}^2 V_{\rm bat}^4}{36} - \frac{V_{\rm bat}^6}{27}} - \frac{V_{\rm S} V_{\rm bat}^2}{6}\right)^{\frac{1}{3}}}$$
(13)

In this situation, faulty SMs are still not required to output voltage to support grid voltage, and the following equation can be obtained:

$$\begin{cases} V_{\rm mp} = V_{\rm S}, \frac{\sqrt{3}}{2} V_{\rm S} \leq V_{\rm bat} < V_{\rm S} \\ V_{\rm mq} = 0, \frac{\sqrt{3}}{2} V_{\rm S} \leq V_{\rm bat} < V_{\rm S} \end{cases}$$
(14)

In (14),  $V_{\text{THV}}$  is equal to (12).

When the sum dc voltage of non-faulty SMs is not enough high to match the grid voltage even though the optimal THV with amplitude has been injected, to wit  $V_{\rm mp,min} > V_{\rm bat}$ , faulty SMs are required to output voltage to some extent.

As seen from (11) and according to duality theory, it is not difficult to obtain the maximum amplitude of FHV that can be generated by non-faulty SMs and its corresponding injected THV as

TABLE I Simulation Model Parameters

| Parameters                   | Symbol                    | Values |
|------------------------------|---------------------------|--------|
| Cascaded number per phase    | п                         | 14     |
| System active power/capacity | $P_{\rm nom}/Q_{\rm nom}$ | 5 MW   |
| Rated voltage of BM          | $V_{\rm bat}$             | 720 V  |
| Battery internal resistor    | $r_{\rm bat}$             | 0.1 Ω  |
| Grid line-to-line voltage    | $V_{ m grid}$             | 10 kV  |
| DC filtering capacitor       | $C_{ m dc}$               | 10 mF  |
| DC filtering inductor        | $L_{\rm dc}$              | 2 mH   |
| Grid-side filtering inductor | $L_{\rm f}$               | 6 mH   |
| SM switching frequency       | $f_{\rm sw}$              | 1 kHz  |



Fig. 8. Curves of THV by healthy SMs and supporting voltage by faulty SMs at different number of faulty SMs during whole discharging SOC range.

$$V_{\rm mp} = \frac{2}{\sqrt{3}} V_{\rm bat}, V_{\rm THV} = \frac{\sqrt{3}}{9} V_{\rm bat}$$
 (15)

Accordingly, the amplitude of output voltage that should be generated by faulty SMs to support grid voltage can be deduced as

$$V_{\rm mq} = \sqrt{V_{\rm S}^2 - V_{\rm mp}^2} = \sqrt{V_{\rm S}^2 - \frac{4}{3}V_{\rm hat}^2}$$
(16)

In this case as depicted in Fig. 7(b), the following equation can be obtained:

$$V_{\rm mp} = \frac{2}{\sqrt{3}} V_{\rm bat}$$

$$V_{\rm mq} = \sqrt{V_{\rm s}^2 - \frac{4}{3}} V_{\rm bat}^2 , V_{\rm bat} < \frac{\sqrt{3}}{2} V_{\rm s} \qquad (17)$$

$$V_{\rm THV} = \frac{\sqrt{3}}{9} V_{\rm bat}$$

Based on aforementioned discussion about three cases, the target reactive power of system should be set as

$$Q_{\rm sys} = \begin{cases} 0, V_{\rm bat} \ge \frac{\sqrt{3}}{2} V_{\rm S} \\ \frac{\sqrt{3V_{\rm S}^2 - 4V_{\rm bat}^2}}{2V_{\rm bat}} P_{\rm sys}, V_{\rm bat} < \frac{\sqrt{3}}{2} V_{\rm S} \end{cases}$$
(18)

On this ground, considering the systematic parameters given in Table I, Fig. 8 shows THV by healthy SMs and supporting voltage by faulty SMs at different number of faulty SMs during whole discharging SOC range. It is obvious that higher voltage outputted by faulty SMs is needed as faulty number increases and battery SOC decreases. By contrast, THV has limited the reductant capability while faulty SMs mainly bear the function of supporting voltage as occurrence of more and more faulty SMs.

# *B. Capacitor Voltage Average Value and Balancing Control of Faulty PMs*

Once the battery cluster is removed from SMs, their capacitor voltage is not anymore clamped by the battery. Therefore, the sum of capacitor voltage in a phase should be control actively to track a reference value and capacitor voltage of all PMs in a phase should be balanced in real-time.

As shown in Fig. 5, in order to ensure that the modulation voltage of faulty SMs is always orthogonal with grid current, its components on dq axis should be set as

$$\begin{cases} V_{Qd} = V_{mq} \cos\left(\frac{\pi}{2} - \theta\right) = V_{mq} \sin\theta \\ V_{Qq} = V_{mq} \sin\left(\frac{\pi}{2} - \theta\right) = -V_{mq} \cos\theta \end{cases}$$
(19)

Then, three-phase modulation voltage  $V_{Ck}$  of faulty SMs are obtained by dq/abc transformation as depicted in Fig. 9(a). To stabilize the sum of capacitor voltage in each phase, the capacitor average voltage per phase  $V_{Ck_avg}$  should be equal to the reference value  $V_{C_avg}$  and a PI controller is utilized for superposing a voltage  $\Delta V_{Ck}$ , which is in-phase or antiphase with its phase current  $i_k$ . The superposed voltage can be realized by

$$\Delta V_{Ck} = \left(k_{\rm p,1} + \frac{k_{\rm i,1}}{S}\right) (V_{C_{\rm ref}} - V_{Ck_{\rm avg}}) i_k$$
(20)

where  $k_{p,1}$  and  $k_{i,1}$  are proportion and integral coefficients of the capacitor voltage PI controller, respectively.

Furthermore, the modulation voltage of each faulty SMs can be obtained

$$v_{Cki} = \frac{v_{Ck} + \Delta v_{Ck}}{n_{\text{fal}}}$$
(21)

For balancing capacitor voltage of inner phase faulty SMs, a PI controller is utilized for superposing a voltage  $\Delta v_{Cki}$ , which is in-phase or antiphase with its phase current  $i_k$ . The superposed voltage can be realized by

$$\Delta v_{Cki} = \left(k_{p,2} + \frac{k_{i,2}}{S}\right) (v_{Ck_{avg}} - v_{Cki}) i_k$$
(22)

where  $k_{p,2}$  and  $k_{i,2}$  are proportion and integral coefficients of the



Fig. 9. Control schematic diagram. (a) Capacitor voltage control of faulty SMs . (b) SOC balancing control of non-faulty SMs.

capacitor voltage balancing PI controller, respectively.

#### C. Inner-Phase SOC Balancing Control of Non-Faulty SMs

As shown in Fig. 9(b), power decoupled control at dq axis is utilized to control grid-side active and reactive power and generate whole modulation voltage  $v_k^*$  of three-phase arms. After subtracting modulation voltage of faulty SMs, the modulation voltage  $v_{Bki}$  of each non-faulty SMs can be obtained as

$$v_{BKi} = \frac{v_{K}^{*} + v_{THV} - v_{CK}}{n - n_{fal}}$$
(23)

As shown in Fig. 9(b), the phase of THV should be identical with FHV generated by non-faulty SMs and it can be expressed as

$$v_{\rm THV} = -V_{\rm THV} \cos\left[3\theta - \tan\left(\frac{Q_{\rm sys}}{P_{\rm sys}}\right)\right]$$
(24)

Similarly, for balancing SOC of inner phase non-faulty SMs, a PI controller is utilized for superposing a voltage  $\Delta V_{Bki}$ , which is in-phase or anti-phase with its phase current  $i_k$ . The superposed voltage can be realized by

$$\Delta v_{Bki} = \left(k_{p,3} + \frac{k_{i,3}}{S}\right) (SOC_k - SOC_{ki}) i_k$$
(25)

where  $k_{p,3}$  and  $k_{i,3}$  are proportion and integral coefficients of the SOC balancing PI controller, respectively.

## IV. VERIFICATION AND DISCUSSION

#### A. Software Offline Simulation

To verify the effectiveness of the proposed fault-tolerance algorithm based on the strategy of battery cluster exiting at dc side, simulation models of the three-phase 10 kV TGT-BESS are built using MATLAB/Simulink. Structural diagram of the simulation model is shown in Fig. 1 and Table I lists the main system parameters.

The steady-state and dynamic-state control performance of TGT-BESS under the proposed fault-tolerance algorithm are validated through various studies. At the beginning of simulation, the system output nominal 5 MW active power and the power factor is unity. The voltage of batter cluster is set at the minimum value 670 V. In order to test the proposed approach, three battery clusters are exited at 0.1 s and more two clusters are disconnected at 0.3 s and 0.5 s, respectively. Fig. 10(a) and (b) shows three-phase active and reactive power and grid-connected current waveform of the system, respectively. It can be observed that at 0.1-0.3 s (Case 1) the reactive power is zero, which means that no voltage should be output by faulty 3 SMs and is consistent with analysis in Fig. 8. In addition, at 0.3-0.5 s (Case 2) and 0.5-0.8 s (Case 3), a certain reactive power is output to ensure that rest healthy SMs can continuous to discharge.

Fig. 10(c) shows the average value of capacitor voltage of each phase faulty SMs and Fig. 10(d) shows the waveforms of modulation voltage of faulty SMs, respectively. Although at 0.1–0.3 s no voltage is needed to support grid voltage, faulty SMs output will output a small voltage after fault occurrence to control the capacitor voltage to follow its given reference 900 V. At Cases 2 and 3, the voltage by faulty SMs is enlarged to 0.5  $V_{\rm s}$  and 0.75  $V_{\rm s}$ . Fig. 10(e) indicates injected THV at different cases and its amplitude complies with analytical calculation.

Capacitor voltage waveforms of all faulty and healthy SMs in phase a are depicted in Fig. 10 (f), (g) and (h). At 0.1 s, three faulty SMs disconnect their battery and their capacitor voltage are charged to 900 V. In this case, no fluctuating voltage is observed owing to zero reactive power. At 0.3 s, capacitor voltage of #1–3 decreases and that of #4–5 increases, which is attributed to capacitor voltage balancing algorithm. Finally, their voltages are stable at 900 V and have a certain fluctuated component.

THVI helps to decrease the modulation ratio and healthy SMs can output a as large first harmonic component as pos-



Fig. 10. Simulation results of the proposed battery cluster fault tolerance strategy. (a) System active and reactive power. (b) Grid side current. (c) Average value of capacitor voltage of faulty SMs. (d) Modulation voltage of faulty SMs. (e) Injected THV. (f) Capacitor voltage waveforms of #1-3 SMs in phase a. (g) Capacitor voltage waveforms of #4-5 SMs in phase a. (h) Capacitor voltage waveforms of #6-14 SMs in phase a.

sible. This is reflected in the modulation voltage waveforms of three-phase non-faulty SMs around 0.1 s as shown in Fig. 11(a1)–(a3). Similar effects can be observed around 0.3 s and 0.5 s in Fig. 11(b1)–(b3) and (c1)–(c3). After superimposing THV over the modulation signals, output voltage waveforms of faulty and healthy SMs are demonstrated in Fig. 11 (a4)–(a5), which are step wave and overmodulation is avoided. This verifies the effectiveness of zero sequence voltage injection to minimize reactive power of system.

For the purpose of validating proposed battery cluster fault tolerance operating control for TGT-BESS combining proposed fault-tolerant strategy and optimal zero sequence voltage injection, modulation ratio of healthy and faulty SMs of proposed fault-tolerance method at different faulty number are demonstrated in Fig. 12. Correspondingly, the modulation ratio of conventional bypassing method is obtained by simulation. It can be seen that the conventional method has critical margin at the occurrence of two clusters. However, avoidance of overmodulation of not only healthy SMs and faulty SMs can always be guaranteed at any number faulty SMs with proposed algorithm.

## **B.** HIL Experiment

To verify the feasibility of implementing proposed control technique in actual digital controller, an HIL experimental platform is constructed, as shown in Fig. 13. Detailed switching circuits of the TGT-BESS are established in the real-time digital simulator MT-8020, and all control algorithms in Fig. 9 are implemented in the controller framework composed of a TMS320C28346 DSP and an XC6SLX25 FPGA. The parameters of the main circuit are same with that listed in Table I. The difference of between offline and real-time models is that only one phase arm and 14 SMs run in simulator since the DSP+FPGA controller has maximum 60 PWM channels. In spite of this, all key control methods can be tested and verified completely with such a single phase system.

Seven SMs are set in sequence with occurrence of faulty signals and Fig. 14 gives key waveforms to illustrate the process before and after fault under employing proposed fault-tolerance method. Grid voltage and current waveforms are captured to show that the system always has enough modulation ratio margin to control grid current after battery clusters are disconnected from H-bridges. At the situation with



Fig. 11. Simulation results of three-phase modulation voltage of nonfaulty SMs and the output voltage by faulty and nonfaulty SMs. (a1-a5) Around the occurrence of 5 battery clusters fault. (c1-c5) Around the occurrence of 7 battery clusters fault.





Fig. 12. Simulated modulation ratio of conventional bypassing method and healthy and faulty SMs of proposed fault-tolerance method at different faulty number.

one faulty SM, as can be seen, faulty SMs modulation voltage and THV all are zero and the whole dc-side voltage of 13 healthy SMs is still higher than the amplitude of grid voltage.

Fig. 13. HIL experimental platform.

This is corresponding with (3). When 2 and 3 faulty SMs are formed, injecting a certain THV can help to avoid system overmodulation and faulty SMs is also no required to output volt-



Fig. 14. Key voltage and current waveforms during the process of continuous occurrence of seven faulty SMs.

age, which is corresponding with (14). Definitely, with more faulty battery clusters are disconnected, only injecting THV has no ability to make the whole battery voltage match the grid voltage. According to (18), faulty SMs should output a certain passive voltage and power to render healthy SMs to continuously operate in charging or discharging mode. It is obvious that faulty SMs modulation voltage gradually increases as more faulty SMs occur.

In order to show the fault-tolerance process of each faulty SMs, dc-side voltage waveforms are observed by 8 channels oscilloscope as shown in Fig. 15. Due to healthy SMs have nearly identical dynamic and stable actions, only eighth SM is observed and the ninth to fourth SMs are be regarded as the same with it. From Fig. 15, before faulty SMs output supported voltage, the capacitor voltage of #1–3 SMs is fast charged to 900 V after battery is cut off. When faulty SMs output passive power, there is also fluctuating voltage superposing over capacitor voltage. As shown in Fig. 16(a), capacitor voltage average value control as (20) renders the average voltage of faulty SMs



Fig. 15. Capacitor voltage waveforms of #1-8 SMs during the process of continuous occurrence of seven faulty SMs.



Fig. 16. Enlarged capacitor voltage waveforms of #1-8 SMs.

track to its reference although they do not need to output power at stable state. Meanwhile, when another two SMs operate in fault-tolerance mode as Fig. 16(b) and (c), capacitor voltage balancing control make all faulty SMs voltage to be equalized first and then charged. This is in consistent with theoretical analysis and offline simulation.

## V. CONCLUSION

In this article, a fault tolerance strategy for multi-battery clusters of TGT-BESS is proposed to ensure the continuous and reliable operation of the large-capacity system. Disconnecting warning or faulty battery by itself dc breaker rather than bypassing whole SMs with ac bypass circuitry, the H-bridge part of faulty SMs can continue to operate as a reactive device. Advantageously, the faulty SMs are made the use of and has the capability of outputting a reactive voltage, which help to decrease the modulation voltage by rest healthy SMs and avoid overmodulation. Furthermore, a battery cluster fault tolerance operating control for TGT-BESS combining proposed fault-tolerant strategy and optimal zero sequence voltage injection is proposed to tolerate any number of faulty battery cluster modules. At last, a 10 kV/5 MW TGT-BESS for principal validation is built in simulation platform of MATLAB/Simulink and a HIL experimental platform. The simulation and experimental results all verify the feasibility of the proposed fault tolerance operating control under occurrence of multi-clusters fault. However, it can be seen that this method will change the steady-state operating point and power factor of the system. Further research is needed on fault isolation and fault-tolerant operation strategies that do not affect the external characteristics of the system. The binary star modular multilevel topology is expected to solve this problem in future research.

#### References

- S. Vazquez, S. M. Lukic, E. Galvan, L. G. Franquelo, and J. M. Carrasco, "Energy storage systems for transport and grid applications," in *IEEE Transactions on Industrial Electronics*, vol. 57, no. 12, pp. 3881–3895, Dec. 2010.
- [2] B. Dunn, H. Kamath, and J.-M. Tarascon, "Electrical energy storage for the grid: A battery of choices," in *Science*, vol. 334, no. 6058, pp. 928–935, Nov. 2011.
- [3] F. Calero, C. A. Cañizares, K. Bhattacharya, C. Anierobi, I. Calero, M. F. Z. de Souza, M. Farrokhabadi, N. S. Guzman, W. Mendieta, D. Peralta, B. V. Solanki, N. Padmanabhan, and W. Violante, "A review of modeling and applications of energy storage systems in power grids," in *Proceedings of the IEEE*, vol. 111, no. 7, pp. 806–831, Jul. 2023.
- [4] X. Li and S. Wang, "Energy management and operational control methods for grid battery energy storage systems," in *CSEE Journal of Power and Energy Systems*, vol. 7, no. 5, pp. 1026–1040, Sept. 2021.
- [5] M. Liu, X. Cao, C. Cao, P. Wang, C. Wang, J. Pei, H. Lei, X. Jiang, R. Li, and J. Li, "A review of power conversion systems and design schemes of high-capacity battery energy storage systems," in *IEEE Access*, vol. 10, pp. 52030–52042, 2022.
- [6] L. Maharjan, S. Inoue, and H. Akagi, "A transformerless energy storage system based on a cascade multilevel PWM converter with star configuration," in *IEEE Transactions on Industry Applications*, vol. 44, no. 5, pp. 1621–1630, Sept.-Oct. 2008.
- [7] C. Liu, N. Gao, X. Cai, and R. Li, "Differentiation power control of modules in second-life battery energy storage system based on cascaded

H-bridge converter," in *IEEE Transactions on Power Electronics*, vol. 35, no. 6, pp. 6609–6624, Jun. 2020.

- [8] C. Liu, X. Cai, and Q. Chen, "Self-adaptation control of second-life battery energy storage system based on cascaded H-bridge converter," in *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 8, no. 2, pp. 1428–1441, Jun. 2020.
- [9] J. I. Y. Ota, T. Sato, and H. Akagi, "Enhancement of performance, availability, and flexibility of a battery energy storage system based on a modular multilevel cascaded converter (MMCC-SSBC)," in *IEEE Transactions on Power Electronics*, vol. 31, no. 4, pp. 2791–2799, Apr. 2016.
- [10] L. Maharjan, S. Inoue, H. Akagi, and J. Asakura, "State-of-charge (SOC)balancing control of a battery energy storage system based on a cascade PWM converter," in *IEEE Transactions on Power Electronics*, vol. 24, no. 6, pp. 1628–1636, Jun. 2009.
- [11] H. Xue, J. He, Y. Ren, and P. Guo, "Seamless fault-tolerant control for cascaded H-bridge converters based battery energy storage system," in *IEEE Transactions on Industrial Electronics*, vol. 70, no. 4, pp. 3803– 3813, Apr. 2023.
- [12] L. Xiong, F. Zhuo, X. Liu, Z. Xu, and Y. Zhu, "Fault-tolerant control of CPS-PWM-based cascaded multilevel inverter with faulty units," in *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 7, no. 4, pp. 2486–2497, Dec. 2019.
- [13] Q. Xiao, Y. Jin, H. Jia, Y. Tang, A. F. Cupertino, Y. Mu, R. Teodorescu, F. Blaabjerg, and J. Pou, "Review of fault diagnosis and fault-tolerant control methods of the modular multilevel converter under submodule failure," in *IEEE Transactions on Power Electronics*, vol. 38, no. 10, pp. 12059–12077, Oct. 2023.
- [14] S. Yang, Y. Tang, and P. Wang, "Seamless fault-tolerant operation of a modular multilevel converter with switch open-circuit fault diagnosis in a distributed control architecture," in *IEEE Transactions on Power Electronics*, vol. 33, no. 8, pp. 7058–7070, Aug. 2018.
- [15] B. Li, S. Shi, B. Wang, G. Wang, W. Wang, and D. Xu, "Fault diagnosis and tolerant control of single IGBT open-circuit failure in modular multilevel converters," in *IEEE Transactions on Power Electronics*, vol. 31, no. 4, pp. 3165–3176, Apr. 2016.
- [16] W. Song and A. Q. Huang, "Fault-tolerant design and control strategy for cascaded H-bridge multilevel converter-based STATCOM," in *IEEE Transactions on Industrial Electronics*, vol. 57, no. 8, pp. 2700–2708, Aug. 2010.
- [17] N. Bisht and A. Das, "A circuit topology of cascaded H-bridge STATCOM to operate with multiple faulty bypassed cells," in *IEEE Transactions on Industry Applications*, vol. 57, no. 5, pp. 5345–5355, Sept-Oct. 2021.
- [18] R Ahmadi, M. Aleenejad, H. Mahmoudi, and S. Jafarishiadeh, "Reduced number of auxiliary H-bridge power cells for post-fault operation of three phase cascaded H-bridge inverter," in *IET Power Electronics*, vol. 12, no. 11, pp. 2923–2931, Aug. 2019.
- [19] B. Mirafzal, "Survey of fault-tolerance techniques for three-phase voltage source inverters," in *IEEE Transactions on Industrial Electronics*, vol. 61, no. 10, pp. 5192–5202, Oct. 2014.
- [20] S. Farzamkia, H. Iman-Eini, A. Khoshkbar-Sadigh, and M. Noushak, "A software-based fault-tolerant strategy for modular multilevel converter using DC bus voltage control," in *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 9, no. 3, pp. 3436–3445, Jun. 2021.
- [21] Y. Neyshabouri and H. Iman-Eini, "A new fault-tolerant strategy for a cascaded H-bridge based STATCOM," in *IEEE Transactions on Industrial Electronics*, vol. 65, no. 8, pp. 6436–6445, Aug. 2018.
- [22] L. Maharjan, T. Yamagishi, H. Akagi, and J. Asakura, "Fault-tolerant operation of a battery-energy-storage system based on a multilevel cascade PWM converter with star configuration," in *IEEE Transactions* on *Power Electronics*, vol. 25, no. 9, pp. 2386–2396, Sept. 2010.
- [23] P. Lezana and G. Ortiz, "Extended operation of cascade multicell converters under fault condition," in *IEEE Transactions on Industrial Electronics*, vol. 56, no. 7, pp. 2697–2703, Jul. 2009.
- [24] Y. Yu, G. Konstantinou, B. Hredzak, and V. G. Agelidis, "Operation of cascaded H-bridge multilevel converters for large-scale photovoltaic

power plants under bridge failures," in *IEEE Transactions on Industrial Electronics*, vol. 62, no. 11, pp. 7228–7236, Nov. 2015.

- [25] Q. Xiao, L. Chen, Y. Jin, Y. Mu, A. F. Cupertino, H. Jia, Y. Neyshabouri, T. Dragičević, and R. Teodorescu, "An improved fault-tolerant control scheme for cascaded H-bridge STATCOM with higher attainable balanced line-to-line voltages," in *IEEE Transactions on Industrial Electronics*, vol. 68, no. 4, pp. 2784–2797, Apr. 2021.
- [26] Y. Neyshabouri, K. K. Monfared, H. Iman-Eini, and M. Farhadi-Kangarlu, "Symmetric cascaded H-bridge multilevel inverter with enhanced multi-phase fault tolerant capability," in *IEEE Transactions on Industrial Electronics*, vol. 69, no. 9, pp. 8739–8750, Sept. 2022.
- [27] H. Xue, J. He, and P. Guo, "Fault-tolerance control of battery energy storage system based on cascaded H-bridge converter," in *High Voltage Engineering*, vol. 46, no. 10, pp. 3418–3430, Sept. 2020.
- [28] M. T. Lawder, B. Suthar, P. W. C. Northrop, S. De, C. M. Hoff, O. Leitermann, M. L. Crow, S. Santhanagopalan, and V. R. Subramanian, "Battery energy storage system (BESS) and battery management system (BMS) for grid-scale applications," in *Proceedings of the IEEE*, vol. 102, no. 6, pp. 1014–1030, Jun. 2014.



Xiqi Wu received the bachelor's degree in electrical engineering from Nanjing University of Aeronautics and Astronautics, China, in 2019 and received the Ph.D. degree from Shanghai Jiaotong University in 2024.

Since 2024, he has been with the Key Laboratory of Control of Power Transmission and Conversion (Shanghai Jiaotong University), Ministry of Education, Minhang District, Shanghai, China, where he is currently a post doctor. His current research interests include

key technologies of battery energy storage system, renewable energy power conversion and grid forming control.



**Shuai Gao** received the master's degree in power electronics and power drives from Hefei University of Technology, China, in 2020.

His current research interests include high voltage energy storage system and new electricity system.



Yilin Liu received the bachelor's degree in electrical engineering from Southwest Jiaotong University, Chengdu, China, in 2022. She is currently working toward the Ph.D. degree in electrical engineering with the Key Laboratory of Control of Power Transmission and Conversion, Shanghai Jiao Tong University, Ministry of Education, Shanghai, China.

Her current research interests include key technologies of battery energy storage system and renewable

energy power conversion.



Rui Li received the Ph.D. degree in electrical engineering from Zhejiang University, Hangzhou, China, in 2010.

From 2008 to 2009, he was an Academic Guest with the Power Electronic Systems Laboratory, Swiss Federal Institute of Technology, Zürich, Switzerland. From 2014 to 2015, he was a Postdoctoral Research Scholar with the Center for Advanced Power Systems, Department of Electrical and Computer Engineering, College of Engineering, Florida State University, Tallahassee,

FL, USA. Since 2010, he has been with the Department of Electrical Engineering, School of Electronics, Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China, where he is currently an Professor since 2019. His current research interest includes the application of power electronics in renewable energy conversion. He was a recipient of the IEEE Power Electronics Society Transactions Second Prize Paper Award, in 2015.



Xinyu Jiang received the master's degree in electrical insulation and high voltage technology from Xi'an Jiaotong University, China, in 1997.

His current research interests include high voltage energy storage system and new electricity system.



Xu Cai received the B.Eng. degree from Southeast University, Nanjing, China, in 1983, the M.Sc. and the Ph.D. degrees from China University of Mining and Technology, Jiangsu, China, in 1988 and 2000, respectively. He was with the Department of Electrical Engineering, China University of Mining and Technology, as an Associate Professor from 1989 to 2001. He joined Shanghai Jiao Tong University, as a Professor from

2002 and is a director of Wind Power Research Center of Shanghai Jiao Tong University from 2008 and vice director of State Energy Smart Grid R&D Center (Shanghai) from 2010 to 2013. His special fields of interest lie in power electronics and renewable energy exploitation and utilization, including wind power converters, wind turbine.