The adoption of multiple antennas both at the transmitter and the receiver will explore additional spatial resources to provide substantial gain in system throughput with the spatial division multiple access (SDMA) technique. Optimal multiuser MIMO linear precoding is considered as a key issue in the area of multiuser MIMO research. The challenge in such multiuser system is designing the precoding vector to maximize the system capacity. An optimal multiuser MIMO linear precoding scheme with LMMSE detection based on particle swarm optimization is proposed in this paper. The proposed scheme aims to maximize the system capacity of multiuser MIMO system with linear precoding and linear detection. This paper explores a simplified function to solve the optimal problem. With the adoption of particle swarm optimization algorithm, the optimal linear precoding vector could be easily searched according to the simplified function. The proposed scheme provides significant performance improvement comparing to the multiuser MIMO linear precoding scheme based on channel block diagonalization method.
1. Introduction
In recent years, with the increasing demand of transmitting high data rates, the (MultipleInput MultipleOutput) MIMO technique, a potential method to achieve high capacity has attracted enormous interest [1, 2]. When multiple antennas are equipped at both base stations (BSs) and mobile stations (MSs), the space dimension can be exploited for scheduling multiuser transmission besides time and frequency dimension. Therefore, the traditional MIMO technique focused on pointtopoint singleuser MIMO (SUMIMO) has been extended to the pointtomultipoint multiuser MIMO (MUMIMO) technique [3, 4]. It has been shown that time division multiple access (TDMA) systems can not achieve sum rate capacity of MUMIMO system of broadcast channel (BC) [5] while MUMIMO with spatial division multiple access (SDMA) could, where one BS communicates with several MSs within the same time slot and the same frequency band [6, 7]. MUMIMO based on SDMA improves system capacity taking advantage of multiuser diversity and precanceling of multiuser interference at the transmitter.
Traditional MIMO technique focuses on pointtopoint transmission as the STBC technique based on spacetime coding and the VBLAST technique based on spatial multiplexing. The former one can efficiently combat channel fading while its spectral efficiency is low [8, 9]. The latter one could transmit parallel data streams, but its performance will be degraded under spatial correlated channel [10, 11]. When the MUMIMO technique is adopted, both the multiuser diversity gain to improve the BER performance and the spatial multiplexing gain to increase the system capacity will be obtained. Since the receive antennas are distributed among several users, the spatial correlation will effect less on multiuser MIMO system. Besides, because the multiuser MIMO technique utilizes precoding at the transmit side to precanceling the cochannel interference (CCI), so the complexity of the receiver can be significantly simplified. However multiuser CCI becomes one of the main obstacles to improve MUMIMO performance. The challenge is that the receiving antennas that are associated with different users are typically unable to coordinate with each other. By mitigating or ideally completely eliminating CCI, the BS exploits the channel state information (CSI) available at the transmitter to cancel the CCI at the transmitter. It is essential to have CSI at the BS since it allows joint processing of all users' signals which results in a significant performance improvement and increased data rates.
The sum capacity in a multiuser MIMO broadcast channel is defined as the maximum aggregation of all the users' data rates. For Gaussion MIMO broadcast channels (BCs), it was proven in [12] that Dirty Paper Coding (DPC) can achieve the capacity region. The optimal precoding of multiuser MIMO is based on dirty paper coding (DPC) theory with the nonlinear precoding method. DPC theory proves that when a transmitter has advance knowledge of the interference, it could design a code to compensate for it. It is developed by Costa which can eliminate the interference by iterative precoding at the transmitter and achieve the broadcast MIMO channel capacity [13, 14]. The famous TomlinsonHarashima precoding (THP) is the nonlinear precoding based on DPC theory. It is first developed by Tomlinson [15] and Miyakawa and Harashima [16] independently and then has become the TomlinsonHarashima precoding (THP) [17–20] to combat the multiuser cochannel interference (CCI) with nonlinear precoding. Although THP performs well in a multiuser MIMO scenario, deploying it in realtime systems is difficult because of its high complexity of the precoding at the transmitter. Many suboptimal MUMIMO linear precoding techniques have emerged recently, such as the channel inversion method [21] and the block diagonalization (BD) method [22–24]. Channel inversion method [25] employs some traditional MIMO detection criterions, such as the Zero Forcing (ZF) and Minimum Mean Squared Error (MMSE), precoding at the transmitter to suppress the CCI Channel inversion method based on ZF can suppress CCI completely; however it may lead to noise amplification since the precoding vectors are not normalized. Channel inversion method based on MMSE compromises the noise and the CCI, and outperforms ZF algorithm, but it still cannot obtain good performance. BD method decomposes a multiuser MIMO channel into multiple single user MIMO channels in parallel to completely cancel the CCI by making use of the null space. With BD, each users precoding matrix lies in the null space of all other users channels, and the CCI could be completely canceled. The generated null space vectors are normalized vectors, which could avoid the noise amplification problem efficiently. So BD method performs much better than channel inversion method. However, since BD method just aims to cancel the CCI and suppress the noise, its precoding gain is not optimized.
It is obvious that the CCI, the noise, and the precoding gain are the factors affecting on the performance of the preprocessing MUMIMO. The above linear precoding methods just take one factor into account without entirely consideration. A rate maximization linear precoding method is proposed in [26]. This method aims to maximize the sum rate of the MUMIMO system with linear preprocessing. However, the optimized function in [26] is too complex to compute. In this paper, we solve the optimal linear precoding with linear MMSE receiver problem in a more simplified way.
An optimal MUMIMO linear precoding scheme with linear MMSE receiver based on particle swarm optimization (PSO) is proposed in this paper. PSO algorithm has been used in many complex optimization tasks, especially in solving the optimization of continuous space [27, 28]. In this paper, PSO is firstly introduced into MIMO research to solve some optimization issues. The adoption of PSO to MIMO system provides a new method to solve the MIMO processing problem. In this paper, we first analyze the optimal linear precoding vector with linear MMSE receiver and establish a simplified function to measure the optimal linear precoding problem. Then, we employ the novel PSO algorithm to search the optimal linear precoding vector according to the simplified function. The proposed scheme obtains significant MUMIMO system capacity and outperforms the channel block diagonalization method.
This paper is organized into seven parts. The system model of MUMIMO is given in Section 2. Then the analysis of optimal linear precoding with linear MMSE receiver is given in Section 3. The particle swarm optimization algorithm is given in Section 4. In Section 5, the proposed optimal linear precoding MUMIMO scheme with LMMSE detection based on particle swarm optimization is introduced. In Section 6, the simulation results and comparisons are given. Conclusions are drawn in the last section. The channel block diagonalization algorithm is given in the appendix.
2. System Model of MUMIMO
The MUMIMO system could transmit data streams of multiple users of the same cellular at the same time and the same frequency resources as Figure 1 shows.
Figure 1. The configuration of MUMIMO system
We consider an MUMIMO system with one BS and MS, where the BS is equipped with antennas and each MS with antennas, as shown in Figure 2. The pointto multipoint MUMIMO system is employed in downlink transmission.
Figure 2. The block diagram of MUMIMO system
Because MUMIMO aims to transmit data streams of multipleusers at the same time and frequency resources, we discuss the algorithm at singlecarrier, for each subcarrier of the multicarrier system, and it is processed as same as the singlecarrier case. Since OFDM technique deals the frequency selective fading as flat fading, we model the channel as the flat fading MIMO channel:
where is the MIMO channel matrix of user . indicates the channel impulse response coupling the th transmit antenna to the th receive antenna. Its amplitude obeys independent and identically Rayleighdistribution.
Data streams of users are precoded by their precoding vectors before transmission. is the normalized precoding vector for user with . The received signal at the kth user is
where is the received signal of user . The elements of additive noise obey distribution that are spatially and temporarily white. is the transmit signal power of the th data stream, and is the total transmit power.
The received signal at the kth user can also be expressed as
where is the transmitted symbol vector with data streams, is the precoding matrix with precoding vectors, and denotes the matrix transposition:
The channel matrix can be assumed as the virtual channel matrix of user after precoding. At the receiver, a linear receiver is exploited to detect the transmit signal for the user . The detected signal of the th user is
The linear receiver can be designed by ZF or MMSE criteria, and linear MMSE will obtain better performance. In order to simplify the analysis, the power allocation is assumed as equal , and linear MMSE MIMO detection is used in this paper as
where indicates the inverse of the matrix, denotes the matrix conjugation transposition, and is the identity matrix:
where denotes the th column of the matrix. Then the detected SINR for the user with the linear detection is
where denotes the matrix twonorm.
Because the nonnormalized precoding vector will amplify the noise at the receiver, the precoding vectors are assumed to be normalized as follow:
for .
3. Optimal Multiuser MIMO Linear Precoding
We assume that the MIMO channel matrices are available at the BS. It can be achieved either by channel reciprocity characteristics in timedivisionduplex (TDD) mode or by feedback in frequencydivisionduplex (FDD) mode. And the channel matrix is known at the receiver through channel estimation. We just discuss the equal power allocation case in this paper. The optimal power allocation is achieved through waterfilling according to the SINR of each user.
The MIMO channel of user can be decomposed by the singular value decomposition (SVD) as
If we apply to precode for user , it obtains the maximal precoding gain as follow.
Lemma 1.
One has
where denotes the first column of and denotes the maximal singular value of .
Proof.
One has
So
where denotes the first column of unitary matrix .
Thus, precoding with the singular vector corresponding to the maximal singular value is an initial thought to obtain good performance. However, if the singular vector is directly used at the transmitter as the precoding vector, the CCI will be large, and the performance will be degraded severely. Only for the special case that the MIMO channel among all these users are orthogonal that the CCI will be zero if we directly use the singular vector of each user as its precoding vector. But in realistic case, the transmit users' channels are always nonorthogonal, and so the singular vector could not be utilized directly. We have drawn some analysis as follow.
Ideal channel case. The ideal channel case is that the MIMO channels of transmitting users' are orthogonal. There is
If we apply to precode for user , the maximal precoding gain will be obtained as (13) shows, and the CCI will turn to zero as follow.
Lemma 2.
One has
Proof.
One has
Because we assume that (), and is the unit vector with , so is an unitary matrix with
Since , so
where denotes the first row of .
Also
Since so Combining there is
so
After linear MMSE detection at the receiver, user k obtains the maximal SINR as follows.
Lemma 3.
One has
Proof.
One has
According to(13) and (22)
Ill channel case The ill channel case is that all these transmitting users' channels are highly correlated. There is
If we still apply to precode for user , the multiuser CCI will be very large, and the system performance will be degraded severely. The SINR after MMSE detection with equal power allocation for user is as follows.
Lemma 4.
One has
Proof.
Since we have proven that when to precode for user , then so
According to (19)
Since we assume that ( ), so is
Let the diagonal matrix and so there is
Since , so
where indicates the first diagonal element of the diagonal matrix . So there is
Since so Combining there is
So the SINR for user k is
Practical case. The practical case is that the transmitting users' channels are neither orthogonal nor ill. There is
The practical case is usually in realistic environment. If we apply to precode for user , then can be the parameter to measure the precoding gain, and can be the parameter to measure the CCI. The SINR for user according to the above analysis can be approximated denoted as
The system capacity isrelated to SINR of the transmit users . So in order to obtain the system capacity, we should obtain the . Thus, when the optimal precoding vector is obtained by the PSO algorithm, the system capacity could be calculated by (41).
The system capacity of the MUMIMO system can be indicated as
We aim to maximize the system capacity of the MUMIMO system in this paper. The optimal MUMIMO linear precoding vector for the MUMIMO system is the vector that can maximize the SINR at each receiver as
where denotes the unitary vector that . From the above equation, it is clear that if we want to maximize the system capacity of MUMIMO, then the SINR of each user should be maximized. The SINR of user is associated with three parameters as the singular vector correspond to the maximal singular value of all users and the noise.
4. The Particle Swarm Optimization Algorithm
Particle swarm optimization algorithm was originally proposed by Kennedy and Eberhart[27] in 1995. It searches the optimal problem solution through cooperation and competition among the individuals of population.
Imagine a swarm of bees in a field. Their goal is to find in the field the location with the highest density of flowers. Without any prior knowledgeof the field, the bees begin in random locations with random velocities looking for flowers. Each bee can remember the location that is found the most flowers and somehow knows the locations where the other bees found an abundance of flowers. Torn between returning to the location where it had personally found the most flowers, or exploring the location reported by others to have the most flowers, the ambivalent bee accelerates in both directions to fly somewhere between the two points. There is a function or method to evaluate the goodness of a position as the fitness function. Along the way, a bee might find a place with a higher concentration of flowers than it had found previously. Constantly, they are checking the concentration of flowers and hoping to find out the absolute highest concentration of flowers.
Suppose that the size of swarm and the dimension of search space are and ,respectively. Each individual in the swarm is referred to as a particle. The location and velocity of particle are represented as the vector and . Each bee remembers the location where it personally encountered the most flowers which is denoted as which is the flight experience of the particle itself. The highest concentration of flowers discovered by the entire swarm is denoted as which is the flight experience of all particles. Each particle is searching for the best location according to and . The particle updates its location and velocity according to the following two formulas [27]:
where is the current iteration number; and denote the velocity and location of the particle in the th dimensional direction. is the individual best location of particle in the th dimensional direction, is the population best location in the th dimensional direction. and are the random numbers between 0 and 1, and are the learning factors, and is the inertia factor. Learning factors determine the relative "pull" of and that usually content . Inertia factor determines to what extent the particle remains along its original course unaffected by the pull of and that is usually between 0 and 1. After this process is carried out for each particle in the swarm, the process is repeated until reaching the maximal iteration or the termination criteria are met.
5. The Optimal Linear Precoding Multiuser MIMO with LMMSE Detection Based on Particle Swarm Optimization
With the adoption of PSO algorithm and the simplified function (40), the optimal linear precoding vector could be easily searched.
The proposed optimal MUMIMO linear precoding scheme based on PSO algorithm will search the optimal precoding vector for each user following 6 steps.
The BS obtains and of each user.
The BS employs the PSO algorithm to search the optimal linear precoding vector for each user. For user , the PSO algorithm sets the maximal iteration number and a group of dimensional particles with the initial velocity and the initial location for particle . In order to accelerate the searching process, the initial location could be initialized as , while the initial velocity could be produced randomly. The real and imaginary parts of the initial velocity obey a normal distribution with mean zero and standard deviation one.
The BS begins to search with the initial location and velocity . The goodness of the location is measured by the following equation:
where the fitness function indicates the obtained SINR for user precoded by . The PSO algorithm finds and that are individual best location and population best location measured by (44) for the next iteration. denotes the individual best location which means the best location of particle at the th iteration of the th user. denotes the population best location which means the best location of all particles at the th iteration of the th user.
For the th iteration, the algorithm finds a and a . The location and velocity for each particle will be updated according to (43) for the next iteration. In order to obtain the normalized optimal precoding vector to suppress the noise, the location should be normalized in each iteration.
When reaching the maximal iteration number , the algorithm stops, and is the obtained optimal precoding vector for user .
For an MUMIMO system with users, the scheme will search the precoding vectors according to the above criteria for each user.
6. Simulation Results
We simulated the proposed MUMIMO scheme, the BD algorithm in [22] (Coordinate TxRx BD), and the channel inversion algorithm in [25] in this paper to compare their performance under the same simulation environment.
Figure 3 is the system capacity comparison of the cumulative distribution function (CDF) of the channel inversion algorithm with ZF precoder and MMSE precoder and the proposed MUMIMO algorithm when , with equation power allocation and MMSE detection at the receiver. For channel inversion method, the BS transmits 4 date streams and 2 users simultaneously with 2 date stream for each user. For the proposed MUMIMO, the BS transmit 4 data streams and 4 users simultaneously with 1 data stream for each user.
Figure 3. The system capacity CDF comparison of the two schemes.
Figure 4 is the system capacity comparison of the CDF of the coordinated TxRx BD algorithm and the proposed MUMIMO algorithm when , with equation power allocation and MMSE detection at the receiver.
Figure 4. The system capacity CDF comparison of the two schemes.
Figure 5 is the system capacity comparison of the CDF of the coordinated TxRx BD algorithm and the proposed MUMIMO algorithm when , with equation power allocation and MMSE detection at the receiver. Both the simulation results of the proposed MUMIMO scheme with PSO algorithm from Figure 3 to Figure 5 are based on the PSO parameters with the particle number and the iteration number . It could be seen that the proposed MUMIMO scheme can effectively increase the system capacity compared to the BD algorithm and channel inversion algorithm.
Figure 5. The system capacity CDF comparison of the two schemes.
Figure 6 is the average BER performance of the proposed MUMIMO scheme and the coordinated TxRx BD algorithm with . Figure 7 is the average BER performance of the proposed MUMIMO scheme and the coordinated TxRx BD algorithm with . Both the schemes adopt equal power allocation, MMSE detection, QPSK, and no channel coding. The proposed MUMIMO scheme, with PSO algorithm from Figures 6 and 7 are based on the PSO parameters with the particle number and the iteration number .
From the simulation results, it is clear that the proposed MUMIMO linear precoding with LMMSE detection based on particle swarm optimization scheme outperforms the BD algorithm and the channel inversion algorithm. The reason lies in that the BD algorithm just aims to utilize the normalized precoding vector to cancel the CCI and suppress the noise. The channel inversion algorithm also aims to suppress CCI and noise. So the users' transmit signal covariance matrices of these schemes are generally not optimal that are caused by the inferior precoding gain. The proposed MUMIMO optimal linear precoding scheme aims to find the optimal precoding vector to maximize each users' SINR at each receiver to improve the total system capacity.
Figure 8 shows the BER performance of the proposed MUMIMO scheme with the same particle size and different iteration size when . It adopts equal power allocation, MMSE detection, QPSK, and no channel coding. The particle number is 20, and the iteration number scales from 5 to 30. We could see that when the iteration number is small, the proposed scheme could not obtain the best performance. With the increase of the iteration number, more performance as well as the computational complexity will increase too. However, when the iteration number is larger than 20 for this case, the algorithm could not obtain more performance gain. Generally, for different case, the best iteration number is different. The iteration number is related to the transmit antenna number at the BS and the transmit user number With the increasing of or , the iteration number should increase in order to obtain the best performance.
Figure 8. The BER comparison of the two schemes with different and
7. Conclusion
This paper solves the optimal linear precoding problem with LMMSE detection for MUMIMO system in downlink transmission. A simplified optimal function is proposed and proved to maximize the system capacity. With the adoption of the particle swarm optimization algorithm, the optimal linear precoding vector with LMMSE detection for each user could be searched. The proposed scheme can obtain significant system capacity improvement compared to the multiuser MIMO scheme based on channel block digonolization under the same simulation environment.
Acknowledgments
The project was supported by the National Natural Science Foundation of China (60702073) and the Key Laboratory of Universal Wireless Communications Lab. (Beijing University of Posts and Telecommunications), Ministry of Education, China.
References

GJ Foschini, MJ Gans, On limits of wireless communications in a fading environment when using multiple antennas. Wireless Personal Communications 6(3), 311–335 (1998). Publisher Full Text

A Goldsmith, SA Jafar, N Jindal, S Vishwanath, Capacity limits of MIMO channels. IEEE Journal on Selected Areas in Communications 21(5), 684–702 (2003). Publisher Full Text

QH Spencer, CB Peel, AL Swindlehurst, M Haardt, An introduction to the multiuser MIMO downlink. IEEE Communications Magazine 42(10), 60–67 (2004). Publisher Full Text

C Windpassinger, RFH Fischer, T Vencel, JB Huber, Precoding in multiantenna and multiuser communications. IEEE Transactions on Wireless Communications 3(4), 1305–1316 (2004). Publisher Full Text

N Jindal, A Goldsmith, Dirtypaper coding versus TDMA for MIMO broadcast channels. IEEE Transactions on Information Theory 51(5), 1783–1794 (2005). Publisher Full Text

P Viswanath, DNC Tse, Sum capacity of the vector Gaussian broadcast channel and uplinkdownlink duality. IEEE Transactions on Information Theory 49(8), 1912–1921 (2003). Publisher Full Text

S Vishwanath, N Jindal, A Goldsmith, Duality, achievable rates, and sumrate capacity of Gaussian MIMO broadcast channels. IEEE Transactions on Information Theory 49(10), 2658–2668 (2003). Publisher Full Text

EA Jorswieck, A Sezgin, Impact of spatial correlation on the performance of orthogonal spacetime block codes. IEEE Communications Letters 8(1), 21–23 (2004). Publisher Full Text

SN Diggavi, On achievable performance of spatial diversity fading channels. IEEE Transactions on Information Theory 47(1), 308–325 (2001). Publisher Full Text

P Tarasak, H Minn, VK Bhargava, Linear prediction receiver for differential spacetime block codes with spatial correlation. IEEE Communications Letters 7(11), 543–545 (2003). Publisher Full Text

H Bölcskei, D Gesbert, AJ Paulraj, On the capacity of OFDMbased spatial multiplexing systems. IEEE Transactions on Communications 50(2), 225–234 (2002)

MHM Costa, Writing on dirty paper. IEEE Transactions on Information Theory 29(3), 439–441 (1983). Publisher Full Text

H Weingarten, Y Steinberg, S Shamai (Shitz), The capacity region of the Gaussian MIMO broadcast channel. Proceedings of IEEE International Symposium on Information Theory, June 2004, 174

G Caire, S Shamai, On the achievable throughput of a multiantenna Gaussian broadcast channel. IEEE Transactions on Information Theory 49(7), 1691–1706 (2003). Publisher Full Text

M Tomlinson, New automatic equaliser employing modulo arithmetic. Electronics Letters 7(56), 138–139 (1971). Publisher Full Text

H Miyakawa, H Harashima, Matchedtransmission techniuqe for channels with intersymbol interference. IEEE Transactions on Communications 20, 774–779 (1972). Publisher Full Text

V Stankovic, M Haardt, Successive optimization TomlinsonHarashima precoding (SO THP) for multiuser MIMO systems. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '05), March 2005 3, 1117–1120

W Miao, L Xiao, Y Li, S Zhou, J Wang, Joint streamwise THP transceiver disign for the multiuser MIMO downlink. Proceedings of IEEE Wireless Communications & Networking Conference (WCNC '08), 2008, 330–334

K Takeda, H Tomeba, F Adachi, BER performance of joint THP/preDFE. Proceedings of IEEE Vehicular Technology Conference (VTC '08), 2008, 1016–1020

PL Athanasios, TomlinsonHarashima precoding with partial channel knowlegde. IEEE Transactions on Communications 53(1), 5–9 (2005). Publisher Full Text

T Haustein, C von Helmolt, E Jorswieck, V Jungnickel, V Pohl, Performance of MIMO systems with channel inversion. Proceedings of the 55th IEEE Vehicular Technology Conference (VTC '02), 2002 1, 35–39

QH Spencer, AL Swindlehurst, M Haardt, Zeroforcing methods for downlink spatial multiplexing in multiuser MIMO channels. IEEE Transactions on Signal Processing 52(2), 461–471 (2004). Publisher Full Text

LU Choi, RD Murch, A transmit preprocessing technique for multiuser MIMO systems using a decomposition approach. IEEE Transactions on Wireless Communications 3(1), 20–24 (2004). Publisher Full Text

KK Wong, RD Murch, KB Letaief, A jointchannel diagonalization for multiuser MIMO antenna systems. IEEE Transactions on Wireless Communications 2(4), 773–786 (2003)

CB Peel, BM Hochwald, AL Swindlehurst, A vectorperturbation technique for nearcapacity multiantenna multiuser communication—part I: channel inversion and regularization. IEEE Transactions on Communications 53(1), 195–202 (2005). Publisher Full Text

M Stojnic, H Vikalo, B Hassibi, Rate maximization in multiantenna broadcast channels with linear preprocessing. Proceedings of IEEE Global Telecommunications Conference (GLOBECOM '04), November 2004 6, 3957–3961

J Kennedy, RC Eberhart, Particle swarm optimization. Proceedings of IEEE International Conference on Neural Networks, 1995 5, 1942–1948

J Robinson, Y RahmatSamii, Particle swarm optimization in electromagnetics. IEEE Transactions on Antennas and Propagation 52(2), 397–407 (2004). Publisher Full Text
Appendix A
Coordinated TxRx BD Algorithm
Coordinated TxRx BD algorithm is the improved BD algorithm. It could solve the antenna constraint problem in traditional BD algorithm and extends the BD algorithm to arbitrary antenna configuration. For a coordinated TxRx BD algorithm with transmit antennas at the BS, receive antennas at the MS, and users to be transmitted simultaneously, the algorithm follows 6 steps.
For , compute the SVD
Determine which is the number of subchannels for each user. In order to compare the two schemes fairly, for each user.
For , let be the first columns of . Calculate .
For , compute the right null space of as
where holds the first right singular vectors, holds the last right singular vectors and .
Compute the SVD
The precoding matrix for the transmit users with average power allocation is