Negli ultimi anni abbiamo assistito alla continua comparsa di applicazioni distribuite, la cui implementabilita' risulta consentita dalla possibilita' di avere a disposizione sensori piccoli ed economici. I recenti progressi tecnologici nel settore micro-elettronico-meccanico hanno infatti consentito una miniaturizzazione dei nodi sensore. La comunita' scientifica si e' oramai abituata alla possibilita', con una spesa minima, di collocare piccoli dispositivi intelligenti lungo un'area specifica. Sensori economici e a basso consumo, una volta muniti dell'hardware necessario per le telecomunicazioni, risultano ideali per l'utilizzo in reti senza infrastruttura, uno scenario in cui spicca l'assenza di un nodo centrale e la robustezza diviene quindi una proprieta' fondamentale.

In lingua inglese

Articoli tecnico scientifici o articoli contenenti case history

Pubblicato

Estratto del testo

Sede amministrativa: UNIVERSIT' DEGLI STUDI DI PADOVA Dipartimento di Ingegneria dell''Informazione Scuola di Dottorato di Ricerca in Ingegneria dell''Informazione Indirizzo: Scienza e Tecnologia dell''Informazione Ciclo XXIV Tesi di Dottorato di Ricerca Clock Synchronization in Wireless Sensor Networks: Statistical and Algorithmic Analysis Direttore della Scuola: Ch.mo Prof. Matteo Bertocco Supervisore: Ch.mo Prof. Lorenzo Vangelista Dottorando: Davide Zennaro A Mamma e Papà - Davide Zennaro - Table of Contents Acknowledgements xi Abstract xiii Sommario xv List of Acronyms and Abbreviations xvii Introduction 1 1 General Concepts on Clock Synchronization 5 1.1 The Notion of Time in a Node . . . . . . . . . . . . . . . . . . 5 1.2 Clock Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.1 Clock Parameters . . . . . . . . . . . . . . . . . . . . . 7 1.2.2 The Adopted Clock Model . . . . . . . . . . . . . . . . 8 1.3 Approaches to Clock Synchronization . . . . . . . . . . . . . . 10

1.3.1 Message-Based vs Pulse-Coupled Algorithms . . . . . . 10 1.3.2 Coupled Clocks vs Uncoupled Clocks . . . . . . . . . . 11 1.3.3 Synchronization Via Parameter Estimation . . . . . . . 12 1.3.4 Synchronization Via Clock Coupling . . . . . . . . . . 12 I Clock Synchronization in WSNs via the Two-Way Message Exchange 13 2 The Two-Way Message Exchange 15 2.1 The Two-Way Data Exchange . . . . . . . . . . . . . . . . . . 15 2.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.1 Fixed and Random Portions of Link Delay . . . . . . . 17 2.3 State of The Art . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.1 Relevant Synchronization Algorithms . . . . . . . . . . 20 i ii Table of Contents 3 Clock Offset Estimation: A Unified Framework 25 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3 A New General Framework . . . . . . . . . . . . . . . . . . . . 27 3.4 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . 31

3.4.1 Unconstrained Likelihood . . . . . . . . . . . . . . . . 31 3.4.2 Constrained Likelihood . . . . . . . . . . . . . . . . . . 33 3.5 Statistical Bounds . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.5.1 Cramer-Rao Lower Bound . . . . . . . . . . . . . . . . 35 3.5.2 Chapman-Robbins Bound . . . . . . . . . . . . . . . . 36 3.5.3 MSE and Variance of the Estimation Error . . . . . . . 36 3.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.6.1 MSE Performance and Bounds . . . . . . . . . . . . . . 38 3.6.2 MSE Robustness with Log-Normally Distributed Like-

lihood . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.A.1 Proof of Theorem 3.2 . . . . . . . . . . . . . . . . . . . 42

3.A.2 Proof of Theorem 3.4 . . . . . . . . . . . . . . . . . . . 43

3.A.3 Proof of Theorem 3.5 . . . . . . . . . . . . . . . . . . . 43

3.A.4 Proof of Theorem 3.7 . . . . . . . . . . . . . . . . . . . 44

3.A.5 MSE Expressions for ML Estimators . . . . . . . . . . 45 4 A Factor Graph-Based Clock Offset Estimator 47 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.2 Factor Graphs and Message Passing . . . . . . . . . . . . . . . 48 4.3 A Factor Graph-based Estimator . . . . . . . . . . . . . . . . 49

4.3.1 Message Computation . . . . . . . . . . . . . . . . . . 50 4.3.2 Closed-Form Expressions For ' θ (o) ML(K ) . . . . . . . . . . 56 4.4 Bayesian Statistical Bounds . . . . . . . . . . . . . . . . . . . 62

4.4.1 Bayesian Cramér-Rao Lower Bound . . . . . . . . . . . 63 4.4.2 Bayesian Chapman-Robbins Bound . . . . . . . . . . . 64 4.4.3 MSE and Variance of the Estimation Error . . . . . . . 65 4.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.5.1 MSE Performance and Bounds . . . . . . . . . . . . . . 66 4.5.2 MSE Robustness with Log-Normally Distributed Like-

lihood . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.5.3 Classical vs Bayesian framework . . . . . . . . . . . . . 68 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.A.1 Proof of Theorem 4.4 . . . . . . . . . . . . . . . . . . . 71 Table of Contents iii II Fast Consensus in WSNs and its Application to Clock Synchronization 73 5 Consensus and Clock Synchronization 75 5.1 Distributed Consensus in WSNs . . . . . . . . . . . . . . . . . 76 5.2 Consensus System Model . . . . . . . . . . . . . . . . . . . . . 77 5.3 Requirements for Consensus Matrix . . . . . . . . . . . . . . . 79

5.3.1 Average Consensus . . . . . . . . . . . . . . . . . . . . 80 5.3.2 Popular Average Consensus Matrices . . . . . . . . . . 80 5.4 Application of Consensus to Clock Synchronization . . . . . . 81

5.4.1 System Model . . . . . . . . . . . . . . . . . . . . . . . 82 5.4.2 Relevant Works . . . . . . . . . . . . . . . . . . . . . . 82 5.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.A.1 Proof of Theorem 5.1 . . . . . . . . . . . . . . . . . . . 84 6 Fast Consensus via ADMM 87 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.2 ADMM-based Average Consensus . . . . . . . . . . . . . . . . 88 6.3 ADMM-based Consensus in Vector Form . . . . . . . . . . . . 90

6.3.1 Assumption on the augmentation constants . . . . . . 93 6.4 ADMM Performance Analysis . . . . . . . . . . . . . . . . . . 96

6.4.1 Noiseless communications . . . . . . . . . . . . . . . . 97 6.4.2 Encompassing for communication noise . . . . . . . . . 97 6.4.3 Noise resilience characterization . . . . . . . . . . . . . 100 6.4.4 Initial states choice . . . . . . . . . . . . . . . . . . . . 102 6.5 Optimization Issues . . . . . . . . . . . . . . . . . . . . . . . . 103

6.5.1 Convergence speed . . . . . . . . . . . . . . . . . . . . 103 6.5.2 Resilience to noise . . . . . . . . . . . . . . . . . . . . 106 6.5.3 Results for highly structured networks . . . . . . . . . 108 6.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 6.A.1 Proof of Methods A and B . . . . . . . . . . . . . . . . 114

6.A.2 Proof of Results of 6.3 . . . . . . . . . . . . . . . . . . 116

6.A.3 Proof of Theorems of 6.4 . . . . . . . . . . . . . . . . . 118

6.A.4 Proof of Results of 6.5 . . . . . . . . . . . . . . . . . . 120 7 An ADMM-Based Clock Synchronization Algorithm 123 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . 124 7.3 ADMM-based Consensus on Clock Periods . . . . . . . . . . . 124 iv Table of Contents 7.3.1 Algorithm implementation . . . . . . . . . . . . . . . . 125 7.3.2 Noise resilience . . . . . . . . . . . . . . . . . . . . . . 126 7.4 Consensus on Clock Counters . . . . . . . . . . . . . . . . . . 127 7.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 128

7.5.1 Convergence Rate and Noise Resilience . . . . . . . . . 128 7.5.2 Performance Comparison with Standard Consensus . . 130 7.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 A Efficient Base Station Selection in Uplink Cellular Networks133 A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

A.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

A.3 Outage Probability Expressions . . . . . . . . . . . . . . . . . 138 A.3.1 Local Decoding with ARQ (LD-ARQ) . . . . . . . . . 141

A.3.2 Local Decoding with HARQ-CC (LD-HARQ-CC) . . . 141

A.3.3 Joint Decoding with HARQ-CC (JD-HARQ-CC) . . . 142

A.3.4 Local Decoding with HARQ-IR (LD-HARQ-IR) . . . . 143

A.3.5 Joint Decoding with HARQ-IR (JD-HARQ-IR) . . . . 144 A.4 Base Station Selection . . . . . . . . . . . . . . . . . . . . . . 144 A.4.1 The Recursive Search (RS) . . . . . . . . . . . . . . . . 144

A.4.2 A closed-form expression for LD-ARQ . . . . . . . . . 146 A.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 147 A.5.1 Outage Probability in the Sector . . . . . . . . . . . . 148

A.5.2 Coverage improvement . . . . . . . . . . . . . . . . . . 148

A.5.3 Randomly Dropped Users . . . . . . . . . . . . . . . . 151

A.5.4 LD-ARQ and 'm = ' . . . . . . . . . . . . . . . . . . . 157 A.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

A.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 A.A.1 Gaussian Approximation for JD-HARQ-IR . . . . . . . 159

A.A.2 Closed-form Expression of Probability (A.10) and ARQ 160

A.A.3 Proof of Theorem A.1 . . . . . . . . . . . . . . . . . . 160

A.A.4 Closed-form solution of (A.3) for 'm = '. . . . . . . . 161 B List of Publications 163 Bibliography 165 List of Figures 1.1 Examples of clocks with different offsets and no skews (a) and

with different skews but no offsets (b). . . . . . . . . . . . . . 9 2.1 Two-way message exchange between node S and R. . . . . . . 16 3.1 K rounds two-way message exchange between node S and R

with clock offset only between them. . . . . . . . . . . . . . . 26 3.2 MSE and bounds for estimating θ(o) by using the MLE with

Gaussian and exponentially distributed likelihood. . . . . . . . 39 3.3 MSE robustness (in terms of the likelihood functions) and

bounds for the estimation of θ(o) in case of log-normally dis-

tributed likelihood. . . . . . . . . . . . . . . . . . . . . . . . . 40 4.1 Factor graph representation of the density (4.5). . . . . . . . . 51 4.2 The factor 'K sends the message m' K ''ξ(K) to the variable ξ(K). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3 The variable ξ(K) sends the message mξ(K)''ζK K''1 to the factor ζK K ''1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.4 The factor ζK K ''1 sends the message mζ K K''1 ''ξ(K ''1) to the vari- able ξ(K '' 1). . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.5 The variable ξ(K '' 1) sends the message mξ(K''1)''ζK''1 K''2 to the factor ζ K ''1 K ''2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.6 The factor ζK''1 K ''2 sends the message mζ K''1 K''2 ''ξ(K ''2) to the vari- able ξ(K '' 2). . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.7 MSE and bounds for estimating θ(o)(K) when using the pro-

posed FGOE with Gaussian and exponentially distributed like-

lihood. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.8 MSE and bounds for the estimation of θ(o)(K) in case of log-

normal likelihood. . . . . . . . . . . . . . . . . . . . . . . . . . 68 v vi Table of Contents 4.9 MSE in the estimation of θ(o)(K) vs ' using FGOEs. The

MSE in the Bayesian case approaches the curves of the MLE

as ' '' 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.1 Example of an undirected and strongly connected graph. . . . 78 5.2 Adjacency matrix relative to the network whose graph is rep-

resented in Fig. 5.1. . . . . . . . . . . . . . . . . . . . . . . . . 78 6.1 Eigenvalues λi of matrix M as functions of 'i for ǫ ' 1 (a) and ǫ ' 1 (b). Solid lines for odd i and dashed lines for even i. 94 6.2 Convergence speed maximization choices, with initial state

(6.46) and 'N = 1

2 . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.3 Spectral radius of ADMM and of other state-of-the-art consen-

sus algorithm'' Boyd''s optimal matrix and Oreshkin''s method. 105 6.4 Convergence speed '' 1

2 log ξA dependent on ǫ and S when the underlying network graph is a mesh grid where each node has

4 neighbors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.5 Method B: Noise values KiUi as functions of d for different

values of 'i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.6 Noise resilience in line graph (left) and mesh grid (right). . . . 109 6.7 MSE with a RGG of N = 100 nodes, Nneigh = 8, and '2q =

10''6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.8 ADMM Method B with fixed ǫ in a RGG with Nneigh = 8:

(a) convergence speed; (b) MSE at steady state; (c) optimal

choices of ǫ used. . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.9 MSE with an underlying RGG: (a) as a function of t for N =

50 nodes, and (b) as a function of N for t = 300. Nneigh = 4,

'2 q = 10 ''6, and ǫ optimized for convergence speed. . . . . . . 112 7.1 Network communication graph. . . . . . . . . . . . . . . . . . 129 7.2 Synchronization phase error Ti(t) for five nodes. . . . . . . . . 130 7.3 MSE for clock phases and clock periods with ADMM+AC. . . 131 7.4 MSE of ADMM+AC compared to standard and optimized

consensus approaches. . . . . . . . . . . . . . . . . . . . . . . 131 A.1 The MT (red cross) is communicating with the M = 3 closest BSs (blu squares) in a cellular scenario. . . . . . . . . . . . . . 136 A.2 Final outage probability pN+1 as a function of the MT position in a sector with LD-HARQ-IR and in the absence of interfer-

ence (q = 0). The serving BS is located at coordinates (0,0)

in the plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Table of Contents vii A.3 Final outage probability pN+1 as a function of the MT position in a sector with LD-ARQ and in the presence of interference

(q = 1). The serving BS is located at coordinates (0,0) in the

plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 A.4 Minimum value of the SNR at the unitary distance vs the spectral efficiency of the first transmission for various decod-

ing schemes. Solid line: HARQ-IR; dashed line: HARQ-CC;

dotted line: ARQ. . . . . . . . . . . . . . . . . . . . . . . . . . 150 A.5 Maximum allowed interference probability qmax vs the spec- tral efficiency of the first transmission for various decoding

schemes. Solid line: HARQ-IR; dashed line: HARQ-CC; dot-

ted line: ARQ. . . . . . . . . . . . . . . . . . . . . . . . . . . 152 A.6 Total average number of active BSs as a function of q for HARQ-IR. Solid line: JD; dashed line: LD. . . . . . . . . . . . 153 A.7 Total average number of active BSs as a function of q for HARQ-CC. Solid line: JD; dashed line: LD. . . . . . . . . . . 154 A.8 Total average number of active BSs as a function of q for ARQ.155

A.9 Total average of number backhaul transmissions as a function of q for JD-HARQ. Solid line: HARQ-IR; dashed line: HARQ-

CC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 A.10 Average payload throughput as a function of q for HARQ-IR. Solid line: JD; dashed line: LD. . . . . . . . . . . . . . . . . . 156 A.11 Average payload throughput as a function of q for HARQ-CC. Solid line: JD; dashed line: LD. . . . . . . . . . . . . . . . . . 157 A.12 Total number of active BSs when the MT is positioned at the edge of the cell with ARQ. . . . . . . . . . . . . . . . . . . . . 158 viii Table of Contents List of Tables 3.1 Specialization of the proposed new general framework. . . . . 30 4.1 Specialization of the proposed new general framework for time-

varying parameters. . . . . . . . . . . . . . . . . . . . . . . . . 51 6.1 Convergence speed results for line graph and mesh grid. . . . . 109 A.1 Combination of decoding policies and retransmission methods. 139

A.2 Recursive Search Algorithm. . . . . . . . . . . . . . . . . . . . 146 ix x Table of Contents Acknowledgements Padova, Gennaio 2012 Giunto al termine del percorso triennale di Dottorato, desidero ringraziare le persone che mi sono state vicine, certo di cadere nella disgrazia di dimen-

ticare qualcuno. Devo la maggior parte di quello che ho ottenuto al supporto costante ed incondizionato di Mamma Palmira e Papà Armando, che fin dalle prime de-

cisioni riguardo al mio futuro si sono sempre comportati da genitori modello,

lasciandomi piena libertà nelle mie scelte personali, scolastiche e professionali,

ma garantendomi allo stesso tempo tutto il supporto di cui avevo bisogno. Esempi, forse inconsapevoli, per la mia vita sono stati (e sono tutt''ora) i miei fratelli, Tiziana e Umberto e rispettive famiglie, che con la loro per-

severanza e coraggio mi permettono di godere di due riferimenti e pietre di

paragone del tutto invidiabili. Ringrazio Prof. Lorenzo Vangelista, per avermi offerto l''opportunità di intraprendere il percorso di Dottorato e per aver sempre creduto nelle mie

qualità e capacità, professionali e personali. Meritevoli di ringraziamenti sono sicuramente tutti i docenti e i colleghi del Dipartimento di Ingegneria dell''Informazione dell''Università di Padova

con cui ho avuto il piacere di lavorare assieme. Uno speciale ringraziamento

va a tutti i compagni di ufficio che si sono alternati durante questi tre anni. Un infinito Grazie ad un grande Amico, Marco Massari, per esserci sempre stato, in ogni momento. Un affettuoso ringraziamento anche a Tommaso e

Martina, per avermi sopportato in questi tre anni, e a Chiara per la costante

vicinanza. Un Grazie agli amici di Pellestrina, a tutti quanti, ma in particolare modo ad Anna per essere sempre stata fonte di saggi consigli. Un enorme Grazie

a Silvia, perché le discussioni via chat ad orari per lei improponibili non

rimangano solamente un vago ricordo. Un forte ringraziamento a Giò, Ermy, Elisa, Sbalky, Matteo e a tutti gli altri amici della combriccola padovana per avermi fatto sentire a tempo di

record parte di un gruppo di amici eccezionale. Un sincero ed umile Grazie xi xii Acknowledgements va in particolare ad Elisa, Ermy e Giò per i fondamentali confronti personali,

a volte accesi ma sempre rispettosi. I would like to thank Prof. Erchin Serpedin for having given me the chance to come to Texas A&M University in 2011 to do research work to-

gether with him and Aitzaz Ahmad. A special Thank You also to Sabit Ekin

and Ali Riza Ekti, for their fundamental help before and during my first days

in College Station. A huge Thank You to Patricia (''oh yeah!'), Alberto (''eja'), Darko (''yeah yeah'), Pilar and Cesar for all the gorgeous moments we spent together in

College Station and in Northgate in particular. A special Thank You to

Tasha and Sarah for having tried desperately to teach me how to dance

country music. Texas time will definitely hold a special place in my heart. Infine, vorrei ringraziare tutte quelle persone (colleghi e non) che, in un modo o nell''altro, hanno condiviso con me esperienze fondamentali della mia

vita, permettendomi di diventare quello che sono oggi. Davide Abstract In the past few years, the impressive growth of applications performing tasks

in a distributed fashion has been enabled by the availability of tiny, inex-

pensive devices which, in turn, has been made possible by the recent micro-

electromechanical technology advancements. Sparsely disposing small intel-

ligent appliances throughout a specific area is something that the community

has become used to. Low cost and low power sensing devices equipped with

telecommunication hardware are attractive for use in an infrastructure-less

network in which the absence of a central node stands out, making robustness

be one of the strengths of this kind of networks. Environmental monitoring

and military surveillance are just a few examples of the number of appli-

cations suitable for sensor networks; in fact, home automation and several

health services also can be implemented given that a distributed network of

sensors exists. Sensors need to keep track of a common time scale. This is fundamental for prolonging the network lifetime, making channel access schemes work

properly, for example, or for allowing precise duty cycling among the nodes.

Clock synchronization is also basic if the goal of a running application is

to track moving objects in the battlefield or, more generally, to perform

distributed processing of the sensed data. Since the local notion of time

in a sensor is based on a low quality local oscillator, it turns out that even

small changes in the environmental conditions, like temperature and pressure,

lead to modification in the oscillation frequency of the quartz crystals, thus

producing time discrepancies among different sensor nodes as time goes by. This thesis tackles the problem of clock synchronization in sensor net- works, both from a perspective of clock parameters estimation and from an

algorithmic point of view, to pursue the final goal of making nodes agree on

a common time scale, all across the network. In the first part of the thesis, the so-called two-way message exchange between two nodes is thoroughly analyzed. After recalling existing results

on clock parameters estimation exploiting data collected via this message

exchange process on the wireless channel, an innovative mathematical frame- xiii xiv Abstract work is introduced, which encompasses several common assumptions for the

random delays present in the collected data, in a more general treatment.

Based on this new framework, a factor graph-based clock offset estimator for

wireless sensor networks is proposed and evaluated. Comparison of the vari-

ance of the estimation error with classical bounds available in the literature

shows that the new estimator has extremely good performance, therefore it

can be considered outstanding among Bayesian clock offset estimators. The focus of the second part of the thesis is on the design of distributed consensus algorithms in wireless sensor networks, especially for observations

averaging purposes. In fact, an innovative fast consensus algorithm is pro-

posed and evaluated, based on the alternating direction multipliers method,

which is a distributed method used to solve minimization problems in an

iterative fashion. The new consensus algorithm is compared with the state-

of-the-art of fast consensus, showing an excellent convergence rate and an

outstanding noise resilience. The proposed algorithm is then applied to solve

a network-wide clock synchronization issue, assuming both clock skew and

offset for the nodes in the network, showing a relevant performance improve-

ment with respect to previously proposed consensus-based synchronization

schemes. Finally, the Appendix contains a work whose topic falls out of the main stream of this thesis: in uplink cellular networks, based on the knowledge

of channel statistics, surrounding base stations are carefully and iteratively

chosen in order to provide the mobile terminal a certain quality of service in

terms of the maximum allowed outage probability, with the aim of minimizing

the overall backhaul network usage. Sommario Negli ultimi anni abbiamo assistito alla continua comparsa di applicazioni dis-

tribuite, la cui implementabilità risulta consentita dalla possibilità di avere

a disposizione sensori piccoli ed economici. I recenti progressi tecnologici nel

settore micro-elettronico-meccanico hanno infatti consentito una miniaturiz-

zazione dei nodi sensore. La comunità scientifica si è oramai abituata alla

possibilità, con una spesa minima, di collocare piccoli dispositivi intelligenti

lungo un''area specifica. Sensori economici e a basso consumo, una volta

muniti dell''hardware necessario per le telecomunicazioni, risultano ideali per

l''utilizzo in reti senza infrastruttura, uno scenario in cui spicca l''assenza di

un nodo centrale e la robustezza diviene quindi una proprietà fondamentale.

Monitoraggio ambientale e sorveglianza militare sono solamente un paio di es-

empi di applicazioni adatte a reti di sensori, così come la domotica e l''ambito

sanitario risultano scenari in cui l''uso di una rete distribuita di sensori può

rivelarsi, in effetti, utile e vantaggiosa. I sensori necessitano di una base temporale comune. Questo bisogno risulta fondamentale al fine di prolungare il tempo di vita di una rete, ot-

timizzando schemi di accesso deterministico al mezzo, ad esempio, oppure

schedulando i periodi di attività dei nodi in maniera precisa. La sincroniz-

zazione risulta fondamentale anche in applicazioni legate alla localizzazione,

o più genericamente, per permettere l''elaborazione distribuita di dati raccolti

dai sensori stessi. Dal momento che la nozione di tempo locale in un sen-

sore è fornita da un oscillatore di bassa qualità, anche minime perturbazioni

delle condizioni ambientali (come temperatura e pressione) si riflettono in

modifiche nella frequenza di oscillazione del cristallo di quarzo, producendo

discrepanze nel comportamento tra oscillatori in diversi sensori, che diven-

tano non trascurabili man mano che il tempo scorre. Questa tesi affronta il problema della sincronizzazione di clock in reti di sensori, sia da una prospettiva di stima dei parametri di clock, sia da un punto

di vista algoritmico lungo tutta la rete, con l''obiettivo finale di permettere ai

nodi interessati di trovare una concordanza su una scala temporale comune. Nella prima parte di questa tesi viene analizzato il processo di scambio di xv xvi Sommario informazioni tra due nodi chiamato two-way message exchange. Dopo aver

richiamato la letteratura esistente sulla stima dei parametri del clock utiliz-

zando questo protocollo di scambio dati attravero il canale wireless, viene

introdotto un nuovo framework matematico per permettere un''assunzione

più generale riguardo i ritardi casuali presenti nei dati raccolti. Basandosi

su questo framework, viene proposto e studiato un nuovo stimatore del clock

offset basato sulla teoria dei factor graphs. Dal confronto della varianza

dell''errore di stima con classici limiti inferiori presenti in letteratura risulta

che il nuovo stimatore proposto permette degli ottimi risultati, per cui può a

pieno titolo essere considerato meritevole di menzione nella teoria della stima

Bayesiana applicata al clock offset. La seconda parte della tesi riguarda invece la progettazione di algoritmi di consensus distribuiti per reti di sensori wireless, in special modo per oper-

azioni di averaging svolte in maniera distribuita. Viene proposto e valutato

un nuovo algoritmo di consensus velocizzato basato su alternating direction multipliers method, un metodo distribuito per risolvere problemi di minimiz-

zazione in modo iterativo. Il nuovo algoritmo di consensus viene confrontato

con lo stato dell''arte del consensus velocizzato, mostrando un''eccellente ve-

locità di convergenza e una resistenza al rumore migliore rispetto agli al-

tri algoritmi presenti in letteratura. Lo schema proposto viene poi appli-

cato al problema della sincronizzazione di clock in reti di sensori wireless,

assumendo presenza di clock skew e clock offset tra i vari oscillatori della

rete. L''algoritmo di sincronizzazione risultante consente un rilevante miglio-

ramento delle prestazioni rispetto a schemi di sincronizzazione basati su con-

sensus proposti in precedenza. Infine, nell''Appendice viene descritto un lavoro il cui argomento si dis- costa da quello principale della tesi: in reti cellulari in uplink, in base alla

statistica del canale le stazioni base cooperanti vengono selezionate tramite

l''utilizzo di tecniche iterative con l''obiettivo di garantire al terminale mobile

una certa qualità del servizio in termini di probabiltà di disservizio massima

permessa e allo stesso tempo di minimizzare l''utilizzo della rete di backhaul. List of Acronyms and

Abbreviations 3GPP: Third Generation Partnership Project AC: Average Consensus

ACK: Acknowledgement

ADMM: Alternating Direction Multipliers Method

ARQ: Automatic Repeat Request

AS: Analytical Solution

AWGN: Additive White Gaussian Noise BCHRB: Bayesian Chapman-Robbins Bound

BCRB: Bayesian Cramér-Rao lower Bound

BLUE-OS: Best Linear Unbiased Estimator using Order Statistics

BO: Boyd optimum solution

BS: Base Station

BWC: Broadband Wireless Communication cdf: Cumulative Distribution Function

CHRB: Chapman-Robbins Bound

CRB: Cramér-Rao lower Bound

CSMA: Carrier Sense Multiple Access DLL: Delay-Locked Loop

DPLL: Digital Phase-Locked Loop ES: Exhaustive Search FC: Flexible Cooperation

FDD: Frequency Division Duplexing

FGOE: Factor Graph-based clock offset Estimator

FTSP: Flooding Time Synchronization Protocol xvii xviii List of Acronyms and Abbreviations Ga: Gaussian approximation

GMKPF: Gaussian Mixture Kalman Particle Filter HARQ: Hybrid Automatic Repeat Request

HARQ-CC: Hybrid Automatic Repeat Request with Chase Combining

HARQ-IR: Hybrid Automatic Repeat Request with Incremental Redun-

dancy IGMKPF: Iterative Gaussian Mixture Kalman Particle Filter JD: Joint Decoding LA: Laplacian

LD: Local Decoding

LTE: Long Term Evolution MAC: Medium Access Control

MCP: Multi-Cell Processing

MGF: Moment Generating Function

MH: Metropolis-Hastings

MIMO: Multiple-Input Multiple-Output

ML: Maximum Likelihood

MLE: Maximum Likelihood Estimate

MMSE: Minimum Mean Squared Error

MnLD: Minimum Link Delay Algorithm

MSE: Mean Squared Error

MT: Mobile Terminal

MVUE: Minimum Variance Unbiased Estimator NACK: Not Acknowledgement

NC: No Cooperation

NIC: Network Interface Card

NTP: Network Time Protocol OSI: Open Systems Interconnection PBS: Pairwise Broadcast Synchronization

pdf: Probability Distribution Function

PLL: Phase-Locked Loop

ppm: Parts Per Million QoS: Quality of Service List of Acronyms and Abbreviations xix RBS: Reference Broadcast Synchronization

RF: Radio Frequency

RGG: Random Geometric Graph

ROS: Receiver-Only Synchronization

RRS: Receiver-Receiver Synchronization

RS: Recursive Search SC: Static Cooperation

SINR: Signal to Interference plus Noise Ratio

SNR: Signal to Noise Ratio

SRS: Sender-Receiver Synchronization TDMA: Time Division Multiple Access

TOA: Time Of Arrival

TPSN: Timing-sync Protocol for Sensor Networks UMVUE: Uniformly Minimum Variance Unbiased Estimator

UWB: Ultra-Wide Band VCO: Voltage Controlled Oscillator WSN: Wireless Sensor Network xx List of Acronyms and Abbreviations Introduction In the last few years, the evolution in micro-electronics and micro-electro-

mechanical systems technology has lead to the design of tiny devices able to

perform tasks in an independent manner and capable of reliable communica-

tion with other agents [1]. Thanks to the miniaturization achieved in design-

ing electronic and mechanical components, these devices can sense the sur-

rounding environment through on board sensors and can actuate some tasks

since they can perform signal processing as they are equipped with micro-

controllers. The key feature that makes them interesting from a telecommu-

nications viewpoint is that a group of them placed relatively close to each

other can constitute a network, using the available on-board network inter-

face card which allows them to access the wireless channel. These intelligent

tiny devices therefore form a wireless sensor network (WSN) through which

environmental data is distributively processed and potentially transmitted

elsewhere for further uses and for taking decisions. Recently, many researchers have been focusing on distributed computing thanks to the new focus on distributed networking. Scenarios like WSNs as

well as ad-hoc networks [2] and vehicular networks [3] are strongly investi-

gated in these years. In these networks, distributed computing is becoming

more and more important, since the single agents are generally entities with

poor computational potential and low energy. All nodes are supposed to be

equally poor from the computation capabilities viewpoint, therefore there is

a strong need to avoid centralized approaches in which a node has to elabo-

rate data coming from other nodes, thus encouraging distributed algorithms.

In other words, in these scenarios the assumption of a central sink node is

often unrealistic. The advent of the WSNs makes possible to implement specific distributed applications which could not be performed in an automated fashion before.

In fact, having sensors connected through a wireless channel permits a non-

invasive but efficient form of environmental monitoring or surveillance, if

nodes are equipped with the required sensors or video cameras. Temperature,

pressure and soil make-up are just a few examples of monitorable quantities, 1 2 Introduction as well as vehicles and persons are actual objects whose movements may

deserve surveillance. Seismic movements are also often controlled by WSNs

[4]. In the last decades, cellular communications have monopolized the at- tention of the researchers, and synchronization in such networks is generally

achieved via a master-slave approach between the mobile terminal and the

base station. On the other hand, in the Internet the Network Time Pro-

tocol (NTP) [5], which makes use of a network of time servers, is used for

clock synchronization among users. Clock synchronization between agents

in a WSN is instead a tougher issue, mainly due to the fact that such a

network is generally infrastructure-less, therefore neither NTP is well suited

nor hierarchical synchronization can always be effectively implemented. The communication capabilities of sensors allow the whole network to distributively compute functions of the gathered data as well as cost func-

tions useful for future uses and decisions. Assuming that sensors are able to

communicate with the surrounding neighbors means that all sensors should

have a common notion of time, otherwise channel access schemes (such as

time division multiple access, TDMA, for example) cannot be used. The

lack of synchronization in this case leads also to a side effect with respect

to channel access failure, which is an unwanted power consumption due to

unsuccessful packet transmissions and collisions. Algorithms for distributed

processing also assume the network nodes to be timely-synchronized, other-

wise there would be no agreement on whether deciding if an event happened

before or after another. In other words, a common time scale is fundamental

also for distributed processing in order to match each event with the time

instant it has happened. Another application of relevant interest which bases

its functioning on clock synchronization is localization: several methods have

been taken into account in order to make sensor nodes in a network able to

localize themselves as well as objects on the basis of the knowledge on the

position of specific reference nodes. Among the various proposed techniques

for localization, the time-based approaches can be exploited to perform node

localization under a fundamental assumption: nodes must have the same

notion of time. To stress this point, the reader should notice that the local-

ization accuracy in an ultra-wideband (UWB) system using a time-of-arrival

(TOA) approach is extremely high [6], therefore fine clock synchronization in

such scenarios becomes an extremely important issue. Channel access, dis-

tributed processing and localization are not the only applications that assume

clock synchronization between sensor nodes: other examples are cooperative

communications, coordinated actuation, spectrum sensing and power control,

among many others. The problem of clock synchronization in WSNs has been thorough inves- Introduction 3 tigated in the past. Several surveys [7''10] have been written about this issue,

and for a detailed explanation of the problem the reader should also read El-

son''s Ph.D. dissertation [11]. Many synchronization algorithms (also called

synchronization protocols) have been proposed in the literature, both for sen-

sor networks that make use of radio frequency (RF) communications [12''18]

and for sensor networks based on underwater acoustic transmissions [19''22].

These algorithms either estimate clock parameters with respect to a refer-

ence (leading to ''uncoupled clocks') [12''15], or actively modify the clock

behaviour via suitable control circuits (leading to ''coupled clocks') [16''18].

There is a vast literature as well about possible techniques to better esti-

mate the clock parameters among two nodes with specific assumptions on

the nature of the network delays [23''31]. This thesis is divided into two parts, after the first chapter which is in- tended to provide the reader the necessary notions in order to fully appreciate

the analytical and simulative results presented after. Part I of this thesis de-

scribes clock estimation and synchronization algorithms for WSNs based on

the two-way message exchange, which allows to estimate the clock param-

eters and subsequently correct the local clock behaviour, thus leading to

uncoupled clocks among the network. Based on data gathered thanks to this

exchange process, a general framework is first introduced in order to allow

clock offset estimation when the network delays have probability distribu-

tions belonging to the exponential family of distributions, thus generalizing

existing results in the literature as well as adding new findings. Based on

this, estimation algorithms are proposed both for the classical (ML) and the

Bayesian framework, using the theory of factor graphs. The second part of this thesis relies on the application of consensus al- gorithms for reaching synchronization in WSNs. First, an innovative fast

consensus algorithm is presented which allows top performance in terms of

convergence speed and outstanding results in terms of noise resilience com-

pared to what is available in the literature. After recasting the clock synchro-

nization issue as a double consensus problem with both clock offset and skew

among nodes, the new fast consensus approach is finally applied for reaching

clock synchronization, showing that it brings substantial improvements with

respect to similar algorithms based on consensus. Finally, in the Appendix a work whose topic falls out of the main stream of the thesis is reported. In fact, an algorithm for reducing the backhaul

network usage in cellular networks is presented, when multi-cell processing

is allowed for uplink communications from a mobile terminal, making use

of error control policies and several cooperation techniques between the base

stations, under a constraint on the quality of service in terms of the maximum

allowed outage probability. 4 Introduction Chapter 1 General Concepts on Clock

Synchronization The purpose of this chapter is to give the reader the basics of clock synchro-

nization so that most the fundamental concepts presented and discussed in

this thesis can actually find a reference in the present thesis itself. The chapter deals with the basics of an oscillator in a sensor node, which is the component from which the node gets its own notion of time. After having

explained how an oscillator is basically composed, hardware techniques for

oscillation frequency correction and control are illustrated. The presentation

of a generally accepted clock model follows, which is the foundation for all the

results presented in this thesis. Finally, the main characteristics of message-

based algorithms are presented as well as the distinction between estimation-

based and clock coupling-based algorithms 1.1 The Notion of Time in a Node When people need to know the current time they consult devices whose

specific purpose is to provide an estimate of the time of the day. Looking

at the wall clock or at the watch are common ways to answer the question

''what time is it''. In sensors, and more generally in every electronic device,

this answer is provided through an electric circuit by means of an oscillating

signal generated inside. The name of such a circuit is precisely oscillator, or

clock. There exist natural materials which have a particular oscillating property, namely piezoelectricity: in a crystal of a piezoelectric material, a mechani-

cal stress produces an accumulation of charges on the opposite sides of the

crystal. In this sense, the material is actually behaving like a capacitor: by 5 6 1. General Concepts on Clock Synchronization connecting the two sides with an electric circuit, the flow of electric current

is induced. In more general terms, the crystal behaves like a RLC circuit

(resistor, capacitor and inductor) with a certain frequency of resonance, typ-

ical of the material. If the mechanical compression is done periodically at

such a frequency, then an output periodic electric signal at the frequency of

resonance is obtained. This is precisely the oscillating signal provided by a

clock in the electronic device. A piezoelectric material largely used in oscillators is quartz. A common oscillation frequency for quartz oscillators is 32768 Hz, as it happens, for

example, in the microprocessor Texas Instrument MSP430 F1611 [32] em-

bedded in Tmote Sky sensors. A clock oscillating at 4 MHz can be found

instead in MICA motes, which are employed in testbeds at the University of

California, Berkeley [33]. Although they are claimed to run at the same nominal frequency, even nominally identical crystal oscillators have slightly different running behaviours,

mainly due to fabrication issues and climate conditions. Indeed, it happens

that their instantaneous phases could be different and generally time-varying.

Given that an estimate of the phase difference is possible, what is needed

right after is a way to implement a phase correction to the oscillating circuit.

The application of a voltage correcting signal to an oscillator for modifying its

running behaviour is exactly what happens in a Voltage Controlled Oscillator

(VCO). The feedback-based circuits that generate such correcting signals are

generally referred as (digital) phase-locked loops ((D)PLLs) or delay-locked

loops (DLLs). For a detailed description of these circuits see [34]. In an actual application of clock synchronization, by means of a PLL the node i is able to synchronize its own clock to an external signal produced

through the interaction with other nodes'' oscillators. Designing the nature of

such an interaction between nodes means providing a clock synchronization

algorithm, which is precisely what is being described in the present thesis. 1.2 Clock Models The main building block in a synchronizing network of agents is the clock,

which can be defined as follows. Definition 1.1 A clock is an electronic device that counts oscillations in an

accurately-machined quartz crystal, at a particular frequency [7]. Therefore, a clock can be considered a timer which increases of a unit at

every clock tick, at a certain frequency. 1.2. Clock Models 7 The observable value at sensor node i is the clock counter at absolute time t, i.e., Ti(t) (1.1) which is often simply denoted by ''clock' of node i at time t. Each oscillator runs at frequency fi(t), generally different from the oscil- lator nominal frequency f0. The clock oscillation period (the inverse of the

oscillation frequency) is defined as Υi(t) = 1 fi(t) . (1.2) As far as the clock accuracy is concerned, it is generally known [8] that oscillators running at higher frequencies are more stable, while oscillators

at lower frequencies consume less power, so that it is common practice to

have sensors equipped with slow running oscillators for energy consumption

purposes. This comes at the price of lower clock stability and less achievable

accuracy, since the time resolution is coarser. Indeed, changes in environ-

mental factors like temperature, pressure and humidity make the oscillation

frequency of clock of node i, fi(t), vary with time. Also, factory imperfections

make different clocks oscillate inherently at different frequencies. Therefore,

clock discrepancies between nodes are due both to different values of the

clock counters and to slightly different oscillation frequencies. 1.2.1 Clock Parameters Terms like clock offset, skew and drift are commonly used to describe the

differences in running behavior between clocks of any two nodes i and j.

These concepts are defined in the following, coherently with the definitions

provided in [7, 35]. The clock offset of node i relative to node j is the difference between the clock counters of the two nodes, i.e., θ (o) ij (t) = Ti(t) '' Tj (t) . In many cases in the present thesis, the key parameter is the initial clock

offset between i and j, which is defined as the clock offset between the

two nodes at time 0, i.e., θ (o) ij = θ (o) ij (0) = Ti(0) '' Tj (0) . (1.3) 8 1. General Concepts on Clock Synchronization If node j is the reference time, it turns out that the clock offset of node

i with respect to the reference time is simply θ (o) i (t) = Ti(t) '' t , while the initial offset of node i is θ (o) i = θ (o) i (0) = Ti(0) , θ (o) i '' R . (1.4) The clock skew of node i relative to node j is the difference of the first derivative of the clock counters of nodes i and j, i.e., θ (s) ij = ''Ti(t) ''t '' ''Tj(t) ''t . (1.5) If node j is the reference time, it turns out that the clock skew of node

i with respect to the reference time is simply θ (s) i = ''Ti(t) ''t '' 1 , θ (s) i '' [''1, +'') . (1.6) The clock drift of node i relative to node j is the difference of the sec- ond derivative of the clocks of nodes i and j, i.e., θ (d) ij = ''2Ti(t) ''t2 '' ''2Tj(t) ''t2 . (1.7) If node j is the reference time, it turns out that the clock drift of node

i with respect to the reference time is simply θ (d) i = ''2Ti(t) ''t2 , θ (d) i '' R . (1.8) 1.2.2 The Adopted Clock Model In order to describe the clock dynamics, several models have been proposed

in the literature. The most complete modeling of a clock consists of a Taylor

expansion of the clock counter truncated at the second degree, i.e., [7, 35, 36] Ti(t) = θ (o) i + 1 + θ (s) i t + θ (d) i t2 2 . In this thesis, clock drift θ (d) i is not considered so that the considered reference model is the following [7''10,12''16,23''31,35,37''43] (either with or

without the skew presence) Ti(t) = θ (o) i + 1 + θ (s) i t , (1.9) 1.2. Clock Models 9 in which θ (o) i and θ (s) i are unknown parameters to be estimated. Often, an a-priori description for these parameters is also available. In each chapter

of this thesis it is specified if clock skew θ (s) i is also considered, after clock offset θ (o) i , which is always taken into account. Fig. 1.1 shows the running behaviours of different clocks adhering to the model (1.9). 0 20 40 60 80 100 ''40 ''20 0 20 40 60 80 100 120 140 t T i (t) a 0 20 40 60 80 100 0 50 100 150 t T i (t) b Figure 1.1: Examples of clocks with different offsets and no skews (a) and

with different skews but no offsets (b). Clock offsets and skews are real values, describing the level of discrepancy between two clocks. However, they have strongly different meanings and rele-

vance, since a relatively small clock skew can lead to a huge clock offset after a

long period of time in which no synchronizing correction is made. As a result,

if having an unbounded initial clock offset could not heavily affect synchro-

nization, even a small clock skew requires frequent re-synchronizations if it

is not taken into account during the design of the algorithm. In real sensor networks, values of clock skews are generally bounded and quite small. The community is used to express the skew in terms of parts

per million (ppm): if a clock is said to deviate c parts per million, it means

that after a million of clock ticks it diverges of the equivalent of c clock ticks.

To be concrete, the clock embedded in the microprocessor Texas Instrument

MSP430 F1611 [44] has a clock skew on the order of 40-50 ppm, which means

that for each second the clock deviates of 40-50 µs. Also other datasheets [45]

report skews of this order of magnitude. If deviating of a few µs every second

could appear as a negligible effect, just realize that in a day (24 ' 60 ' 60 = 86400 seconds) the oscillator deviates of few seconds, which is a huge time

from a sensor network application viewpoint. 10 1. General Concepts on Clock Synchronization Absolute or Relative Synchronization' In clock synchronization, identifying the known and unknown quantities is

a matter of fundamental importance. In the model (1.9), Ti(t) represents

the observation at node i at time t; on the other hand, the time scale at

node i is Ti(t) itself, so that the sensor node is totally unconscious of the

absolute time reference t. In fact, it can only access the counter Ti(t). As a

result, by itself the node is not able to estimate its own clock parameters θ (o) i and θ (s) i , unless by using time-related observations coming from other nodes. Moreover, since timestamps recorded by other nodes are related to these

nodes'' timescales, an estimate of the absolute values θ (o) i and θ (s) i is generally not possible, except for the case in which node i uses readings from a special

node whose timescale is assumed as the reference, i.e., θ (o) i = θ (s) i = 0. 1.3 Approaches to Clock Synchronization Clock synchronization between agents requires information exchange between

nodes in order to pursue a common time basis. The main question now is

what kind of data should nodes exchange in order to reach synchronization'

Existing clock synchronization algorithms either make use of messages (OSI

level 2 data exchange) or electromagnetic radio frequency pulses for exchang-

ing clock related data (OSI level 1 data exchange). The two approaches lead

respectively to message-based and pulse-coupled algorithms. The comparison

between them is done in the next section. 1.3.1 Message-Based vs Pulse-Coupled Algorithms Depending on the available hardware, the target level of accuracy and the

channel resources, nodes can either exchange message-based data or physical

signaling [9]: Message-Based Algorithms : in message-based algorithms, the local time of a node is written (e.g. timestamped ) in a message, which is broadcast to the neighboring nodes via the wireless channel. Since they involve

data exchange at level 2 of the OSI stack, such algorithms suffer of soft-

ware delays, such as interrupts or operating system calls. However, it

is proved that most of such delays can be effectively neglected through

kernel [12] or MAC-layer timestamping [13, 14], i.e. reading the times-

tamp and adding it to the packet just before the transmission on the

wireless medium. A fundamental advantage of such algorithms is that 1.3. Approaches to Clock Synchronization 11 they don''t require the network to allocate specific resources (such as

bandwidth) for synchronization purposes. Pulse-Coupled Algorithms : pulse-coupled algorithms are based on phys- ical layer signaling instead of message transmissions; in fact, time infor-

mation is encoded in the transmission instants of the sent waveforms.

To this purpose, generally either a specific frequency band is dedicated

or an overlay system (such as UWB) is employed. The received signal

at neighboring nodes is then processed in order to achieve synchro-

nization with the sending node. Such processing techniques possibly

include the use of PLLs [16, 37]. Although pulse-coupled approaches are interesting from the signal pro- cessing point of view, this thesis entirely focuses on message-based clock

synchronization algorithms, since it is assumed that neither specific chan-

nel resources are reserved for synchronization purposes nor sensor nodes

are equipped with UWB transceivers. Basic wireless resources and simple

communication hardware are well suited constraints to low-cost low-power

sensing devices. 1.3.2 Coupled Clocks vs Uncoupled Clocks In the realm of message-based algorithms, synchronization methods can be

divided in two categories 1 . Depending on the type of data processing and clock correction, message-based algorithms can lead either to coupled clocks

or to uncoupled clocks [9]. Indeed, the algorithms can either estimate the

clock parameters with respect to the other nodes and then perform a cor-

rection through a VCO (uncoupled clocks), or act directly to the oscillator

circuit to correct it step by step (coupled clocks). Regarding this aspect, the

difference between uncoupled and coupled clocks reflects also the distinction

between Part I and Part II of this thesis: in fact, in Part I clock synchroniza-

tion algorithms are designed with the goal of estimating clock parameters, so

that clocks in the network result to be uncoupled, since no correction signal

is applied to the oscillator circuit before the estimation algorithm stops run-

ning; on the other hand, the synchronization algorithm derived in Chapter 7

in Part II relies on the application of a correction signal to the oscillator on

the fly at each step, therefore leading to coupled clocks along the network. It

can also be observed that, while in Part I the focus is on the statistical de-

tailed description of the message exchange process between two nodes, Part 1The distinction is actually transversal with respect to the categorization message-based vs pulse-coupled. 12 1. General Concepts on Clock Synchronization II provides a broader picture of the problem, better analysing the overall

network performance while lacking of local detail. 1.3.3 Synchronization Via Parameter Estimation Clock synchronization through parameter estimation can be performed via

time-related data exchange. The following definition holds. Definition 1.2 The process by which nodes exchange time-related informa-

tion in message-based algorithms is called a synchronization paradigm. To the best of the author''s knowledge, three synchronization paradigms can

be identified [10, 15]: Sender-Receiver Synchronization - SRS : in this classic paradigm two nodes S (the sender) and R (the receiver) are involved; they exchange

timestamped messages in a two-way handshake fashion (started by

node S) in order to make node S able to estimate its clock relative

parameters with respect to node R. Receiver-Receiver Synchronization - RRS : at least three nodes are involved in this paradigm, the parent node P that sends broadcast

messages, and two other nodes, R1 and R2. After that, R1 and R2

exchange data about the local time of reception of the messages and

estimate their clock relative parameters. Receiver-Only Synchronization - ROS : a node L is able to synchronize to node S by listening to the messages exchanged by node S and node

R that are synchronizing using the SRS paradigm. Part I of this thesis is dedicated to the analysis of clock synchronization algorithms based on clock parameters estimation which make use of the SRS

paradigm. 1.3.4 Synchronization Via Clock Coupling Some network-wide clock synchronization algorithms that perform a step

by step clock correction assume a low level of detail in the communication

model but allow for a more comprehensive analysis and provide a broader

picture of the algorithm behaviour. Synchronization algorithms that fall

in this category and are taken into account in this thesis are based on the

consensus approach. Basics of consensus algorithms as well as the application

of consensus to the problem of clock synchronization are described in Part

II. Part I Clock Synchronization in WSNs via the Two-Way Message Exchange Clock synchronization can be reached through the estimation of the parameters that identify the disagreement between node''s clocks. This is the approach considered in Part I of the present thesis. 13 Chapter 2 The Two-Way Message Exchange In Chapter 1, message-based protocols have been divided in two categories,

based on the fact that they lead either to coupled or to uncoupled clocks.

The current chapter describes the process of data exchange exploited in

estimation-based clock synchronization algorithms, that lead to uncoupled

clocks. This communication protocol provides the rules for getting the time-

related information needed for clock parameters estimation. The obtained

observation model is carefully analysed and its performance studied. 2.1 The Two-Way Data Exchange One of the preferred ways to exchange time related data in a message-based

algorithm is the so called two-way message exchange, which identifies a two-

way communication handshake protocol between two nodes sharing their

timestamps in order to achieve synchronization. The two-way message exchange is a data communication process between two nodes in which there is a double message exchange between the involved

nodes, one for each link direction. The messages carry timestamped infor-

mation according to the local clock time scale, so that at the end of the

information exchange there is sufficient data to infer the clock disagreement

parameters of the two nodes, making clock synchronization possible between

them. The mechanism is not perfectly symmetric, since there is a node which initiates the process and the other which responds. On the other hand,

with little overhead (just one more short message, which can also exploit

piggybacking of following messages) at the end of the data exchange both

the nodes get all the raw information available, thus they can perform the

same data processing to clock parameters estimation. 15 16 2. The Two-Way Message Exchange R S T 1 S T 2 R T 4 S T 3 R Figure 2.1: Two-way message exchange between node S and R. 2.2 System Model The two-way message exchange can be better described if one looks at Fig.

2.1. The lowest horizontal line represents the time scale of node S, which

is the node that begins the data exchange process. The upper oblique line

instead represents the time scale of node R. The non-zero angle between the

two lines represents the clock skew between the time basis of node S and

node R. The data exchange [23, 25''31, 38''42] starts with a message sent by node S, containing the timestamp T 1 S of the time at which such a message has been sent 1 , measured according to the time scale of node S. Such a message is received by node R at time T 2 R, according to the time scale of node R. After a predefined period of time, node R replies with a message containing

the timestamps T 1 S , T 2 R and the timestamp T 3 R relative to the sending time of this message, measured according to the time scale of node R. Finally, such

a message arrives to node S at time T 4 S , according to the time scale of node S. By keeping into account of both clock offset θ(o) and skew θ(s) between the two nodes using the model (1.9), the following relations between the 1The dependence on the absolute time of the counter values T (t) is dropped in order to make the notation more readable. 2.2. System Model 17 timestamps hold [26, 27, 31, 38]: T 2 R = 1 + θ (s) T 1 S + d + X + θ(o) T 3R = 1 + θ (s) T 4 S '' d '' Y + θ(o) . (2.1) The fixed portion of the link delay is assumed symmetric and denoted by

d, while the random portions of the same delay in uplink and downlink

respectively are denoted by X and Y . A full discussion about these delays

as well as their possible modeling is performed in the following. 2.2.1 Fixed and Random Portions of Link Delay The major impairments in the two-way message exchange come from the

message transmission delays. There is no need to stop and try to find a

possible cause of delay that could affect a message transmission through a

shared medium. The wireless channel is indeed used by a potentially large

number of users, and there is an extensive bibliography on the available

channel access schemes. For sure, the channel access is neither deterministic

nor instantaneous. In other words, there is a stochastic delay to be modelled. The delay encountered by a message transmission through a wireless chan- nel has been decomposed into four categories [10, 12''14]: ' Send Time: the time required by the transmitter to construct the mes- sage and to forward it to the NIC 2. This time includes delays coming from interrupts and operating system calls. ' Access Time: the time required to get a successful message transmission in the wireless medium. It depends on the channel access scheme. ' Propagation Time: the time needed for the electromagnetic wave to travel on the wireless medium from the transmitter to the receiver. It

encompasses latencies and forwarding delays at intermediate hops in

large networks. ' Receive Time: the time needed to the receiver for the message reception and decoding as well as for the notification of the message arrival to

the operating system. These delays are typical of a large class of wireless networks and protocols, from the TDMA one-hop networks to the large multi-hop CSMA networks. In 2In the following, the NIC (Network Interface Card) is intended as the part of the device which is actually sending the packet over the air. 18 2. The Two-Way Message Exchange this thesis, for the delay analysis in the two-way message exchange, only one-

hop communications will be considered and CSMA is assumed as the wireless

channel access scheme [33]. With this assumption, the just mentioned four

delays can be divided in two categories. Assuming nodes are not mobile, the

propagation delay can be considered fixed (or deterministic), since it just

depends on the distance between the two sensors, no random components

are present. However, the send, access and receive times encompass random

impairments among their causes (operating system calls, channel access, . . . ),

therefore they can be considered random (or stochastic). In this thesis, the

following definitions hold (see [46]): Definition 2.1 The deterministic (fixed) part of the link delay in the two-

way message exchange process is the minimum of the link delays over the

message exchange rounds. Definition 2.2 The stochastic (random) part of the link delay in the two-

way message exchange process is the positive quantity to be added to the

deterministic part of the link delay in order to get the overall delay. As a result, both the deterministic and the stochastic delays are positive

quantities. In the past, a measurement campaign has been performed [46] to see if the end-to-end delay could be somehow modelled. What turned out from such

work is that the stochastic end-to-end delay can be better described by a

Gamma probability distribution, which comprises a number of distributions.

What plays a key role in the stochastic part of the delay is the queueing delay

at the transmitters; this is the reason why many works in the literature as-

sumed exponential distributed random portions of delays [23''27]. However,

the Gaussian, log-normal and Weibull distributions are other good candi-

dates adopted in the literature for modelling the stochastic delay. Given the

substantial diversity between these distributions, it is hard to find a common

framework which encompasses the entire range of possible distributions. On

the other hand, in Chapter 3 a new framework is presented in order to allow

for the study of different random delay distributions at the same time in a

more general fashion, through the use of tools borrowed from the theory of

the exponential family of distributions. Message delays can be reduced through smart tricks. The most used tech- niques involve reading the timestamps just before forwarding the message to

the NIC for the actual transmission (kernel [12] or MAC layer timestamp-

ing [13,14]) to eliminate the send and the access times. Moreover, if also the

receiver timestamps the message arrival time at a low enough level in the

host''s operating system kernel, the receive time is strongly reduced since it 2.3. State of The Art 19 does not include the overhead of system calls, context switches, or even the

transfer of the message from the network interface to the host. Finally, the

propagation time cannot be eliminated because of its physical reason, but it

can be estimated from the round trip time through a handshake paradigm. 2.3 State of The Art Several clock synchronization algorithms exploit the two-way message ex-

change in order to reach clock agreement. Indeed, as an example, both

Timing-sync Protocol for Sensor Networks (TPSN [14]) and Reference Broad-

cast Synchronization (RBS [12]) use data coming from this exchange process

in order to compensate for clock offset and skew. The two-way message exchange well applies to a statistical analysis of timestamped data, both in presence of clock offset and skew, as it can be

seen from (2.1). A major statistical analysis of the two-way message exchange

mechanism has been done first in [23], where a thorough discussion on the

best use of the timestamped data is also presented. After this first work, a

deep study on the statistical performance of this message exchange process

for clock parameter estimation has been done in the literature. If clock skew is not considered, the issue is to correctly estimate the clock offset between nodes S and R. Assuming the random portion of delays be-

ing identically distributed exponential random variables both in uplink and

downlink, Ghaffar in [23] states that an ML estimate (MLE) of the clock

offset does not uniquely exist in the case in which both the mean of the

random portion of the delays and the fixed portion of delay are known. On

the other hand, Jeske in [24] proved that such an MLE actually exists when

the fixed portion of delay is unknown, with the mean of the exponentially

distributed delays being either known or unknown; an expression for the

MLE is therein derived. However, several ad-hoc clock offset estimators were

proposed in [23]; one of them (the Minimum Link Delay Algorithm, MnLD)

coincides with the actual MLE derived in [24]. Still assuming that the ran-

dom delay distribution is exponential but, more generally, asymmetric be-

tween uplink and downlink, in [25] the best linear unbiased estimator using

order statistics (BLUE-OS) [47] is derived as well as the minimum variance

unbiased estimator (MVUE) [47] via the application of the Rao-Blackwell-

Lehmann-Sheffé theorem. Making the assumption that the stochastic part

of the delay is the result of a diverse serie of causes, [26] considers it as

following a Gaussian distribution. The applicability of such an hypothesis

could be questionable, because if so, there is a non-zero probability that the

overall delay (deterministic plus stochastic) is negative and this is physically 20 2. The Two-Way Message Exchange not possible. On the other hand, the Gaussian assumption may hold in the

case the standard deviation is chosen small enough, by recalling that about

the 99.7% of the samples are inside the 6' interval symmetric around the

distribution mean. The authors in [26] provide expressions for the MLE for

the clock offset and the relative Cramér-Rao lower bound (CRB) under this

Gaussian assumption. The case of a Weibull distributed random delay is

faced instead in [28], where it is shown that the problem of finding the MLE

of the clock offset is generally convex, therefore it is solvable by numeric op-

timization A uniform MVUE (UMVUE) is also derived for the exponential

distribution case in which the fixed portion of delay is known. When a more

general assumption is made about the noise distribution, it becomes hard to

find an exhaustive solution to the clock offset estimation problem. In [29,30]

such a difficulty has been faced by the usage of (Iterative) Gaussian Mixture

Kalman Particle Filter, (I)GMKPF, in order to find an estimate of the clock

offset in the presence of non-Gaussian and non-exponential random delay

distribution or in the presence of a mixture of several distributions. If the model assumes the presence of both clock skew and offset among node''s oscillators, frequent re-synchronization is reduced since a skew cor-

rection naturally leads to a longer clock synchronization. A Gaussian dis-

tribution for the random portion of delay is assumed in [31] where the joint

MLEs for clock offset, skew and fixed part of the delay are derived as long as

performance bounds. Moreover, a lower complexity suboptimal estimator is

also proposed. In [26] the joint MLE for both clock skew and offset is derived

in the case of Gaussian distributed random delays with the assumption of

knowing the fixed part of delay. Since in this case the estimator expression

involves non-trivial computation (and an MLE for the exponential case does

not even exist if d is known), an ML-like estimator is proposed for both the

Gaussian and the exponential cases, which exploits just the first and the last

time stamps. This estimator assumes d to be unknown, which is a more

realistic assumption in real WSNs. Performance is obviously worse than the

MLE but a simpler expression is provided for the estimators so that they

are easily implementable. Later in [27] a graphical algorithm to find the ML

clock offset and skew estimate is derived in the case of exponential symmet-

ric random delays. Due to its complexity, a suboptimal alternative empirical

algorithm is proposed. 2.3.1 Relevant Synchronization Algorithms In this section, the most famous clock synchronization algorithms in radio

WSNs based on the two-way message exchange are described. 2.3. State of The Art 21 Reference Broadcast Synchronization (RBS) To the best of the author''s knowledge, RBS [11, 12] was the first relevant

algorithm proposed for clock synchronization in WSNs; it was presented in

2002 [12] by Elson et al. for synchronizing a set of Berkeley motes [33]. The

problem of reducing the uncertainty in the message delivery time has been

solved by using the RRS paradigm in a smart way: in order to eliminate

the Send and the Access times, the authors let a set of nodes synchronize

themselves by recording the arrival time (according to their local clocks) of

a number of broadcast packets received from an elected reference node; the

key assumption is that the absolute time of arrival of the packet is the same

at all the nodes: in other words it is assumed that the packets arrive to all

the reference node''s neighbors at the same instant. After having received

these packets, such nodes exchange their local readings. Each node is there-

fore able to compute clock offset and skew with respect to any other node

(except the reference node) by solving a least square linear regression. The

receive time is limited by allowing system clock reading directly inside of the

network driver''s interrupt handler. A multi-hop extension of RBS consists on

exploiting nodes that are in the coverage range of more than one node: such

nodes work as clock bridges between two or more broadcast regions, there-

fore they are able to make the time conversion from a broadcast domain to

another. This algorithm has the following pros: ' The reference broadcast packet doesn''t need to be a dedicated packet, therefore it is possible to exploit any broadcast transmission. ' The algorithm is robust against node deaths and packet losses since the linear regression works even if some points are missing. ' Local time scales are built up, making the algorithm well suited to those applications that don''t need a network global time scale (like, for

example, a localization application). ' A large number of synchronizing broadcasts enable tighter synchroniza- tion because residuals errors tend to follow well-behaved distributions. There are also some cons: ' For a single-hop network of N nodes there is a huge (O(N 2)) number of exchanged packets for synchronization. ' Low convergence speed because of the high number of message ex- changes. 22 2. The Two-Way Message Exchange ' Need for the election of the beacon sender. ' The reference node is left unsynchronized. Timing-sync Protocol for Sensor Networks (TPSN) Ganeriwal et al. in [14] propose an algorithm that works upon a different

paradigm with respect to RBS: in fact, while the latter is based on synchro-

nizing a set of receiver nodes, TPSN bases its fundamentals on the more

conventional approach, SRS. In order to synchronize the network, a hier-

archical structure is constructed: this is the reason for the presence of a

''discovery phase' in which a root node is chosen by running a leader elec-

tion algorithm in the network so that a tree is built up. The ''synchronization

phase' follows, in which every node synchronizes itself to its parent by a two-

way message exchange mechanism. The implementation results on Berkeley

Motes are quite encouraging since TPSN provides twice better synchroniza-

tion performance with respect to RBS; moreover, the synchronization error

doesn''t blow up with the hop distance. However, a strong limitation in this

protocol is that it doesn''t provide any skew correction, therefore a periodic

resynchronization phase of the entire network is needed. MAC layer time-

stamping is also implemented in this algorithm in order to reduce the send

and the access time. TPSN has the following pros: ' It is robust to single node deaths and births, with effective special provisions. ' It is easily scalable, since the synchronization error doesn''t blow up with hop distance. ' Performance is good also with respect to RBS. There are also some cons: ' It is not suitable in the case nodes have some mobility since a hierar- chical structure has to be built up and kept running. ' Need for the election of the root node. ' No clock skew estimation. 2.3. State of The Art 23 Flooding Time Synchronization Protocol (FTSP) In FTSP [13], the first step is the election of a root which is intended to

represent the global reference time; it periodically broadcasts its clock to its

neighbors which collect this data and build up a table with a set of entries

global time-local time, using linear regression to estimate clock skew and

offset with respect to the root node. After they get synchronized to the root,

they themselves become roots for their neighborhood by broadcasting their

synchronized time. Differently from RBS, the broadcast packet must contain

the sender timestamp, but similar to TPSN the paradigm is SRS. Also here,

MAC layer time-stamping is used in order to reduce the send and the access

time. The pros of this algorithm are the following: ' It supports topology changes since no fixed hierarchical structure is required. ' Less number of message exchanges with respect to RBS and TPSN. On the other hand, the main cons are: ' A root election and renewing process is needed. ' The actual number of exchanged messages is redundant, due to cover- age overlapping. Pairwise Broadcast Synchronization (PBS) The authors of PBS [15], are the inventors of a new synchronization approach,

the ROS. As explained in Chapter 1, a node L exploits data embedded in mes-

sages exchanged by node S and R, synchronizing through the SRS paradigm,

to estimate its clock skew and offset with respect to the clock of node S.

Results of PBS in terms of synchronization error and in terms of the number

of exchanged packets are encouraging since the clock precision is comparable

to that of RRS approach while the number of exchanged messages is propor-

tional to the number of clusters in the network. Here some advantages of

this algorithm are listed: ' Low number of exchanged messages. ' Fine results in term of synchronization error. However, the main disadvantages are: ' PBS needs to work over another SRS synchronization protocol. 24 2. The Two-Way Message Exchange ' There is the need to divide the network in clusters and synchronize them by additional pairwise synchronization among super nodes in different

clusters. Several clock offset and skew estimators have been proposed [39''42] for the listening nodes, assuming this framework. On the other hand, a thorough

analysis of this problem is out of the scope of this thesis. Chapter 3 Clock Offset Estimation: A

Unified Framework The results presented in this chapter refer to a collaboration of Prof. Van-

gelista and me with Prof. Erchin Serpedin and Aitzaz Ahmad, both of them

with the Department of Electrical and Computer Engineering, Texas A&M

University, College Station, TX (USA). The research work has been per-

formed while I was on leave as a ''Visiting Scholar' at Texas A&M University,

from January to July 2011, partially supported by an ''A. Gini' fellowship.

Results have been submitted to the journal IEEE Transactions on Informa- tion Theory [48]. 3.1 Introduction The concepts introduced in this chapter are based on the two-way message

exchange introduced in Chapter 2. Here, a new framework is defined for

treating several assumptions on the random part of the link delays in a

unified manner. In the considered scenario, only clock offset is assumed,

so that the model depicted in Chapter 2 is substantially simplified. This

allows to make a key quantity substitution in the observation model which

leads to parameters decoupling. Tools borrowed from the exponential family of distributions prove useful to tackle the problem of finding the MLE of the clock offset under the gen-

eral framework, both recovering existing results and extending them to new

findings using the convex optimization theory. 25 26 3. Clock Offset Estimation: A Unified Framework R S T 1 S (1) T 2 R(1) · · · · · · T 4 S (1) T 3 R(1) T 1 S (K ) T 2 R(K ) T 4 S (K ) T 3 R(K ) Figure 3.1: K rounds two-way message exchange between node S and R with

clock offset only between them. 3.2 System Model The data exchange mechanism on which the new common framework bases

its fundamentals is the two-way message exchange, as described in Chapter

2. As an extension of what introduced therein, here it is assumed that the

data exchange is iterated for K rounds, with the aim of providing a more

accurate estimate of the clock parameters. In order to keep into account

of the exchange repetition, the notation is extended as follows. The four

timestamps {T 1 S , T 2 R, T 3 R, T 4 S } gathered at round k are here denoted as {T 1 S (k), T 2 R(k), T 3 R(k), T 4 S (k)} , k = 1, . . . , K . In other words, brackets are added to remark the dependence of the times-

tamps on the round index k. No clock skew is taken into account: Fig. 3.1 represents the considered scenario: a K rounds two-way message exchange mechanism with offset pres-

ence only. This assumption leads to a simplified relation between timestamps

with respect to (2.1). In fact, setting θ(s) = 0 leads to T 2R(k) = T 1 S (k) + d + X (k) + θ (o) T 3 R(k) = T 4 S (k) '' d '' Y (k) + θ (o) , (3.1) where, as before, d represents the fixed (among all K rounds) portion of the

delay while X(k) and Y (k) model the i.i.d. random portions of the link delay

in uplink and downlink at round k, respectively. 3.3. A New General Framework 27 Following the approach of [23, 25, 28''30], the readings are then processed to obtain U(k) = T 2 R(k) '' T 1 S (k) = d + θ (o) + X(k) (3.2a) V (k) = T 4 S (k) '' T 3 R(k) = d '' θ (o) + Y (k) . (3.2b) At this point, it results possible to define new quantities such that the two relations in (3.2) results to be uncoupled. In fact, by defining ξ = d + θ (o) (3.3a) ' = d '' θ (o) (3.3b) the model (3.2) can be rewritten as U(k) = ξ + X(k)

V (k) = ' + Y (k) for k = 1, . . . , K. 3.3 A New General Framework As pointed out in Chapter 2, the random portions of the delays X(k) and

Y (k) can obey to a number of possible probability distribution functions.

The aim of this chapter is to propose a new general framework for these

delays so that three important distributions can be considered at the same

time in a general fashion. In order to make the notation shorter, the readings U(k) and V (k) are organized in vector form, i.e., U = [U(1), . . . , U(K)] ' V = [V (1), . . . , V (K)] ' , so that it comes easier to express the likelihood functions of the observa-

tions. Here it is assumed that the likelihoods follow either the Gaussian or

log-normal distribution, in a general framework. This can be done by bor-

rowing notation and properties from the theory of the exponential family of

distributions [49, 50]. 28 3. Clock Offset Estimation: A Unified Framework Unconstrained Likelihood: fU (U |ξ) ' exp ξ K X k=1 ηξ(U(k)) '' K'ξ(ξ) ! (3.4a) fV (V |') ' exp ' K X k=1 η'(V (k)) '' K''(') ! (3.4b) where ηξ( ·) and η'(·) are functions of the observations U and V , respec- tively and they are assumed to be sufficient statistics for estimating ξ and '.

The 'ξ( ·) and ''(·) are called log-partition functions and they serve as nor- malization factors so that both fU (U |ξ) and fV (V |') are valid probability distribution functions. Given the normalization purposes of the log-partition

functions, they are naturally defined except for an additive constant. The

proportional signs in (3.4a) and (3.4b) refer to the fact that a multiplicative

factor only dependent on the observations U and V , respectively, is present

in front of the exponential term. In this case, the likelihood functions are

called ''unconstrained' since the domains are independent on the parameters

ξ and ', as opposite to the case when the likelihood functions are assumed

adhering to the exponential distribution, as shown below. Constrained Likelihood: fU (U |ξ) ' exp ξ K X k=1 ηξ(U(k)) '' K'ξ(ξ) ! K Y k=1 I(U(k) '' ξ) (3.5a) fV (V |') ' exp ' K X k=1 η'(V (k)) '' K''(') ! K Y k=1 I(V (k) '' ') (3.5b) where I( ·) is the indicator function, defined as I(x) = 1 x ' 0 0 x < 0 . In this case, the likelihood function is called ''constrained' since its domain

is dependent on the parameters ξ and '. The exponential family of distributions represents a precious tool to anal- yse the likelihood functions in a sufficiently general fashion to encompass to-

gether potentially very different distributions like Gaussian, log-normal and

exponential. The three of them are all valid candidates for best describing

the random behaviour of X(k) and Y (k), thus it follows that the choice of

adopting the exponential family framework is an effective way to study the 3.3. A New General Framework 29 problem of clock offset estimation in a more general way with respect to what

done in the past. Although original, the approach of adopting uses and costumes of the exponential family of distribution can be specialized at the specific need, re-

covering the results of the literature when the assumption is to have random

portions of delays obeying specifically either to Gaussian, log-normal or ex-

ponential distribution. This is of fundamental importance, since if on one

side the framework permits a general analysis of the problem, simultaneously

it leads to a continuity with the existing results. It is useful to dig further into the theory of the exponential family of distributions: the following properties [49] will prove useful in the rest of the

upcoming discussion. They are listed referring to the parameter ξ and the

observations U : similar properties hold for ' and V , respectively. 1. The mean value and the variance of the sufficient statistics ηξ(U(k)) can be written as E [ηξ(U(k))] = '''ξ(ξ) ''ξ (3.6) '2 ηξ = Var [ηξ(U (k))] = ''2'ξ(ξ) ''ξ2 . (3.7) 2. The moment generating function (MGF) of the random variable ηξ(U(k)) is given by Mη ξ (h) = exp ('ξ (ξ + h) '' 'ξ(ξ)) . (3.8) The following proposition holds from the positiveness of the variance in (3.7). Proposition 3.1 The log-partition function 'ξ( ·) is convex. The following theorem specializes the common framework to the three con-

sidered specific distributions. Theorem 3.2 The sufficient statistics and the log-partition function for the

Gaussian, log-normal and exponential likelihood functions are respectively 1. N ξ, ' 2 ξ : ηξ (U(k)) = U(k) '2 ξ , 'ξ(ξ) = ξ2 2'2 ξ , (3.9) 30 3. Clock Offset Estimation: A Unified Framework Distribution fU (U |ξ) ηξ(U(k)) 'ξ(ξ) '2η ξ N (ξ, ' 2 ξ ) ' exp h '' P K

k =1(U (k)''ξ) 2 2'2 ξ i U (k) 2'2 ξ ξ2 2'2 ξ 1 '2 ξ log N (ξ, ' 2 ξ ) ' exp h '' P K

k =1(log U (k)''ξ) 2 2'2 ξ i log U (k) 2'2 ξ ξ2 2'2 ξ 1 '2 ξ λξ exp h ''λξ P K

k=1 (U (k) '' ξ) i · E (λξ) λξ 0 0 · Q K

k=1 I (U (k) '' ξ) Table 3.1: Specialization of the proposed new general framework. 2. log N ξ, ' 2 ξ : ηξ (U(k)) = log U(k) '2 ξ , 'ξ(ξ) = ξ2 2'2 ξ , (3.10) 3. E (λξ): ηξ (U(k)) = λξ , 'ξ(ξ) = 0 . (3.11) Proof: See Appendix 3.A.1. The results stated by Theorem 3.2 are summarized in Tab. 3.1.

The following corollary holds. Corollary 3.3 In the Gaussian, log-normal as well as in the exponential

case the log-partition function can be expressed as a second-degree incomplete

polynomial, i.e., 'ξ(ξ) = aξξ 2 = '2 ηξ 2 ξ2 . (3.12) In a real scenario, if the variance of the sufficient statistics is not known, it can be substituted by its empirical version, since the weak law of large

numbers assures that P K

k=1 ηξ (U (k)) K p '' E [ηξ(U)] , K '' '' P K

k=1 η 2 ξ (U (k)) K p '' E η2 ξ (U ) , K '' '' . 3.4. Maximum Likelihood Estimation 31 3.4 Maximum Likelihood Estimation This section focuses on the ML estimation of the clock offset, thus recovering

the results existing in the literature. In order to perform the estimation,

the problem is recast as an instance of convex optimization, differently from

the graphical approach adopted in [24]. The case of unconstrained and con-

strained likelihoods are treated separately. 3.4.1 Unconstrained Likelihood Using (3.4a) and (3.12), the unconstrained likelihood functions are given by fU (U |ξ) ' exp ξ K X k=1 ηξ(U(k)) '' K '2 ηξ 2 ξ2 ! (3.13a) fV (V |') ' exp ' K X k=1 η'(V (k)) '' K '2η ' 2 '2 ! . (3.13b) The MLE of ξ and ' can be expressed as ' ξML = arg max ξ exp ξ K X k=1 ηξ(U(k)) '' K '2η ξ 2 ξ2 ! (3.14a) ' 'ML = arg max ' exp ' K X k=1 η'(V (k)) '' K '2 η' 2 '2 ! . (3.14b) Theorem 3.4 The likelihood maximization problems (3.14a) and (3.14b)

are strictly concave and the MLEs are given by ' ξML = P K

k=1 ηξ (U (k)) K'2η ξ (3.15a) ' 'ML = P K

k=1 η' (V (k)) K'2η ' . (3.15b) Hence, the MLE ' θ (o) ML for the clock offset can be written as ' θ (o) ML = ' ξML '' ' 'ML 2 . (3.16) Proof: See Appendix 3.A.2 32 3. Clock Offset Estimation: A Unified Framework Gaussian Distribution Theorem 3.4 finds an application in the case the likelihood functions fU

and fV follow a Gaussian distribution, i.e., fU (U(k) |ξ) ' N (ξ, ' 2 ξ ) and fV (V (k) |') ' N (', ' 2 ') [26] (see also Tab. 3.1). Using (3.15a) with (3.9) and (3.12), it follows that the MLE is given by ' ξML = P K

k=1 U (k) K . (3.17) By a similar reasoning, the MLE for ' (3.15b) is given by ' 'ML = P K

k=1 V (k) K . (3.18) Using (3.16), the MLE for the offset θ(o) can be expressed as ' θ (o) ML = P K

k=1(U (k) '' V (k)) 2N . (3.19) The above estimate coincides exactly with the one reported in [26]. Log-Normal Distribution When the likelihood functions of the readings U and V are log-normal (see

Tab. 3.1), Theorem 3.4 can still be applied, making use also of (3.10) and

(3.12), for finding the MLE of ξ, i.e., ' ξML = P K

k=1 log U (k) K . With a similar reasoning, the MLE for ' results to be ' 'ML = P K

k=1 log V (k) K . Finally, the ML estimator for θ(o) can be written as ' θ (o) ML = P K

k=1 (log U (k) '' log V (k)) K . (3.20) 3.4. Maximum Likelihood Estimation 33 3.4.2 Constrained Likelihood Using (3.5a), (3.5b) and (3.12), the constrained likelihood functions can be

written as fU (U |ξ) ' exp ξ K X k=1 ηξ(U(k)) '' K '2η ξ 2 ξ2 ! K Y k=1 I(U(k) '' ξ) (3.21a) fV (V |') ' exp ' K X k=1 η'(V (k)) '' K '2 η' 2 '2 ! K Y k=1 I(V (k) '' ') . (3.21b) The MLEs of ξ and ' can be expressed as ' ξML = arg max ξ exp ξ K X k=1 ηξ(U(k)) '' K '2 ηξ 2 ξ2 ! such that U(k) ' ξ (3.22a) ' 'ML = arg max ' exp ' K X k=1 η'(V (k)) '' K '2 η' 2 '2 ! such that V (k) ' ' . (3.22b) It proves useful to define the order statistics of the samples U(k) and V (k) as U(1) = min (U(1), . . . , U(K)) V(1) = min (V (1), . . . , V (K)) . Theorem 3.5 The likelihood maximization problems (3.22a) and (3.22b)

are strictly concave and the MLEs can be expressed as ' ξML = min P K

k=1 ηξ (U (k)) K'2 ηξ , U(1) ! (3.23a) ' 'ML = min P K

k=1 η' (V (k)) K'2η ' , V(1) ! . (3.23b) The MLE ' θ (o) ML for the clock offset is given by ' θ (o) ML = ' ξML '' ' 'ML 2 . (3.24) Proof: See Appendix 3.A.3. 34 3. Clock Offset Estimation: A Unified Framework Exponential Distribution An application of Theorem 3.5 can be found if the likelihood functions fU

and fV represent exponential distributions, with parameters λξ and λ', re-

spectively, i.e., fU ' E (λξ) and fV ' E (λ') (see also Tab. 3.1). Using (3.11) yields ' ξML = U(1) . (3.25) With a similar reasoning, ' 'ML = V(1) . Using (3.24), the MLE of θ(o) is given by ' θML = U(1) '' V(1) 2 , (3.26) thus recovering the result of [24], therein derived through graphical argu-

ments. 3.5 Statistical Bounds The purpose of this section is to provide statistical bounds to clock offset

estimation in the classical framework in order to evaluate the performance

of the MLEs derived in the previous sections in terms of the variance of the

estimation error. The upcoming analysis assumes that the likelihood functions fU and fV belongs to the exponential family of distributions without specifying any

shape for the log-partition function, therefore the provided analytical results

are indeed generally valid for a wide variety of distributions and almost all

the distribution of interest. Moreover, the upcoming results can be used in a

context different from the clock offset estimation, since they are self standing

in classical estimation theory. Both the case of unconstrained and constrained likelihood are taken into account for providing the necessary generality to the discussion of the

problem. The general expressions for the likelihoods of the observations

Z = [Z(1), . . . , Z(K)] T are given by Unconstrained Likelihood: fZ(Z; ρ) ' exp ρ K X k=1 η(Z(k)) '' K'(ρ) ! (3.27) 3.5. Statistical Bounds 35 Constrained Likelihood: fZ(Z; ρ) ' exp ρ K X k=1 η(Z(k)) '' K'(ρ) ! K Y k=1 I(Z(k) '' ρ) (3.28) where ρ is the scalar parameter to be estimated. In the following, the Cramer-

Rao and Chapman-Robbins bounds for the estimation error variance of the

deterministic parameter ρ are derived. 3.5.1 Cramer-Rao Lower Bound The Cramer-Rao lower bound (CRB) [47] provides a limit on the variance of

the estimation error for an unbiased estimator of a deterministic parameter.

Although it is a simple bound to compute, it relies on some ''regularity condi-

tions' which are not always satisfied. In particular, in the case of constrained

likelihoods (cf. (3.28)) the CRB cannot be computed, so that the variance

of the estimation error has to be described by a different bound. The CRB states that, given the condition E '' log fZ (Z; ρ) ''ρ = 0 for all ρ , (3.29) the variance of an unbiased estimator ' ρ of ρ is lower bounded by Var(' ρ) ' ''1 E h ''2 log f Z ( Z;ρ) ''ρ2 i , (3.30) Theorem 3.6 The CRB for the estimation error of ρ in the unconstrained

likelihood function in (3.27) is given by Var(' ρ) ' 1 K'2η (3.31) where '2 η = ''2' (ρ) ''ρ2 . Proof: The Fisher information is given by I(ρ) = E ''2 log fZ (Z; ρ) ''ρ2 = ''K ''2' (ρ) ''ρ2 = ''K' 2 η so that the proof readily follows. 36 3. Clock Offset Estimation: A Unified Framework 3.5.2 Chapman-Robbins Bound The Chapman-Robbins bound (CHRB) [51] provides a lower bound on the

variance of an estimator of a deterministic parameter. Differently from the

CRB, the CHRB does not make any assumptions on the regularity condition

(3.29) that often prevents from the use of the CRB. Moreover, the CHRB

is substantially tighter than the CRB in many cases. Hence, CHRB is here

used for determining a lower bound on the variance of an unbiased estimator

of ρ for constrained likelihood functions. Generally speaking, the CHRB for estimating a parameter ρ is given by Var(' ρ) ' " inf h 1 h2 ( E fZ(Z; ρ + h) fZ(Z; ρ) 2 '' 1 )# ''1 . (3.32) Theorem 3.7 The CHRB for the parameter ρ given the likelihood function

(3.28) can be expressed as Var(' ρ) ' inf h n (Mη(h))'' 2K · δK(h) '' 1 o h2 ''1 (3.33) where Mη(h) is the MGF of the statistic η(Z(k)) and δ(h) = E [exp (2hη(Z(k))) I (Z(k) '' ρ '' h)] (3.34) with the expectation taken with respect to any Z(k). Proof: The details of the proof are relegated to Appendix 3.A.4. 3.5.3 MSE and Variance of the Estimation Error The performance bounds computed above provide lower limits to the variance

of the estimation error of ξ and '. Such results can be used to lower bound

the mean squared error (MSE) of the clock offset θ(o), MSE 'θ(o) = E 'θ(o) '' θ(o) 2 . In fact, by using (3.3), the following result is immediate. Proposition 3.8 The MSE of any estimator of θ(o) can be expressed as MSE 'θ(o) = 1 4 Var ' ξ + Var ' ' + 1

4 (bξ '' b') 2 where bξ and b' [47] are the biases of the estimators ' ξ and ' ', respectively. 3.6. Numerical Results 37 Gaussian Distribution - CRB If the likelihood function for ξ is Gaussian distributed (see Tab. 3.1), then

using (3.31), it results that the CRB for an unbiased estimator ' ξ is given by Var ' ξ ' '2 ξ K , with a similar expression for the CRB in estimating ' as well. Using Propo-

sition 3.8, it follows that MSE 'θ(o) ' '2 ξ + ' 2 ' 4K . (3.35) As a remark, ' θ (o) ML (3.19) is an efficient estimator since its MSE achieves (3.35) with equality (cf. Appendix 3.A.5). Exponential Distribution - CHRB If the likelihood for ξ is exponentially distributed (see Tab. 3.1), using (3.8)

it can be easily shown that Mη ξ (U ) (h) = 1 and (3.34) becomes δ(h) = exp (λξh) , so that the CHRB (3.33) can be rewritten as Var ' ξ ' inf h exp (λξhK) '' 1 h2 ''1 = 0.6476 λ2 ξ K 2 and similarly for ' '. Using Proposition 3.8, it holds that MSE 'θ(o) = 1 4 Var ' ξ + Var ' ' + 1

4 (bξ '' b') 2 ' 0.162 K2 1 λ2 ξ + 1 λ2 ' ! + 1

4 (bξ '' b') 2 . (3.36) 3.6 Numerical Results In this section the theoretical findings of this chapter are corroborated by

numerical simulations. In particular, performance of the estimator in case of 38 3. Clock Offset Estimation: A Unified Framework log-normal likelihood is analysed, as well as the MSE in all cases is compared

with the statistical bounds computed in Section 3.5. The setting parameters

are given by 'ξ = '' = 0.1 for both Gaussian and log-normally distributed

likelihoods, while λξ = λ' = 10 for exponentially distributed likelihood func-

tions. 3.6.1 MSE Performance and Bounds Comparing the performance of the proposed estimator with the statistical

bounds is useful to provide a measure of correctness and efficiency of MLEs

derived in Section 3.4 in the case of Gaussian and exponentially distributed

likelihoods. Fig. 3.2 reports the MSE in estimating the clock offset using the

MLEs for Gaussian and exponential likelihoods, (3.19) and (3.26), respec-

tively (the log-normal case is analogous to the Gaussian case, therefore it

is not reported here). As benchmarks, the CRB and CHRB are also shown

in the plot. First, it is evident that the MLE ' θ (o) ML in the case of Gaussian likelihood is efficient since the MSE reaches the CRB and the CHRB, the

two being coincident. On the other hand, in the exponential case the CRB

cannot be computed because the regularity assumption (3.29) is not satis-

fied, therefore only the CHRB is shown in figure. The MSE of the MLEs

in this case is close, but not coincident, to the CHRB. A further fair com-

parison which can be made is among the Gaussian and the exponential case,

since the parameter choice is such that the variance of the observations is the

same: from the shown curves it is evident that the MSE in the exponential

case is lower than that in the Gaussian case as the number of observations K

increases. This behaviour can be verified by the MSE expressions (3.37) and

(3.38), reported in Appendix 3.A.5: in the Gaussian case, the MSE decreases

proportionally to 1/K, while in the exponential distribution case it decreases

proportional to 1/K2. 3.6.2 MSE Robustness with Log-Normally Distributed

Likelihood So far, no clock offset estimator has been proposed in the literature in case

the likelihood function of the readings is log-normal. The proposed estimator ' θ (o) ML (3.20) can then be used to determine an estimate of the clock offset in the maximum likelihood sense. Until now, if the likelihood of the readings

was log-normal, the available estimators were just those assuming the likeli-

hoods to be either Gaussian or exponentially distributed, therefore they are

expected to perform worse than the estimator (3.20) matched to the case of 3.6. Numerical Results 39 0 5 10 15 20 25 10 ''6 10 ''5 10 ''4 10 ''3 10 ''2 K MSE( θ(o) ) MLE '' Gaussian

CRB '' Gaussian

CHRB '' Gaussian

MLE '' Exp. Distr.

CHRB '' Exp. Distr. Figure 3.2: MSE and bounds for estimating θ(o) by using the MLE with

Gaussian and exponentially distributed likelihood. 40 3. Clock Offset Estimation: A Unified Framework 0 5 10 15 20 25 10 ''4 10 ''3 10 ''2 10 ''1 10 0 10 1 K MSE( θ(o) ) MLE '' Log''normal

MLE '' Gaussian Based

MLE '' Exp. Distr. Based

CRB

CHRB Figure 3.3: MSE robustness (in terms of the likelihood functions) and bounds

for the estimation of θ(o) in case of log-normally distributed likelihood. log-normal likelihood. Fig. 3.3 compares the MSE of the new proposed estimator (3.20) with that of the MLEs that (wrongly) assume the likelihood to be either Gaussian

or exponentially distributed, when the actual likelihood of the readings is log-

normally distributed. The figure shows how the two latter approaches are not

robust with respect to the likelihood distribution, since their performance is

extremely poor. Moreover, it can be observed how the proposed MLE (3.20)

is efficient since it achieves the CRB (as well as the CHRB). 3.7 Conclusions The new proposed framework allows to uniformly treat different likelihood

functions, by exploiting the common properties among them, and using no-

tation and fundamental properties of the exponential family of distributions.

This generalization process allows to tackle the ML estimation problem in a

more general fashion, which can be specialized at the specific need depend-

ing on the behaviour of the likelihood functions. This particular approach is 3.7. Conclusions 41 naturally based on convex optimization, therefore in the exponential distri-

bution case it represents an alternative view to the problem of clock offset

ML estimation with respect to what has been done up to now, under this

specific assumption [24]. Statistical bounds are provided for all the cases

of interests, while numerical results confirm the performance of the derived

MLEs. Moreover, it is worth to highlight that the provided framework allows to derive the expression of θ (o) ML in case the likelihood functions of the readings are log-normal. This result is new in the literature, although the log-normal

distribution is among the candidates for modelling the distribution of the

random portions of the delay. Simulation results show that the new MLE

performs better than the state-of-the-art in clock offset estimation. The ML estimation method outlined above differs from the previous work [24] in that it is based on convex optimization, while [24] maximized the

likelihood graphically. Hence, this approach presents an alternative view of

the ML estimation of the clock offset. It also allows us the determine the ML

estimator of θ(o) when the likelihood function is log-normally distributed.

In addition, Theorem 3.5 also provides a stepping stone for the findings

described in Chapter 4, where estimation of θ(o) in the Bayesian regime is

discussed. 42 3. Clock Offset Estimation: A Unified Framework 3.A Appendix 3.A.1 Proof of Theorem 3.2 1. N ξ, ' 2 ξ : fU (U |ξ) ' exp " '' P K

k=1 (U (k) '' ξ) 2 2'2 ξ # ' exp " '' P K

k=1 (U 2(k) '' 2ξU(k) + ξ2) 2'2 ξ # ' exp " ξ K X k=1 U(k) '2 ξ '' K ξ2 2'2 ξ # , so that (3.9) follows from comparison with (3.4a). 2. log N ξ, ' 2 ξ : fU (U |ξ) ' exp " '' P K

k=1 (log U (k) '' ξ) 2 2'2 ξ # ' exp " '' P K

k=1 log 2 U(k) '' 2ξ log U(k) + ξ2 2'2 ξ # ' exp " ξ K X k=1 log U(k) '2 ξ '' K ξ2 2'2 ξ # , therefore (3.10) follows, again from comparison with (3.4a). 3. E (λξ): fU (U |ξ) ' exp " ''λξ K X k=1 (U(k) '' ξ) # K Y k=1 I (U(k) '' ξ) ' exp " ''λξ K X k=1 U(k) + ξ K X k=1 λξ # K Y k=1 I (U(k) '' ξ) ' exp " ξ K X k=1 λξ # K Y k=1 I (U(k) '' ξ) , which leads to (3.11) through comparison with (3.5a). 3.A. Appendix 43 3.A.2 Proof of Theorem 3.4 The MLE of ξ (3.15a) can be determined by maximizing the exponent of

(3.14a), since it can be easily proved as strictly concave [52]. Setting to zero

the first derivative of the exponent of (3.14a) leads to (3.15a). A similar

reasoning for the MLE (3.14b) of ' brings to (3.15b). Finally, the MLE of

θ(o) (3.16) is derived using (3.3) and the invariance principle [ 47]. 3.A.3 Proof of Theorem 3.5 The strict concavity of the maximization problems (3.22a) and (3.22b) follows

from noting that the constraints U(k) ' ξ (resp. V (k) ' ') are linear functions of ξ (resp. ') and using similar arguments to those used in the

proof of Theorem 3.4. Moreover, the K constraints can be reduced to a single

constraint in each problem, by using the order statistics of the observations,

i.e., ' ξML = arg max ξ exp ξ K X k=1 ηξ(U(k)) '' K '2η ξ 2 ξ2 ! such that U(1) ' ξ ' 'ML = arg max ' exp ' K X k=1 η'(V (k)) '' K '2 η' 2 '2 ! such that V(1) ' ' . The solution of the unconstrained problem in (3.22a), ¯ ξ, is given by (3.15a). If ¯ ξ ' U(1), then the solution of (3.22a) is given by ' ξML = ¯ ξ. On the other hand, being the objective function concave, if U(1) ' ¯ ξ, then the ML estimate is given by ' ξML = U(1). The combination of the two cases leads to (3.23a). A similar reasoning for ' yields (3.23b). Finally, the MLE of θ(o) (3.24) is

derived using (3.3) and the invariance principle [47]. 44 3. Clock Offset Estimation: A Unified Framework 3.A.4 Proof of Theorem 3.7 The ratio of the likelihood functions can be expressed as fZ(Z; ρ + h) fZ(Z; ρ) = e( (ρ+h) P K

k =1 η(Z (k))''K '(ρ+h)) QK k=1 I(Z (k) '' ρ '' h) e( ρ P K

k =1 η(Z (k))''K '(ρ)) Q K

k =1 I(Z (k)''ρ) =e( h P K

k =1 η(Z (k))''K '(ρ+h)+K '(ρ) ) K Y k=1 I(Z(k) '' ρ '' h) =e( h P K

k =1 η(Z (k))) e(''K('(ρ+h)+'(ρ))) K Y k=1 I(Z(k) '' ρ '' h) . The expectation of the ratio of the likelihood functions can now be calculated

as E fZ(Z; ρ + h) fZ(Z; ρ) 2 = E " e( 2h P K

k =1 η(Z (k))) e(''2K('(ρ+h)+'(ρ))) K Y k=1 I(Z(k) '' ρ '' h) # = e(''2K('(ρ+h)+'(ρ)))E " e( 2h P K

k =1 η(Z (k))) K Y k=1 I(Z(k) '' ρ '' h) # = Mη(Z)(h) ''2K E " e( 2h P K

k =1 η(Z (k))) K Y k=1 I(Z(k) '' ρ '' h) # where it follows from (3.8) that Mη(Z)(h) ''2K = e''2K('(ρ+h)'''(ρ)) . Since the samples Z(k) are i.i.d, E " e( 2h P K

k =1 η(Z (k)) ) K Y k=1 I(Z(k) '' ρ '' h) # = E e(2hη(Z(k)))I(Z(k) '' ρ '' h) K . With δ( ·) defined in the theorem, the proof is complete. 3.A. Appendix 45 3.A.5 MSE Expressions for ML Estimators Gaussian Distribution If the likelihood for ξ is Gaussian distributed (see Tab. 4.1), the MLE is

given by (3.17). Since the variance of the readings Uj is '2ξ and the MLE

(3.17) is unbiased, it is straightforward to see that MSE ' ξML = '2 ξ K , and similarly for ' 'ML. Given (3.16) and Proposition 3.8, it can be concluded that MSE 'θ(o) ML = '2 ξ + ' 2 ' 4K . (3.37) Exponential Distribution If the likelihood for ξ is exponential distributed (see Tab. 4.1), the MLE is

given by (3.25). Through simple algebra it can be seen that U(1) is exponen-

tially distributed with parameter λ '

ξ = λξ K , so that Var ' ξML = 1 λ2 ξ K 2 . It can be noticed that ' ξML is a biased estimator for ξ, with bias bξ,ML = 1 λξK . Similarly, Var ' 'ML = 1 λ2 ' K 2 and b ',ML = 1 λ'K . Therefore, given (3.16) and Proposition 3.8, it can be concluded that MSE 'θ(o) ML = 0.25 K2 1 λ2 ξ + 1 λ2 ' ! + 0.25 K2 1 λξ '' 1 λ' 2 . (3.38) 46 3. Clock Offset Estimation: A Unified Framework Chapter 4 A Factor Graph-Based Clock

Offset Estimator The results presented in this chapter refer to a collaboration of Prof. Van-

gelista and me with Prof. Erchin Serpedin and Aitzaz Ahmad, both of them

with the Department of Electrical and Computer Engineering, Texas A&M

University, College Station, TX (USA). The research work has been per-

formed while I was on leave as a ''Visiting Scholar' at Texas A&M University,

from January to July 2011, partially supported by an ''A. Gini' fellowship.

Results have been summarized in two papers, one submitted to the journal

IEEE Transactions on Information Theory [48] and the other accepted for

publication at the conference ICASSP 2012 [53]. 4.1 Introduction This chapter presents a new clock offset estimator based on a Bayesian version

of the new framework introduced in Chapter 3. The estimation procedure

assumes a time-varying clock offset and it makes use of the factor graph

theory together with message passing for deriving a closed-form expression

of the estimator. Moreover, in order to measure the performance of the

estimator, Bayesian estimation bounds are derived for the MSE and existing

estimators are implemented, then showing that the new proposed estimator

performs better than the state-of-the-art in clock offset estimators and it

provides performance really close to the computed lower bounds. The assumptions made in Chapter 3 refer to the absence of clock skew between node pairs in a WSN and time-invariant clock offsets. Although this

model permits to derive a common framework and closed-form expressions

for clock offset estimators, it is generally not well suited to real world scenar- 47 48 4. A Factor Graph-Based Clock Offset Estimator ios, where climate conditions change over time, but they are not kept into

consideration. In fact, with fixed clock offsets, only fabrication imperfections

are modelled. Considering the fixed portion of the delay to be time-invariant

is also quite restrictive because it does encompass neither slow mobility of

the wireless nodes nor shadowing effects in urban environments. To deal with offset evolution over time, here it is assumed that clock offset evolves obeying to a Gauss-Markov model [29, 30, 42]. The Bayesian

estimation procedure is performed by the application of the factor graph

theory with the message passing algorithm. The basics of factor graphs are

explained in the next section. 4.2 Factor Graphs and Message Passing Factor graphs are bipartite graphical models used to better represent the

relations among variables in a global function. The global function is decom-

posed in a set of local sub-functions, called factors, each of them dependent

on a subset of variables. Generally, each factor is represented by a factor

node and each variable by an edge. The fundamental rule is that an edge is

connected to a particular factor node if and only if the variable associated to

that edge is an argument of the factor associated to that factor node [54]. Message passing algorithms are used to perform inference in factor graphs by exchanging messages (beliefs) among nodes in the graph. The sum-product

and the max-product algorithm are examples of message passing algorithms

in which summation and maximization are performed respectively at each

factor node. Here, the focus is on the max-product algorithm, and the ex-

changed messages are as follows variable to factor node : mx''f (x) = Y h ''n(x)\f mh''x (x) (4.1) factor node to variable : mf''x (x) = max \{x} f (Z) Y z ''n(f)\{x} mz''f (z) (4.2) where Z = n (f ) is the subset of variables from which the local function

f is dependent. The marginal distribution, or the belief, associated with a

variable is obtained by performing the product of all the incoming messages

to that variable node. 4.3. A Factor Graph-based Estimator 49 4.3 A Factor Graph-based Estimator The time-varying behaviour of the clock offset and the fixed portion of the

delay is kept into account by assuming a Gauss-Markov process for both the

parameters ξ and ', i.e., ξ(k) = ξ(k '' 1) + w(k) (4.3) '(k) = '(k '' 1) + v(k) for k = 1, . . . , K (4.4) with w(k) and v(k) i.i.d and w(k), v(k) ' N (0, ' 2). The Gauss-Markov model for the clock offset has already found application in previous works on

clock synchronization [29, 30, 42]. Under the Bayesian assumption, the a-posteriori pdf can be expressed as fa(ξ, ' |U, V ) ' fp(ξ, ')f''(U, V |ξ, ') = fp,ξ(ξ(0)) K Y k=1 fξ(ξ(k) |ξ(k '' 1))fp,'('(0)) K Y k=1 f'('(k) |'(k '' 1)) · K Y k=1 fU (U(k) |ξ(k))fV (V (k)|'(k)) (4.5) where the a-priori distributions fp,ξ(ξ(0)) and fp,'('(0)) are assumed uni-

form, i.e., they can be considered constant. To make the notation more

readable, the following definitions hold: ζk k ''1 = fξ(ξ(k)|ξ(k '' 1)) ' N (ξ(k '' 1), ' 2) , νk k ''1 = f'('(k)|'(k '' 1)) ' N ('(k '' 1), ' 2) , 'k = fU (U(k) |ξ(k)) , 'k = fV (V (k) |'(k)) . In the following treatment, only the case of constrained likelihood is anal- ysed, since the unconstrained likelihood is a subset of the constrained case,

as it will be clarified shortly. The common framework introduced in Chapter 3 is adapted to the Bayesian framework, therefore the likelihood functions fU and fV are assumed obey-

ing to the exponential family of distributions (3.21), so that they can be 50 4. A Factor Graph-Based Clock Offset Estimator expressed as fU (U(k) |ξ(k)) ' exp ξ(k)ηξ(U(k)) '' '2η ξ,k 2 ξ 2(k) ! I(U(k) '' ξ(k)) fV (V (k) |'(k)) ' exp '(k)η'(V (k)) '' '2η ',k 2 '2(k) ! I(V (k) '' '(k)) . (4.6) In Fig. 4.1, the a-posteriori pdf (4.5) is represented via factor graphs. As an example of factor graph representation of Gauss-Markov models, graphical

phase estimation has been performed in [55]. It is fundamental to notice

that the structure of the pdf naturally leads to a separation between terms

dependant on ξ(k) only and terms dependant on '(k) only. This means

that the resulting factor graph is actually composed by two sub-graphs, not

communicating to each other. This is the consequence of the decoupling

between the two parameters derived in Chapter 3 and still valid. Moreover,

through inspection of (4.5) it can be inferred that the two sub-graphs have a

similar structure, therefore in the following the message passing is described

for ξ only, being the case for ' completely equivalent. Finally, the sub-graphs

in Fig. 4.1 are cycle-free, therefore inference through message passing in these

graphs is optimal [54]. To make the notation easier, it is convenient to define the following quan- tities αξ,k = '' '2η ξ,k 2 , βξ,k = ηξ(U(k)) so that the constrained likelihood function 'k can be expressed as 'k ' exp αξ,kξ 2(k) + βξ,kξ(k) I(U(k) '' ξ(k)) . (4.7) The next step is to apply the message passing mechanism to derive an estimate for θ(o)(K). 4.3.1 Message Computation The application of the max-product algorithm leads to message passing be-

tween variable nodes and factor nodes. The procedure begins at the factor

node 'K, from which a message is sent to the variable ξ(K) (see Fig. 4.2),

i.e., m' K ''ξ(K) = 'K . 4.3. A Factor Graph-based Estimator 51 ζ0 ζ1 0 ξ(0) ξ(1) '1 ζK''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K ν0 ν1 0 '(0) '(1) '1 νK''1 K ''2 νK K ''1 '(K '' 1) 'K''1 '(K) 'K Figure 4.1: Factor graph representation of the density (4.5). Distribution fU (U(k) |ξ) ηξ(U(k)) 'ξ(ξ(k)) '2η ξ N (ξ(k), ' 2 ξ ) ' exp h '' (U (k) ''ξ(k)) 2 2'2 ξ i U (k) 2'2 ξ ξ2(k) 2'2 ξ 1 '2 ξ log N (ξ(k), ' 2 ξ ) ' exp h '' (log U (k) ''ξ(k)) 2 2'2 ξ i log U (k) 2'2 ξ ξ2(k) 2'2 ξ 1 '2 ξ λξ exp [ ''λξ (U(k) '' ξ(k))] · E (λξ) λξ 0 0 ·I (U(k) '' ξ(k)) Table 4.1: Specialization of the proposed new general framework for time-

varying parameters. 52 4. A Factor Graph-Based Clock Offset Estimator ζ0 ζ1 0 ξ(0) ξ(1) '1 ζK''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K Figure 4.2: The factor 'K sends the message m' K ''ξ(K) to the variable ξ(K). ζ0 ζ10 ξ(0) ξ(1) '1 ζ K ''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K Figure 4.3: The variable ξ(K) sends the message mξ(K)''ζK K''1 to the factor ζK K ''1. ζ0 ζ10 ξ(0) ξ(1) '1 ζK''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K Figure 4.4: The factor ζK K ''1 sends the message mζ K K''1''ξ(K ''1) to the variable ξ(K '' 1). 4.3. A Factor Graph-based Estimator 53 Such message is then relayed to the factor node ζK K ''1 (see Fig. 4.3), i.e., mξ(K)''ζK K''1 = 'K . which computes the product of such message with the factor ζK K ''1. The resulting message is then sent to the variable ξ(K ''1) after the maximization (cf. (4.2)) process, performed over the variable ξ(K) (see Fig. 4.4), i.e., mζK K''1 ''ξ(K ''1) ' max ξ(K) ζK K ''1 · mξ(K)''ζ K K''1 = max ξ(K) 1 '' 2''2 exp ''(ξ(K) '' ξ(K '' 1)) 2 2'2 · exp αξ,Kξ 2(K) + βξ,Kξ(K) I(U(K) '' ξ(K)) that can be rewritten in a more general form as mζK K''1 ''ξ(K ''1) ' max ξ(K) 'U(K) exp Aξ,Kξ 2(K) + Bξ,Kξ2(K '' 1)+ + Cξ,Kξ(K)ξ(K '' 1) + Dξ,Kξ(K) (4.8) with Aξ,K = '' 1 2'2 + αξ,K , Bξ,K = '' 1 2'2 Cξ,K = 1 '2 , Dξ,K = βξ,K . (4.9) To determine the message (4.8) the procedure is the same as that followed

in the proof of Theorem 3.5 with a single available reading. Indeed, by

indicating with ¯ ξ(K) the unconstrained maximizer of the exponent in the objective function, i.e., ¯ ξ(K) = arg max ξ(K) Aξ,Kξ 2(K)+Bξ,Kξ2(K''1)+Cξ,Kξ(K)ξ(K''1)+Dξ,Kξ(K) , it follows that ¯ ξ(K) = '' Cξ,Kξ(K '' 1) + Dξ,K 2Aξ,K , (4.10) therefore the solution of the maximization problem in (4.8) is given by ' ξ(K) = min ¯ ξ(K), U(K) . If ¯ ξ(K) ' U(K), the estimation problem is solved. If instead ¯ ξ(K) < U(K), then the chain has to be traversed backward. On the other hand, the expres-

sion of ¯ ξ(K) depends on ξ(K '' 1) which is not determined at this stage. To 54 4. A Factor Graph-Based Clock Offset Estimator ζ0 ζ10 ξ(0) ξ(1) '1 ζ K ''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K Figure 4.5: The variable ξ(K '' 1) sends the message mξ(K''1)''ζK''1 K''2 to the factor ζ K ''1 K ''2 . complete the computation, further messages need to be propagated backward

in the graph from the variable node ξ(K '' 1). It is now assumed that ¯ ξ(K) ' U(K), therefore ¯ ξ(K) from (4.10) is plugged back in (4.8) to obtain a modified version of (4.8), that is mζK K''1''ξ(K ''1) ' exp ( Bξ,K '' C2 ξ,K 4Aξ,K ξ2(K '' 1) '' Cξ,KDξ,K 2Aξ,K ξ(K '' 1) ) . (4.11) The next step is the message sent to the factor node ζK''1 K ''2 from the variable node ξ(K '' 1), which is the product between the message mζK K''1''ξ(K ''1) and the message received from the factor node 'K''1 (see Fig. 4.5), i.e., m ξ(K ''1)''ζ K''1 K''2 = mζK K''1''ξ(K ''1) · m' K''1''ξ(K ''1) . At this stage, the factor node ζK''1 K ''2 takes the maximization over ξ(K '' 1) of the product between the incoming message and the factor ζK''1 K ''2 , in order to deliver the following message to the variable node ξ(K '' 2): m ζ K''1 K''2 ''ξ(K ''2) ' max ξ(K ''1)'U(K''1) ζK''1 K ''2 · mξ(K''1)''ζ K''1 K''2 = max ξ(K ''1) 1 '' 2''2 exp '' (ξ(K '' 1) '' ξ(K '' 2)) 2 2'2 · exp Bξ,K '' C2 ξ,K 4Aξ,K ξ2(K '' 1) '' Cξ,KDξ,K 2Aξ,K ξ(K '' 1) · exp αξ,K''1ξ 2(K '' 1) + βξ,K''1ξ(K '' 1) I(U(K '' 1) '' ξ(K '' 1)) . As before, the message above can be rewritten in terms of A, B, C and D as m ζ K''1 K''2 ''ξ(K ''2) ' max ξ(K ''1)'U(K''1) exp(Aξ,K''1ξ 2(K '' 1)+ Bξ,K''1ξ(K '' 2) 2 + Cξ,K''1ξ(K '' 1)ξ(K '' 2) + Dξ,K''1ξ(K '' 1)) (4.12) 4.3. A Factor Graph-based Estimator 55 with Aξ,K''1 = '' 1 2'2 + αξ,K''1 + Bξ,K '' C2 ξ,K 4Aξ,K , Bξ,K''1 = '' 1 2'2 , Cξ,K''1 = 1 '2 Dξ,K''1 = βξ,K''1 '' Cξ,KDξ,K 2Aξ,K . As before, one can compute the unconstrained maximizer ¯ ξ(K '' 1) of the objective function above, which turns out to be given by ¯ ξ(K '' 1) = '' Cξ,K''1ξ(K '' 2) + Dξ,K''1 2Aξ,K''1 and the solution to the maximization problem (4.12) is again expressed as ' ξ(K '' 1) = min ¯ ξ(K '' 1), U(K '' 1) . In the same way as previously seen, ¯ ξ (K '' 1) depends on ξ(K '' 2) therefore at least another set of messages must be passed backward on the factor graph

(see Fig. 4.6). Once ¯ ξ(K '' 1) is plugged back in (4.12), it holds that m ζ K''1 K''2 ''ξ(K ''2) ' exp Bξ,K''1 '' C2 ξ,K ''1 4Aξ,K''1 ξ 2(K '' 2) '' Cξ,K''1Dξ,K''1 2Aξ,K''1 ξ(K '' 2) . (4.13) Note that the above has a form similar to (4.11). By keeping traversing the

graph, messages similar to (4.11) and (4.13) are obtained, with opportune

definitions for the quantities A, B, C and D. A generalization of what has just been seen leads to, for i = 1, . . . , K '' 1 Aξ,K''i = '' 1 2'2 + αξ,K''i + Bξ,K''i+1 '' C2 ξ,K ''i+1 4Aξ,K''i+1 Bξ,K''i = '' 1 2'2 , Cξ,K''i = 1 '2 Dξ,K''i = βξ,K''i '' Cξ,K''i+1Dξ,K''i+1 2Aξ,K''i+1 (4.14) and ¯ ξ(K '' i) = '' Cξ,K''iξ(K '' i '' 1) + Dξ,K''i 2Aξ,K''i (4.15) ' ξ(K '' i) = min ¯ ξ(K '' i), U(K '' i) . (4.16) 56 4. A Factor Graph-Based Clock Offset Estimator ζ0 ζ10 ξ(0) ξ(1) '1 ζ K ''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K Figure 4.6: The factor ζK''1 K ''2 sends the message mζ K''1 K''2 ''ξ(K ''2) to the variable ξ(K '' 2). The message passing keeps going until the end of the graph is reached, which

means that, for i = K '' 1 and using (4.15) and (4.16) one obtains ¯ ξ(1) = '' Cξ,1ξ(0) + Dξ,1 2Aξ,1 (4.17) ' ξ(1) = min ¯ ξ(1), U(1) . (4.18) Similarly, it follows that mζ1 0 ''ξ0 ' exp Bξ,1 '' C2 ξ,1 4Aξ,1 ξ 2(0) '' Cξ,1Dξ,1 2Aξ,1 ξ(0) . (4.19) The estimate ' ξ(0) can be obtained by maximizing the above message, which naturally leads to an unconstrained maximization issue, thus yielding ' ξ(0) = ¯ ξ(0) = max ξ(0) mζ1 0 ''ξ(0) '' ' ξ(0) = Cξ,1Dξ,1 4Aξ,1Bξ,1 '' C2ξ,1 . (4.20) Now that the estimate in (4.20) is fully determinable from the observations:

its value can be used to solve for ' ξ(1) in (4.18) and it can then be plugged back in the chain using the recursions (4.15) and (4.16). 4.3.2 Closed-Form Expressions For ' θ (o) ML(K ) The aim of this section is to provide a concise expression for the estimator ' θML(K), by defining new quantities: given the linear function of x gξ,k(x) = '' Cξ,kx + Dξ,k 2Aξ,k , (4.21) 4.3. A Factor Graph-based Estimator 57 which has a key property, summarized in the following lemma. Lemma 4.1 For real numbers x1 and x2, the function gξ,k( ·), defined in (4.21), satisfies gξ,k (min(x1, x2)) = min (gξ,k(x1), gξ,k(x2)) . Proof: The constants Aξ,k, Cξ,k and Dξ,k are defined in (4.9) and (4.14).

The proof follows by observing that ''Cξ,k 2Aξ,k > 0, which implies that gξ,k is a monotonically increasing function. Using the just defined function gξ,k, the following chain of equalities can be stated as ¯ ξ(1) = gξ,1 ' ξ(0) ' ξ(1) = min U(1), gξ,1 ' ξ(0) ¯ ξ(2) = gξ,2 ' ξ(1) ' ξ(2) = min U(2), gξ,2 ' ξ(1) with gξ,2 ' ξ(1) = gξ,2 min U(1), gξ,1 ' ξ(0) = min gξ,2 (U(1)) , gξ,2 gξ,1 ' ξ(0) (4.22) where (4.22) holds thanks to Lemma 4.1. In turn, the estimate ' ξ(2) can be rewritten as ' ξ(2) = min U(2), min gξ,2 (U(1)) , gξ,2 gξ,1 ' ξ(0) = min U(2), gξ,2 (U(1)) , gξ,2 gξ,1 ' ξ(0) . With the same procedure, it holds ' ξ3 = min U(3), gξ,3 (U(2)) , gξ,3 (gξ,2 (U(1))) , gξ,3 gξ,2 gξ,1 ' ξ(0) . For m ' j, it is convenient to define the function Gm ξ,j (.) = gξ,m (gξ,m''1 . . . gξ,j (.)) , (4.23) 58 4. A Factor Graph-Based Clock Offset Estimator so that the estimate ' ξ3 can be compactly rewritten as ' ξ3 = min U(3), G3 ξ,3 (U (2)) , G 3 ξ,2 (U (1)) , G 3 ξ,1 ' ξ(0) . It must be noted that the just presented estimators only depends on functions of data, therefore they are fully determined. By keeping on plug-

ging back the quantities in the chain, the estimate for ξ(K) can be readily

obtained. Analogous expressions can be obtained for ', by keeping in mind that A',K''i, B',K''i, C',K''i and D',K''i for i = 0, . . . , K ''1, can be obtained from

(4.9) and (4.14) by substituting αξ,K''i and βξ,K''i with the proper α',K''i and

β',K''i, respectively. By making use of these constants, ' '(0), g',k and Gm ',j can be defined analogously to (4.20), (4.21) and (4.23). The closed form expression for the clock offset estimate ' θ(o) (K) is given by the following theorem. Theorem 4.2 The state estimates ' ξ(K) and ' '(K) for the posterior pdf in (4.5) can be expressed as ' ξ(K) = min U(K), G K ξ,K (U (K '' 1)) , . . . , G K ξ,2 (U (1)) , G K ξ,1 ' ξ(0) ' '(K) = min V (K), GK ',K (V (K '' 1)) , . . . , G K ξ,2 (V1) , G K ξ,1 ' '(0) and the factor graph-based clock offset estimator (FGOE) ' θ(o)(K) is given by ' θ(o)(K) = ' ξ(K) '' ' '(K) 2 . (4.24) Proof: The proof follows from the previous discussion and by making use of

(3.3). As it has been previously remarked, the closed form expressions in The- orem 4.2 allow to estimate the time-varying clock offset θ(o)(K), when the

likelihood functions, f (U(k) |ξ(k)) and f(V (k)|'(k)), have either a Gaussian, exponential or log-normal distribution. The following sections present the an-

alytical results when the general framework here adopted is particularized to

all these three distributions. Remark If the likelihood functions fU (U(k) |ξ(k)) and fV (V (k)|'(k)) are unconstrained, it follows that the solution of the maximization problem max ξ(k) exp Aξ,kξ 2(k) + Bξ,kξ2(k '' 1) + Cξ,kξ(k)ξ(k '' 1) + Dξ,kξ(k) 4.3. A Factor Graph-based Estimator 59 coincides with the unconstrained maximizer, i.e., ' ξ(k) = ¯ ξ(k) ''k = 1, . . . , K. Hence, the unconstrained likelihood maximization problem is subsumed in the

message passing framework for constrained likelihood maximization. Gaussian Distribution In case of Gaussian likelihoods (see Tab. 4.1), the functions fU (U(k) |ξ(k)) and fV (V (k) |'(k)) are unconstrained. By keeping in mind this, it follows from Theorem 4.2 that ' ξ(K) for Gaussian distributed observations U(k) is given by ' ξ(K) = G K ξ,1 ' ξ (0) . To determine the constants in (4.9) and (4.14) it is sufficient to compare the

Gaussian pdf (see Tab.4.1) with (4.7), to get αξ,k = '' 1 2'2 ξ,k , βξ,k = U(k) '2 ξ,k , (4.25) and using these values in (4.9) and (4.14), thus obtaining Aξ,K = '' 1 2'2 '' 1 2'2 ξ,K , Bξ,K = '' 1 2'2 Cξ,K = 1 '2 , Dξ,K = U(K) '2 ξ,K (4.26) Aξ,K''i = '' 1 2'2 '' 1 2'2 ξ,K ''i + Bξ,K''i+1 '' C2 ξ,K ''i+1 4Aξ,K''i+1 Bξ,K''i = '' 1 2'2 , Cξ,K''i = 1 '2 Dξ,K''i = U(K '' i) '2 ξ,K ''i '' Cξ,K''i+1Dξ,K''i+1 2Aξ,K''i+1 for i = 1, . . . , K '' 1. Using similar arguments, the estimate ' '(K) is given by ' '(K) = GK ',1 ' '(0) . Finally, the FGOE ' θ(o)(K), is expressed (3.3) as ' θ(o)(K) = GK ξ,1 ' ξ(0) '' G K ',1 ' '(0) 2 . (4.27) 60 4. A Factor Graph-Based Clock Offset Estimator It is interesting to observe the behaviour of the FGOE ' θ(o)(K) when the Gauss-Markov model variance '2 tends to zero. Consider gξ,K(ξ) = '' Cξ,Kξ + Dξ,K 2Aξ,K where the constants Aξ,K, Bξ,K, Cξ,K and Dξ,K are determined by (4.26).

After some manipulation, it holds that gξ,K(ξ) = '2 ξ ξ + ' 2U(K) '2 ξ + ' 2 , and it is quite straightforward to see that lim '2 ''0 gξ,K(ξ) = ξ , Similarly, gξ,K''1(ξ) '' ξ as ' 2 '' 0. It can be shown that in the low system noise regime, as '2 '' 0, it holds that ' ξ(K) '' ' ξ(0) = Cξ,1Dξ,1 4Aξ,1Bξ,1 '' C2ξ,1 . The same relations are valid for ', for which it can be shown that ' '(K) '' ' '(0) = C',1D',1 4A',1B',1 '' C2',1 . Therefore ' θ(o)(K) '' ' ξ(0) '' ' '(0) 2 , which can be proven to be equal to the MLE (3.19). Log-Normal Distribution When the likelihood functions are log-normally distributed (see Tab. 4.1), it

holds that αξ,k = '' 1 2'2 ξ,k βξ,k = log U(k) '2 ξ,k . It is clear that the only difference with the Gaussian case is a different ex-

pression of βξ,k. Being an unconstrained case of likelihood functions, ' ξ(K) in this case is again ' ξ(K) = GK ξ,1 ' ξ(0) , 4.3. A Factor Graph-based Estimator 61 with the constants in (4.9) and (4.14) given by Aξ,K = '' 1 2'2 '' 1 2'2 ξ,K , Bξ,K = '' 1 2'2 Cξ,K = 1 '2 , Dξ,K = log U(K) '2 ξ,K Aξ,K''i = '' 1 2'2 '' 1 2'2 ξ,K ''i + Bξ,K''i+1 '' C2 ξ,K ''i+1 4Aξ,K''i+1 Bξ,K''i = '' 1 2'2 , Cξ,K''i = 1 '2 Dξ,K''i = log U(K '' i) '2 ξ,K ''i '' Cξ,K''i+1Dξ,K''i+1 2Aξ,K''i+1 for i = 1, . . . , K '' 1. The same procedure leads to similar results about '. Hence, the FGOE ' θ(o)(K) can be expressed as ' θ(o)(K) = GK ξ,1 ' ξ(0) '' G K ',1 ' '(0) 2 . (4.28) Again, as the Gauss-Markov system noise '2 '' 0, the FGOE approaches its ML counterpart derived in Chapter 3, cf. (3.20). Exponential Distribution If the likelihood functions are exponentially distributed (see Tab. 4.1), it

means that they represent a case of constrained likelihoods. The quantities

αξ,k and βξ,k are given by αξ,k = 0, βξ,k = λξ , so that the constants Aξ,k, Bξ,k, Cξ,k and Dξ,k can be written as Aξ,k = '' 1 2'2 , Bξ,k = '' 1 2'2 Cξ,k = 1 '2 , Dξ,k = λξ for all k = 1, . . . , K. Using Theorem 4.2, it follows that GK ξ,K (U (K '' 1)) = '' Cξ,KU(K '' 1) + Dξ,K 2Aξ,K = U(K '' 1) + λξ' 2 GK ξ,K ''1(U (K '' 2)) = U (K '' 2) + 2λξ' 2 62 4. A Factor Graph-Based Clock Offset Estimator and so on. The estimator ' ξ(0) can be evaluated as ' ξ(0) = Cξ,1Dξ,1 4Aξ,1Bξ,1 '' C2ξ,1 = + '' , so that GK ξ,1( ' ξ(0)) = + '' . (4.29) Finally, using (4.29) and Theorem 4.2, it holds that ' ξ (K) = min(U(K), U(K '' 1) + λξ' 2, U(K '' 2)+ 2λξ' 2, U(1) + (K '' 1)λξ'2) . (4.30) Similar results hold for ', so that the estimate ' ' (K) is given by ' '(K) = min(V (K), V (K '' 1) + λ'' 2, V (K '' 2)+ 2λ'' 2, V (1) + (K '' 1)λ''2) (4.31) and the combination of the two estimators above provides the estimate ' θ(o)(K) provides (4.24) ' θ (o)(K) = 1

2 min(U(K), U(K '' 1) + λξ' 2, U(K '' 2)+ 2λξ' 2, U(1) + (K '' 1)λξ'2)'' 1

2 min(V (K), V (K '' 1) + λ'' 2, V (K '' 2)+ 2λ'' 2V (1) + (K '' 1)λ''2) . (4.32) As the Gauss-Markov system noise '2 '' 0, the above estimator yields ' θ(o)(K) '' 'θML = min (U(K), . . . , U(1)) '' min (V (K), . . . , V (1)) 2 , which is the MLE (3.26). 4.4 Bayesian Statistical Bounds The purpose of this section is to provide statistical bounds to the clock offset

estimation error in the Bayesian framework, similarly to what performed in

Section 3.5 for the classical case. As in that case, the likelihood is assumed

belonging to the exponential family of distributions with the general expres-

sion for the log-partition function, so that the analytical results are valid for 4.4. Bayesian Statistical Bounds 63 a large class of distributions. Moreover, the results can be used by themselves

in other Bayesian estimation contexts. By assuming the general notation (3.27) and (3.28) for the likelihoods, the following sections provide Bayesian bounds for the variance of the error

performed in estimating ρ. To this purpose, the Bayesian Cramer-Rao bound

and a Bayesian version of the Chapman-Robbins bound are derived. 4.4.1 Bayesian Cramér-Rao Lower Bound In this section, the Bayesian Cramér-Rao lower bound (BCRB) is derived,

which requires regularity conditions analogous to (3.29) for being computed. The BCRB states that the variance of the estimator ' ρ(k) of ρ(k) is bounded below by the lower-right sub-matrix of the inverse of the Bayesian

information matrix, J''1 CR(k) [56], i.e., Var (' ρ(k)) ' J'' 1 CR(k) = [J ''1 CR(k)]kk (4.33) where the Bayesian information matrix is given by [JCR(k)]ij = E '' log f (Zk, ρk) ''ρ(i) '' log f (Zk, ρk) ''ρ(j) = ''E ''2 log f (Zk, ρk) ''ρ(i)''ρ(j) . The expectation is taken with respect to the joint pdf and the following

definitions hold Zk = [Z(1), . . . , Z(k)] ' ρk = [ρ(0), ρ(1), . . . , ρ(k)] ' fZ(Z(k) |ρ(k)) ' exp (η(Z(k))ρ(k) '' 'k(ρ(k))) . (4.34) By assuming that the parameter ρ(k) follows a Gauss-Markov model, the

evolution can be described by fρ(ρ(k) |ρ(k '' 1)) = 1 '' 2''2 exp '' (ρ(k) '' ρ(k '' 1)) 2 2'2 . (4.35) In [57], a recursive formula for evaluating the Bayesian sub-matrix was de-

rived, being JCR(k + 1) = '' E (2) CR(k) JCR(k) + E (1) CR(k) ''1 E (2) CR(k) + E (3A) CR (k) + E (3B) CR (k) (4.36) 64 4. A Factor Graph-Based Clock Offset Estimator with E (1) CR(k) = E '' ''2 ''ρ2(k) log fρ(ρ(k + 1) |ρ(k)) E (2) CR(k) = E '' ''2 ''ρ(k)''ρ(k + 1) log fρ(ρ(k + 1) |ρ(k)) E (3A) CR (k) = E '' ''2 ''ρ2(k + 1) log fρ(ρ(k + 1) |ρ(k)) E (3B) CR (k) = E '' ''2 ''ρ2(k + 1) log fρ(Z(k + 1) |ρ(k + 1)) and the expectation is again taken with respect to the joint pdf. The fol-

lowing theorem specializes the recursive formula (4.36) to the Gauss-Markov

model. Theorem 4.3 Assuming the Bayesian framework in (4.34) and (4.35), the

recursive Bayesian information matrix in (4.36) is given by JCR(k + 1) = ' 2 + J''1 CR(k) ''1 + '2 ηk (4.37) with JCR(0) = 0. Proof: Using the density functions, fρ(ρ(k) |ρ(k''1)) in (4.35) and fZ(Zk|ρ(k)) in (4.34), it can be proved that E (1) CR(k) = 1 '2 , E (2) CR(k) = '' 1 '2 , E (3A) CR (k) = 1 '2 and E (3B) CR (k) = Z Z ''2'k(ρ(k + 1)) ''ρ2(k + 1) fρ(ρ(k + 1), Z(k + 1))dρ(k + 1)dZ(k + 1) = ''2'k(ρ(k + 1)) ''ρ2(k + 1) = '2 ηk . The thesis easily follows by plugging these quantities in (4.36). 4.4.2 Bayesian Chapman-Robbins Bound In the case the regularity assumptions required by the BCRB are not satisfied

(such as for the constrained likelihood functions), a Bayesian version of the

Chapman-Robbins bound (BCHRB) can be used to provide a lower bound

on the variance of an estimator of ρ(k). 4.4. Bayesian Statistical Bounds 65 The BCHRB states that the variance of an estimator ' ρk of ρk is lower bounded as Var( ' ρk) '' [Tk(hk) '' 1] ''1 h kh T k 0 with in the positive semi-definite sense, where Tk(hk) = E " f (Zk, ρk + hk) f (Zk, ρk) 2 # , and hk = [0, h1, . . . , hk] ' . Theorem 4.4 The BCHRB for the parameter ρ(k) can be expressed as Var(' ρ(k)) ' 1 JCH,k where JCH,k = inf hk Tk(hk) '' 1 h2 k and Tk(hk) = k Y j=1 M''2 η (hj)Mη(2hj) ! ' exp " 1 '2 k X j=1 (hj '' hj''1) 2 # . (4.38) Proof: See Appendix 4.A.1 for details. 4.4.3 MSE and Variance of the Estimation Error As for the classical case, also under the Bayesian framework the bounds

on the variance of the estimation error of ξ and ' (substituting ρ in the

likelihoods) can be used to lower bound the MSE of the clock offset θ(o)(K),

by making use of Proposition 3.8. Gaussian Distribution - BCRB When the likelihood function for ξ(k) is Gaussian distributed (see Tab. 4.1),

by using (4.25) and (4.37), it follows that JCR,ξ (k + 1) = ' 2 + J''1 CR ,ξ (k) ''1 + 1 '2 ξ,k , 66 4. A Factor Graph-Based Clock Offset Estimator with JCR,ξ (0) = 0. A similar recursion for JCR,' (k) can be derived. The

MSE of θk can be then lower bounded as Var(' θ(o)(k)) ' 1

4 1 JCR,ξ (k) + 1 JCR,' (k) . (4.39) Exponential Distribution - BCHRB If the likelihood for ξ(k) is exponentially distributed (see Tab. 4.1), (4.38)

becomes Tk(hk) = exp λξ k X j=1 hj ! exp " 1 '2 k X j=1 (hj '' hj''1) 2 # . In fact, it just needs to be noticed that 'ξ (ξ(k)) is constant over ξ(k) and

ηξ(U(j)) = λξ, so that (4.41) turns out to be E [exp (2hjηξ(U(j)))] = exp (λξhj) therefore S(hk) = exp λξ P k

j=1 hj . 4.5 Numerical Results Numerical results are provided in this section to corroborate the theoretical

findings of this chapter in the various scenarios. As already stated, the

performance metric used to evaluate the proposed estimators is the MSE

in estimating θ(o)(K). Parameters are chosen as 'ξ = '' = 0.1 for both

Gaussian and log-normally distributed likelihoods, while λξ = λ' = 10 for

exponentially distributed likelihood functions, so that, as in Chapter 3, the

variance of the observations is the same in all the three cases. Finally, the

variance of the Gauss-Markov model (4.4) is set to ' = 10''4. 4.5.1 MSE Performance and Bounds Assuming either Gaussian or exponentially distributed likelihoods, perfor-

mance of the FGOEs proposed in Section 4.3 in these two specific cases are

here compared with the Bayesian bounds introduced in Section 4.4. Fig. 4.7

reports the MSE performance of the FGOEs ' θ(o) (K) (4.27) and (4.32) and compares it with the BCRB and the BCHRB. As in the classical estimation

scenario, for Gaussian likelihoods the MSE using (4.27) overlaps the reported 4.5. Numerical Results 67 0 5 10 15 20 25 10 ''6 10 ''5 10 ''4 10 ''3 10 ''2 10 ''1 10 0 10 1 K MSE( θ(o) (K)) FGOE '' Gaussian

BCRB '' Gaussian

BCHRB '' Gaussian

FGOE '' Exp. Distr.

BCHRB '' Exp. Distr. Figure 4.7: MSE and bounds for estimating θ(o)(K) when using the proposed

FGOE with Gaussian and exponentially distributed likelihood. bounds. The MSE of the FGOE in the case of exponentially distributed like-

lihoods (4.32) is also plotted against the BCHRB as well in Fig. 4.7. In this

case, the MSE is quite close to BCHRB, although it does not coincide with

it, as exactly was the case in the classical estimation framework. 4.5.2 MSE Robustness with Log-Normally Distributed

Likelihood Fig. 4.8 shows the MSE in case the likelihood is log-normally distributed and

the various FGOEs are used: it can be seen that the FGOEs that (wrongly)

assume that the likelihood is either Gaussian or exponentially distributed

(dashed and dotted curve, respectively) have a quite bad performance as the

MSE is unbounded and unpredictable. On the other hand, the proposed esti-

mator ' θ (o) ML(K ) (4.28) derived under the assumption of log-normal likelihoods has extremely good performance, establishing a strong improvement with re-

spect the other estimators in this case. Moreover, the MSE curve overlaps

the reported BCRB (which, in turn, coincides with the BCHRB), denoting 68 4. A Factor Graph-Based Clock Offset Estimator 0 5 10 15 20 25 10 ''4 10 ''2 10 0 10 2 10 4 10 6 10 8 K MSE( θ(o) (K)) FGOE '' Log''normal

FGOE '' Gaussian Based

FGOE '' Exp. Distr. Based

BCRB

BCHRB Figure 4.8: MSE and bounds for the estimation of θ(o)(K) in case of log-

normal likelihood. estimator efficiency. 4.5.3 Classical vs Bayesian framework The estimators proposed in the classical and the Bayesian framework in

Chapter 3 and this chapter, respectively, can also be compared with each

other on the basis of their MSE performance as the system noise varies. The

goal of this analysis here is to show that the performance of the Bayesian

estimators approaches the performance of the MLEs of Chapter 3 as ' '' 0. Fig. 4.9 shows the MSE for the cases of Gaussian, exponential and log- normal likelihoods for K = 25. In the plot, the MSEs in the classical frame-

work are represented by the horizontal lines obtained with the MLEs using

(3.37) and (3.38) (see Appendix 3.A.5). It can be observed how the MSE ob-

tained with the use of the FGOEs for estimating θ(o)(K) almost approaches

the MSE of the MLEs as ' < 10''3, for all the three considered distributions. 4.5. Numerical Results 69 10 ''5 10 ''4 10 ''3 10 ''2 10 ''1 10 0 10 ''6 10 ''5 10 ''4 10 ''3 10 ''2 ' MSE( θ(o) (K)) Gaussian

Exp. Distr.

Log''normal Figure 4.9: MSE in the estimation of θ(o)(K) vs ' using FGOEs. The MSE

in the Bayesian case approaches the curves of the MLE as ' '' 0. 70 4. A Factor Graph-Based Clock Offset Estimator 4.6 Conclusions In this chapter, the model of Chapter 3 is improved by introducing a Gauss-

Markov model regulating the time behaviour of the clock offset related quan-

tities ξ and '. Extending the clock model by incorporating time-varying

offset provides a more realistic scenario for actual oscillators in sensor nodes.

Based on this, the general framework for the likelihood functions of the ob-

servations and the theory of factor graphs and message passing are exploited

in order to derive closed-form expressions for clock offset estimators under

the Bayesian framework, via the use of max-product algorithm. Statistical

bounds derived under the Bayesian framework in order to assess the quality

of the proposed FGOEs. The provided bounds are valid for a generic distri-

bution of the exponential family, so that they can also be used in scenarios

other than clock synchronization. The reported numerical studies confirm

the good behaviour of the proposed estimators. It can be concluded that

the FGOEs introduced in this chapter are valid alternatives to the state-of-

the-art in clock offset estimator when time-varying clock offset is realistically

assumed. 4.A. Appendix 71 4.A Appendix 4.A.1 Proof of Theorem 4.4 It holds that Tk(hk) =E " f (Zk, ρk + hk) f (Zk, ρk) 2 # = Z + '' '''' Z + '' '''' f (Zk, ρk + hk) f (Zk, ρk) 2 f (Zk, ρk)dZkdρk = S(hk) Z + '' '''' f (ρk + hk)2 f (ρk) dρk with S(hk) = Z + '' '''' f (Zk|ρk + hk) f (Zk |ρk) 2 f (Zk |ρk)dZk . (4.40) Continuing with the calculations Th(hk) =S(hk) Z + '' '''' f (ρk + hk)2 f (ρk) dρk =S(hk) Z + '' '''' f 2(ρ0 + h0) f (ρ0) ' k Y j=1 f 2(ρ(j) + hj |ρj''1 + hj''1) f (ρ(j) |ρj''1) dρk =S(hk) Z + '' '''' k Y j=1 f 2(ρ(j) + hj |ρj''1 + hj''1) f (ρ(j) |ρj''1) dρk . Since the quantity f 2(ρ(j)+hj |ρj''1+hj''1) f (ρ(j) |ρj''1) results to be equal to (j = 1, . . . , k) 1 ' '' 2' exp '' ρ2 j ''1 2'2 + ρ(j) + 2(hj '' hj''1) '2 ρj''1 ' exp '' (hj '' hj''1) 2 '2 exp '' ρ(j)2 2'2 '' 2ρ(j)(hj '' hj''1) '2 it follows that Z + '' '''' f 2(ρ(j) + hj |ρj''1 + hj''1) f (ρ(j) |ρj''1) dρj''1 = exp (hj '' hj''1) 2 '2 72 4. A Factor Graph-Based Clock Offset Estimator to get Tk(hk) = S(hk) exp " 1 '2 k X j=1 (hj '' hj''1) 2 # . Moreover, (4.40) can be written as S(hk) = k Y j=1 Z + '' '''' f (Zj|ρ(j) + hj) f (Zj |ρ(j)) 2 f (Zj |ρ(j))dZj , and it can be noted that f (Zj|ρ(j) + hj) f (Zj |ρ(j)) 2 = exp [ ''2 ('ρ(ρ(j) + hj) '' 'ρ(ρ(j)))] ' exp (2hjηρ(Zj)) . Therefore Z + '' '''' f (Zj|ρ(j) + hj) f (Zj |ρ(j)) 2 f (Zj |ρ(j))dZj = M'' 2 ηρ (hj )' E [exp (2hjηρ(Zj))] . Then, since E [exp (2hjηρ(Zj))] = exp ('ρ(ρ(j) + 2hj) '' 'ρ(ρ(j))) (4.41) it can be easily proved that Z + '' '''' f (Zj|ρ(j) + hj) f (Zj |ρ(j)) 2 f (Zj |ρ(j))dZj =M'' 2 ηρ (hj )Mηρ (2hj ) , so yielding S(hk) = k Y j=1 M''2 ηρ (hj )Mηρ (2hj ) . Part II Fast Consensus in WSNs and its Application to Clock Synchronization Clock coupling is well represented by the relation between oscillators when consensus is run for achieving clock synchronization. In Part II of this thesis an innovative consensus algorithm is presented, as well as its application to the clock synchronization problem. 73 Chapter 5 Consensus and Clock

Synchronization Differently from the results presented in Part I, the second part of this the-

sis relies on synchronization of coupled oscillators. Coupling between clocks

of different wireless sensor nodes involves the implementation of a correct-

ing signal to the oscillators (acting on the clock phase and/or on the clock

frequency) which modifies their running behaviors. In fact, while in Part I

the goal is the estimation of clock offsets (or skews) between different nodes,

the algorithms proposed in this part instead actively modify the oscillation

frequency of the oscillators by physically applying a voltage signal to the os-

cillators through well-known suited structures, such as PLLs or DLLs. Since

the correcting signals are functions of neighbors clock-related readings, af-

ter the first iteration the clocks of the network will result coupled together,

requiring an analysis fundamentally different from that performed in Part

I. Consensus algorithms are well suited distributed procedures whose imple-

mentation for clock synchronization purposes cause coupling of the network

oscillators, as it will be clear next. This chapter describes the fundamentals of consensus algorithms, so that the relevant performance improvement permitted by the innovative consen-

sus algorithm presented in Chapter 6 can be fully appreciated. Of course,

the primary objective of the present thesis is reaching clock synchronization,

therefore the reader will not be surprised of seeing in Chapter 7 the appli-

cation to clock synchronization of the just mentioned innovative results on

consensus. 75 76 5. Consensus and Clock Synchronization 5.1 Distributed Consensus in WSNs The recent technology advancements in sensor equipment allowed the re-

search community to propose (and eventually implement on real hardware)

efficient techniques for distributed processing in WSNs [1]. In fact, sensors

are now equipped with on-board processors that allow for local pre-processing

of raw sensed data, thus reducing the need of data transfer to the hub nodes.

This concurs to make the network a robust and decentralized entity. Given local sensor readings, a common form of distributed processing consists of reaching an agreement on a common value. More specifically,

starting from sensed data at each node, the process of exchanging data-

related information in order to agree on a common representative value is

called distributed consensus. Applications of consensus can be found in a variety of scenarios in WSNs. Sensors gather data from the environment, so that each sensor node can get

information regarding environmental conditions such as temperature, pres-

sure and humidity, or about ongoing phenomena like target movements. Dis-

tributed averaging on sensed data is useful for reducing the uncertainty on

the measures. It also represents one of the most studied cases in which

distributed consensus can be successfully applied. In this case, since the

common agreement value is the average of the initial readings, the research

community refers to it as average consensus [58''61]. Consensus also finds

application in other areas, such as sensor fusion [62], just to cite one. Also,

clock synchronization can be recast as a double consensus problem (over both

clock offsets and skews) [9, 16''18], as it will be shown in Section 5.4 and in

Chapter 7. The classical implementation of consensus is through a linear update of the readings, as it will be explained later in Section 5.2. On the other hand,

alternative consensus approaches exist, like, for example, asynchronous gossip

algorithms [63, 64]. A method which uses similarities with fluid mechanics is

described in [61], but it is currently applicable only to highly structured net-

works. A further interesting alternative approach is to apply the alternating

direction multipliers method (ADMM) for reaching an agreement throughout

the network (see [65, p. 253] and [66''69]). The whole Chapter 6 is devoted

to the analysis of this method. As in most distributed algorithms, the main design issues are conver- gence speed and resilience to noise. For maximizing the first, in the context

of (linear) average consensus [70] provides an iterative method for computing

the best update weights at each sensor node. Throughout this thesis, such

method is denoted as the optimal Boyd''s solution. Although the compu- 5.2. Consensus System Model 77 tational burden for computing it prevents from an implementation on real

sensor nodes, this optimized solution is anyway taken into account in this

thesis, at least as a benchmark. Applying a polynomial filter to the net-

work matrix is the method chosen in [71] for increasing the convergence rate.

In the realm of gossip algorithms, techniques for increasing the convergence

speed can be found in [72]. However, the state-of-the-art of fast consensus for

convergence rate is [73], where the past values of the states are incorporated

in the algorithm in order to speed up the convergence. Unfortunately, this

comes at the cost of a bad noise resilience, which can potentially prevent

from recovery of the correct consensus value. It appears evident that the problem of distributed consensus has been well studied in the literature, both in continuous and in discrete time. In

this chapter, basics of discrete time linear consensus are introduced as well

as the framework for its application to the problem of clock synchronization. 5.2 Consensus System Model In a WSN, just couples of nodes that are in the transmission range of one

another are actually able to communicate reliably. In order to deal with these

network communication constraints, some tools borrowed from the graph

theory are used. Here it is assumed that the communication properties of

the network are described by an undirected 1 graph of N nodes, one for each sensor. An edge connects nodes i and j if they are in the transmission

range one of each other, or, in other words, if i can receive packets from j

reliably and vice-versa. The link existence information is summarized in the

adjacency matrix ', which has 1 in position (i, j) if node i can transmit

information to node j, 0 otherwise. In other words, [']i,j = ( 1 if node j can communicate to node j

0 otherwise . (5.1) The adjacency matrix of an undirected graph is symmetric by definition, i.e.

' = ' ' . In this thesis it is assumed that the underlying communication graph is strongly connected besides undirected. This means that there exists a path

connecting each node to every other node in the network, so that communi-

cation is possible among any pair of nodes, via multi-hop transmissions. An example of a graph and its adjacency matrix is provided in Figg. 5.1 and 5.2, respectively. 1In this thesis, communication links are assumed symmetric. 78 5. Consensus and Clock Synchronization 1 4 2 5 6 3 Figure 5.1: Example of an undirected and strongly connected graph. ' =

1 1 0 1 0 0

1 1 0 1 1 0

0 0 1 1 0 0

1 1 1 1 1 1

0 1 0 1 1 0

0 0 0 1 0 1

. Figure 5.2: Adjacency matrix relative to the network whose graph is repre-

sented in Fig. 5.1. In consensus theory, convergence to a common value is reached through the iterative exchange and processing of data coming from neighbors at each

sensor node. Mathematically speaking, let xi(0) = θi '' R be the initial reading 2 of sensor i '' [1, . . . , N]. Let also xi(t) be the estimated consensus value at time t. The estimated consensus value at time t + 1, xi(t + 1), is

linearly dependent on the values xj(t), j '' Ni, where Ni denotes the set of neighbors of node i. Let x(t) = [x1(t), . . . , xN (t)] ' be the column vector form of the estimated values at time t. The classical formulation of the

linear consensus problem relates the states at time t + 1 to those at time t

as follows [58, 63, 70, 74, 75] x (t + 1) = Sx(t), t ' 0, x (0) = θ , (5.2) where S '' R N 'N is a square matrix coherent with the communication con- straints of the graph, i.e. [S]i,j = 0 if i cannot communicate with j. 2The work can be easily extended to vectorial case. 5.3. Requirements for Consensus Matrix 79 5.3 Requirements for Consensus Matrix A lot of approaches assimilable to (5.2) are actually identified under the

name of consensus algorithms, but their common property is the convergence

toward a common value to all the agents in the network. What makes the

difference in terms of convergence speed and noise resilience is the choice of

update matrix S. Consensus is reached when vector x(t) has equal components, correspond- ing to a linear combination of the elements in the initial readings vector x(0),

i.e., lim t '''' x (t) = 1 '' ' x(0) , '' '' R N , (5.3) where 1 is the N ' 1 vector of all ones. By an iterative application of (5.2), it is true also that lim t '''' x (t) = lim t '''' Stx (0) , (5.4) therefore, from (5.3) and (5.4) it must be that lim t '''' St = 1'' ' , (5.5) which means that the limit (5.5) must exist and be a rank 1 matrix. The

following theorem states equivalent conditions for reaching consensus, based

on properties of matrix S. Theorem 5.1 Let λ1, . . . , λN be the eigenvalues of matrix S sorted in a

decreasing order. Necessary and sufficient conditions for reaching consensus,

i.e., for which (5.5) holds, are C1) S1 = 1 C2) '' ' S = '' ' , with k''k : '' ' 1 = 1 C3) ρ(S) = 1 C4) µ(λ1 = 1) = 1, where ρ(S) denotes the spectral radius of matrix S, and µ(λ) is the algebraic

multiplicity of the eigenvalue λ. Proof: See Appendix 5.A.1. 80 5. Consensus and Clock Synchronization 5.3.1 Average Consensus In this thesis, the focus is on linear average consensus, that is consensus on

the average of the initial readings θ = 1 N 1 ' θ = 1 N N X i=1 θi , which means that it must be '' ' x (0) = '' ' θ = θ . (5.6) The following corollary simply follows from (5.6) [76]. Corollary 5.2 Under the assumptions of Theorem 5.1, average consensus

is achieved if and only if C5) '' = 1 N 1. 5.3.2 Popular Average Consensus Matrices From the set of all matrices satisfying the properties C1)-C5) for achieving

average consensus, only four relevant matrices are chosen and described.

Three of them are relatively easy to implement since they require simple

local knowledge of the neighborhood at each node. On the other hand, the

last matrix is reported as a benchmark since it is the result of an optimization

process, not easy at all to implement in a distributed fashion. Consensus Laplacian The Laplacian matrix SLA of the graph [16,58,60] is the most popular matrix

for reaching average consensus, being easy to implement distributively. It is

constructed with the following procedure. First compute the diagonal matrix ' whose diagonal element gi in position i is the number of neighbors of node i, i.e. gi = P j 6=i[']i,j. The Laplacian consensus matrix is computed as SLA = I '' (' '' '), where I is the N ' N identity matrix. Consensus Metropolis The Metropolis-Hastings matrix SMH [70,77] associated to the graph requires

a two-hop neighbors knowledge of the network by each node. It constructed

with the following procedure. 5.4. Application of Consensus to Clock Synchronization 81 First consider the symmetric stochastic matrix Ss defined as follows [Ss]i,j = 1 max {gi,gj} if (i, j) is an edge of the graph and i 6= j 1 '' P j 6=i[Ss]i,j if i = j 0 otherwise . (5.7) The Metropolis-Hastings matrix is defined as SMH = I '' Ss. Averaged Consensus Matrix The averaged consensus matrix SAC [78] is as simple to implement as the

Laplacian matrix, since the required network knowledge is the same. It is

directly derived from the adjacency matrix ' of the graph. The procedure for constructing it is to set each element of a row of SAC equal to 1 over the number of non-zero elements of that specific row. Boyd''s Optimum Solution The three previous matrices are constructed via an empirical procedure. On

the other hand, Boyd et. al. in [70] provide an iterative procedure to find the

specific matrix Sboyd which optimizes the convergence rate of the algorithm

(5.2), through the minimization of the second eigenvalue in modulus. Its

implementation requires a high computational cost, therefore in this thesis

such matrix is used as a benchmark rather than being suggested for practical

implementation. 5.4 Application of Consensus to Clock Synchro-

nization Clock synchronization is a well suited application of distributed consensus

in which agreement is required both for clock offsets and for clock skews.

There is an extensive literature on consensus-based clock synchronization

algorithms. Simeone et. al. in [16] applied the average consensus for reaching

an agreement on clock frequencies only. On the other hand, in [17, 18] both

clock phases and clock frequencies are corrected, even if in [18] it is not

clear how frequency observations are obtained, since they are generally not

available at the nodes. An interesting algorithm was proposed in [59], where

link delays are carefully taken into account into the analysis. A formal statement of the problem of clock synchronization in consensus terms is now provided. 82 5. Consensus and Clock Synchronization 5.4.1 System Model By stacking the clock counter values (1.1) and the clock periods (1.2) in

the vectors T (t) , [T1(t), . . . , TN(t)] ' and Υ(t) , [Υ1(t), . . . , ΥN(t)]', respec- tively, the problem of clock synchronization is now recast as a double consen-

sus problem, both for agreement on counters and on periods. By assuming

initial conditions T (0) and Υ(0), for clock phases and periods, respectively,

the adopted reference model is [9, 16''18] T (t + 1) = T (t) + Υ(t) + u(t) , (5.8a) Υ(t + 1) = Υ(t) + v(t) (5.8b) where u(t) and v(t) are N ' 1 real-valued vectors, which represent the consensus-based control signals to be designed for reaching synchronization. 5.4.2 Relevant Works The update equations (5.8) are the reference model for a number of consensus-

based clock synchronization algorithms, which obviously differ in the expres-

sions of the correcting signals u(t) and v(t). The most relevant works are: a) Simeone et. al. in [16] proposed a clock synchronization algorithm in which only correction on the clock counters is performed, i.e., v(t) ' 0. A consensus algorithm is instead applied to clock phases, so that

u (t) = (S '' I) T (t), with S being a row-stochastic matrix compatible with the underlying graph. Weights in matrix S are chosen in order

to give more importance to more reliable (in terms of received power)

channels. b) On the other hand, Schenato et. al in [17] designed a consensus-based clock synchronization algorithm in which both consensus on clock phases

and frequencies are considered, so that the control signals are expressed

as u(t) = (S '' I) T (t) and v(t) = (S '' I) Υ(t). The implementation of such an algorithm is assured by the fact that sensors are assumed

able to measure the period ratios Υi(t)/Υj.(t). c) Another relevant clock synchronization protocol is the one proposed by Carli et. al. in [18]. Consensus both in clock phases and in clock

periods is applied, so that the control signal can still be expressed as

u (t) = (S '' I) T (t) and v(t) = α (S '' I) T (t), with α '' R. On the other hand, while the control u(t) can be straightforwardly applied, for

an application of v(t) the values of Υ(t) are needed, but they are not

directly measurable at the sensors. 5.4. Application of Consensus to Clock Synchronization 83 In Chapter 6 an innovative consensus algorithm is presented, which allows a convergence speed on the order of the state-of-the-art of fast consensus

and an outstanding noise resilience. This algorithm will be applied to clock

synchronization in Chapter 7. 84 5. Consensus and Clock Synchronization 5.A Appendix 5.A.1 Proof of Theorem 5.1 For the sufficiency, first note that C1) and C2) imply that a right and a left

eigenvector relative to eigenvalue λ1 = 1 are 1 and '', respectively. Let Q be

the non singular matrix such that it holds S = Q'P Q'' 1 'P = 1 S2 , where it is assured from C3) and C4) that S2 is a (N '' 1) ' (N '' 1) matrix with ρ(S2) < 1. It then holds that S '' 1'' ' = Q 0 S2 Q'' , so that it is ρ S '' 1'' ' < 1 . (5.9) Therefore, by following the steps in [76, Thm. 1] S t '' 1'' ' = St I '' 1'' ' = S t I '' 1'' ' t = h S I '' 1'' ' i t = S '' 1'' ' t , (5.10) where the first equality descends from C1) and the second from C2). By

using (5.9), result (5.10) implies (5.5). For proving necessity, it must be noted first [76] that the limit limt'''' S t exists if and only if there is a non singular matrix Q such that S = Q Ir S3 Q'' 1 , with Ir is the r ' r identity matrix (0 ' r ' N) and ρ(S3) < 1. This implies that ρ(S) ' 1. By denoting with 'i the columns of matrix Q and with 'i the rows of matrix Q'' 1, it is lim t '''' S t = lim t '''' Q Ir S t

3 Q'' 1 = Q Ir 0 Q'' 1 = r X i=1 'i' ' i . 5.A. Appendix 85 Since (5.5) holds, r must be equal to 1 and 'i' '

i = 1'' ' . This implies that C1) and C2) hold and that λ1 = 1 is a simple eigenvalue of matrix S, therefore

proving also C3) and C4). 86 5. Consensus and Clock Synchronization Chapter 6 Fast Consensus via ADMM The results presented in this chapter refer to a collaboration of Prof. Van-

gelista and me with Prof. Tomaso Erseghe 1 and Dr. Emiliano Dall''Anese2. They have been published in [78]. 6.1 Introduction As differently from the linear consensus presented in Chapter 5, in the present

chapter the problem of distributed agreement is faced from a rather different

perspective. The application of the ADMM leads in fact to an innovative

reformulation of the average consensus problem, as it will be clear next. The adopted approach here is a recasting of the average consensus prob- lem presented in Section 5.3.1 as an instance of convex optimization. In

fact, here the ADMM [65] is exploited to solve the optimization problem in

a distributed fashion. The ADMM is an iterative algorithm by which convex

optimization problems can be solved (for a comprehensive description of the

ADMM, the reader is redirected to [65, p. 253]). The average consensus problem over parameters θi can be recast as the following convex optimization problem, θ = argmin x N X i=1 (x '' θi) 2 . (6.1) The ADMM has found several applications to (6.1), such those proposed in

[66,67] (that here is called Method A) in the context of parameter estimation. 1T. Erseghe is with the Department of Information Engineering, University of Padova, via G. Gradenigo 6/b, 35131 Padova, Italy. 2E. Dall''Anese is with the Department of Electrical Engineering, University of Min- nesota, 200 Union St., MN 55455, Minneapolis, USA. 87 88 6. Fast Consensus via ADMM In network decoding and cognitive radio sensing, the ADMM has also been

proven useful [68, 69] (referred as Method B). Since in both methods the application of this interesting distributed al- gorithm produced satisfying results and it resulted to be robust to noise, the

present work is intended to perform a deep study of an ADMM-based form of

consensus in order to place it in the literature of fast consensus and to derive

simple analytical bounds for performance evaluation. A comparison with op-

timal and suboptimal consensus algorithms [59,63,70,73''75] is made in such

sense. Moreover, a generalization of the two most interesting and promising

reformulations of ADMM, proposed in [66] and [68] respectively, is done. It

is also shown how Method A and B differ in convergence speed and noise re-

silience. The forthcoming analysis can be considered complementary to that

performed in [66''69]. By relaxing the ADMM augmentation constants it is

possible to represent the resulting algorithm in matrix form, thus allowing

for an optimization study of the convergence and noise resilience properties

of the algorithm. In this chapter, it is shown that the proposed ADMM-based algorithm can be optimized for assuring: i) a convergence speed comparable to [73] which is the state-of-the-art of fast consensus ii) a significantly improved robustness to quantization/communication noise with respect to [73]; iii) a faster convergence speed and a stronger resilience to noise than that of the optimal Boyd''s solution; iv) a convergence performance comparable to [61], applicable to any net- work. If suboptimal parameters are selected, the performance is just slightly worse with respect to the optimum, once a unique parameter has been care-

fully chosen. On the other hand, the improved noise resilience makes the

ADMM-based consensus algorithm be the state-of-the-art in fast consensus. 6.2 ADMM-based Average Consensus This section, as well as the following, assumes the absence of noise due either

to quantization or to communication errors. Noisy measurements will be

dealt later in Section 6.4. 6.2. ADMM-based Average Consensus 89 Method A The reformulation of ADMM in [66] implies the introduction of the dummy

variables {zj} to facilitate the derivation of the algorithm. Such variables will be finally taken away when the optimization problem will be solved. By

considering all nodes being ''bridge' nodes, the formulation of Method A can

therefore be stated as follows θ = argmin {xi},{zj} N X i=1 (xi '' θi) 2 subject to xi = zj, ''i and j '' Ni (6.2) which is conceptually equivalent to (6.1). Since the network is assumed

strongly connected, the set of equality constraints xi = zj assures that the

estimate values {xi} N i=1 will finally coincide across the whole network. In Appendix 6.A.1 it is shown that the following notation can be used to express the ADMM Method A of (6.2): ai,j = ci,j P '' ''Nj c'',j , bi,j = ci,j 1 + P '' ''Ni ci,'' , di = X j ''Ni bi,j (6.3) where ci,j > 0 are augmentation constants, and where the condition P i ai,j = 1 holds. For t ' 1, the update equations can be expressed as xi(t + 1) = (1 '' di)θi + di xi(t) + λi(t) + 2ui(t) λi(t + 1) = λi(t) + ui(t) (6.4) with initial conditions xi(1) = (1 '' di) θi , λi(1) = 0 (6.5) and with ui(t) = X j ''Ni bi,j X k ''Nj ak,jxk(t) '' dixi(t) (6.6) which needs to be evaluated in a distributed fashion by means of two message

exchanges: the first for the summation in ak,j, the second for the summation

in bi,j. Method B The second reformulation of the problem (6.1) descends from [68], which

introduces the auxiliary variables {zi,j}, allowing to recast the optimization 90 6. Fast Consensus via ADMM problem as follows θ = argmin {xi},{zi,j} N X i=1 (xi '' θi) 2 subject to xi = zi,j and to zi,j = zj,i, ''i and j '' Ni . (6.7) Similarly to the constraints present in (6.2), since the network is assumed

strongly connected, again the set of equality constraints in (6.7) assures that

the estimate values {xi} N i=1 will finally coincide across the whole network. In Appendix 6.A.1 it is shown how the ADMM implementation of (6.7) can be expressed by (6.4) and (6.5), where (6.6) is replaced by ui(t) = X j ''Ni ei,jxj(t) '' X j ''Ni ei,j xi(t) , ei,j = (1 '' di) ci,jcj,i ci,j + cj,i . (6.8) As opposite to Method A, the application of (6.7), requires just one message

exchange. Remark Compared to the literature, in the present work the augmentation constants

are relaxed, in the sense that in [68] a unique coefficient is used, i.e., ci,j = c,

while in [66] the augmentation constant is dependent on the specific node,

i.e. ci,j = cj. 6.3 ADMM-based Consensus in Vector Form In order to facilitate the convergence analysis and the study of noise re-

silience of ADMM, in this Section a matrix formulation of both Method A

and Method B is proposed. Defining I as the N ' N identity matrix, the 6.3. ADMM-based Consensus in Vector Form 91 following notation is introduced: C = [ci,j]ij , cij > 0 ' C = ( [ ci,jcj,i ci,j+cj,i ]ij for j '' Ni 0 for j 6'' Ni ''1 = diag(C'1)

''2 = diag(C1) A = [aj,i]ij = '''' 1 1 C ' B = [bi,j]ij = (I + ''2)'' 1C E = [ei,j]ij = (I + ''2)'' 1 ' C D = diag(di) = (I + ''2)'' 1''2 , (6.9) where in the definition of A the interchange of i and j is intended; moreover,

it must be noted that matrix ' C is symmetric by construction. With these definitions, U = ( BA '' D , Method A E '' diag(E1) , Method B (6.10) summarizes the multiplications in (6.6) (Method A) and in (6.8) (Method

B). Since A is by construction a row stochastic matrix, i.e., A1 = 1, it is true that D = diag(BA1). This property allows to state that matrix U has

the same structure in both methods, and it also satisfies U 1 = 0. In Appendix 6.A.2 it is shown how the linear systems described by (6.4), (6.6), and (6.8) can be represented by the following linear update equation x (t + 1) = (I + D + 2U )x(t) '' (D + U)x(t '' 1) (6.11) or, in a more compact and convenient (for performance analysis purposes)

form s (t) = M s(t '' 1) , M = 2F '' G G '' F I 0 , s (t) = x(t + 1) x (t) , (6.12) with F = U + I and G = I '' D. The update rule (6.12) is valid for t > 0 with the initial state s (0) = Gθ 0 . (6.13) 92 6. Fast Consensus via ADMM The matrix framework of (6.12) corresponds to the standard form of consen-

sus problems (5.2), even if matrix M is not in standard row-stochastic form,

as requested from properties C1)-C4) in Theorem 5.1 for reaching consensus. On the other hand, matrix M has other important properties that allows to reach consensus anyway, in fact (see Appendix 6.A.2): P1) Defined the eigenvalues of matrix M as λi, it is true that λ1 = 1 with single multiplicity, while all the other eigenvalues satisfy |λi| < 1. P2) The left and right eigenvector of matrix M corresponding to eigenvalue 1 are, respectively, '''1 = 1 N h (1 + C1)', ''(C1)' i , r1 = 1 , where ''' 1r1 = 1 holds. Such properties lead to some consequences on the limit power behavior of matrix M , which is described by (see [79]) M '' = lim t '''' M t = r1''' 1 . (6.14) Since it is true that ''' 1s(0) = 1 N 1 'θ = θ , (6.15) convergence of the series {s(t)} to the average consensus is assured, that is s ( '') = lim t '''' s (t) = M ''s(0) = 1θ . Moreover, since the eigenvalue 1 has single multiplicity (see P1)), the t-th power of M can be splitted (for t > 0) in the sum of its limit power (6.14)

and a t-dependent matrix, i.e., M t ' M t + M '' , ' M = M '' M '' , so that the state evolution is described, for t > 0, by s (t) = ' M t s (0) + 1θ . (6.16) Matrix ' M results to have the same spectral structure of M , with the substantial difference that the eigenvalue λ1 = 1 is mapped to eigenvalue ' λ1 = 0, so that the whole set of eigenvalues 'λi of ' M satisfies |'λi| < 1. 6.3. ADMM-based Consensus in Vector Form 93 6.3.1 Assumption on the augmentation constants Properties P1) and P2) are enough to state that matrix M is appropriate for

consensus; on the other hand, they do not say anything about the convergence

speed and the noise resilience of the update equation (6.12); such properties,

in fact, are evinced by the analysis of the Jordan structure of M . In order to

evaluate these fundamental properties of the algorithm, matrix C is chosen

such that D = dI, or equivalently G = γI, with γ = 1 '' d. This implies the following form for the matrix C: C = ǫS , S 1 = 1 , ǫ > 0 , (6.17) which means that C has to be chosen equal to a row-stochastic matrix S

(that can be considered its shape) times an amplitude factor ǫ. Given that (6.17) holds, it is true that d = ǫ 1 + ǫ , γ = 1 1 + ǫ , therefore matrix F assumes the following expression F = 2d(Φ '' I) + I , Φ = ( 1 2 Sdiag(S '1)''1S' + 1 2 I , Method A 1

2 ' S '' 1

2 diag( ' S 1) + I , Method B . (6.18) It naturally follows that, for ǫ = 1, F = Φ and ' C = ' S . The choice (6.17) brings some consequences in the Jordan structure of matrix M . The proofs of the theorems can be found in Appendix 6.A.2. Theorem 6.1 Under condition (6.17) matrix F is positive definite, i.e., it

is unitarily diagonalizable with positive real valued eigenvalues fi. The eigen-

values of F satisfy f1 = 1 , 0 < γ ' fi < 1, i 6= 1 . (6.19) As a consequence, matrix M is unitarily similar to a block diagonal matrix,

M = Σ diag(Bi) Σ'' where Σ'' = Σ'' 1, with 2 ' 2 blocks Bi = 2fi '' γ ''(fi '' γ) 1 0 . (6.20) 2 94 6. Fast Consensus via ADMM 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 'i λ 2 i'' 1 ,2 i ǫ=0 ǫ=1 ǫ=0 (a) 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 'i |λ 2 i'' 1 ,2 i| ǫ= '' ǫ=1 (b) Figure 6.1: Eigenvalues λi of matrix M as functions of 'i for ǫ ' 1 (a) and ǫ ' 1 (b). Solid lines for odd i and dashed lines for even i. 6.3. ADMM-based Consensus in Vector Form 95 The eigenvalues of F can be related to those of matrix Φ and to the amplitude factor ǫ. Sorted eigenvalues '1 = 1 ' '2 ' . . . ' 'N are assumed for Φ; moreover, because of (6.19), it follows that 1 ' 'i ' 1

2 , and therefore that fi = 2d'i + 1 '' 2d . From Theorem 6.1, the eigenvalues of matrix M are the solutions of the characteristic polynomials of matrices Bi, that is of the equations det(λI '' Bi) = 0, therefore they are of the form λ2i''1,2i = (2fi '' γ) ± p(2fi '' γ)2 '' 4(fi '' γ) 2 . (6.21) In Fig. 6.1 the location of the eigenvalues (6.21) is represented, as a function

of 'i. It can be shown that, for i = 1, (6.21) provides λ1 = 1 and λ2 = d.

For i > 1 different cases need to be considered. More specifically, if ǫ < 1,

it follows that λ2i''1 6= λ2i both being real valued. On the other hand, when

ǫ ' 1, if fi = flow or fi = fhigh, where flow = γ + 1 '' '' 1 '' 2γ 2 , fhigh = γ + 1 + '' 1 '' 2γ 2 , then (6.21) provides two real valued coinciding eigenvalues λ2i''1 = λ2i =

fi '' 1

2 γ. If flow < fi < fhigh, λ2i''1 and λ2i constitute a complex conjugate couple, i.e., λ2i''1 = λ''2i. In the last case, for fi < flow and fi > fhigh, λ2i''1

and λ2i are distinct and real valued. In all cases, the following relations hold λ2i''1λ2i = fi '' γ , λ2i''1 + λ2i = 2fi '' γ , (1 '' λ2i''1)(1 '' λ2i) = 1 '' fi . (6.22) The just described considerations imply the following properties on the Jor-

dan structure of M : P3) If ǫ < 1, then M is diagonalizable. P4) If ǫ ' 1, then M is diagonalizable if and only if fi 6= flow and fi 6= fhigh, for all i. If not, the maximum size of the Jordan blocks is Q = 2 (and

in this case Bi results non diagonalizable). This aforementioned properties allow a complete understanding of the powers

of M and ' M , as it is summarized by the following theorem. Theorem 6.2 Under condition (6.17), the powers of M and ' M are uni- tarily similar to a block diagonal matrix, namely M t = Σ diag(Bt i) Σ '' and ' M t = Σ diag( ' B t

i) Σ '' with Σ'' = Σ''1, where: 96 6. Fast Consensus via ADMM ' The first block has the form B t

1 = 1 1 '' d 1 '' dt+1 ''d(1 '' d t) 1 '' d t ''d(1 '' d t ''1) , ' B t

1 = dt 1 '' d ''d d ''1 1 . (6.23) ' For ǫ < 1 all remaining blocks have the form B t

1 = ' B t

i = λt+1 2i ''1 ''λ t+1

2i ''λ2i''1λ2i(λ t 2i ''1 ''λ t 2i) λt 2i ''1 ''λ t 2i ''λ2i''1λ2i(λ t ''1 2i ''1 ''λ t ''1 2i ) λ2i''1''λ2i (6.24) with roots taken from (6.21). Alternatively, (6.24) can be rewritten in

the form (sometimes more meaningful) B t i = ' B t

i = " P t

k=0 λ t ''k 2i ''1λ k 2i '' P t ''1 k=0 λ t ''k 2i ''1λ k+1

2i P t ''1 k=0 λ t ''1''k 2i ''1 λ k 2i '' P t ''2 k=0 λ t ''1''k 2i ''1 λ k+1

2i # (6.25) which is valid for t ' 2, but also for t = 1 when the bottom right element is set to 0. ' For ǫ ' 1 all the blocks satisfying fi 6= flow and fi 6= fhigh have the form (6.24) (or (6.25)), while the blocks that satisfy either fi = flow or

fi = fhigh have the form B t

i = ' B t

i = (1 + t)λt 2i ''tλ t+1

2i tλt''1 2i ''(t '' 1)λ t 2i , (6.26) where λ2i = fi '' 1

2 γ. All blocks B t i and ' B t

i are real valued. 2 In this section, it has been shown that with the assumption (6.17) a full spectral analysis of matrix M can be performed. For the rest of the paper,

(6.17) it is assumed to hold. 6.4 ADMM Performance Analysis A performance evaluation of the model (6.12) is now performed, in terms

of convergence speed and, introducing the proper noise term, of the noise

resilience. 6.4. ADMM Performance Analysis 97 6.4.1 Noiseless communications In case of noiseless communications, (6.16) holds with no noise terms added.

It is well known that the convergence speed of the algorithm is regulated by ρ = sr( ' M ) = max i |'λi| = max i 6=1 |λi| , (6.27) which is the spectral radius, or the largest eigenvalue modulus of a matrix.

It is possible to derive a bound based on ρ, that is generally fairly loose (if

identified by exploiting [80, Fact 3], or similar results) but indicative of the

convergence speed of the consensus algorithm. In Appendix 6.A.3, the reader can find the proof of the following theorem: Theorem 6.3 When (6.17) applies and in the absence of noise, the con-

vergence of ADMM to its limit value is regulated, for t > ǫ, by the norm-2

bound kx(t) '' x('')k2 ' γ kθk2 t ρ t ''1 , (6.28) where x( '') = 1θ. 2 6.4.2 Encompassing for communication noise Noise coming from wireless communications among nodes can be kept into

account in the model (6.12) by adding a noise term n(t), as follows s (t) = M s(t '' 1) + n(t) (6.29) where the vector n(t) is as follows n (t) = w(t) 0 , (6.30) with w(t) collecting the N noise samples at each sensor at time t. Noise n(t) accounts for noise that affects the exchanged readings. Possible sources of noise in the readings {xi(t)} encompass the quantization errors, which are unavoidable for transmitting values in finite precision, and the

thermal noise produced by the receivers. The noise resilience of the algorithm can be studied through the subse- quent application of (6.29), i.e., s (t) = ' M t s (0) + t ''1 X k=1 ' M k n (t '' k) + n(t) + 1 θ + 1 t ''1 X k=1 '''1n(t '' k) , (6.31) 98 6. Fast Consensus via ADMM which gives a representation of s(t) in terms of s(0) and the noise terms.

Because of such corrupting terms, it can happen that consensus on the av-

erage value θ cannot be achieved, therefore a study of the noise resilience

properties of the algorithm is mandatory. It can be observed from (6.31) that the last term on the right hand side of the equation is indeed a random walk, whose variance increases with time

if the condition '''1 n(k) = 1 + ǫ N 1'w(k) = 0 (6.32) is not verified. To overcome this issue (which in general prevents from reach-

ing average consensus even if the random walk is has zero mean), in [68] the

authors only consider noise terms that are orthogonal to ''1. Such impair-

ment can be overcome also by proper changes in the consensus algorithm,

like in [81]. Before proceeding to the performance analysis, it is necessary to design a communication protocol for the algorithm (6.29), so that the noise model

can be subsequently characterized. Given that node i has knowledge of its

first-hop neighborhood Ni only, it is assumed, without losing generality, that the only source of noise is the quantization error in the exchanged data. The

upcoming analysis can be easily extended if also thermal noise is considered. Method A By recalling what done in [66], the description of the communication protocol

follows. 1. Node i sends to its first-hop neighbors a message containing the differ- ence mi,1(t) = 2xi(t) '' xi(t '' 1) . As it has already been said, quantization error is present, which is mod-

eled by an additive noise qi,1(t) with zero mean and common variance

'2 q . Therefore, the noise-corrupted quantity received by all neighbors of node i is mi,1q(t) = 2xi(t) '' xi(t '' 1) + qi,1(t) . By stacking all the mi,1''s together in a vector, one obtains m1(t) = 2x(t) '' x(t '' 1) , and the corresponding noisy version is equal to m1q(t) = m1(t)+q1(t). 6.4. ADMM Performance Analysis 99 2. The second step requires nodes to transmit an elaboration of m1(t), which is given by m2(t) = Am1q(t) , A = diag(S'1)'' 1S' , and its noisy version is equal to m2q(t) = m2(t) + q2(t), with q2(t)

having the same statistics of q1(t) while being statistically independent.

The message is then broadcast to the neighbors. 3. The final step involves the update of the estimate values x(t) as follows x (t + 1) = (1 '' d)x(t) + Bm2q(t) '' d h q1(t) + diag(S'1)q2(t) i , (6.33) with B = dS, and where the last term has been designed to satisfy

(6.32). Moreover, the update step (6.33) can be actually implemented

since qi,1(t) and qi,2(t) are known quantities at node i. The algorithm (6.33) is then equivalent to (6.29) and (6.30), with w (t) = (F '' I)q1(t) + d(S '' diag(S'1))q2(t) , (6.34) which satisfies (6.32) since 1'w(t) = 0, as a consequence of the double sto-

casticity of matrix F (see Appendix 6.A.2). It is easy to prove that noise

w (t) is zero mean, while it has correlation matrix equal to Rw = E [w(t)w''(t)] = h (F ''I) 2 + d2(S''diag(S'1))(S'''diag(S'1)) i '2 q . Method B In Method B just one message exchange is required, therefore the whole

communication protocol is simpler than in the previous case. 1. As for Method A. 2. Once the {mi,1q} have been received, the update step is implemented as follows x (t + 1) = (1 + d)x(t) '' dx(t '' 1) + (F '' I)m1q(t) . (6.35) From (6.35) it can be inferred that noise w(t) satisfies (6.32), since w (t) = (F '' I)q1(t) . (6.36) Also in this case, noise has zero mean but correlation matrix Rw = (F '' I) 2'2 q . 100 6. Fast Consensus via ADMM 6.4.3 Noise resilience characterization Noise resilience of (6.29) is analysed through the evaluation of the distance

of the state vector x(t) from its average 1x(t) at time t, i.e., 'x(t) = x(t) '' 1x(t) = Kx(t) , K = I '' 1 N 11 ' where x(t) = 1 N 1 'x(t) . It is straightforward to note that the two signals, 'x(t) and 1x(t) are orthog-

onal by construction. The aforementioned decomposition has already been

applied to similar contexts (see [64, 75]). Average component By defining the average quantities s (t) = x(t+1) x(t) , s(0) = γ θ 0 . the following relation holds s (t) = B1s(t '' 1) , B1 = 1 + d ''d 1 0 . (6.37) It needs to be noticed that the update matrix that regulates the dynamics of (6.37) is actually the block (6.20) with i = 1, with eigenvalues λ1 = 1 and

λ2 = d. A recursive application of (6.37) brings to s (t) = B t 1s(0) from which an expression for the average signal can be obtained as x(t) = (1 '' d t)θ . (6.38) The above guarantees the convergence of the average signal x(t) to θ with a

rate that depends on the parameter d. Distance from the average The signal representing the distance from the average has the following dy-

namics 's(t) = ' M 's(t '' 1) + 'n(t) , ' M = K 0 0 K M (6.39) 6.4. ADMM Performance Analysis 101 where 's(t) = 'x(t+1) 'x(t) , 's(0) = γ 'θ 0 , 'n(t) = w(t) 0 with 'θ = Kθ and with noise w(t) satisfying the property w(t) = Kw(t).

Matrix ' M is known, and it has a similar eigenstructure to matrix M with the two eigenvalues λ1 = 1 and λ2 = d mapped to zero, therefore the Jordan

structure is the same ( ' Bi = Bi for i 6= 1) except for ' B1 which is the 2 ' 2 zero matrix. As a consequence, the powers of ' M are fully described recalling (6.24) and (6.26). A recursive application of (6.39) gives 's(t) = ' M t 's(0) + t ''1 X k=0 ' M k 'n(t '' k) . whose convergence speed is regulated by ξ = sr( ' M ) , (6.40) which satisfies the condition ξ ' ρ = max(ξ, d) < 1. The evolution of the signal expressing the distance from the average can also be written as 'x(t) = γQHtQ'' 'θ + 'η(t) (6.41) where Q is the unitary matrix diagonalizing F (F = Qdiag(fi)Q'') and

where Ht = diag(hi,t) contains the bottom left elements of ' B t

i, which are hi,t = 0 , i = 1 λt 2i''1 ''λ t 2i λ2i''1''λ2i , i 6= 1, λ2i''1 6= λ2i tλt''1 2i , i 6= 1, λ2i''1 = λ2i . (6.42) Moreover, 'η(t) = t ''1 X k=0 QHkQ''w(t '' k) (6.43) is the overall noise term. Convergence to zero of the elements in (6.42) implies the convergence in mean to zero of 'x(t), in fact lim t '''' E [ 'x(t)] = lim t '''' γQHtQ'' 'θ = 0 . On the other hand, the convergence speed properties of (6.41) have to be

analysed through the mean squared error, ' MSEt = E [ k 'x(t)k 2 2]. In Ap- pendix 6.A.3 the following theorem on a bound on ' MSEt is proved: 102 6. Fast Consensus via ADMM Theorem 6.4 When (6.17) applies, the MSE ' MSEt = E k 'x(t)k 2 2 = m2 t + ' 2 t , mt = kE [ 'x(t)] k2 = kγQHtQ'' 'θk2 , '2t = E k 'η(t)k 2 2 for t > ǫ satisfies mt ' γ t ξ t ''1k 'θk2 '2t ' K ' 2 q , K = N X i=2 Ki Ui (6.44) where Ki = (1 + fi '' γ) (1 '' fi)(1 + 3fi '' 2γ)(1 '' fi + γ) Ui = (fi ''1) 2 +d2sr((S''diag(S'1))(S'''diag(S'1))) , Method A (fi '' 1) 2 + 4d2 , Method A, S = S' (fi '' 1) 2 + d( '' fi '' γ '' '' d)2 , Method A, S = S' 0 (fi '' 1) 2 , Method B . (6.45) 2 A few remarks on (6.44) are needed. The reader can notice how the convergence speed is regulated by ξ, while the noise level is dependent on K.

Moreover, the resulting bound on the noise variance at time t is not dependent

on matrix eigenvectors (such as in [66,68]) and, as it will be shown in Section

6.6, it is also very strict, especially when in Method A matrix S has a specific

structure. 6.4.4 Initial states choice The condition (6.15), identifies a vectorial subspace of dimension 2N ''1 from which the initial condition s(0) can be chosen, therefore there is no special

constraint to use (6.13), while the noise bound (6.44) is not dependent on the

choice of s(0) and the convergence results are not fundamentally affected. As

an example, the choice s (0) = θ θ , (6.46) 6.5. Optimization Issues 103 is another valid initial state. What makes this specific choice interesting is

that the average signal is equal to the average of the initial readings, at each

time instant t, i.e., x(t) = θ . The convergence bounds (6.28) and (6.44) are just slightly modified, the

product γt being substituted by 2t '' 1. 6.5 Optimization Issues Given the results stated in the previous sections, now the focus is on their

analysis in order to optimize the algorithm (6.29) in terms of convergence

speed or noise resilience. The parameters to be tuned for optimization pur-

poses are the shape matrix S and the amplitude factor ǫ. 6.5.1 Convergence speed The aim of this section is to choose S and ǫ in order to maximize the con-

vergence speed. To begin with, S is assumed given and the aim is to choose

ǫ such that the convergence speed is maximized. If s(0) is given by (6.46), the convergence speed is described by ξ (6.40), and its minimization can be

performed by inspection of Fig. 6.1 (see Appendix 6.A.4); it turns out that ǫopt = 1 '' 1 ''(4'2''3)2 , '2 > 3

4 1 , '2 = 3

4 '2 '' 1

2 ('2 '''N )2+('2+'N ''1) '' ('2 '''N )2+4('2'' 1

2 )(1'''N ) , '2 < 3

4 (6.47) for having ξopt = 1+ '' 2dopt ''1 2 , '2 > 3

4 1

2 , '2 = 3

4 pdopt(2'2 '' 1) , '2 < 3 4 , dopt = ǫopt 1 + ǫopt . (6.48) In Fig. 6.2 such results can be observed, with 'N = 1

2 , as it is in common scenarios. If, instead, the initial state is chosen to be equal to (6.13), the convergence speed depends on both d and ξ (see, respectively, (6.38) and (6.44)), therefore

ρ = max(ξ, d) in (6.27) is the key parameter to minimize. In Appendix 6.A.4 104 6. Fast Consensus via ADMM 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 '2 dopt ξopt Figure 6.2: Convergence speed maximization choices, with initial state (6.46)

and 'N = 1

2 . it is shown that, if '2 ' 3

4 results (6.47) and (6.48) keep holding. On the other hand, if '2 < 3

4 , ǫopt = 1 is the best choice, with ξopt = 1

2 . As a general remark, the case '2 < 3

4 is a rare case in common networks, therefore generally the convergence properties are regulated by the same equations,

(6.47) and (6.48). If the aim is jointly optimizing S and ǫ and S is a generic N ' N matrix, the optimization problem is still an open issue. On the other hand, the best

choice for matrix S already exists if only symmetric matrices are considered,

and it is S = Sboyd with Sboyd the optimum Boyd solution (see Section 5.3.2).

In this case Φ = ( 1 2 S 2 boyd + 1

2 I , Method A 1

4 Sboyd + 3

4 I , Method B therefore '2 depends on the second largest eigenvalue modulus of Sboyd de-

noted as ρboyd, i.e., '2 = ( 1 2 (1 + ρ 2 boyd) , Method A 1

4 (3 + ρboyd) , Method B . (6.49) Fig. 6.3 shows the value of ξ obtained from substituting (6.49) in (6.47) and (6.48), therefore such parameter is there plotted as a function of ρboyd.

For Method A, '' ξ is instead plotted since, differently from the other al- 6.5. Optimization Issues 105 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ρboyd pξA,opt ξB,opt ρboyd ξoreshkin,opt Figure 6.3: Spectral radius of ADMM and of other state-of-the-art consensus

algorithm'' Boyd''s optimal matrix and Oreshkin''s method. gorithms, the data dissemination involves two message exchanges. It is in-

teresting to notice that, for large networks (ρboyd > 1 '' 2 ), which is a com- mon scenario, ADMM provides a better convergence speed with respect to

the optimum Boyd solution. In Fig. 6.3, the spectral radius for the con-

sensus algorithm proposed by Oreshkin et al. [73] is also plotted; it is the

current state-of-the-art for fast consensus, and its performance outperforms

ADMM''s. To further study the convergence speed, for large networks it results that ρboyd is very close to 1 so that it can be written as ρboyd = 1 '' Ψ(N), with small Ψ(N), and, by asymptotic Taylor series expansion it turns out that '' 1

2 log ξopt,A = '' log ξopt,B = p2Ψ(N) 2 + O(Ψ(N)) (6.50) so that the speed gap with respect to [73] is only of a factor of 2 (see [73,

eq. (15)]). For ρboyd, it holds that '' log ρboyd = Ψ(N) + O(Ψ(N) 2), there- fore the speed gain of ADMM over the Boyd''s solution is substantial and

comparable to other state-of-the-art fast consensus algorithms [61]. In order to realize which of the two parameters, S or ǫ, is more impor- tant for maximization of convergence speed, in Fig. 6.4 the speed measure

'' 1

2 log ξA is shown when the underlying network is a mesh grid consisting of N = L2 nodes, in which each node is connected to the four closest neighbors

in the grid (right, left, below, above). Discontinuities in the results coming 106 6. Fast Consensus via ADMM 0 50 100 150 0 0.025 0.05 0.075 0.1 S =Sboyd, ǫ=ǫopt S =SAC, ǫ=ǫopt S =Sboyd, ǫ=5 S =SAC, ǫ=5 L '' 1 2 lo g ξ A Figure 6.4: Convergence speed '' 1

2 log ξA dependent on ǫ and S when the underlying network graph is a mesh grid where each node has 4 neighbors. from boundary conditions are overcome by assuming that the grid lies on a

torus. It appears evident how ǫ is the most important parameter for opti-

mizing the convergence speed, since just sub-optimal performance is reached

if S is chosen to be the averaged consensus matrix SAC (see Section 5.3.2),

with non zero entries equal to 1 5 . Similar results apply to Method A and, in general, to less structured networks. 6.5.2 Resilience to noise For ADMM-based consensus, the key parameter for describing the noise re-

silience properties is K, as it is stated in (6.44). From (6.45) it is evident

that Method B has a better resilience to noise with respect to Method A,

therefore only Method B will be considered in the forthcoming analysis for

comparison with the state-of-the-art of standard and fast consensus algo-

rithms. Fig. 6.5 shows the behaviour of KiUi as a function of d for different

values of 'i, where KiUi = d(1 '' 'i) [1 + d(2'i '' 1)] [1 '' d(2'i '' 1)] [1 + d(3'i '' 2)] (Method B) . (6.51) A first observation is that all curves stay below 1, which means that the

overall noise is smaller than the quantization noise in all cases, i.e., '2t '

(N '' 1)' 2 q . Moreover, since KiUi tends to 0 as d decreases (i.e., as ǫ goes to 6.5. Optimization Issues 107 0 0.2 0.4 0.6 0.8 1 10 ''3 10 ''2 10 ''1 10 0 d KiUi 'i = 0.5 'i '' 1 Figure 6.5: Method B: Noise values KiUi as functions of d for different values

of 'i. 0), then the noise can be controlled at a desired level, by carefully choosing

the design parameter ǫ. Making the comparison with standard consensus, it can be seen that KiUi = 1 '' ρi 1 + ρi (Standard consensus) (6.52) with ''1 < ρi < 1 being the real valued eigenvalues of the (symmetric) consensus matrix. Getting KiUi < 1 is assured by having ρi > 0, which

is not guaranteed for all i, even using Boyd''s optimal matrix. This means

that the expected level of noise in standard consensus is higher than that in

ADMM Method B. A similar conclusion can be drawn for the solution proposed by Oreshkin; in fact, with (θ1, θ2, θ3) = ( ''δ, 0, 1 + δ) (see [73, eq. (13)]), in Appendix 6.A.4 by similar arguments it is shown that KiUi = (1 + αδ)(1 '' ρi) (1 '' αδ)(1 + ρi) (Oreshkin et al., [73]) (6.53) where ''1 < ρi < 1 are again the real valued eigenvalues of the (symmetric) consensus matrix used, and where αδ ' 0 is carefully designed for the fastest convergence. It is straightforward to see that (6.52) is always less than (6.53),

therefore the algorithm proposed by Oreshkin is expected to provide worse 108 6. Fast Consensus via ADMM noise performance with respect to standard consensus. This is also supported

by the fact that an optimal choice of αδ asymptotically provides αδ = 1 '' 2 p2Ψ(N) + O(Ψ(N)) . 6.5.3 Results for highly structured networks In highly structured networks, performance can be further analysed. In fact,

in the following sections line graphs and mesh grids will be characterized in

terms of convergence speed and noise resilience, when ADMM-based consen-

sus is used. Some closed form solutions will be provided. Line graph In the line graph, every node is connected to two nodes only, the preceding

and the following one. Boundary conditions are avoided by connecting the

first and the last node in the chain. In this case, matrix Sboyd is circulant,

characterized by the first row of the form [1 '' 2a, a, 0, . . . , 0, a], with a = ( 1 3 ''cos(2'/N) , even N 1 2+cos('/N ) ''cos(2'/N) , odd N and with eigenvalues having the form λn,boyd = 1 + 2a(cos(2'n/N) '' 1), n = 0, . . . , N '' 1. It is also ρboyd = 1 + 2a(cos(2'/N) '' 1). Mesh grid The mesh grid is a graph in which each node is connected to the four closest

nodes (right, left, below, above). Boundary conditions are avoided through

the assumption that the grid lies on a torus. Matrix Sboyd is block circu-

lant and the first block-row is given by [P , aI, 0, . . . , 0, aI], with P being a

circulant matrix whose first row equal to [1 '' 4a, a, 0, . . . , 0, a], and a = ( 1 5 ''cos(2'/L) , even L 1 3+2 cos('/L) ''cos(2'/L) , odd L . Eigenvalues are given by the expressions λij,boyd = 1 '' 4a + 2a cos(2'i/L) + 2a cos(2'j/L), i, j = 0, . . . , L '' 1, with ρboyd = 1 + 2a(cos(2'/L) '' 1). Table 6.1 summarizes the aforementioned results, according to (6.50) for high values of N, while in Fig. 6.6 the noise resilience results are shown,

with an evident performance improvement through the use of ADMM with

respect to standard and fast consensus approaches. 6.6. Numerical Results 109 0 200 400 10 ''1 10 0 10 1 10 2 10 3 10 4 Line graph 0 200 400 10 ''1 10 0 10 1 10 2 10 3 10 4 Mesh grid Oreshkin Boyd ADMM, Method B K/N N L Oreshkin Boyd ADMM, Method B Figure 6.6: Noise resilience in line graph (left) and mesh grid (right). Line graph Mesh grid '' log ρboyd ' 2'2 N2 ' '2 N '' 1

2 log ξA,opt ' '' log ξB,opt ' ' N ' ' '' 2N '' log ξoreshkin,opt ' ' 2' N ' '' 2' '' N Table 6.1: Convergence speed results for line graph and mesh grid. 6.6 Numerical Results Performance of ADMM are here tested by assuming a network of N nodes

whose communication properties are described by a random geometric graph

(RGG). More specifically, nodes are disposed in a disc of area 1 m2 (radius

R = 1/ '' '), and the coverage radius 3 ' is chosen such that the average number of neighbors is Nneigh, to have 4 ' = pNneigh/('N). 3The coverage area of a node is assumed circular, with radius '.

4Under the assumption of uniform spatial distribution for the nodes, and by neglecting boundary conditions, the probability that any two nodes are in the communication range 110 6. Fast Consensus via ADMM 0 100 200 300 400 500 10 ''7 10 ''6 10 ''5 10 ''4 10 ''3 Method A ADMM''MH

ADMM''Boyd

Bound

True MSE 0 100 200 300 400 500 10 ''7 10 ''6 10 ''5 10 ''4 10 ''3 Method B t t MSEt N MSEt N Figure 6.7: MSE with a RGG of N = 100 nodes, Nneigh = 8, and '2q = 10'' 6. Comparison between Method A and Method B In Fig. 6.7 the MSE when using ADMM Method A and B is shown as a

function of t, with N = 100, Nneigh = 8 and '2q = 10'' 6. For statistical confidence, the MSE is averaged over 1000 realizations of the noise. The used shape matrices are the Boyd''s optimal solution Sboyd, and the MH matrix SMH (see Section 5.3.2). The amplitude parameter ǫ is optimally

chosen for the maximization of the convergence speed, according to (6.47). Method B clearly outperforms Method A in terms of noise resilience, confirming the theoretical results of Section 6.4.3. On the other hand, the two

methods are pretty much equivalent in terms of convergence speed. Moreover,

the choice of a suboptimal shape matrix leads to just limited loss, as it was

already been noticed in Fig. 6.4 for mesh grids. In Fig. 6.7 also the MSE

bound (6.44) is reported, and it can be observed that such a bound is indeed

very strict far large values of t (where it saturates to the noise floor level

K '2 q ) while for smaller t it provides an excellent indication of the convergence speed. of each other is Pconn = '' 2, which is the ratio between the area covered by a node, and

the entire area. The result follows by also noting that the average number of connected

nodes is approximately Nneigh = N Pconn. 6.6. Numerical Results 111 0 20 40 60 80 100 0 0.05 0.1 ADMM Method B, AC with optimum ε ADMM Method B, AC with fixed ε N '' log ξ (a) 0 20 40 60 80 100 10 ''1 10 0 0 20 40 60 80 100 0 5 10 15 N N K

N ǫ (b) (c) Figure 6.8: ADMM Method B with fixed ǫ in a RGG with Nneigh = 8: (a)

convergence speed; (b) MSE at steady state; (c) optimal choices of ǫ used. Suboptimal choice for ǫ Computing the optimized values of ǫ according to (6.47) can be hard to

implement. For such reason, it is interesting to see what is the performance

loss if ǫ is chosen differently from ǫopt in (6.47). In Fig. 6.8, the performance

of ADMM in terms of convergence speed and noise level is shown as well

as the fixed values of ǫ used, this giving a flavour of the loss induced by

this suboptimal, but simpler to implement, choice for the amplitude factor

ǫ. The used values for ǫ are those that maximize the average convergence

speed, through the minimization of the average spectral radius definition

ξ = (E [ξtref ])1/tref , with tref = 100. It is clear from Figg. 6.8a and 6.8b that

this suboptimal choice for ǫ leads to a small loss both in convergence speed

and noise resilience. 112 6. Fast Consensus via ADMM 0 100 200 300 400 500 10 ''7 10 ''6 10 ''5 10 ''4 10 ''3 10 ''2 10 ''1 Standard consensus, MH

Std. consensus, Boyd

Std. consensus, 2 steps Boyd

Oreshkin, Boyd

Oreshkin, MH

ADMM Method A, Boyd

ADMM Method A, AC

ADMM Method B, Boyd

ADMM Method B, AC t MSEt N (a) N = 50 0 20 40 60 80 100 10 ''7 10 ''6 10 ''5 10 ''4 10 ''3 10 ''2 10 ''1 N MSEt N (b) t = 300 Figure 6.9: MSE with an underlying RGG: (a) as a function of t for N = 50

nodes, and (b) as a function of N for t = 300. Nneigh = 4, '2q = 10'' 6, and ǫ optimized for convergence speed. Comparison with the state-of-the-art In order to compare ADMM with the standard and fast consensus algorithms,

in Fig. 6.9a the MSE (averaged over 1000 noise realizations) is shown for a

RGG with N = 50 and '2q = 10'' 6. The used shape matrices are the Boyd''s optimum solution Sboyd, and the AC matrix SAC, the latter providing better 6.7. Conclusions 113 performance with respect to the MH matrix. The amplitude factor ǫ is again

chosen for optimizing the convergence speed according to (6.47). Fig. 6.9a shows how ADMM allows a dramatic speed improvement with respect to the optimal Boyd''s solution. On the other hand, the slight speed

loss with respect to [73] is balanced by a consistent improvement in noise

resilience when ADMM Method B is used. In Fig. 6.9b, the MSE is shown

for t = 300 and for different values of N. It can be seen that the relations

between the various algorithms hold, independently on the network size. 6.7 Conclusions This chapter went through the application of ADMM to the problem of av-

erage consensus. The approach used for analysing the ADMM properties

was the relaxation of the ADMM augmentation constants, thanks to which

a closed form matrix description of ADMM has been provided. Convergence

speed, noise resilience and simple and tight analytical bounds have been de-

rived. These last turned out to be particularly useful for optimizing ADMM

parameters, and for analytically assessing ADMM properties. It has been

shown how the ADMM-based consensus has excellent properties in terms of

convergence speed, which place it among the best algorithms for fast consen-

sus, getting performance similar to the state-of-the-art [73]. Moreover, the

ADMM-based consensus showed the best noise resilience among the stan-

dard and the fast consensus algorithms in the literature. All these properties

make the ADMM-based consensus an attractive tool for reaching clock syn-

chronization in a WSN quickly and with a very low noise level, as it is better

described in Chapter 7. 114 6. Fast Consensus via ADMM 6.A Appendix 6.A.1 Proof of Methods A and B Proof of Method A The augmented Lagrangian function corresponding to (6.2) can be derived

through the application of the results in [65, p. 255], therefore getting Lc(x, z, λ) = N X i=1 (xi '' θi) 2 + N X i=1 X j ''Ni 2λi,j(xi '' zj) + N X i=1 X j ''Ni ci,j(xi '' zj) 2 (6.54) where vectors x, z, λ and c respectively collect the states xi,zj, the Lagrange

multipliers λi,j, and the augmentation constants ci,j > 0. By using the

equivalence P N

i=1 P j ''Ni = P N

j=1 P i ''Nj , the ADMM update equations for t ' 0 are derived from (6.54), i.e., xi(t + 1) = θi + X j ''Ni ci,jzj(t) '' X j ''Ni λi,j(t) 1 + X j ''Ni ci,j zj(t + 1) = X i ''Nj ci,jxi(t + 1) + X i ''Nj λi,j(t) X i ''Nj ci,j λi,j(t + 1) = λi,j(t) + ci,j(xi(t + 1) '' zj(t + 1)) (6.55) where convergence to xi( '') = θ is assured by the fact that ci,j > 0, in- dependently on the choice for the initial conditions λi,j(0) and zj(0). A

more convenient form for (6.55) comes from setting null initial conditions,

λi,j(0) = 0 and zj(0) = 0, and by defining the new parameters λi(t) = '' P j ''Ni λi,j(t '' 1) 1 + P j ''Ni ci,j , (6.56) so that storing all the Lagrange multipliers is not needed any more. In

fact, by first noting that, from the second of (6.55) it is P i ''Nj λi,j(t) + 6.A. Appendix 115 P i ''Nj ci,j(xi(t + 1) '' zj(t + 1)) = 0 which, in turn, from the third of (6.55), gives X i ''Nj λi,j(t + 1) = 0 , t ' 0 , (6.57) by setting null initial conditions for λi,j(0), (6.57) results to hold for t ' ''1. Substituting such result into the second of (6.55), it turns out that zj(t) = X i ''Nj ai,jxi(t) , t ' 1 . (6.58) The update equation (6.4) is therefore obtained by substituting (6.58) in the

first of (6.55) and using definitions (6.3) and (6.6). It is valid for t ' 1. The initial conditions (6.5) come from the first of (6.55) and from (6.56). Proof of Method B In Method B, the augmented Lagrangian function is Lc(x, z, λ) = N X i=1 (xi '' θi) 2 + N X i=1 X j ''Ni 2λi,j(xi '' zi,j)+ N X i=1 X j ''Ni ci,j(xi '' zi,j) 2 (6.59) where an additional constraint, zi,j = zj,i, must be kept into account. The

ADMM update equations for t ' 0 in this case are xi(t + 1) = θi + X j ''Ni ci,jzi,j(t) '' X j ''Ni λi,j(t) 1 + X j ''Ni ci,j zi,j(t + 1) = ci,jxi(t+1)+cj,ixj(t+1) ci,j + cj,i + λi,j(t)+λj,i(t) ci,j + cj,i λi,j(t + 1) = λi,j(t) + ci,j(xi(t + 1) '' zi,j(t + 1)) (6.60) with ci,j > 0, which again assures convergence to xi( '') = θ for any initial condition λi,j(0) and zj(0). The second of (6.60) comes from the additional

constraint zi,j = zj,i in (6.7). By proceeding analogously to what done in

Method A, null initial conditions, xi(0) = 0 and zi,j(0) = 0, are considered,

thus obtaining ci,jzi,j(t + 1) + cj,izj,i(t + 1) = λi,j(t) + λj,i(t) + ci,jxi(t + 1) +

cj,ixj(t + 1) from the second of (6.60). By also using the third of (6.60), it 116 6. Fast Consensus via ADMM turns out that λi,j(t + 1) + λj,i(t + 1) = 0 for t ' 0. Moreover, by recalling λi,j(0) = 0, it also holds that for t ' 0 it is zi,j(t) = ci,jxi(t) + cj,ixj(t) ci,j + cj,i . (6.61) The substitution of (6.61) into the first and third of (6.60) finally gets (6.4),

with ui(t) as defined in (6.8) and λi still defined by (6.56). 6.A.2 Proof of Results of 6.3 Proof of (6.11), (6.12) and (6.13) The update equations (6.4) can be written in matrix form for t ' 1 as x (t + 1) = (I '' D)θ + (D + 2U)x(t) + λ(t) λ (t + 1) = λ(t) + U x(t) (6.62) with initial conditions x(1) = (I '' D)θ and λ(1) = 0. To reformulate (6.62) as an update of x(t) as a function of x(t) itself and of the previous values

x (t '' 1), x(t) is added and subtracted to the first of (6.62), thus getting x (t + 1) = h (I '' D)θ + (D + 2U)x(t) + λ(t) i + x(t) '' h (I '' D)θ + (D + 2U)x(t '' 1) + λ(t '' 1) i which gives (6.11) through the use of the second of (6.62), thus proving also

(6.12). Being t ' 1, the initial state would be s (1) = (I + D + 2U)(I '' D)θ (I '' D)θ . On the other hand, the same recursion is found by using the initial state

given by (6.13), as s(1) = M s(0). Proof of P1) and P2) The eigenvalues of M as described in P1) are assured by the convergence

properties of ADMM [65]. P2) is instead proved by the following properties F 1 = 1 , (1 + C1)'U = 0 , (1 + C1)'D = (C1)' . (6.63) 6.A. Appendix 117 Proof of Theorem 6.1 The properties of matrix F can be derived from standard matrix properties

[79, 82]. It must be first noted that, since (6.17) holds, it is ''2 = d 1 ''d I , so that F = (1 '' d) h C '''' 1 1 C ' + I i , Method A (1 '' d) h ' C + I + ''2 '' diag( ' C 1) i , Method B . (6.64) As a consequence, the following properties hold: a) Elements in F are all non negative, this fact being assured from ci,j > 0 in Method A, while in Method B it must be also noted that diag( ' C 1) = diag h X j ci,jcj,i ci,j+cj,i i 'diag hX j ci,j i = diag(''2) (6.65) with ' applied entrywise. b) F 1 = 1, since U 1 = 0 (see (6.63)).

c) F is row-stochastic as a consequence of a) and b).

d) Since the underlying graph is strongly connected, F is irreducible.

e) c) and d) assure that F has an eigenvalue '1 = 1 with single multi- plicity, while the others satisfy |'i| < 1, i 6= 1 [79, Theorem 10, p. 84]. f) Inspection of (6.64) leads to the conclusion that F is a real symmetric matrix. g) Inspection of (6.64) and by recalling (6.65) leads to the conclusion that F is positive definite. h) As a consequence of g), F has real valued and positive eigenvalues.

i) As a consequence of f) and g), F is diagonalizable by a unitary matrix.

j) As a consequence of F '' (1 '' d)I being a positive semidefinite matrix (seen by inspection), the eigenvalues of F satisfy fi ' 1 '' d. The block diagonalization (6.20) of matrix M is a consequence of the unitary diagonalization of matrix F . Proof of Theorem 6.2 Since M and ' M have the same eigenstructure except for the first eigenvalue (λ1 = 1 and 'λ1 = 0), Theorem 6.1 assures that ' M = Σ diag( ' Bi) Σ'' where ' B1 6= B1 and ' Bi = Bi, i 6= 1. Blocks Bi can be further analysed: a) If λ2i''1 6= λ2i, block Bi can be written as Bi = V iSiV '' 1 i where V i = λ2i''1 λ2i 1 1 , Si = λ2i''1 0 0 λ2i , V '' 1 i = 1 λ2i''1 '' λ2i 1 ''λ2i ''1 λ2i''1 . (6.66) 118 6. Fast Consensus via ADMM As a consequence, B t

1 = V 1 1 0 0 dt V '' 1 1 , B t

i = V i λt 2i ''1 0 0 λt 2i V '' 1 i which proves the first of (6.23) and (6.24). Standard properties of the geo-

metric series lead to the expression (6.25). For ' B1 it holds that ' B t

1 = V 1 0 0 0 dt V '' 1 1 which proves the second of (6.23) by substitution of V 1. b) If λ2i''1 = λ2i, block Bi can be expressed as Bi = V iJiV '' 1 i , with V i = λ2i 1 '' λ2i 1 ''1 , J i = λ2i 1 0 λ2i , V '' 1 i = 1 1 '' λ2i 1 ''λ2i , (6.67) where J i is a Jordan block of size Q = 2. It follows that B t

i = V iJ t

iV ''1 i = V i λt 2i tλt''1 2i 0 λt 2i V '' 1 i which, by use of (6.67), proves (6.26). 6.A.3 Proof of Theorems of 6.4 Proof of Theorem 6.3 By letting ' B t

i = βt 1,i βt 2,i βt3,i β t 4,i with Q the unitary matrix which diagonalizes F , it holds that ' M t = Q Q diag(βt 1,i) diag(βt 2,i) diag(βt 3,i) diag(βt 4,i) Q'' Q'' . It follows from (6.16) that x(t) '' x('') = γ Q diag(β t 3,i)Q ''θ, therefore the application of standard matrix properties (see [82]) lead to kx(t)''x('')k2 ' γ kθk2 maxi |β t 3,i|. Now, from (6.23) it is |β t 3,i| = d t/(1 '' d), which is valid for t > 1. On the other hand, from (6.26) it holds that |β t 3,i| ' tρ t ''1, and the same bound is valid for (6.25). Therefore, for t > d/(1 '' d) = ǫ it is dt/(1 '' d) < tρ t ''1 as d ' ρ, and this proves (6.28). 6.A. Appendix 119 Proof of Theorem 6.4 Taking the expected value of 'x(t) and using (6.41) leads to kγQHtQ'' 'θk2 ' γkHtk2k 'θk2 . By inspection of (6.42) (the reader is also referred to (6.25)), it can be seen

that kHtk2 ' tξ t ''1, which proves the first of (6.44). For the noise signal 'η(t), (6.43) and the statistical independence of w(t) lead to E k 'η(t)k 2 2 = t ''1 X k=0 E kQHkQ''w(t '' k)k 2

2 = t ''1 X k=0 trace (HkQ''RwQH''k) ' + '' X k=0 trace (HkQ''RwQH''k) = E k 'η('')k 2 2 . By recalling that the Hk are real-valued and diagonal matrices, it holds that E k 'η('')k 2 2 = '2 q N X i=2 '' X k=0 h 2 i,k ! [Q''RwQ]i,i '2q . Simple algebra manipulation allows to express '' X k=0 h2 i,k = 0 , i = 1 1+λ2i''1λ2i (1 ''λ22i ''1 )(1 ''λ22i)(1''λ2i''1λ2i) , i 6= 1, λ2i''1 6= λ2i 1+λ2i''1λ2i (1 ''λ2i''1λ2i)3 , i 6= 1, λ2i''1 = λ2i . The bound Ki in (6.45) is obtained from (1 ''λ2i''1λ2i) 2 > (1''λ22i''1)(1''λ22i), and from 1 + λ2i''1λ2i (1 '' λ22i''1)(1 '' λ 2 2i)(1 '' λ2i''1λ2i) = 1 + fi '' γ (1 '' fi)(1 + 3fi '' 2γ)(1 '' fi + γ) (6.68) where (6.22) is also used. The bound Ui for Method B descends on the fact

that matrix Q diagonalizes F , therefore [Q''RwQ]i,i '2q = [Q''(F '' I) 2Q]i,i = (fi '' 1)2 . 120 6. Fast Consensus via ADMM For Method A, the issue is more involved, since Q does not necessarily diag-

onalize (S ''diag(S'1))(S'''diag(S'1)). Generally speaking, an upper bound Ui can be derived by exploiting the positive semidefinite inequality sr (S ''diag(S'1))(S'''diag(S'1)) I (S''diag(S'1))(S'''diag(S'1)) . If S is symmetric, then (6.18) leads to F = d(S 2 ''I)+I, therefore Q results to diagonalize S, obtaining [Q''RwQ]i,i '2 q = (fi '' 1) 2 + d2[Q''(S '' I)2Q]i,i = (fi '' 1) 2 + d2(si '' 1)2 ' (fi '' 1) 2 + 4d2 where si are the real valued eigenvalues of S which satisfy ''1 < si ' 1. Finally, if S is also positive semidefinite, then the relation F = d(S 2 ''I)+I leads to fi = d(s 2 i '' 1) + 1 = '' si = r f i '' γ d which provides the last bound Ui by substitution. 6.A.4 Proof of Results of 6.5 Sketch of the proof of (6.47) and (6.48) The cases ǫ ' 1 and ǫ > 1 are treated separately, when the initial state is given by (6.46). ' ǫ ' 1: By inspection of Fig. 6.1a, it must be noted that λ2i''1 ' λ2i for all i, so that just λ2i''1 can be considered for the optimization purposes.

It results that λ2i''1 is a monotone increasing function of 'i, so that ξ is

determined by '2, hence ξ = λ3. The minimum is given by the lowest

curve with ǫ = 1, to have ξ = λ3 = ( 2'2 '' 1 , '2 ' 3

4 1

2 , '2 < 3

4 . (6.69) ' ǫ > 1: By inspection of Fig. 6.1b, it is |λ2i''1| ' |λ2i|, so (again) just λ2i''1 can be considered. The two cases, '2 ' 3

4 and '2 < 3

4 , need to be treated separately. 6.A. Appendix 121 '' '2 ' 3

4 : It holds that |λ3| > |λ2i''1| for all i > 2, hence ξ is (again) determined by '2 as ξ = |λ3|. The minimum here is reached by choosing ǫ in such a way that '2 corresponds to the point where

the eigenvalues locuses separate, that is, when f2 = fhigh. This

corresponds to the first line of (6.47). '' '2 < 3

4 : In this case, the minimum is guaranteed when '2 and 'N are equal, that is when |λ2N''1| = |λ3|. This provides the second line of (6.47). It must be observed that the values of ξ in (6.48) (derived by choice (6.47))

are the optimum choices, since they are smaller than those of (6.69). On the other hand, when the initial state is given by (6.13), the eigenvalue d also plays a role. For '2 ' 3

4 the solution provided by (6.47) is still optimal because ξopt ' d (see plots in Fig. 6.1b where d = λ2 set by '1 = 1). For '2 < 3

4 , the two cases ǫ > 1 (for which d > 1

2 , and so ρ > 1

2 ) and ǫ ' 1 (for which ξ ' d, and so ρ = ξ) need again to be treated separately. Thus, (6.69) is still the optimum choice, which gives ρ = 1 2 for ǫ = 1. Proof of (6.53) According to [73] the parameters (θ1, θ2, θ3) = ( ''δ, 0, 1 + δ) are used, so that the block Bi has the form Bi = (1 + αδ)ρi ''αδ 1 0 = λ2i''1 + λ2i ''λ2i''1λ2i 1 0 . Using the above in (6.68) leads to Ki = 1 + λ2i''1λ2i (1 '' λ22i''1)(1 '' λ 2 2i)(1 '' λ2i''1λ2i) = 1 (1 '' α2δ2)(1 '' ρ2i) . By then assuming the noise model w(t) = (1 + αδ)(W '' I)q1, equivalent to the one used in (6.36), it turns out that Ui = (1 + αδ)2(ρi '' 1) 2, from which (6.53) follows. 122 6. Fast Consensus via ADMM Chapter 7 An ADMM-Based Clock

Synchronization Algorithm The results presented in this chapter refer to a collaboration of Prof. Van-

gelista and me with Dr. Emiliano Dall''Anese 1 and Prof. Tomaso Erseghe2. They have been published in [83]. 7.1 Introduction Differently from the algorithms presented in Part I of this thesis, in this

chapter a clock synchronization algorithm is proposed in which a correction

signal is applied at each step of the algorithm. As it has already been said,

this makes the clocks be coupled together. The applied control signals result

to be linear functions of the observations of each node''s neighborhood, so

that this algorithm falls down in the class of linear consensus-based clock

synchronization. t will be clear next that the full synchronization issue (both

in frequencies and counters) is recast as a double consensus problem. In Chapter 6 an innovative algorithm for reaching average consensus with convergence speed among the best in the literature has been proposed: in the

present chapter the ADMM-based consensus is indeed applied to the consen-

sus on clock frequencies through the implementation of the required message

passing procedure based on the measurements of frequency displacements.

Plain consensus is applied for reaching agreement on the clock counters, but

this only affects marginally the results. In fact, it is shown that the innovative 1E. Dall''Anese is with the Department of Electrical Engineering, University of Min- nesota, 200 Union St., MN 55455, Minneapolis, USA. 2T. Erseghe is with the Department of Information Engineering, University of Padova, via G. Gradenigo 6/b, 35131 Padova, Italy. 123 124 7. An ADMM-Based Clock Synchronization Algorithm application of the ADMM-based of Chapter 6 to the problem of consensus

on clock frequencies allows to substantially reduce the time needed to get the

whole WSN synchronized, with respect to what happens with the application

of the standard consensus in both clock frequencies and clock counters. The

increased convergence speed has not a bad reflection on the noise resilience,

since it is clearly comparable to other consensus-based approaches. 7.2 Problem Statement The adopted system model is the one described by equations (5.8). As it

has already been said, in this chapter the synchronization on clock phases is

achieved via the application of a plain consensus algorithm, while agreement

on clock periods is obtained via the ADMM-based consensus, so that the

control signal u(t) comes from standard consensus, while v(t) is designed

according to the guidelines described in Chapter 6. By summarizing what is

to be described next, the proposed solution comprises a plain consensus algo-

rithm running in parallel to an ADMM-based consensus algorithm, ensuring

fast convergence to commons values for T (t) and Υ(t). 7.3 ADMM-based Consensus on Clock Periods Consider the goal of computing the average value ¯ Υ , (1/N) P N

i=1 Υi(0). The application of ADMM to this problem leads to the update equation

(6.11), in which x is replaced by Υ, thus providing Υ(t + 1) = (I + D + 2U )Υ(t) '' (D + U)Υ(t '' 1) , (7.1) with initial conditions Υ(1) = Υ(0) = [Υ1(0), . . . , ΥN(0)] ' , and where the involved matrices are those defined in (6.9) and (6.11) (Method A). Under

the assumption (6.17), it holds that C = ǫS , D = dI , d = ǫ ǫ + 1 , (7.2) with S the row-stochastic shape matrix and ǫ > 0 the amplitude factor.

It is possible to optimize the aforementioned parameters (shape matrix and

amplitude factor) in order to maximize the convergence speed, by choosing

a well suited matrix S (such as Sboyd or SAC) and, more importantly, an

optimized value for the amplitude factor ǫ. In fact, while a careful choice for

the amplitude factor is fundamental for reaching higher convergence speed,

the choice of the shape matrix is less relevant, as it has already been showed 7.3. ADMM-based Consensus on Clock Periods 125 in Chapter 6. For speeding up the convergence toward ¯ Υ, throughout the remaining of this chapter ǫ will be set equal to ǫopt according to (6.47). Using (7.2), a new matrix can be defined, being L = SQ , Q = [diag(S ' 1)]''1 S' , a symmetric doubly stochastic (L = L ' , L1 = 1) and positive semi-definite matrix, which allows to rewrite (7.1) as Υ(t + 1) = Υ(t) + d h (L '' I)Υ(t) + L(Υ(t) '' Υ(t '' 1)) i . (7.3) In general, (7.3) provides fast convergence (much faster than the optimal

Boyd''s solution) and resilience to noise impairments. From (7.3) it follows that the enforcing control v(t) at time t in (5.8b) can be expressed as v (t) = d h (L '' I)Υ(t) + L(Υ(t) '' Υ(t '' 1)) i , (7.4) and it can be seen that it is only dependent upon the clock periods vectors

Υ(t) and Υ(t '' 1). 7.3.1 Algorithm implementation The implementation of the control signal (7.4) is not an issue, since standard

PLL or DLL structures [34] are well suited techniques for correcting the

oscillation frequency. On the other hand, the values Υ(t) are not measurable

quantities. Fortunately, other related quantities can actually be measured

via the application of largely used techniques. In fact, methods that rely on

time differences [15,17,84], or efficient frequency estimation techniques (such

as the Schmidl-Cox [85]) can be effectively used to measure the (relative)

frequency displacements between node pairs, i.e., ''fi,j(t) , Υj(t) Υi(t) '' 1 , '' j '' Ni , (7.5) sequentially gathered at each node i. By exploiting the definition of the frequency correlation factor for sensor i at time t, i.e., ''Fi(t) , Υi(t '' 1) Υi(t) '' 1 , (7.6) the control signal (7.4) can be rewritten as a function of (7.5) as ''Fi(t + 1) '' d X j ''N (2) i Li,j h ''Fj(t) '' ''fi,j(t) i , (7.7) 126 7. An ADMM-Based Clock Synchronization Algorithm where N (2) i denotes the set of two-hop neighboring nodes of sensor i. Since the frequency offsets of practical systems is quite small (usually it ranges

from 10 to 100 ppm [43]), in the derivation of the above the approximations

[1 + ''Fi(t + 1)]''1 '' 1 '' ''Fi(t + 1) and ''Fi(t + 1)''fi,j(t) '' 0 have been used. The two-hop information flow represented by (7.7) requires a double in- formation exchange at each iteration t. More in the specific, if n is the in-

termediate node between nodes i and j, it is n '' Ni and j '' Nn. Moreover, again under the assumption of small frequency offsets, the approximation

''fi,j(t) '' ''fi,n(t) + ''fn,j(t) can be made. With this in mind, (7.7) can be expressed as ''Fi(t + 1) '' d X n ''Ni Si,n h ''Gn(t) '' ''fi,n(t) i , (7.8) with ''Gn(t) = X j ''Nn Qn,j h ''Fj(t) '' ''fn,j(t) i . (7.9) It must be recalled that the quantities ''fi,n(t) and ''fn,j(t) are measured

at node i and n respectively, while ''Fj(t) is transmitted during the first

message exchange and ''Gn(t) during the second exchange. 7.3.2 Noise resilience The frequency estimation process as well as the message exchanges involve

the presence of corrupting noise. Referring to eq. (7.8) and (7.9), an AWGN

model is assured both for the estimation noise affecting {''fi,n(t), ''fn,j(t)} and for the quantization/communication noise in {''Fj(t), ''Gn(t)}. All sources of noise are assumed mutually independent; this comes after the

assumption that {''fi,n(t), ''fn,j(t)} are estimated independently. Noise components are incorporated in the N '1 vector wv(t), to be added to the noiseless control signal (7.4), to obtain v (t) = d h (L '' I)Υ(t) + L(Υ(t) '' Υ(t '' 1)) i + wv(t) . (7.10) Because of the aforementioned assumptions, wv(t) is AWGN, zero mean and

with covariance matrix given by Rv , E [wv(t)w'' v (t)] '' d 2T 2 nom h S ΣGS ' + ((S ' Σf)S') ' I + LΣF L ' + S[((Q ' Σf)Q') ' I]S' i , 7.4. Consensus on Clock Counters 127 where ' stays for the Hadamard (entry-wise) matrix product, Υnom denotes the nominal clock period, and ΣF , diag('2F 1 , . . . , ' 2 FN ) , ΣG , diag('2 G1 , . . . , ' 2 GN ) , Σf , ['2 fi,j ]i,j=1,...,N , collect estimation and transmission noise variances. Because of uncorrelated noise, the following result holds from Theorem 6.4: Theorem 7.1 The noisy control signal (7.10) assures consensus on clock

periods Υ(t). Letting the deviation from the average value of Υ(t) be written

as ' Υ(t) , KΥ(t), with K = I '' 1 N 11 ' , upon defining the mean squared error on estimating ¯ Υ, MSEΥ(t) , E h k 'Υ(t)k 2 2 i , (7.12) the asymptotic convergence can be expressed by the norm-2 result lim t '''' MSEΥ(t) ' N X i=2 (1 + d''i)'2max d(1 '' ''i)(1 '' d''i)(2 + 3d''i '' d) (7.13) where ''i, i = 2, . . . , N are the eigenvalues of L excluding ''1 = 1, and where

'2 max is the largest eigenvalue of matrix KRv K . 7.4 Consensus on Clock Counters Although a straightforward application of the ADMM-based consensus to

clock phases T (t) is possible since clock counters are directly observable

values, it eventually leads to a bias on the control signal u(t) at steady state.

Therefore, a plain consensus algorithm is applied for reaching agreement also

on such readings, so that the faster convergence property coming from the

application of ADMM to clock periods can be exploited anyway in order to

get faster full synchronization. The noise corrupted control signal u(t) can be expressed as u (t) = (S '' I)T (t) + wu(t) , (7.14) where S is any standard consensus matrix, such those described in Section

5.3.2. Noise wu(t) encompasses AWGN noise sources analogously to wv(t) 128 7. An ADMM-Based Clock Synchronization Algorithm in (7.10). More specifically, wu(t) comes from the noisy readings T (t) and

it is assumed to be zero mean and with covariance matrix Ru = (S '' I)ΣT (S '' I) ' , ΣT , diag('2 T1 , . . . , ' 2 TN ) , with '2 Ti being the variances of communication and/or quantization noise affecting the readings T (t). As for clock periods, the used metric for assessing the algorithm per- formance is the mean squared error in agreeing on common clock counters,

i.e., MSET (t) , E k'c(t)k 2 2 , (7.15) where ' T (t) , KT (t) . (7.16) 7.5 Numerical Results Performance of the proposed ADMM-based clock synchronization algorithm

are now evaluated and compared with existing approaches, in terms of syn-

chronization error ' T (t) and MSE. The considered setup consists of a WSN of N = 50 nodes whose connec- tivity is represented by a RGG with an average number of neighbors equal

to 6. In this scenario (a representation can be observed in Fig. 7.1), nodes

are uniformly disposed in a disc area of 1 m2. ADMM-based consensus is applied for reaching consensus on clock pe- riods, with averaged consensus shape matrix SAC (see Section 5.3.2) and

amplitude factor ǫ optimized for convergence speed (see (6.47)). Matrix S

is also set equal to the averaged consensus matrix (ADMM+AC),. Noise is

characterized by the (diagonal) covariance matrices Ru = '2uI and Rv = ' 2 v I , with '2u = 10'' 2 and '2 v = 10 ''6 respectively for the control signals u(t) and v (t). The nominal oscillation frequency is set to fnom = 1/Υnom = 32768 Hz. The initial conditions Υ(0) are chosen as independent Gaussian random

variables, with mean Υnom and standard deviation 50 ppm. The initial con-

ditions for clock counters T (0) are chosen as independent Gaussian random

variables with zero mean and unitary variance. Finally, the time interval

between two successive iterations of the algorithm is set to 10 seconds. 7.5.1 Convergence Rate and Noise Resilience Through an analysis of the synchronization error (7.16) and of the mean

squared errors (7.15) and (7.12), in this section the convergence speed and the 7.5. Numerical Results 129 Figure 7.1: Network communication graph. noise resilience of the proposed ADMM-based algorithm ((5.8a) and (5.8b))

will be evaluated. The phase errors at five nodes is showed in Fig. 7.2. It can be seen how the synchronization error decreases as time goes by. Moreover, a snapshot around

t = 560 shows that, from this time on, the network remains synchronized,

that is, the absolute value of the synchronization error stays below 1. Note

that in the network setup of Fig. 7.1, nodes are pretty sparse, so that the

algorithm takes a while to get the network synchronized. On the other hand,

it will be shown next that other consensus-based approaches have worse

performance. The characteristic sawtooth shape of Fig. 7.2 is a consequence of the fact that, at the discontinuity points, the correction signals ((7.14) and (7.10)) are

applied, and until the next iteration, the frequency skew remains constant;

thus, in such intervals the synchronization error increases linearly. However,

the slope of the error decreases as time goes by, denoting a progressive im-

provement in the clock periods agreement across the network. In Fig. 7.3 the considered metrics are the MSEs (7.15) and (7.12), namely MSET (t) and MSEΥ(t) respectively. Consensus on clock phases comes ob-

viously after agreement on periods has been achieved. Moreover, it can be 130 7. An ADMM-Based Clock Synchronization Algorithm 0 100 200 300 400 500 600 700 800 900 1000 ''400 ''200 0 200 400 t 190 195 200 205 210 215 220 ''2 ''1 0 1 2 t Figure 7.2: Synchronization phase error Ti(t) for five nodes. noted that the initial disagreement on the clock periods makes the clock

phase error to increase. Such trend is rightafter inverted as consensus on

clock periods is pursued. Noise bound (7.13) for the estimation of ¯ Υ is also showed in Fig. 7.3, confirming its tightness: in fact,the simulative error curve overlaps the bound

at steady state. 7.5.2 Performance Comparison with Standard Consen-

sus The ADMM-based clock synchronization algorithm provides better perfor-

mance with respect to synchronization algorithms based on plain consensus,

even with optimized settings. In fact, Fig. 7.4 shows MSET (t) and MSEΥ(t)

for ADMM+AC as well as the performance coming from the double applica-

tion of plain consensus, both to clock phases and to clock periods. The (com-

mon) matrix used for the double consensus synchronization algorithm is the

averaged consensus matrix SAC (AC). As a benchmark, (since its high com-

putational cost prevents its practical implementation) the Boyd''s optimum

solution Sboyd (see Section 5.3.2) is also used (BO). The great performance

improvement brought by the ADMM-based consensus is well visible: for

clock periods, the convergence speed of the algorithm introduced in Chap-

ter 6 proves to be approximately more than three times the one achieved

by plain consensus. This excellent performance also brings relevant bene- 7.5. Numerical Results 131 0 50 100 150 200 250 300 350 400 450 500 10 ''6 10 ''4 10 ''2 10 0 10 2 t ADMM + AC

noise bound 0 50 100 150 200 250 300 350 400 450 500 10 ''2 10 0 10 2 10 4 t Figure 7.3: MSE for clock phases and clock periods with ADMM+AC. 0 50 100 150 200 250 300 350 400 450 500 10 ''6 10 ''4 10 ''2 10 0 10 2 t ADMM + AC

AC + AC

BO + BO 0 50 100 150 200 250 300 350 400 450 500 10 ''2 10 0 10 2 10 4 t Figure 7.4: MSE of ADMM+AC compared to standard and optimized con-

sensus approaches. 132 7. An ADMM-Based Clock Synchronization Algorithm fits for reaching full synchronization: in fact, while for standard consensus

more than 340 iterations are needed in order to get the network synchronized

(i.e., MSET < 1), the same performance is obtained by waiting almost 270

iterations if using the ADMM based consensus. Finally, as it can be observed in figure, the noise resilience of the proposed algorithm is comparable to that of the standard consensus approaches. 7.6 Conclusions While in Chapter 6 a new distributed consensus algorithm based on the

ADMM has been introduced, here it was shown how its application to the

problem of clock synchronization in WSNs leads to better results, compared

to the state-of-the-art of consensus-based clock synchronization. ADMM-based consensus was here applied for achieving agreement on clock periods, while plain consensus was used for clock phases, since a direct

application of the method proposed in Chapter 6 would have lead to an

undesired and not negligible bias on the correction signal u(t). Even if ADMM-based consensus could not be straightforwardly applied to clock phases synchronization, its use for agreement on clock periods had

extremely positive consequences on the convergence speed of the overall al-

gorithm. In fact, it allowed to reach full synchronization well in advance with

respect to existing consensus-based approaches. Appendix A Efficient Base Station Selection in

Uplink Cellular Networks The results presented in this chapter refer to a collaboration with Prof. Van-

gelista and Prof. Stefano Tomasin 1 and have been published in [86, 87]. While falling out of the main topic of this thesis, because of its relevance in

my Ph.D. program this work has been included as an appendix chapter. Be-

ing self-standing, the present work can also be appreciated by the interested

reader independently from the rest of the thesis. A.1 Introduction Wireless cellular systems are characterized by the communication between a

mobile terminal (MT) and one or more base stations (BSs). Recent advance-

ments in the research community thinking brought the concept of coopera-

tion among BSs to the attention of the other researchers as well as to the

industry. In fact, by adopting cooperation among BSs the network through-

put can be increased via the implementation of a distributed multiple-input

multiple-output (MIMO) system, also known as macro-diversity or multi-cell

processing (MCP). MCP has been first studied from an information theoretic viewpoint and later also in more realistic scenarios with correlated Rayleigh fading between

antennas [88] and with possible application to 3GPP long term evolution

(LTE) advanced [89]. Moreover, scheduling and power allocation have been

studied for MCP in downlink [90]. 1S. Tomasin is with the Department of Information Engineering, University of Padova, via G. Gradenigo 6/b, 35131 Padova, Italy. 133 134 A. Efficient Base Station Selection in Uplink Cellular Networks In uplink, MCP encompasses the cooperation among different BSs in or- der to decode the packet transmitted by a MT in a joint fashion at the

serving BS [91, Chapter 2]. In other words, in uplink, cooperating BSs share

information on the received packet therefore they need to exploit the back-

haul network in a non-negligible manner. The issue of designing the proper

characteristics of the backhaul network has been faced in [92, 93], where the

capacity of an uplink MCP system with limited backhaul throughput has

been investigated. Also, in [94], the goal is to maximize the user throughput

while having a limited backhaul capacity. Cooperation among BSs has been studied in [95] for application to a fre- quency division duplexing (FDD) LTE network in which both demodulation

and decoding are performed at the serving BS, thus requiring a heavy use of

the backhaul network. Differently, in [96] just decoding is performed at the

serving BS, therefore only soft information bits are forwarded to the serving

BS. A study on the asymptotic properties of the spectral efficiency of both

the optimum and the linear minimum mean squared error (MMSE) joint

multi-cell receivers have been performed in [97] assuming a large number of

users per cell, showing that cooperation among receivers allows a significant

performance improvement. As it can be evinced from this discussion, the main issue in the design of an uplink MCP system is the high capacity backhaul network necessary

to exchange soft information on the demodulated signals among BSs. The

selection of the appropriate cooperating BSs for performing decoding jointly

is therefore fundamental from this point of view: on this matter, the authors

of [98] select the BSs in order to maximize the network throughput for a given

user deployment under backhaul capacity constraints. Handoff for uplink

MCP and BS selection are investigated in [99, 100]. In [101], a dynamic

greedy algorithm is proposed for selecting the cooperating BSs considering

multiple MTs transmitting in a cell and using linear receive beamforming

with multi antenna. Considering error control policies leads to a higher overhead on the back- haul network, since both automatic repeat request (ARQ) and hybrid ARQ

(HARQ) encompass retransmissions by the MTs, thus increasing the infor-

mation sharing and the average decoding delay, while naturally allowing a

more reliable communication. This work considers an uplink MCP system in which BSs can cooperate in order to decode a packet transmitted by a single MT in the considered sector.

(H)ARQ error control policies are used for maintaining a threshold level of

quality of service (QoS) in terms of maximum allowed outage probability

after a given number of retransmissions. The objective is to accordingly

select the cooperating BSs such that the average number of cooperating BSs A.2. System Model 135 is minimized (therefore reducing the backhaul network usage). The approach

of this work is different from what most of the literature does since, instead

of imposing a constraint on the backhaul capacity, a QoS target must be

guaranteed to the user with the aim of minimizing the backhaul occupancy.

Such an approach especially fits with HARQ error control since the duration

of the packet transmission is not fixed: the backhaul network can therefore

be statistically exploited by each MT. Another novel aspect of this work consists on the number of considered policies: in fact, while error control is implemented either with ARQ, HARQ

with chase combining or HARQ with incremental redundancy, two levels of

cooperation among BSs are also assumed: joint decoding: cooperating BSs forward soft information to the serving BS which tries to decode the packet local decoding: cooperating BSs try to decode the packet locally and only exchange one information bit on whether they decoded the packet or

not. In this work, the number of cooperating BSs at each retransmission frame is optimized assuming to have perfect knowledge on the average channel at-

tenuation (path loss) between the MT to each cooperating BSs. Since finding

the optimum solution is computationally expensive, an iterative algorithm is

introduced which proves to be both efficient and effective. Although making

strong assumptions on the channel knowledge, this work provides a way to

increase the system efficiency and may provide further insights for possible

future implementations of MCP solutions. A.2 System Model The considered scenario (see Fig. A.1) involves a MT transmitting a packet to

a serving BS, which is typically the closest BS to the terminal. It is assumed

that the surrounding BSs may help the serving BS through cooperation in

a macro diversity approach. Such BSs are called cooperating BSs. Just

one transmitting MT is considered, since tit is assumed that a pre-selection

phase [102] has already taken place. However, a cooperating BS may serve

other MTs as well as cooperate with other serving BSs at the same time. As

an alternative to the use of other BSs as cooperators, specific devices (re-

lays) may be used, therefore implementing a distributed broadband wireless

communication (BWC) system. On the other hand, this is not relevant for

the upcoming analysis. In fact, the purpose of this work is to verify if using 136 A. Efficient Base Station Selection in Uplink Cellular Networks cooperating BSs proves useful for reducing the backhaul usage in decoding a

packet transmitted by a MT while meeting certain requirements on the QoS. Figure A.1: The MT (red cross) is communicating with the M = 3 closest

BSs (blu squares) in a cellular scenario. Unsuccessful decoding is due to channel fading conditions, so that meet- ing QoS targets requires the adoption of error management policies. Error

control is implemented via the use of retransmission policies like plain ARQ

or hybrid ARQ (HARQ), the latter either with chase combining (HARQ-CC)

or incremental redundancy (HARQ-IR). Considering a time division multi-

plexing system, retransmissions take place in different frames, each organized

in two slots: the first slot is used by the MT to transmit data packets at fixed

spectral efficiency R (in bit/s/Hz), while in the second slot the serving BS re-

sponds either with an acknowledgement (ACK) or with a not ACK (NACK)

to the MT, in order to signal successful or failed packet decoding, respec-

tively. The second slot is generally shorter than the first, being ACKs and

NACKs shorter packets with respect to data packets transmitted by the MT.

A further assumption about ACK and NACK packets is that they are always

decoded successfully by the MT: this can be reasonably assumed if, for ex-

ample, the transmission rate of ACKs and NACKs are low enough. (H)ARQ A.2. System Model 137 frames are retransmitted until either the serving BS decodes the packet or a

maximum of N frames is reached, in which case an outage event is declared. Channel fading is assumed rapidly time-varying and not known at the BSs. The average channel conditions (i.e., path loss in the channels from the

MT to the BSs) are instead assumed known at the BSs. Given the channel

statistical description, the serving BS selects xn '' [0, . . . , M] BSs (including itself) that will cooperate at frame n '' [1, . . . , N] for decoding the packet transmitted by the MT. The xn BSs are denoted as active at frame n. Two cooperation methods among BSs are envisioned in this work. On one hand, the active BSs try to decode the packet locally after each frame

without exchanging any further soft information: this approach is called local

decoding (LD), and the only data exchange between BSs is 1 bit indicating

the outcome (either success or failure) of the local decoding process at the end

of each frame. Differently from LD, in joint decoding (JD) the cooperating

BSs instead forward all the received bits to the serving BS which tries to

decode the packet making use of all the available data. In this latter case,

the active BSs exchange soft information at the end of each frame. Because of

the greater amount of available data, JD is expected to perform better than

LD in terms of QoS. This comes at the cost of a larger use of the backhaul

network. As a matter of comparison, also the case in which just one (the

serving) BS is active is considered and denoted by the acronym NC, which

stands for no cooperation. Let 'm be the average signal to noise ratio (SNR) of the channel between the MT and the BSs m. It is assumed that the BSs are sorted according to

decreasing values of the SNR, i.e., '1 ' '2 ' . . . ' 'M . (A.1) At each frame n, the serving BS selects the xn BSs with the higher SNRs to

get the best results. In order to evaluate the performance of the various approaches, an appro- priate metric must be introduced. To this end, let pn(x, ', R) be the proba-

bility of decoding failure at the serving BS before frame n, with n = 1, . . . , N;

let also be p1(x, ', R) = 1. In the definition of pn, the dependence on the

number of active BSs at each frame x , [x1, . . . , xN], on the SNRs of the BSs-MT links ' , ['1, . . . , 'M] and on the required spectral efficiency R is made explicit. The fundamental metric adopted in this work is then the

following. Definition A.1 The average number of active BSs is defined as f (x) , N X n=1 xnpn(x, ', R) . (A.2) 138 A. Efficient Base Station Selection in Uplink Cellular Networks Since f is a weighted average of xn, it provides a measure of the backhaul

network usage, even if a specific metric will be introduced in the next section

for such specific purpose. The objective here is to elaborate a strategy in order to accordingly choose the vector x so that the average number of active BSs (A.2) is minimized,

under a constraint on the maximum outage probability. In other words, the

goal is to solve the problem min x f (x), (A.3a) subject to pN+1(x, ', R) ' θ, (A.3b) xn '' {0, 1, . . . , M}, n = 1, 2, . . . , N. (A.3c) This is essentially different from just requiring to minimize the outage prob- ability. In fact, in that case the optimum solution would be simply setting

xn = M for all n, thus overloading the backhaul network. However, in prob-

lem (A.3) the goal is to preserve the backhaul resources while at the same

time guaranteeing a target QoS. In order to attack problem (A.3), it is necessary to derive explicit ex- pressions for the outage probability in each combination of error correction

method and decoding policy. This is to be done in the following sections. A.3 Outage Probability Expressions The goal of this section is to provide explicit expressions for the outage prob-

ability pn before n frames, in order to enforce the constraint (A.3b). All the

cooperation policies among BSs and retransmission methods are considered,

while assuming that the channels suffer of flat Rayleigh fading with path

loss. To this end, new quantities and notations are introduced. Moreover, in

order to evaluate the cell coverage, special focus is given to the case 'm = '

for all m, this case being the condition in which the MT is at the edge of

all surrounding cells. The importance of this case resides on the fact that it

provides the highest outage probability, thus proving useful in system design

for cell dimensioning and sites position planning. The considered combinations of decoding policies and error control meth- ods encompass JD, LD and NC, used together with both HARQ-CC and

HARQ-IR. Plain ARQ is used only in combination with LD and NC, since

JD has no difference with LD, as it will be clear next. Table A.1 summarizes

the aforementioned combinations. In this section, co-user interference is not considered in order to obtain analytical results. The scenario in which only one MT is transmitting inside A.3. Outage Probability Expressions 139 ARQ HARQ-IR HARQ-CC NC X X X LD X X X JD X X Table A.1: Combination of decoding policies and retransmission methods. the sector at that specific time can be achieved by several techniques, ranging

from multiple antennas or multiple receivers in each BS and time or frequency

multiplexing among cells, at the price of either a higher cost or a lower

spectral efficiency. Section A.5 reports simulative results in which both inter-

cell and intra-cell interference are taken into account. In combinations involving the use of JD and HARQ, there is a non negligi- ble backhaul occupancy, since data on the received signals must be forwarded

from the cooperating BSs to the serving BS. On the other hand, with LD

no soft information forwarding is needed, and the backhaul occupancy (1

bit) can be considered negligible. Given this, in order to better evaluate the

backhaul usage of JD, it will be assumed, for simplicity, that the BSs are

connected on a ring, each BS being able to communicate directly just to the

previous and the following BS in the chain. Definition A.2 The average number of backhaul packets exchanged among

active BSs BP is defined as the average sum of the lengths of the paths from

each active BS to the serving BS, BP = N X n=1 xn '' 1 2 xn '' 1 2 + 1 + '(xn) xn + 1 2 pn(x, ', R) , (A.4) with '(xn) = ( 1 xn even

0 xn odd. The backhaul occupancy is not the only performance metric adopted in this work. In fact, it is well known that retransmissions due to decoding

failures imply a delay on the packet delivery. Therefore, it is necessary to

keep into account of such delay by introducing a proper metric. Definition A.3 The normalized (with respect to the packet duration) aver-

age decoding delay D is the average time required for decoding a packet when

transmitted in uplink from a MT, i.e. D = X n pn(x, ', R) . 140 A. Efficient Base Station Selection in Uplink Cellular Networks The delay D depends on the MT position. Its meaning does not describe a

time but the number of transmissions needed to achieve correct decoding. In

fact, the purpose of introducing such a quantity is to get a measure of the

retransmission delay, while being transparent to delays in the retransmission

requests from the BS. Moreover, the design of the backhaul network must

consider the traffic of the soft information as well as the signalling packets

about channel conditions that allow BS selection, in order not to impact

on D. The actual performance comparison will be done on a delay-related

quantity: Definition A.4 The normalized average payload throughput ' is the ratio

between the required spectral efficiency and the (normalized) average decoding

delay, i.e. ' = R D . (A.5) In order to simplify the upcoming analysis, the indicator function relative to BS m at frame n ym(n) , ( 0 no cooperation

1 cooperation , (A.6) is introduced. Since the BSs are ordered with increasing path loss with

respect to the MT, ym(n) are univocally determined from xn. Moreover, it

is useful to introduce the number of frames in which BS m has cooperated

until frame n as Ym(n) , n X j=1 ym(j), for m = 1, 2, . . . , M, and for n = 1, 2, . . . , N. By inspection of (A.6), it follows that the ym(n) can be expressed in matrix form, with the row sums being equal to Ym(N) and the column sums

equal to xn, i.e.,

y1(1) y1(2) · · · y1(N) y2(1) y2(2) · · · y2(N) ... ... . .. ... yM(1) yM(2) · · · yM(N)

+ '' Y1(N) + '' Y2(N) · · · + '' YM(N) '' + '' + '' + x1 x2 · · · xN . In the following, the explicit expressions of the outage probability pn are derived for all the cases of interest. The NC policy is omitted since it can be

obtained from either the JD or the LD case by setting M = 1. A.3. Outage Probability Expressions 141 A.3.1 Local Decoding with ARQ (LD-ARQ) Assuming plain ARQ, if the first transmission by the MT fails, the terminal

retransmits the same packet to the BSs. At the receiver side, the BSs discard

the packet each time a decoding failure happens. This means that there is

no information reuse neither from different frames nor from different BSs, so

that the only meaningful policy for ARQ is LD. In this case, the expression of

the outage probability after n frames, pn+1, is given by the product '' over the

active BSs and frames '' of the probabilities of decoding failure for each BS

m during each frame n. The spectral efficiency at BS m for a given channel

realization is given by the random variable Φ LD ''A m (x, 'm) , log2 1 + ym(j)'m|hm(j)| 2 , where hm(j) are i.i.d. normalized complex Gaussian random variables mod-

eling the Rayleigh fading in the channel from the MT to BS m at frame j.

Path loss is taken into account by 'm. The outage probability is therefore

given by pLD''A n+1 (x, ', R) = n Y j=1 xj Y m=1 P[Φ LD ''A m (x, 'm) < R] = = n Y j=1 xj Y m=1 am, (A.7) with am , P['m|hm(j)|2 < 2R '' 1] =

= 1 '' e'' 2R ''1 'm . Given the ordering (A.1) assumed for the BSs, it holds that 0 ' a1 ' . . . ' aM ' 1 . (A.8) A.3.2 Local Decoding with HARQ-CC (LD-HARQ-CC) In LD-HARQ-CC the outage probability is the product of the partial outage

probabilities at each BS. The spectral efficiency at BS m is the random

variable ΦLD''CC m (x, 'm) , log2 1 + n X j=1 ym(j)'m |hm(j)| 2 ! , 142 A. Efficient Base Station Selection in Uplink Cellular Networks while the probability of being in outage after n frames is pLD''CC n+1 (x, ', R) = M Y m=1 P ΦLD''CC m (x, 'm) < R . (A.9) Probability (A.9) can be rearranged as M Y m=1 P " n X j=1 ym(j)'m |hm(j)| 2 < 2R '' 1 # = M Y m=1 FC Ym(n), 2R '' 1 'm where FC(k, z) is the cumulative distribution function (cdf) of the Gamma

random variable P k

''=1 |X''| 2 (with X'' i.i.d. normalized complex Gaussian random variables) with scale 1 and shape k, being defined as FC(k, z) = 1 (k '' 1)! γ(k, z), z ' 0 , where γ(k, z) = '' X l=0 ( ''1) l l! zk+l k + l , z ' 0 . For the scenario of multi-relay networks, similar results have been ob- tained in [103]. A.3.3 Joint Decoding with HARQ-CC (JD-HARQ-CC) In JD-HARQ-CC the spectral efficiency is given by ΦJD''CC(x, ') , log2 1 + n X j=1 xj X m=1 'm |hm(j)| 2 ! with outage probability after n frames pJD''CC n+1 (x, ', R) = P ΦJD''CC(x, ') < R . (A.10) Even if in Appendix A.A.2 a closed-form expression of probability (A.10) is

derived, it results computationally intractable. To simplify the treatment,

the worst case scenario is assumed in the absence of interference, that is

the MT is assumed at the edge of all surrounding sectors, and 'm = ' for

m = 1, 2, . . . , M. In this case, probability (A.10) is given by the cdf of a

Gamma random variable with scale 1 and shape kn, where kn , P n

j=1 xj , i.e., p JD ''CC n+1 (x, ', R) = P[' khnk 2 < 2R '' 1] = FC kn, 2R ''1 ' . A.3. Outage Probability Expressions 143 In Section A.5, the system performance is studied with MTs dropped ran-

domly in the sector, thus extending the special case 'm = ' through numer-

ical simulation. A.3.4 Local Decoding with HARQ-IR (LD-HARQ-IR) For LD-HARQ-IR, information is accumulated at each BS, frame after frame.

The spectral efficiency therefore results to be ΦLD''IR(x, 'm) , n X j=1 ym(j) log2 1 + 'm|hm(j)| 2 with the outage probability after n frames given by p LD ''IR n+1 (x, ', R) = M Y m=1 P ΦLD''IR(x, 'm) < R . An alternative view is that BS m fails to decode the packet if the sum of

Ym(N) random variables of the kind log2 (1 + 'mZ) is less than R. Fol-

lowing the approach of [104], ΦLD''IR(x, 'm) is approximated with a Gaus-

sian random variable with mean µLD''tot(m) , Ym(N)µ('m) and variance

'2 LD ''tot(m) , Ym(N )' 2('m), where µ('m) , E[log2(1 + 'mZ)] = log2(e)e 1/'m E1(1/'m) and '2('m) , E[(log2(1 + 'mZ) '' µ('m)) 2] = 2 'm log 2

2(e)e 1/'m G4,0 3,4 1/'m

0,0,0

0, ''1,''1,''1 '' µ 2('m), where E1(x) , R '' 1 t''1e''xtdt and Gm,n p,q z

a1,...,ap

b1,...,bq is the Meijer G function. Therefore the approximation pLD''IR n+1 (x, ', R) '' M Y m=1 Q µLD''tot(m) '' R 'LD''tot(m) holds, where Q[ ·] is the tail probability of the standard Gaussian distribution. 144 A. Efficient Base Station Selection in Uplink Cellular Networks A.3.5 Joint Decoding with HARQ-IR (JD-HARQ-IR) For JD-HARQ-IR, the serving BS keeps accumulating information coming

from the cooperating BSs at each frame, so that the spectral efficiency is

given by ΦJD''IR(x, ') , n X j=1 log2 1 + M X m=1 ym(j)'m |hm(j)| 2 ! . (A.11) The expression of the outage probability after n frames is therefore pJD''IR n+1 (x, ', R) = P ΦJD''IR(x, ') < R . (A.12) A closed-form expression of (A.12) cannot be derived, even for the simple case of large n and 'm = ' for all m. Indeed, in Appendix A.A.1 it is

discussed how even a Gaussian approximation of (A.11) cannot be exploited

since a closed-form expression of the variance '2 Φ does not exist. A.4 Base Station Selection Solving problem (A.3) is not at easy task at all. Being an integer pro-

gramming problem, its solution requires an exhaustive search (ES) over the

(M + 1) N configurations of active BSs at each frame. Indeed, as the maxi- mum number of active BSs M and the maximum number of (H)ARQ frames

N increase, the ES becomes computationally unfeasible. By adopting a con-

tinuous relaxation of the problem, when the outage probability has a closed-

form tractable expression a solution can be found, as shown in Section A.4.2

for LD-ARQ and MT positioned at the same distance from all the M sur-

rounding BSs. In the general case, there is the need of a heuristic algorithm

for recursively computing the number of active BSs at each frame. The

approach being introduced in the next section provides a low complexity

suboptimal solution to problem (A.3), whose performance results extremely

close to ES, thus justifying its adoption in real scenarios. A.4.1 The Recursive Search (RS) The recursive search (RS) algorithm iteratively computes the number of ac-

tive BSs at each frame n, by starting from a static feasible solution and then

modifying the xn of a single unit, one at a time in a greedy fashion. In fact,

the recursive procedure generally decreases the number of active BSs at early

frames and decreases it at later frames, trying to reduce the average number A.4. Base Station Selection 145 of active BSs (A.2). The rationale behind this procedure is that if correct

decoding occurs at an early stage, the BSs scheduled for cooperation in the

next frames are not active. On the other hand, to get an early decoding, a

high number of active BSs is necessary at the earliest frames. Therefore, a

trade-off must be designed between probability of early decoding and number

of active BSs at early frames. With such assumptions, RS looks for the best

trade-off by simultaneously ensuring that the QoS constraint on the outage

probability is met. Although RS is generally suboptimal, in Section A.4.2 it

is shown that in a particular case (LD-ARQ and 'm = ' for all m) it leads

to the optimum solution. The starting configuration of RS is a static cooperation (SC), which is a condition in which the number of active BSs is the same at each frame, i.e., x(0) n = xmin , min{1 ' M' ' M : pN+1([M', . . . , M'], ', R) ' θ}. (A.13) The value of x (0) n is obtained via a search procedure. At iteration i + 1, with i = 0, 1, . . ., the BS configuration is modified by selecting two frame indices s and d, with s < d. Then the number of active

BSs at the early frame s is decreased by one and the number of active BSs

at the later frame d is instead increased by the same quantity, i.e., x(i+1) s = x(i) s '' 1 , x (i+1)

d = x (i)

d + 1 . Then if f (x(i+1)) ' f(x (i)) and condition (A.3b) is still satisfied, then the new BS configuration x(i+1)

is saved as the tentative optimum configuration of active BSs ( ¯ x ). This means that at each iteration the total average number of cooperating BSs is

decreased while meeting constraints (A.3b) and (A.3c). The choice of s and

d satisfies s, d '' {1, 2, . . . , N} and s < d , (A.14) starting from s = 1. Moreover, for a given value of s, d is first set to N

and then reduced, thus first activating BSs at the later stages, to have the

sharpest decrease of f (xi+1). The pseudocode of the RS algorithm is reported

in Table A.2. Note that with RS the total number of active BSs is kept fixed, i.e., N X n=1 x(i) n = xminN . (A.15) If the MT is at the same distance from all the surrounding BSs and ARQ is used, RS assures that the outage constraint is always satisfied without the 146 A. Efficient Base Station Selection in Uplink Cellular Networks Table A.2: Recursive Search Algorithm. Set x (0) n = xmin from (A.13), for all n = 1, . . . , N; Set i = 0

for s = 1 to '' N 2 '' for d = N down to s + 1 while x (i) s > 0 x (i+1) s = x (i) s '' 1; if x (i)

d < M then x (i+1)

d ' x (i)

d + 1 otherwise break; end if

if pN+1(x'', ', R) ' θ and if f(x (i+1)) < f(x(i)) then ¯ x ' x (i+1); end if i ' i + 1 end end end

Return ¯ x as the set of active BSs. need of checking it explicitly. In fact, (A.7) assures that in this case the

outage probability only depends on the total number of active BSs across

all frames. However, in general this is not the case, therefore the outage

constraint (A.3b) must be checked at each iteration. Simulation results re-

ported in Section A.5 show that the performance of RS is really close to the

optimum solution provided by ES, in all considered policies. Moreover, RS

represents a low complex solution since it can be easily shown that it con-

verges in O (MN2) iterations, i.e. in polynomial time. On the other hand,

the ES requires the evaluation of (A.2) for each possible configuration of ac-

tive BSs, therefore requiring O (M + 1)N operations, i.e., an exponential number on the parameter N. Genetic approaches as well as sphere decoding

algorithms have been also proposed in the literature [98]: they lead to good

results but at the cost of a higher computational burden. A.4.2 A closed-form expression for LD-ARQ Considering LD-ARQ leads to a closed-form solution of the continuous relax-

ation of problem (A.3) in a specific case. However, before proceeding with

the specific solution, it is worth to notice why the rationale behind the RS

algorithm is justified. A.5. Numerical Results 147 Theorem A.1 Vector x solving the minimization problem (A.3) for LD-

ARQ satisfies xn ' xn+1, n = 1, 2, . . . , N '' 1. Proof: See Appendix A.A.3 If the MT is positioned at the same distance from the M surrounding BSs, i.e. 'm = ' for m = 1, . . . , M, a closed-form solution can be derived,

as it is shown in Appendix (A.A.4). The solution provides xn = ( '' loga(1 '' xn+1 log a) 1 ' n ' N '' 1

'' loga δ

θ n = N , (A.16) where δ is a solution of the implicit equation E N (δ) = 1 , (A.17) with E 1(δ) , δ E 2(δ) , δ 1 + log δ θ E n(δ) = En''1(δ) h 1 + log En''1(δ)

En''2(δ) i n = 3, . . . , N . (A.18) Eq. (A.17) is trascendent, therefore it has to be solved by the use of numerical methods. However, it has a unique solution, as it is proved in

Appendix A.A.4. The last step is to find a solution in ZN , which is done by approximat- ing the solution (A.16) with the vertex in ZN of the hypercube around x

providing the least average number of active BSs and satisfying the outage

constraint. Such discretization process is a source of sub-optimality for the

analytical solution, but in Section A.5 it is shown that it actually provides

performance close to the optimum. A.5 Numerical Results The considered simulative scenario consists of a cellular system in which a

single MT is present in the studied sector. It is surrounded by M = 3 BSs

arranged on three vertices of an hexagon of unitary side, dividing a cell in

three sectors. Antenna attenuation for out of sector signals is assumed to be

20 dB [105], while the maximum number of retransmission frames is set to

N = 3. Rayleigh fading affects the channel while path loss is assumed with 148 A. Efficient Base Station Selection in Uplink Cellular Networks coefficient 3.4, therefore the SNR of the link between the MT and BS m at

distance d is given by 'm = '0 (d) ''3.4 |ξ|2 , where '0 is the average SNR at unitary distance and ξ is a complex zero-

mean Gaussian random variable with unitary variance modelling the Rayleigh

fading. Interference is kept into account by considering the presence of MTs in the neighboring sectors with interference probability q. The condition q = 0

means total absence of interferers, while q = 1 corresponds to the case in

which all the interfering MTs are transmitting. The positions of interfering

MTs is random within their sectors. Both intra-cell and inter-cell interference

is considered by including a ring of surrounding cells around the studied

sector, assuming a MT (active with probability q) for each sector of the ring. In the following subsections, a number of communication policies are com- pared. Within each policy, two macro diversity approaches are considered:

1) flexible cooperation (FC), in which the active BSs are selected according

to the methods proposed in this work, and 2) static cooperation (SC), where

BSs are activated according to the rule provided by (A.13). For FC, results

for both RS and ES (as a benchmark) are reported. The maximum allowed

outage probability is set at θ = 10''2. A.5.1 Outage Probability in the Sector To provide examples of the distribution of the outage probability in the stud-

ied sector, this section reports spatial plots for two specific cases, assuming

R = 2 bit/s/Hz and activating al the M = 3 BSs. Supposing that the serving

BS is located at the point with coordinates (0, 0), Fig. A.2 focuses on the

policy LD-HARQ-IR with q = 0. It can be observed that the outage prob-

ability increases with the distance from the M = 3 BSs, revealing that the

position at the edge of the cell sectors is the worst case. Similar conclusions

can be drawn looking at Fig. A.3 which reports the spatial distribution of

the average (with respect to the interfering nodes'' positions) outage proba-

bility for LD-ARQ and q = 1. The point at the maximum distance from all

the BSs results to be the worst case for the outage probability also using the

other policies listed in Section A.3, the corresponding plots being omitted. A.5.2 Coverage improvement In order to evaluate the cell coverage improvement allowed by the innovative

FC policies introduced so far in a macro diversity approach, in this section A.5. Numerical Results 149 Figure A.2: Final outage probability pN+1 as a function of the MT position

in a sector with LD-HARQ-IR and in the absence of interference (q = 0).

The serving BS is located at coordinates (0,0) in the plot. Figure A.3: Final outage probability pN+1 as a function of the MT position

in a sector with LD-ARQ and in the presence of interference (q = 1). The

serving BS is located at coordinates (0,0) in the plot. 150 A. Efficient Base Station Selection in Uplink Cellular Networks 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 ''20 ''15 ''10 ''5 0 5 10 15 R [bit/s/Hz] ' min [dB] JD

LD

NC

LD''HARQ''IR''Ga Figure A.4: Minimum value of the SNR at the unitary distance vs the spectral

efficiency of the first transmission for various decoding schemes. Solid line:

HARQ-IR; dashed line: HARQ-CC; dotted line: ARQ. the MT is placed at the edge of three sectors, equally distant from the M = 3

surrounding BSs, that is in the worst case in the sector in terms of outage

probability. It is assumed that all the M = 3 BSs are activated during each

frame, i.e., x = [M, M, M]. Interference-free Scenario When interferers are switched off (q = 0), the minimum acceptable value of

the SNR '0 can be evaluated by letting the MT transmit while imposing

that the outage constraint (A.3b) is fulfilled, i.e., that means finding 'min = min '0 {'0 : pN+1(x, ', R) ' θ} . Fig. A.4 shows the value of 'min as a function of the normalized spectral efficiency R using a) macro diversity with JD, b) macro diversity with LD

and c) no cooperation (NC), i.e. only the serving BS is active. Since all BSs

are active in all frames, a unique line is reported for all cooperation and error

protection scheme, since distinction between RS and ES has not meaning at

this point. JD significantly outperforms both macro diversity with LD and A.5. Numerical Results 151 NC by reducing 'min by about 3.5 dB and 9 dB, respectively, for R = 0.4

bit/s/Hz. Moreover, also the Gaussian approximation discussed in Section

A.3.4 (LD-HARQ-IR-Ga) is shown in Fig. A.4, so that it can be observed

that it provides a satisfying approximation of the simulated performance. Fig. A.4 can be read also from other different viewpoints: in fact, results of 'min can be translated in terms of coverage radius. As an example of the

increased coverage properties yielded by FC schemes, it can be observed that

NC achieves a spectral efficiency of 0.4 bit/s/Hz at the edge of a hexagonal

cell with unitary side, while the same spectral efficiency is achieved by JD

and LD on cells with a side of 1.3 and 1.8, respectively, thus substantially

extending the coverage radius of the BSs. Another reading of Fig. A.4 consists of looking at the achieved spectral efficiency R as a function of the SNR, therefore obtaining information about

diversity and multiplexing gains [106]. In particular, note that HARQ-CC

(dashed lines) has a higher multiplexing gain with respect to ARQ (dotted

lines), while HARQ-IR (solid lines) exhibits also a diversity gain, as can be

seen from the slope at high SNRs. Interference Scenario If the interferers are switched on, their transmissions produce interference

to the studied MT. The outage probability pN+1(x, '(q), R) is therefore also

a function of the interference probability q which determines the signal to

interference plus noise ratio (SINR) vector '(q), so that a threshold for the

maximum value of q can be found for having sustainable communications in

terms of outage probability, for each cooperation policy. Fig. A.5 shows the

maximum value of the interference probability, i.e., qmax = max q {q : pN+1(x, '(q), R) ' θ} , as a function of the spectral efficiency R of the first transmission that can

be reached by the MT of interest, despite interference. Also in this case JD

provides the highest immunity to interference for a fixed spectral efficiency or,

equivalently, letting BSs cooperate increases the achievable spectral efficiency

for a given level of interference. A.5.3 Randomly Dropped Users In this section, MTs are randomly dropped in the sector. In each case, the

cooperation and error protection combinations of Table A.1 are considered

in order to determine the number of active BSs for each frame. The cases in 152 A. Efficient Base Station Selection in Uplink Cellular Networks 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 R [bit/s/Hz] q max JD

LD

NC Figure A.5: Maximum allowed interference probability qmax vs the spectral

efficiency of the first transmission for various decoding schemes. Solid line:

HARQ-IR; dashed line: HARQ-CC; dotted line: ARQ. A.5. Numerical Results 153 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 E[f] q RS''HARQ''IR

SC''HARQ''IR

ES''HARQ''IR Figure A.6: Total average number of active BSs as a function of q for HARQ-

IR. Solid line: JD; dashed line: LD. which the outage probability constraint is not statistically met are dropped

and the MT transmission re-scheduled. From data on the number of active

BSs, the performance metrics (A.2), (A.4) and (A.5) are computed. In the

following, the spectral efficiency R is set to 0.4 bit/s/Hz. Total Average Number of Active BSs In analogy with the well known soft handover overhead, the function f (x) ''1 can be seen as a macro diversity overhead, which is proportional to the addi-

tional hardware resources required in the cellular system to implement uplink

BS cooperation. Indeed, active BSs must be able to manage additional up-

link signal processing in order to cooperate with the serving BS. If no extra

hardware is provided in the active BSs, cooperation translates into a spectral

efficiency reduction. Moreover, the backhaul network will be partially used

for the required exchange of information between cooperating BSs, deter-

mining an additional overhead, especially for JD, which will be considered in

more details in the following. For HARQ-IR, Fig. A.6 shows E[f (x)] as a function of the interference probability q, where the average is taken over the random MT position and

'0 = 0 dB. JD outperforms LD in the entire range of q (especially for values 154 A. Efficient Base Station Selection in Uplink Cellular Networks 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.5 2 2.5 3 3.5 4 E[f] q RS''HARQ''CC

SC''HARQ''CC

ES''HARQ''CC Figure A.7: Total average number of active BSs as a function of q for HARQ-

CC. Solid line: JD; dashed line: LD. of q close to 1), thus it can be concluded that information sharing among

the BSs significantly reduces the average number of active slots. For both

JD and LD, RS always outperforms SC while at the same time providing

performance really close to the optimum solution: in fact, for q = 0.5, more

than 25% of active BSs are saved in the JD case. Fig. A.7 shows E[f (x)] for HARQ-CC as a function of q. Also in this case RS has only slightly suboptimal performance with respect to the best

configuration (ES) of active BSs at each frame; in particular, for q = 0.5,

ES outperforms RS only by 4% in the JD case, with a consistent reduction

in computational complexity: for example, in the performed simulations, for

RS the search is completed in less than 5 iterations in all cases, instead of

(M + 1)N = 64 iterations for ES. Moreover, RS significantly reduces the

backhaul usage with respect to SC, since for q = 0.5, it requires 24% less

total active BSs on average for the JD case. Fig. A.8 shows E[f (x)] for LD-ARQ as a function of q. Even in this case, RS significantly outperforms SC while only slightly under-performing

ES. For example, the resource saving with respect to SC at q = 0.5 is of

about 20%. A.5. Numerical Results 155 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 2.5 3 3.5 4 4.5 5 5.5 E[f] q RS''LD''ARQ

SC''LD''ARQ

ES''LD''ARQ Figure A.8: Total average number of active BSs as a function of q for ARQ. Backhaul Occupation Fig. A.9 shows E[BP ] as a function of q by remarking that RS allows a strong

saving in the resource usage with respect to SC. As an example, for q = 0.5

and JD-HARQ-IR, RS requires 52% less backhaul links usage on average,

while the saving increases up to 64% for JD-HARQ-CC for the same value

of q. Payload Throughput Fig. A.10 shows E['] as a function of q for a system using HARQ-IR. It

can be observed how SC slightly outperforms RS. This result was expected

since RS tends to reduce the number of active BSs during earlier frames while

increasing it in the latter frames. As a consequence, while the average number

of active BSs is reduced, the average decoding delay is increased, since there

is a higher probability of late decoding. On the other hand, to confirm the

close similarity between ES and RS, they have a very close performance also

in terms of payload throughput. Fig. A.11 shows E['] as a function of q for HARQ-CC. Also in this case it can be observed that SC outperforms RS. By comparing Fig.s A.6

and A.7 with Fig.s A.10 and A.11 it can be observed that while requiring a 156 A. Efficient Base Station Selection in Uplink Cellular Networks 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 E[B P ] q RS''JD''HARQ

SC''JD''HARQ

ES''JD''HARQ Figure A.9: Total average of number backhaul transmissions as a function of

q for JD-HARQ. Solid line: HARQ-IR; dashed line: HARQ-CC. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.36 0.365 0.37 0.375 0.38 0.385 0.39 E[ '] [bit/s/Hz] q RS''HARQ''IR

SC''HARQ''IR

ES''HARQ''IR Figure A.10: Average payload throughput as a function of q for HARQ-IR.

Solid line: JD; dashed line: LD. A.6. Conclusions 157 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.36 0.365 0.37 0.375 0.38 0.385 0.39 E[ '] [bit/s/Hz] q RS''HARQ''CC

SC''HARQ''CC

ES''HARQ''CC Figure A.11: Average payload throughput as a function of q for HARQ-CC.

Solid line: JD; dashed line: LD. very close payload throughput, the various techniques lead to a substantially

different number of active BSs, i.e. different backhaul occupancies. Hence,

1/f (x) from Fig.s A.6 and A.7 can be read as the energy efficiency of the

transmission. A.5.4 LD-ARQ and 'm = ' To compare the performance of the analytical solution (AS) derived in Section

A.4.2 with the optimal solution (ES) of (A.3) for the LD-ARQ case, in Fig.

A.12 the total average number of active BSs is shown in the case in which

the MT is positioned at the edge of the cell and the transmission rate is set

equal to R = 2 bit/s/Hz. It can be seen that the analytical solution is clearly

optimal. Finally, it can be observed that in the case in which the SNR toward

all the BSs is the same, RS is fully optimal in the LD-ARQ case. A.6 Conclusions In uplink cellular communications, a MT in a sector transmits to a set of

surrounding BSs in a multi-cell processing scenario. Error control schemes 158 A. Efficient Base Station Selection in Uplink Cellular Networks 6 7 8 9 10 11 12 13 14 15 1 1.5 2 2.5 3 3.5 4 ' 0 [dB] f AS''LD''ARQ

ES''LD''ARQ

RS''LD''ARQ Figure A.12: Total number of active BSs when the MT is positioned at the

edge of the cell with ARQ. as well as cooperation policies among BSs are allowed in order to select which

BSs should be activated in the retransmission frames with the aim of minimiz-

ing the average number of active BSs during all frames, under a constraint on

the maximum allowed outage probability. The iterative proposed algorithm,

RS, provides an efficient and just slightly suboptimal solution to the problem,

for each selected combination of error control scheme and cooperation policy.

The proposed algorithm permits a strong performance improvement (up to

64%) in the backhaul usage with respect to existing cooperative solutions

(SC), as it is also confirmed by the provided simulation results. It can be

therefore concluded that RS is a computationally cheap but effective solution

to the problem of minimizing the backhaul usage under a QoS constraint. A.A. Appendix 159 A.A Appendix A.A.1 Gaussian Approximation for JD-HARQ-IR It can be noticed that Wj , P M

m=1 ym(j)|hm(j)| 2 is Gamma distributed with scale 1 and shape xj and E[log2(1 + 'Wj)] = 1 (xj '' 1)! Z + '' 0 log2(1 + 'w)w xj ''1e''wdw . It follows [107] that E[log2(1 + 'Wj)] = 1 (xj '' 1)! log 2 ' 1 ' xj ' xj sin(xj') 1F1 xj; xj + 1; 1 ' '' 1 (xj '' 1)! p''xj ' log 1 ' '' Ψ(xj) '' 1 '(1 '' xj) 2F2 1, 1; 2, 2 '' xj; 1 ' , with pFq (r1, . . . , rp; s1, . . . , sq; z) , '' X k=0 Q p

i=1(ri)k Q q

i=1(si)k zk k! Ψ(z) , 1 (z '' 1)! ''(z '' 1)! ''z and (r)k = r(r + 1) . . . (r + k '' 1) is the rising factorial of r. Finally it is µ' = n X j=1 E[log2(1 + 'Wj)]. However, a closed-form expression for E[log 2

2(1 + 'Wj)] does not exist, there- fore '2 ' cannot be determined analytically. 160 A. Efficient Base Station Selection in Uplink Cellular Networks A.A.2 Closed-form Expression of Probability (A.10) and

ARQ Probability (A.10) can be rewritten thanks to [108] as P " n X j=1 xj X m=1 'm |hm(j)| 2 < 2R '' 1 # = 1 '' M Y m=1 'Ym(n) m ! M X m=1 Ym(n) X l=1 'm,l( '''m) ' Ym(n) ''l+1 m ' Ψ 'm(2 R '' 1), Ym(n) '' l , where Ψ(a, b) is the cdf of a Poisson random variable of parameter a, i.e., 'm,l(b) = ( ''1) l ''1 X '(M,m,l) Y j ij + Yj(n) '' 1 ij 'j(b) . The set '(M, m, l) contains partitions of l '' 1 through the positive integers indices ij, such that P M

j=1,j 6=m ij = l '' 1 and 'j (b) = ('j + b) ''(Yj(n)+ij). A.A.3 Proof of Theorem A.1 The proof of theorem A.1 in the case N = 2 is reported here. It can be easily

extended to the general case. Let x+ ' M and x'' ' x+ be two integers. First observe that both configurations of vector x, [x'', x+] and [x+, x''], satisfy

the outage probability constraint since for LD-ARQ the outage probability

is only determined by the set of active BSs within the whole transmission

irrespective of their order, see (A.7). The aim is to show that f (x'', x+) ' f(x+, x'') . From (A.2) and (A.7) it is f (x+, x'') '' f(x'', x+) =x+ + x'' x+ Y m=1 am+ '' " x'' + x+ x'' Y m=1 am # . Using (A.8) it holds f (x+, x'') '' f(x'', x+) '(x+ '' x'')+ + x'' Y m=1 am(x''a x+ ''x'' x'' '' x+) . A.A. Appendix 161 Then it must be observed that for fixed Q x''

m=1 am, ax'' is minimized when a1 = . . . = ax '' = a , x'' pQ x''

m=1 am, therefore defining g(a) , x+ + x''a x+ '' (x'' + x+ax'') it follows that f (x+, x'') '' f(x'', x+) ' g(a) . Now, note that g(0) = x+ '' x'' ' 0 and g(1) = 0 and also ''g

''a = x''x+ a (ax+ '' ax'') ' 0 for 0 ' a ' 1. Since g(a) is a continuous function of a, the minimum of g(a) is assumed for a = 1 and it is equal to g(1) = 0, hence g(a) ' 0. A.A.4 Closed-form solution of (A.3) for 'm = '. The average outage probability after n frames (A.7) can be written as pLD''A n+1 (x, ') = a P n

j =1 xj , with a , 1 '' e'' 2R ''1 ' . With the application of the Lagrange multipliers method, the problem to

solve turns out to be min x ,δ '(x, δ) = min x ,δ " f (x) '' δ N X n=1 xn '' loga θ !# . (A.19) Note that in defining the Lagrangian (A.19) the bounds (A.3c) on xn are

not taken into account. Indeed, from Theorem A.1 if the solution of the

unbounded problem (A.3a) with the constraint (A.3b) yields xn > M for

some n, for sure it will be xi > M, n ' i ' N. In such a case, the values xi = M, n ' i ' N will be set, therefore attacking the problem (A.3a) with the outage constraint (A.3b) on the n '' 1 unknowns x1, . . . , xn''1 (all the others being equal to M). Setting to zero the derivatives of (A.19) ''' ''xn = a P n''1 j =1 xj " 1 + N ''1 X l=n xl+1a P l

j =n xj ! log a # '' δ = 0, (A.20a) with n = 1, 2, . . . , N, δ provided by (A.17) and (A.18). Therefore ''' ''δ = '' N X n=1 xn + loga θ = 0. (A.20b) 162 A. Efficient Base Station Selection in Uplink Cellular Networks By manipulating the N + 1 equations (A.20), (A.16) is obtained. Lastly, notice that (A.17) has a unique solution, since a product of two monotonically increasing functions is a monotone function if either of the two

functions is positive valued everywhere. This can be seen by noticing that

both the feasible values of δ in En(δ) and the positiveness of δ, from (A.16),

imply En(δ) > 0 for 1 ' n ' N '' 1. Hence, EN(δ) is a monotone function, therefore (A.17) has a unique solution. Appendix B List of Publications Part of the work presented in this thesis appeared also in the following arti-

cles: ' A. Ahmad, D. Zennaro, E. Serpedin, and L. Vangelista, ''A Factor Graph Approach to Clock Offset Estimation in Wireless Sensor Net-

works,' submitted to IEEE Trans. Inf. Theory. ' A. Ahmad, D. Zennaro, E. Serpedin, and L. Vangelista, ''Time-Varying Clock Offset Estimation in Two-Way Timing Message Exchange in

Wireless Sensor Networks using Factor Graphs,' accepted for publica-

tion in IEEE International Conference on Acoustics, Speech and Signal

Processing (ICASSP), 2012. ' T. Erseghe, D. Zennaro, E. Dall''Anese, and L. Vangelista, ''Fast Con- sensus by the Alternating Direction Multipliers Method,' IEEE Trans-

actions on Signal Processing, vol.59, no.11, pp.5523-5537, Nov. 2011. ' D. Zennaro, S. Tomasin, and L. Vangelista, ''Base Station Selection in Uplink Macro Diversity Cellular Systems with Hybrid ARQ,' IEEE J.

Sel. Areas Commun., vol.29, no.6, pp.1249-1259, June 2011. ' D. Zennaro, E. Dall''Anese, T. Erseghe, and L. Vangelista, ''Fast Clock Synchronization in Wireless Sensor Networks via ADMM-based Con-

sensus,' International Symposium on Modeling and Optimization in

Mobile, Ad Hoc and Wireless Networks (WiOpt), 2011, Princeton (NJ),

May 9-13, 2011. ' D. Zennaro, S. Tomasin, and L. Vangelista, ''Uplink Cell Selection for Cooperative Multi-Cell Networks with Hybrid ARQ,' Proc. of IEEE 163 164 B. List of Publications Global Telecommunications Conf. (GLOBECOM), Miami (FL), Dec.

6-10, 2010. Bibliography [1] I. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, ''Wireless sensor networks: a survey,' Computer Networks, vol. 38, no. 4, pp.

393''422, 2002. [2] I. Chlamtac, M. Conti, and J. J.-N. Liu, ''Mobile ad hoc networking: imperatives and challenges,' Ad Hoc Networks, vol. 1, no. 1, pp. 13 '' 64, 2003. [Online]. Available: http: //www.sciencedirect.com/science/article/pii/S1570870503000131 [3] H. Hartenstein and K. Laberteaux, ''A tutorial survey on vehicular ad hoc networks,' Communications Magazine, IEEE, vol. 46, no. 6, pp.

164 ''171, Jun. 2008. [4] M. Lukac, P. Davis, R. Clayton, and D. Estrin, ''Recovering temporal integrity with Data Driven Time Synchronization,' in

Proceedings of the 2009 International Conference on Information

Processing in Sensor Networks, ser. IPSN ''09. Washington, DC, USA: IEEE Computer Society, 2009, pp. 61''72. [Online]. Available:

http://dl.acm.org/citation.cfm'id=1602165.1602173 [5] D. Mills, ''Internet time synchronization: The network time protocol,' Communications, IEEE Transactions on, vol. 39, no. 10, pp. 1482 ''

1493, Oct. 1991. [6] S. Gezici, Z. Tian, G. Giannakis, H. Kobayashi, A. Molisch, H. Poor, and Z. Sahinoglu, ''Localization via ultra-wideband radios: a look at

positioning aspects for future sensor networks,' Signal Processing Mag-

azine, IEEE, vol. 22, no. 4, pp. 70 '' 84, Jul. 2005. [7] B. Sundararaman, U. Buy, and A. D. Kshemkalyani, ''Clock synchronization for wireless sensor networks: a survey,' Ad Hoc Networks, vol. 3, no. 3, pp. 281 '' 323, 2005. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S1570870505000144 165 166 BIBLIOGRAPHY [8] B. Sadler and A. Swami, ''Synchronization in Sensor Networks: an Overview,' in Military Communications Conference, 2006. MILCOM

2006. IEEE, Oct. 2006, pp. 1 ''6. [9] O. Simeone, U. Spagnolini, Y. Bar-Ness, and S. Strogatz, ''Distributed synchronization in wireless networks,' Signal Processing Magazine,

IEEE, vol. 25, no. 5, pp. 81''97, Sept. 2008. [10] I. K. Rhee, J. Lee, J. Kim, E. Serpedin, and Y. C. Wu, ''Clock syn- chronization in wireless sensor networks: an overview,' Sensors, vol. 9,

no. 1, pp. 56''85, 2009. [11] J. E. Elson, ''Time synchronization in wireless sensor networks,' Ph.D. dissertation, UCLA, Los Angeles, California, 2003. [12] J. Elson, L. Girod, and D. Estrin, ''Fine-grained network time synchro- nization using reference broadcasts,' SIGOPS Oper. Syst. Rev., vol. 36,

pp. 147''163, Dec. 2002. [13] M. Maróti, B. Kusy, G. Simon, and A. Lédeczi, ''The flooding time synchronization protocol,' in Proceedings of the 2nd international con-

ference on Embedded networked sensor systems, 2004, pp. 39''49. [14] R. Kumar, S. Ganeriwal, and M. B. Srivastava, ''Timing-sync proto- col for sensor networks,' in ACM Conf. Embedded Networked Sensor

Systems, 2003. [15] K.-L. Noh, E. Serpedin, and K. Qaraqe, ''A new approach for time syn- chronization in wireless sensor networks: Pairwise broadcast synchro-

nization,' Wireless Communications, IEEE Transactions on, vol. 7,

no. 9, pp. 3318''3322, Sept. 2008. [16] O. Simeone and U. Spagnolini, ''Distributed time synchronization in wireless sensor networks with coupled discrete-time oscillators,' Wire-

less Communications and Networking, EURASIP Journal on, 2007. [17] L. Schenato and G. Gamba, ''A distributed consensus protocol for clock synchronization in wireless sensor network,' in Decision and Control,

46th IEEE Conference on, Dec. 2007, pp. 2289''2294. [18] R. Carli, A. Chiuso, S. Zampieri, and L. Schenato, ''A PI consensus controller for networked clocks synchronization,' in Automatic Control,

IFAC World Congress on, 2008. BIBLIOGRAPHY 167 [19] A. A. Syed and J. Heidemann, ''Time Synchronization for High La- tency Acoustic Networks,' in INFOCOM 2006. 25th IEEE Interna-

tional Conference on Computer Communications. Proceedings, Apr.

2006, pp. 1 ''12. [20] N. Chirdchoo, W. S. Soh, and K. C. Chua, ''MU-Sync: A Time Syn- chronization Protocol for Undewater Mobile Networks,' in Proc. of

ACM WUWNet, San Francisco, CA, 2008. [21] J. Liu, Z. Zhou, Z. Peng, and J. H. Cui, ''Mobi-Sync: Efficient Time Synchronization for Mobile Underwater Sensor Networks,' in Proc. of

ACM WUWNet, Berkeley, CA, Nov. 2009. [22] F. Lu, D. Mirza, and C. Schurgers, ''D-sync: Doppler-based time synchronization for mobile underwater sensor networks,' in Proceedings

of the Fifth ACM International Workshop on UnderWater Networks,

ser. WUWNet ''10. New York, NY, USA: ACM, 2010, pp. 3:1''3:8. [Online]. Available: http://doi.acm.org/10.1145/1868812.1868815 [23] H. Abdel-Ghaffar, ''Analysis of synchronization algorithms with time- out control over networks with exponentially symmetric delays,' Com-

munications, IEEE Transactions on, vol. 50, no. 10, pp. 1652 '' 1661,

Oct. 2002. [24] D. Jeske, ''On maximum-likelihood estimation of clock offset,' Com- munications, IEEE Transactions on, vol. 53, no. 1, pp. 53 '' 54, Jan.

2005. [25] Q. Chaudhari, E. Serpedin, and K. Qaraqe, ''On Minimum Variance Unbiased Estimation of Clock Offset in a Two-Way Message Exchange

Mechanism,' Information Theory, IEEE Transactions on, vol. 56,

no. 6, pp. 2893 ''2904, Jun. 2010. [26] K.-L. Noh, Q. Chaudhari, E. Serpedin, and B. Suter, ''Novel Clock Phase Offset and Skew Estimation Using Two-Way Timing Message

Exchanges for Wireless Sensor Networks,' Communications, IEEE

Transactions on, vol. 55, no. 4, pp. 766 ''777, Apr. 2007. [27] Q. Chaudhari, E. Serpedin, and K. Qaraqe, ''On Maximum Likelihood Estimation of Clock Offset and Skew in Networks With Exponential

Delays,' Signal Processing, IEEE Transactions on, vol. 56, no. 4, pp.

1685 ''1697, Apr. 2008. 168 BIBLIOGRAPHY [28] A. Ahmad, A. Noor, E. Serpedin, H. Nounou, and M. Nounou, ''On Clock Offset Estimation in Wireless Sensor Networks with Weibull Dis-

tributed Network Delays,' in Pattern Recognition (ICPR), 2010 20th

International Conference on, Aug. 2010, pp. 2322 ''2325. [29] J.-S. Kim, J. Lee, E. Serpedin, and K. Qaraqe, ''A robust estimation scheme for clock phase offsets in wireless sensor networks

in the presence of non-gaussian random delays,' Signal Processing, vol. 89, no. 6, pp. 1155 '' 1161, 2009. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0165168409000024 [30] '''', ''Robust Clock Synchronization in Wireless Sensor Networks Through Noise Density Estimation,' Signal Processing, IEEE Trans- actions on, vol. 59, no. 7, pp. 3035 ''3047, Jul. 2011. [31] M. Leng and Y.-C. Wu, ''On clock synchronization algorithms for wireless sensor networks under unknown delay,' Vehicular Technology, IEEE Transactions on, vol. 59, no. 1, pp. 182 ''190, Jan. 2010. [32] ''MSP430 F1611 data sheet,' Texas Instruments. [33] J. Hill and D. Culler, ''A Wireless Embedded Sensor Architecture for System-Level Optimization,' Tech. Rep., 2001. [34] H. Meyr and G. Ascheid, Synchronization in Digital Communications, vol.1. Wiley, 1990. [35] S. Moon, P. Skelly, and D. Towsley, ''Estimation and removal of clock skew from network delay measurements,' in INFOCOM ''99. Eighteenth

Annual Joint Conference of the IEEE Computer and Communications

Societies. Proceedings. IEEE, vol. 1, Mar. 1999, pp. 227 ''234. [36] ''Definitions and terminology for synchronization networks,' ITU-T, Ed., City, State of Publication, 1996, ch. G.810, pp. 400''402. [37] O. Simeone, U. Spagnolini, G. Scutari, and Y. Bar-Ness, ''Physical-layer distributed synchronization in wireless networks and applications,'

Physical Communication, vol. 1, no. 1, pp. 67 '' 83, 2008.

[Online]. Available: http://www.sciencedirect.com/science/article/pii/

S1874490708000049 [38] Q. Chaudhari and E. Serpedin, ''Maximum Likelihood Estimation of clock parameters for synchronization of wireless sensor networks,' in

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE

International Conference on, Apr. 2008, pp. 2517 ''2520. BIBLIOGRAPHY 169 [39] Q. Chaudhari, E. Serpedin, and K. Qaraqe, ''Minimal cost clock syn- chronization using a sender-receiver protocol in Wireless Sensornets,'

in Performance Evaluation of Computer and Telecommunication Sys-

tems, 2008. SPECTS 2008. International Symposium on, Jun. 2008,

pp. 159 ''164. [40] '''', ''Some improved and generalized estimation schemes for clock synchronization of listening nodes in wireless sensor networks,' Com-

munications, IEEE Transactions on, vol. 58, no. 1, pp. 63 ''67, Jan.

2010. [41] Q. Chaudhari and E. Serpedin, ''A new scheme for synchronization of inactive nodes in a sender-receiver protocol,' in Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference

on, Apr. 2009, pp. 2905 ''2908. [42] Q. Chaudhari, E. Serpedin, and J.-S. Kim, ''Energy-Efficient Estima- tion of Clock Offset for Inactive Nodes in Wireless Sensor Networks,'

Information Theory, IEEE Transactions on, vol. 56, no. 1, pp. 582

''596, Jan. 2010. [43] B. Sadler, ''Fundamentals of energy-constrained sensor network sys- tems,' Aerospace and Electronic Systems Magazine, IEEE, vol. 20,

no. 8, pp. 17''35, Aug. 2005. [44] [Online]. Available: http://mail.millennium.berkeley.edu/pipermail/ tinyos-help/2008-October/036812.html [45] [Online]. Available: http://www.tla.co.nz/xtal1.html [46] C. J. Bovy, H. T. Mertodimedjo, G. Hooghiemstra, H. Uijterwaal, and P. V. Mieghem, ''Analysis of end-to-end delay measurements in inter-

net,' in Proc. Passive Active Meas. Workshop, Fort Collins, CO, Mar.

2002, pp. 26''33. [47] S. M. Kay, Fundamentals of statistical signal processing. Estimation theory. Prentice-Hall, 1993. [48] A. Ahmad, D. Zennaro, E. Serpedin, and L. Vangelista, ''A Factor Graph Approach to Clock Offset Estimation in Wireless Sensor Net-

works,' submitted to Information Theory, IEEE Transactions on. [49] M. Wainwright, T. Jaakkola, and A. Willsky, ''A new class of up- per bounds on the log partition function,' Information Theory, IEEE

Transactions on, vol. 51, no. 7, pp. 2313 '' 2335, Jul. 2005. 170 BIBLIOGRAPHY [50] M. J. Wainwright and M. I. Jordan, ''Graphical Models, Exponential Families, and Variational Inference,' Found. Trends Mach. Learn.,

vol. 1, pp. 1''305, Jan. 2008. [Online]. Available: http://dl.acm.org/

citation.cfm'id=1498840.1498841 [51] D. Chapman and H. Robbins, ''Minimum variance estimation without regularity assumptions,' Annals of Math. Statistics, vol. 22, no. 4, pp.

581''586, Dec. 1951. [52] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Uni- versity Press, 2003. [53] A. Ahmad, D. Zennaro, E. Serpedin, and L. Vangelista, ''Time-varying clock offset estimation in two-way timing message exchange in wireless

sensor networks using factor graphs,' in accepted at Acoustics, Speech

and Signal Processing, 2012. ICASSP 2012. IEEE International Con-

ference on, 2012. [54] F. Kschischang, B. Frey, and H.-A. Loeliger, ''Factor graphs and the sum-product algorithm,' Information Theory, IEEE Transactions on,

vol. 47, no. 2, pp. 498 ''519, Feb. 2001. [55] J. Dauwels and H.-A. Loeliger, ''Phase estimation by message passing,' in Communications, 2004 IEEE International Conference on, vol. 1,

Jun. 2004, pp. 523 '' 527 Vol.1. [56] H. V. Trees and K. Bell, Bayesian Bounds for Parameter Estimation and Nonlinear Filtering/Tracking. Wiley, 2007. [57] P. Tichavsky, C. Muravchik, and A. Nehorai, ''Posterior Cramer-Rao bounds for discrete-time nonlinear filtering,' Signal Processing, IEEE

Transactions on, vol. 46, no. 5, pp. 1386 ''1396, May 1998. [58] R. Olfati-Saber, J. Fax, and R. Murray, ''Consensus and cooperation in networked multi-agent systems,' Proceedings of the IEEE, vol. 95,

no. 1, pp. 215''233, Jan. 2007. [59] G. Scutari, S. Barbarossa, and L. Pescosolido, ''Distributed decision through self-synchronizing sensor networks in the presence of propaga-

tion delays and asymmetric channels,' IEEE Transactions on Signal

Processing, vol. 56, no. 4, pp. 1667''1684, Apr. 2008. [60] R. Olfati-Saber and R. Murray, ''Consensus problems in networks of agents with switching topology and time-delays,' Automatic Control,

IEEE Transactions on, vol. 49, no. 9, pp. 1520''1533, Sept. 2004. BIBLIOGRAPHY 171 [61] S. Sardellitti, M. Giona, and S. Barbarossa, ''Fast Distributed Average Consensus Algorithms Based on Advection-Diffusion Processes,' IEEE

Transactions on Signal Processing, vol. 58, no. 2, pp. 826 ''842, Feb.

2010. [62] R. Olfati-Saber and J. Shamma, ''Consensus filters for sensor networks and distributed sensor fusion,' in Decision and Control, 2005 and 2005

European Control Conference. CDC-ECC ''05. 44th IEEE Conference

on, Dec. 2005, pp. 6698''6703. [63] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, ''Randomized gossip algorithms,' Information Theory, IEEE Transactions on, vol. 52, no. 6,

pp. 2508 '' 2530, Jun. 2006. [64] T. C. Aysal, M. E. Yildiz, A. D. Sarwate, and A. Scaglione, ''Broad- cast Gossip Algorithms for Consensus,' IEEE Transactions on Signal Processing, vol. 57, no. 7, pp. 2748 ''2761, Jul. 2009. [65] D. P. Bersekas and J. N. Tsitsiklis, Parallel and distributed computa- tion: numerical methods. Athena Scientific, 1997. [66] I. Schizas, A. Ribeiro, and G. Giannakis, ''Consensus in ad hoc WSNs with noisy links - part I: Distributed estimation of deterministic sig-

nals,' Signal Processing, IEEE Transactions on, vol. 56, no. 1, pp.

350''364, Jan. 2008. [67] I. D. Schizas, G. B. Giannakis, S. I. Roumeliotis, and A. Ribeiro, ''Con- sensus in Ad Hoc WSNs With Noisy Links - Part II: Distributed Es-

timation and Smoothing of Random Signals,' IEEE Transactions on

Signal Processing, vol. 56, no. 4, pp. 1650 ''1666, Apr. 2008. [68] H. Zhu, G. B. Giannakis, and A. Cano, ''Distributed In-Network Channel Decoding,' IEEE Transactions on Signal Processing., vol. 57,

no. 10, pp. 3970''3983, Oct. 2009. [69] S.-J. Kim, E. Dall''Anese, and G. B. Giannakis, ''Cooperative Spectrum Sensing for Cognitive Radios Using Kriged Kalman Filtering,' IEEE

Journal on Selected Topics in Signal Processing, vol. 5, no. 1, pp. 24

''36, Feb. 2011. [70] S. Boyd, P. Diaconis, and L. Xiao, ''Fastest mixing Markov chain on a graph,' SIAM Rev., vol. 46, no. 4, pp. 667 '' 689, 2004. 172 BIBLIOGRAPHY [71] E. Kokiopoulou and P. Frossard, ''Polynomial Filtering for Fast Con- vergence in Distributed Consensus,' IEEE Transactions on Signal Pro-

cessing, vol. 57, no. 1, pp. 342 ''354, Jan. 2009. [72] M. Cao, D. A. Spielman, and E. M. Yeh, ''Accelerated gossip algorithms for distributed computation,' in 44th Annual Allerton Conference on

Communication, Control, and Computation, Sept. 2006. [73] B. Oreshkin, M. Coates, and M. Rabbat, ''Optimization and analysis of distributed averaging with short node memory,' IEEE Transactions

on Signal Processing, vol. 58, no. 5, pp. 2850 ''2865, May 2010. [74] L. Xiao and S. Boyd, ''Fast linear iterations for distributed averaging,' Systems & Control Letters, vol. 53, no. 1, pp. 65 '' 78, Sept. 2004. [75] D. Jakovetic and, J. Xavier, and J. Moura, ''Weight Optimization for Consensus Algorithms With Correlated Switching Topology,' IEEE

Transactions on Signal Processing, vol. 58, no. 7, pp. 3788 ''3801, Jul.

2010. [76] L. Xiao and S. Boyd, ''Fast linear iterations for distributed averaging,' in Decision and Control, 2003. Proceedings. 42nd IEEE Conference on,

vol. 5, Dec. 2003, pp. 4997 '' 5002. [77] L. Xiao, S. Boyd, and S.-J. Kim, ''Distributed average consensus with least-mean-square deviation,' Journal of Parallel and Distributed Com-

puting, vol. 67, no. 1, pp. 33 '' 46, 2007. [78] T. Erseghe, D. Zennaro, E. Dall''Anese, and L. Vangelista, ''Fast Con- sensus by the Alternating Direction Multipliers Method,' Signal Pro-

cessing, IEEE Transactions on, vol. 59, no. 11, pp. 5523 ''5537, Nov.

2011. [79] F. R. Gantmacher, The Theory of Matrices, Vol. 2. Chelsea Publishing Company, 2000. [80] J. S. Rosenthal, ''Convergence rates of markov chains,' SIAM Review, vol. 37, pp. 387''405, 1995. [81] S. Kar and J. Moura, ''Distributed Consensus Algorithms in Sensor Networks With Imperfect Communication: Link Failures and Channel

Noise,' IEEE Transactions on Signal Processing, vol. 57, no. 1, pp. 355

''369, Jan. 2009. BIBLIOGRAPHY 173 [82] R. Horn and C. Johnson, Matrix analysis. Cambridge University Press, 1985. [83] D. Zennaro, E. Dall''Anese, T. Erseghe, and L. Vangelista, ''Fast clock synchronization in wireless sensor networks via ADMM-based consen-

sus,' in Modeling and Optimization in Mobile, Ad Hoc and Wireless

Networks (WiOpt), 2011 International Symposium on, May 2011, pp.

148 ''153. [84] W. Lindsey, F. Ghazvinian, W. Hagmann, and K. Dessouky, ''Network synchronization,' Proceedings of the IEEE, vol. 73, no. 10, pp. 1445''

1467, Oct. 1985. [85] T. Schmidl and D. Cox, ''Low-overhead, low-complexity [burst] syn- chronization for ofdm,' in Communications, IEEE International Con-

ference on, vol. 3, Jun. 1996, pp. 1301 ''1306. [86] D. Zennaro, S. Tomasin, and L. Vangelista, ''Uplink Cell Selection for Cooperative Multi-Cell Networks with Hybrid ARQ,' in GLOBECOM

2010, 2010 IEEE Global Telecommunications Conference, Dec. 2010,

pp. 1 ''5. [87] '''', ''Base Station Selection in Uplink Macro Diversity Cellular Sys- tems with Hybrid ARQ,' Selected Areas in Communications, IEEE

Journal on, vol. 29, no. 6, pp. 1249 ''1259, Jun. 2011. [88] S. Chatzinotas, M. Imran, and R. Hoshyar, ''On the multicell processing capacity of the cellular MIMO uplink channel in correlated rayleigh

fading environment,' Wireless Communications, IEEE Transactions

on, vol. 8, no. 7, pp. 3704 ''3715, Jul. 2009. [89] S. W. Peters, A. Y. Panah, K. T. Truong, and R. W. Heath, ''Relay architectures for 3GPP LTE-advanced,' EURASIP J. Wirel.

Commun. Netw., vol. 2009, pp. 1:1''1:14, Mar. 2009. [Online].

Available: http://dx.doi.org/10.1155/2009/618787 [90] L. Venturino, N. Prasad, and X. Wang, ''Coordinated Scheduling and Power Allocation in Downlink Multicell OFDMA Networks,' Vehicular

Technology, IEEE Transactions on, vol. 58, no. 6, pp. 2835 ''2848, Jul.

2009. [91] H. Hu, Y. Zhang, and J. Luo, Distributed Antenna Systems: Open Architecture for Future Wireless Communications, 2007. 174 BIBLIOGRAPHY [92] A. Sanderovich, O. Somekh, and S. Shamai, ''Uplink Macro Diversity with Limited Backhaul Capacity,' in Information Theory, 2007. ISIT

2007. IEEE International Symposium on, Jun. 2007, pp. 11 ''15. [93] A. Sanderovich, O. Somekh, H. Poor, and S. Shamai, ''Uplink Macro Diversity of Limited Backhaul Cellular Network,' Information Theory,

IEEE Transactions on, vol. 55, no. 8, pp. 3457 ''3478, Aug. 2009. [94] P. Marsch and G. Fettweis, ''A Framework for Optimizing the Up- link Performance of Distributed Antenna Systems under a Constrained

Backhaul,' in Communications, 2007. ICC ''07. IEEE International Conference on, Jun. 2007, pp. 975 ''979. [95] C. Hoymann, L. Falconetti, and R. Gupta, ''Distributed Uplink Sig- nal Processing of Cooperating Base Stations Based on IQ Sample Ex-

change,' in Communications, 2009. ICC ''09. IEEE International Con-

ference on, Jun. 2009, pp. 1 ''5. [96] L. Falconetti, C. Hoymann, and R. Gupta, ''Distributed Uplink Macro Diversity for Cooperating Base Stations,' in Communications Work-

shops, 2009. ICC Workshops 2009. IEEE International Conference on,

Jun. 2009, pp. 1 ''5. [97] O. Somekh, B. Zaidel, and S. Shamai, ''Spectral Efficiency of Joint Mul- tiple Cell-Site Processors for Randomly Spread DS-CDMA Systems,'

Information Theory, IEEE Transactions on, vol. 53, no. 7, pp. 2625

''2637, Jul. 2007. [98] M. Kamoun and L. Mazet, ''Base-station selection in cooperative single frequency cellular network,' in Signal Processing Advances in Wireless

Communications, 2007. SPAWC 2007. IEEE 8th Workshop on, Jun.

2007, pp. 1 ''5. [99] Y. Wang, X. Shen, and P. Zhang, ''Uplink performance analysis in group cell systems,' in Vehicular Technology Conference, 2005. VTC

2005-Spring. 2005 IEEE 61st, vol. 4, Jun. 2005, pp. 2405 '' 2409. [100] A. Del Coso and S. Simoens, ''Distributed compression for MIMO co- ordinated networks with a backhaul constraint,' Wireless Communi-

cations, IEEE Transactions on, vol. 8, no. 9, pp. 4698 ''4709, Sept.

2009. BIBLIOGRAPHY 175 [101] A. Papadogiannis, D. Gesbert, and E. Hardouin, ''A Dynamic Cluster- ing Approach in Wireless Networks with Multi-Cell Cooperative Pro-

cessing,' in Communications, 2008. ICC ''08. IEEE International Con-

ference on, May 2008, pp. 4033 ''4037. [102] S. Kiani and D. Gesbert, ''Optimal and Distributed Scheduling for Multicell Capacity Maximization,' Wireless Communications, IEEE

Transactions on, vol. 7, no. 1, pp. 288 ''297, Jan. 2008. [103] I. Stanojev, O. Simeone, Y. Bar-Ness, and C. You, ''Performance of multi-relay collaborative hybrid-ARQ protocols over fading channels,'

Communications Letters, IEEE, vol. 10, no. 7, pp. 522 ''524, Jul. 2006. [104] P. Wu and N. Jindal, ''Coding Versus ARQ in Fading Channels: How Reliable Should the PHY Be'' in Global Telecommunications Confer-

ence, 2009. GLOBECOM 2009. IEEE, Dec. 2009, pp. 1 ''6. [105] F. Calabrese, M. Anas, C. Rosa, P. Mogensen, and K. Pedersen, ''Per- formance of a Radio Resource Allocation Algorithm for UTRAN LTE

Uplink,' in Vehicular Technology Conference, 2007. VTC2007-Spring.

IEEE 65th, Apr. 2007, pp. 2895 ''2899. [106] D. Tse and P. Visvanath, Fundamentals of Wireless Communication. Cambridge University Press, 2005. [107] A. Prudnikov, Y. A. Brychkov, and O. Marichev, Integrals and Series, vol. 1. Gordon and Breach Science Publishers, 1986. [108] S. Amari and R. Misra, ''Closed-form expressions for distribution of sum of exponential random variables,' Reliability, IEEE Transactions

on, vol. 46, no. 4, pp. 519 ''522, Dec. 1997.

1.3.1 Message-Based vs Pulse-Coupled Algorithms . . . . . . 10 1.3.2 Coupled Clocks vs Uncoupled Clocks . . . . . . . . . . 11 1.3.3 Synchronization Via Parameter Estimation . . . . . . . 12 1.3.4 Synchronization Via Clock Coupling . . . . . . . . . . 12 I Clock Synchronization in WSNs via the Two-Way Message Exchange 13 2 The Two-Way Message Exchange 15 2.1 The Two-Way Data Exchange . . . . . . . . . . . . . . . . . . 15 2.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.1 Fixed and Random Portions of Link Delay . . . . . . . 17 2.3 State of The Art . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.1 Relevant Synchronization Algorithms . . . . . . . . . . 20 i ii Table of Contents 3 Clock Offset Estimation: A Unified Framework 25 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3 A New General Framework . . . . . . . . . . . . . . . . . . . . 27 3.4 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . 31

3.4.1 Unconstrained Likelihood . . . . . . . . . . . . . . . . 31 3.4.2 Constrained Likelihood . . . . . . . . . . . . . . . . . . 33 3.5 Statistical Bounds . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.5.1 Cramer-Rao Lower Bound . . . . . . . . . . . . . . . . 35 3.5.2 Chapman-Robbins Bound . . . . . . . . . . . . . . . . 36 3.5.3 MSE and Variance of the Estimation Error . . . . . . . 36 3.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.6.1 MSE Performance and Bounds . . . . . . . . . . . . . . 38 3.6.2 MSE Robustness with Log-Normally Distributed Like-

lihood . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.A.1 Proof of Theorem 3.2 . . . . . . . . . . . . . . . . . . . 42

3.A.2 Proof of Theorem 3.4 . . . . . . . . . . . . . . . . . . . 43

3.A.3 Proof of Theorem 3.5 . . . . . . . . . . . . . . . . . . . 43

3.A.4 Proof of Theorem 3.7 . . . . . . . . . . . . . . . . . . . 44

3.A.5 MSE Expressions for ML Estimators . . . . . . . . . . 45 4 A Factor Graph-Based Clock Offset Estimator 47 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.2 Factor Graphs and Message Passing . . . . . . . . . . . . . . . 48 4.3 A Factor Graph-based Estimator . . . . . . . . . . . . . . . . 49

4.3.1 Message Computation . . . . . . . . . . . . . . . . . . 50 4.3.2 Closed-Form Expressions For ' θ (o) ML(K ) . . . . . . . . . . 56 4.4 Bayesian Statistical Bounds . . . . . . . . . . . . . . . . . . . 62

4.4.1 Bayesian Cramér-Rao Lower Bound . . . . . . . . . . . 63 4.4.2 Bayesian Chapman-Robbins Bound . . . . . . . . . . . 64 4.4.3 MSE and Variance of the Estimation Error . . . . . . . 65 4.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.5.1 MSE Performance and Bounds . . . . . . . . . . . . . . 66 4.5.2 MSE Robustness with Log-Normally Distributed Like-

lihood . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.5.3 Classical vs Bayesian framework . . . . . . . . . . . . . 68 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.A.1 Proof of Theorem 4.4 . . . . . . . . . . . . . . . . . . . 71 Table of Contents iii II Fast Consensus in WSNs and its Application to Clock Synchronization 73 5 Consensus and Clock Synchronization 75 5.1 Distributed Consensus in WSNs . . . . . . . . . . . . . . . . . 76 5.2 Consensus System Model . . . . . . . . . . . . . . . . . . . . . 77 5.3 Requirements for Consensus Matrix . . . . . . . . . . . . . . . 79

5.3.1 Average Consensus . . . . . . . . . . . . . . . . . . . . 80 5.3.2 Popular Average Consensus Matrices . . . . . . . . . . 80 5.4 Application of Consensus to Clock Synchronization . . . . . . 81

5.4.1 System Model . . . . . . . . . . . . . . . . . . . . . . . 82 5.4.2 Relevant Works . . . . . . . . . . . . . . . . . . . . . . 82 5.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.A.1 Proof of Theorem 5.1 . . . . . . . . . . . . . . . . . . . 84 6 Fast Consensus via ADMM 87 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.2 ADMM-based Average Consensus . . . . . . . . . . . . . . . . 88 6.3 ADMM-based Consensus in Vector Form . . . . . . . . . . . . 90

6.3.1 Assumption on the augmentation constants . . . . . . 93 6.4 ADMM Performance Analysis . . . . . . . . . . . . . . . . . . 96

6.4.1 Noiseless communications . . . . . . . . . . . . . . . . 97 6.4.2 Encompassing for communication noise . . . . . . . . . 97 6.4.3 Noise resilience characterization . . . . . . . . . . . . . 100 6.4.4 Initial states choice . . . . . . . . . . . . . . . . . . . . 102 6.5 Optimization Issues . . . . . . . . . . . . . . . . . . . . . . . . 103

6.5.1 Convergence speed . . . . . . . . . . . . . . . . . . . . 103 6.5.2 Resilience to noise . . . . . . . . . . . . . . . . . . . . 106 6.5.3 Results for highly structured networks . . . . . . . . . 108 6.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 6.A.1 Proof of Methods A and B . . . . . . . . . . . . . . . . 114

6.A.2 Proof of Results of 6.3 . . . . . . . . . . . . . . . . . . 116

6.A.3 Proof of Theorems of 6.4 . . . . . . . . . . . . . . . . . 118

6.A.4 Proof of Results of 6.5 . . . . . . . . . . . . . . . . . . 120 7 An ADMM-Based Clock Synchronization Algorithm 123 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . 124 7.3 ADMM-based Consensus on Clock Periods . . . . . . . . . . . 124 iv Table of Contents 7.3.1 Algorithm implementation . . . . . . . . . . . . . . . . 125 7.3.2 Noise resilience . . . . . . . . . . . . . . . . . . . . . . 126 7.4 Consensus on Clock Counters . . . . . . . . . . . . . . . . . . 127 7.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 128

7.5.1 Convergence Rate and Noise Resilience . . . . . . . . . 128 7.5.2 Performance Comparison with Standard Consensus . . 130 7.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 A Efficient Base Station Selection in Uplink Cellular Networks133 A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

A.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

A.3 Outage Probability Expressions . . . . . . . . . . . . . . . . . 138 A.3.1 Local Decoding with ARQ (LD-ARQ) . . . . . . . . . 141

A.3.2 Local Decoding with HARQ-CC (LD-HARQ-CC) . . . 141

A.3.3 Joint Decoding with HARQ-CC (JD-HARQ-CC) . . . 142

A.3.4 Local Decoding with HARQ-IR (LD-HARQ-IR) . . . . 143

A.3.5 Joint Decoding with HARQ-IR (JD-HARQ-IR) . . . . 144 A.4 Base Station Selection . . . . . . . . . . . . . . . . . . . . . . 144 A.4.1 The Recursive Search (RS) . . . . . . . . . . . . . . . . 144

A.4.2 A closed-form expression for LD-ARQ . . . . . . . . . 146 A.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 147 A.5.1 Outage Probability in the Sector . . . . . . . . . . . . 148

A.5.2 Coverage improvement . . . . . . . . . . . . . . . . . . 148

A.5.3 Randomly Dropped Users . . . . . . . . . . . . . . . . 151

A.5.4 LD-ARQ and 'm = ' . . . . . . . . . . . . . . . . . . . 157 A.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

A.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 A.A.1 Gaussian Approximation for JD-HARQ-IR . . . . . . . 159

A.A.2 Closed-form Expression of Probability (A.10) and ARQ 160

A.A.3 Proof of Theorem A.1 . . . . . . . . . . . . . . . . . . 160

A.A.4 Closed-form solution of (A.3) for 'm = '. . . . . . . . 161 B List of Publications 163 Bibliography 165 List of Figures 1.1 Examples of clocks with different offsets and no skews (a) and

with different skews but no offsets (b). . . . . . . . . . . . . . 9 2.1 Two-way message exchange between node S and R. . . . . . . 16 3.1 K rounds two-way message exchange between node S and R

with clock offset only between them. . . . . . . . . . . . . . . 26 3.2 MSE and bounds for estimating θ(o) by using the MLE with

Gaussian and exponentially distributed likelihood. . . . . . . . 39 3.3 MSE robustness (in terms of the likelihood functions) and

bounds for the estimation of θ(o) in case of log-normally dis-

tributed likelihood. . . . . . . . . . . . . . . . . . . . . . . . . 40 4.1 Factor graph representation of the density (4.5). . . . . . . . . 51 4.2 The factor 'K sends the message m' K ''ξ(K) to the variable ξ(K). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3 The variable ξ(K) sends the message mξ(K)''ζK K''1 to the factor ζK K ''1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.4 The factor ζK K ''1 sends the message mζ K K''1 ''ξ(K ''1) to the vari- able ξ(K '' 1). . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.5 The variable ξ(K '' 1) sends the message mξ(K''1)''ζK''1 K''2 to the factor ζ K ''1 K ''2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.6 The factor ζK''1 K ''2 sends the message mζ K''1 K''2 ''ξ(K ''2) to the vari- able ξ(K '' 2). . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.7 MSE and bounds for estimating θ(o)(K) when using the pro-

posed FGOE with Gaussian and exponentially distributed like-

lihood. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.8 MSE and bounds for the estimation of θ(o)(K) in case of log-

normal likelihood. . . . . . . . . . . . . . . . . . . . . . . . . . 68 v vi Table of Contents 4.9 MSE in the estimation of θ(o)(K) vs ' using FGOEs. The

MSE in the Bayesian case approaches the curves of the MLE

as ' '' 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.1 Example of an undirected and strongly connected graph. . . . 78 5.2 Adjacency matrix relative to the network whose graph is rep-

resented in Fig. 5.1. . . . . . . . . . . . . . . . . . . . . . . . . 78 6.1 Eigenvalues λi of matrix M as functions of 'i for ǫ ' 1 (a) and ǫ ' 1 (b). Solid lines for odd i and dashed lines for even i. 94 6.2 Convergence speed maximization choices, with initial state

(6.46) and 'N = 1

2 . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.3 Spectral radius of ADMM and of other state-of-the-art consen-

sus algorithm'' Boyd''s optimal matrix and Oreshkin''s method. 105 6.4 Convergence speed '' 1

2 log ξA dependent on ǫ and S when the underlying network graph is a mesh grid where each node has

4 neighbors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.5 Method B: Noise values KiUi as functions of d for different

values of 'i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.6 Noise resilience in line graph (left) and mesh grid (right). . . . 109 6.7 MSE with a RGG of N = 100 nodes, Nneigh = 8, and '2q =

10''6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.8 ADMM Method B with fixed ǫ in a RGG with Nneigh = 8:

(a) convergence speed; (b) MSE at steady state; (c) optimal

choices of ǫ used. . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.9 MSE with an underlying RGG: (a) as a function of t for N =

50 nodes, and (b) as a function of N for t = 300. Nneigh = 4,

'2 q = 10 ''6, and ǫ optimized for convergence speed. . . . . . . 112 7.1 Network communication graph. . . . . . . . . . . . . . . . . . 129 7.2 Synchronization phase error Ti(t) for five nodes. . . . . . . . . 130 7.3 MSE for clock phases and clock periods with ADMM+AC. . . 131 7.4 MSE of ADMM+AC compared to standard and optimized

consensus approaches. . . . . . . . . . . . . . . . . . . . . . . 131 A.1 The MT (red cross) is communicating with the M = 3 closest BSs (blu squares) in a cellular scenario. . . . . . . . . . . . . . 136 A.2 Final outage probability pN+1 as a function of the MT position in a sector with LD-HARQ-IR and in the absence of interfer-

ence (q = 0). The serving BS is located at coordinates (0,0)

in the plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Table of Contents vii A.3 Final outage probability pN+1 as a function of the MT position in a sector with LD-ARQ and in the presence of interference

(q = 1). The serving BS is located at coordinates (0,0) in the

plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 A.4 Minimum value of the SNR at the unitary distance vs the spectral efficiency of the first transmission for various decod-

ing schemes. Solid line: HARQ-IR; dashed line: HARQ-CC;

dotted line: ARQ. . . . . . . . . . . . . . . . . . . . . . . . . . 150 A.5 Maximum allowed interference probability qmax vs the spec- tral efficiency of the first transmission for various decoding

schemes. Solid line: HARQ-IR; dashed line: HARQ-CC; dot-

ted line: ARQ. . . . . . . . . . . . . . . . . . . . . . . . . . . 152 A.6 Total average number of active BSs as a function of q for HARQ-IR. Solid line: JD; dashed line: LD. . . . . . . . . . . . 153 A.7 Total average number of active BSs as a function of q for HARQ-CC. Solid line: JD; dashed line: LD. . . . . . . . . . . 154 A.8 Total average number of active BSs as a function of q for ARQ.155

A.9 Total average of number backhaul transmissions as a function of q for JD-HARQ. Solid line: HARQ-IR; dashed line: HARQ-

CC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 A.10 Average payload throughput as a function of q for HARQ-IR. Solid line: JD; dashed line: LD. . . . . . . . . . . . . . . . . . 156 A.11 Average payload throughput as a function of q for HARQ-CC. Solid line: JD; dashed line: LD. . . . . . . . . . . . . . . . . . 157 A.12 Total number of active BSs when the MT is positioned at the edge of the cell with ARQ. . . . . . . . . . . . . . . . . . . . . 158 viii Table of Contents List of Tables 3.1 Specialization of the proposed new general framework. . . . . 30 4.1 Specialization of the proposed new general framework for time-

varying parameters. . . . . . . . . . . . . . . . . . . . . . . . . 51 6.1 Convergence speed results for line graph and mesh grid. . . . . 109 A.1 Combination of decoding policies and retransmission methods. 139

A.2 Recursive Search Algorithm. . . . . . . . . . . . . . . . . . . . 146 ix x Table of Contents Acknowledgements Padova, Gennaio 2012 Giunto al termine del percorso triennale di Dottorato, desidero ringraziare le persone che mi sono state vicine, certo di cadere nella disgrazia di dimen-

ticare qualcuno. Devo la maggior parte di quello che ho ottenuto al supporto costante ed incondizionato di Mamma Palmira e Papà Armando, che fin dalle prime de-

cisioni riguardo al mio futuro si sono sempre comportati da genitori modello,

lasciandomi piena libertà nelle mie scelte personali, scolastiche e professionali,

ma garantendomi allo stesso tempo tutto il supporto di cui avevo bisogno. Esempi, forse inconsapevoli, per la mia vita sono stati (e sono tutt''ora) i miei fratelli, Tiziana e Umberto e rispettive famiglie, che con la loro per-

severanza e coraggio mi permettono di godere di due riferimenti e pietre di

paragone del tutto invidiabili. Ringrazio Prof. Lorenzo Vangelista, per avermi offerto l''opportunità di intraprendere il percorso di Dottorato e per aver sempre creduto nelle mie

qualità e capacità, professionali e personali. Meritevoli di ringraziamenti sono sicuramente tutti i docenti e i colleghi del Dipartimento di Ingegneria dell''Informazione dell''Università di Padova

con cui ho avuto il piacere di lavorare assieme. Uno speciale ringraziamento

va a tutti i compagni di ufficio che si sono alternati durante questi tre anni. Un infinito Grazie ad un grande Amico, Marco Massari, per esserci sempre stato, in ogni momento. Un affettuoso ringraziamento anche a Tommaso e

Martina, per avermi sopportato in questi tre anni, e a Chiara per la costante

vicinanza. Un Grazie agli amici di Pellestrina, a tutti quanti, ma in particolare modo ad Anna per essere sempre stata fonte di saggi consigli. Un enorme Grazie

a Silvia, perché le discussioni via chat ad orari per lei improponibili non

rimangano solamente un vago ricordo. Un forte ringraziamento a Giò, Ermy, Elisa, Sbalky, Matteo e a tutti gli altri amici della combriccola padovana per avermi fatto sentire a tempo di

record parte di un gruppo di amici eccezionale. Un sincero ed umile Grazie xi xii Acknowledgements va in particolare ad Elisa, Ermy e Giò per i fondamentali confronti personali,

a volte accesi ma sempre rispettosi. I would like to thank Prof. Erchin Serpedin for having given me the chance to come to Texas A&M University in 2011 to do research work to-

gether with him and Aitzaz Ahmad. A special Thank You also to Sabit Ekin

and Ali Riza Ekti, for their fundamental help before and during my first days

in College Station. A huge Thank You to Patricia (''oh yeah!'), Alberto (''eja'), Darko (''yeah yeah'), Pilar and Cesar for all the gorgeous moments we spent together in

College Station and in Northgate in particular. A special Thank You to

Tasha and Sarah for having tried desperately to teach me how to dance

country music. Texas time will definitely hold a special place in my heart. Infine, vorrei ringraziare tutte quelle persone (colleghi e non) che, in un modo o nell''altro, hanno condiviso con me esperienze fondamentali della mia

vita, permettendomi di diventare quello che sono oggi. Davide Abstract In the past few years, the impressive growth of applications performing tasks

in a distributed fashion has been enabled by the availability of tiny, inex-

pensive devices which, in turn, has been made possible by the recent micro-

electromechanical technology advancements. Sparsely disposing small intel-

ligent appliances throughout a specific area is something that the community

has become used to. Low cost and low power sensing devices equipped with

telecommunication hardware are attractive for use in an infrastructure-less

network in which the absence of a central node stands out, making robustness

be one of the strengths of this kind of networks. Environmental monitoring

and military surveillance are just a few examples of the number of appli-

cations suitable for sensor networks; in fact, home automation and several

health services also can be implemented given that a distributed network of

sensors exists. Sensors need to keep track of a common time scale. This is fundamental for prolonging the network lifetime, making channel access schemes work

properly, for example, or for allowing precise duty cycling among the nodes.

Clock synchronization is also basic if the goal of a running application is

to track moving objects in the battlefield or, more generally, to perform

distributed processing of the sensed data. Since the local notion of time

in a sensor is based on a low quality local oscillator, it turns out that even

small changes in the environmental conditions, like temperature and pressure,

lead to modification in the oscillation frequency of the quartz crystals, thus

producing time discrepancies among different sensor nodes as time goes by. This thesis tackles the problem of clock synchronization in sensor net- works, both from a perspective of clock parameters estimation and from an

algorithmic point of view, to pursue the final goal of making nodes agree on

a common time scale, all across the network. In the first part of the thesis, the so-called two-way message exchange between two nodes is thoroughly analyzed. After recalling existing results

on clock parameters estimation exploiting data collected via this message

exchange process on the wireless channel, an innovative mathematical frame- xiii xiv Abstract work is introduced, which encompasses several common assumptions for the

random delays present in the collected data, in a more general treatment.

Based on this new framework, a factor graph-based clock offset estimator for

wireless sensor networks is proposed and evaluated. Comparison of the vari-

ance of the estimation error with classical bounds available in the literature

shows that the new estimator has extremely good performance, therefore it

can be considered outstanding among Bayesian clock offset estimators. The focus of the second part of the thesis is on the design of distributed consensus algorithms in wireless sensor networks, especially for observations

averaging purposes. In fact, an innovative fast consensus algorithm is pro-

posed and evaluated, based on the alternating direction multipliers method,

which is a distributed method used to solve minimization problems in an

iterative fashion. The new consensus algorithm is compared with the state-

of-the-art of fast consensus, showing an excellent convergence rate and an

outstanding noise resilience. The proposed algorithm is then applied to solve

a network-wide clock synchronization issue, assuming both clock skew and

offset for the nodes in the network, showing a relevant performance improve-

ment with respect to previously proposed consensus-based synchronization

schemes. Finally, the Appendix contains a work whose topic falls out of the main stream of this thesis: in uplink cellular networks, based on the knowledge

of channel statistics, surrounding base stations are carefully and iteratively

chosen in order to provide the mobile terminal a certain quality of service in

terms of the maximum allowed outage probability, with the aim of minimizing

the overall backhaul network usage. Sommario Negli ultimi anni abbiamo assistito alla continua comparsa di applicazioni dis-

tribuite, la cui implementabilità risulta consentita dalla possibilità di avere

a disposizione sensori piccoli ed economici. I recenti progressi tecnologici nel

settore micro-elettronico-meccanico hanno infatti consentito una miniaturiz-

zazione dei nodi sensore. La comunità scientifica si è oramai abituata alla

possibilità, con una spesa minima, di collocare piccoli dispositivi intelligenti

lungo un''area specifica. Sensori economici e a basso consumo, una volta

muniti dell''hardware necessario per le telecomunicazioni, risultano ideali per

l''utilizzo in reti senza infrastruttura, uno scenario in cui spicca l''assenza di

un nodo centrale e la robustezza diviene quindi una proprietà fondamentale.

Monitoraggio ambientale e sorveglianza militare sono solamente un paio di es-

empi di applicazioni adatte a reti di sensori, così come la domotica e l''ambito

sanitario risultano scenari in cui l''uso di una rete distribuita di sensori può

rivelarsi, in effetti, utile e vantaggiosa. I sensori necessitano di una base temporale comune. Questo bisogno risulta fondamentale al fine di prolungare il tempo di vita di una rete, ot-

timizzando schemi di accesso deterministico al mezzo, ad esempio, oppure

schedulando i periodi di attività dei nodi in maniera precisa. La sincroniz-

zazione risulta fondamentale anche in applicazioni legate alla localizzazione,

o più genericamente, per permettere l''elaborazione distribuita di dati raccolti

dai sensori stessi. Dal momento che la nozione di tempo locale in un sen-

sore è fornita da un oscillatore di bassa qualità, anche minime perturbazioni

delle condizioni ambientali (come temperatura e pressione) si riflettono in

modifiche nella frequenza di oscillazione del cristallo di quarzo, producendo

discrepanze nel comportamento tra oscillatori in diversi sensori, che diven-

tano non trascurabili man mano che il tempo scorre. Questa tesi affronta il problema della sincronizzazione di clock in reti di sensori, sia da una prospettiva di stima dei parametri di clock, sia da un punto

di vista algoritmico lungo tutta la rete, con l''obiettivo finale di permettere ai

nodi interessati di trovare una concordanza su una scala temporale comune. Nella prima parte di questa tesi viene analizzato il processo di scambio di xv xvi Sommario informazioni tra due nodi chiamato two-way message exchange. Dopo aver

richiamato la letteratura esistente sulla stima dei parametri del clock utiliz-

zando questo protocollo di scambio dati attravero il canale wireless, viene

introdotto un nuovo framework matematico per permettere un''assunzione

più generale riguardo i ritardi casuali presenti nei dati raccolti. Basandosi

su questo framework, viene proposto e studiato un nuovo stimatore del clock

offset basato sulla teoria dei factor graphs. Dal confronto della varianza

dell''errore di stima con classici limiti inferiori presenti in letteratura risulta

che il nuovo stimatore proposto permette degli ottimi risultati, per cui può a

pieno titolo essere considerato meritevole di menzione nella teoria della stima

Bayesiana applicata al clock offset. La seconda parte della tesi riguarda invece la progettazione di algoritmi di consensus distribuiti per reti di sensori wireless, in special modo per oper-

azioni di averaging svolte in maniera distribuita. Viene proposto e valutato

un nuovo algoritmo di consensus velocizzato basato su alternating direction multipliers method, un metodo distribuito per risolvere problemi di minimiz-

zazione in modo iterativo. Il nuovo algoritmo di consensus viene confrontato

con lo stato dell''arte del consensus velocizzato, mostrando un''eccellente ve-

locità di convergenza e una resistenza al rumore migliore rispetto agli al-

tri algoritmi presenti in letteratura. Lo schema proposto viene poi appli-

cato al problema della sincronizzazione di clock in reti di sensori wireless,

assumendo presenza di clock skew e clock offset tra i vari oscillatori della

rete. L''algoritmo di sincronizzazione risultante consente un rilevante miglio-

ramento delle prestazioni rispetto a schemi di sincronizzazione basati su con-

sensus proposti in precedenza. Infine, nell''Appendice viene descritto un lavoro il cui argomento si dis- costa da quello principale della tesi: in reti cellulari in uplink, in base alla

statistica del canale le stazioni base cooperanti vengono selezionate tramite

l''utilizzo di tecniche iterative con l''obiettivo di garantire al terminale mobile

una certa qualità del servizio in termini di probabiltà di disservizio massima

permessa e allo stesso tempo di minimizzare l''utilizzo della rete di backhaul. List of Acronyms and

Abbreviations 3GPP: Third Generation Partnership Project AC: Average Consensus

ACK: Acknowledgement

ADMM: Alternating Direction Multipliers Method

ARQ: Automatic Repeat Request

AS: Analytical Solution

AWGN: Additive White Gaussian Noise BCHRB: Bayesian Chapman-Robbins Bound

BCRB: Bayesian Cramér-Rao lower Bound

BLUE-OS: Best Linear Unbiased Estimator using Order Statistics

BO: Boyd optimum solution

BS: Base Station

BWC: Broadband Wireless Communication cdf: Cumulative Distribution Function

CHRB: Chapman-Robbins Bound

CRB: Cramér-Rao lower Bound

CSMA: Carrier Sense Multiple Access DLL: Delay-Locked Loop

DPLL: Digital Phase-Locked Loop ES: Exhaustive Search FC: Flexible Cooperation

FDD: Frequency Division Duplexing

FGOE: Factor Graph-based clock offset Estimator

FTSP: Flooding Time Synchronization Protocol xvii xviii List of Acronyms and Abbreviations Ga: Gaussian approximation

GMKPF: Gaussian Mixture Kalman Particle Filter HARQ: Hybrid Automatic Repeat Request

HARQ-CC: Hybrid Automatic Repeat Request with Chase Combining

HARQ-IR: Hybrid Automatic Repeat Request with Incremental Redun-

dancy IGMKPF: Iterative Gaussian Mixture Kalman Particle Filter JD: Joint Decoding LA: Laplacian

LD: Local Decoding

LTE: Long Term Evolution MAC: Medium Access Control

MCP: Multi-Cell Processing

MGF: Moment Generating Function

MH: Metropolis-Hastings

MIMO: Multiple-Input Multiple-Output

ML: Maximum Likelihood

MLE: Maximum Likelihood Estimate

MMSE: Minimum Mean Squared Error

MnLD: Minimum Link Delay Algorithm

MSE: Mean Squared Error

MT: Mobile Terminal

MVUE: Minimum Variance Unbiased Estimator NACK: Not Acknowledgement

NC: No Cooperation

NIC: Network Interface Card

NTP: Network Time Protocol OSI: Open Systems Interconnection PBS: Pairwise Broadcast Synchronization

pdf: Probability Distribution Function

PLL: Phase-Locked Loop

ppm: Parts Per Million QoS: Quality of Service List of Acronyms and Abbreviations xix RBS: Reference Broadcast Synchronization

RF: Radio Frequency

RGG: Random Geometric Graph

ROS: Receiver-Only Synchronization

RRS: Receiver-Receiver Synchronization

RS: Recursive Search SC: Static Cooperation

SINR: Signal to Interference plus Noise Ratio

SNR: Signal to Noise Ratio

SRS: Sender-Receiver Synchronization TDMA: Time Division Multiple Access

TOA: Time Of Arrival

TPSN: Timing-sync Protocol for Sensor Networks UMVUE: Uniformly Minimum Variance Unbiased Estimator

UWB: Ultra-Wide Band VCO: Voltage Controlled Oscillator WSN: Wireless Sensor Network xx List of Acronyms and Abbreviations Introduction In the last few years, the evolution in micro-electronics and micro-electro-

mechanical systems technology has lead to the design of tiny devices able to

perform tasks in an independent manner and capable of reliable communica-

tion with other agents [1]. Thanks to the miniaturization achieved in design-

ing electronic and mechanical components, these devices can sense the sur-

rounding environment through on board sensors and can actuate some tasks

since they can perform signal processing as they are equipped with micro-

controllers. The key feature that makes them interesting from a telecommu-

nications viewpoint is that a group of them placed relatively close to each

other can constitute a network, using the available on-board network inter-

face card which allows them to access the wireless channel. These intelligent

tiny devices therefore form a wireless sensor network (WSN) through which

environmental data is distributively processed and potentially transmitted

elsewhere for further uses and for taking decisions. Recently, many researchers have been focusing on distributed computing thanks to the new focus on distributed networking. Scenarios like WSNs as

well as ad-hoc networks [2] and vehicular networks [3] are strongly investi-

gated in these years. In these networks, distributed computing is becoming

more and more important, since the single agents are generally entities with

poor computational potential and low energy. All nodes are supposed to be

equally poor from the computation capabilities viewpoint, therefore there is

a strong need to avoid centralized approaches in which a node has to elabo-

rate data coming from other nodes, thus encouraging distributed algorithms.

In other words, in these scenarios the assumption of a central sink node is

often unrealistic. The advent of the WSNs makes possible to implement specific distributed applications which could not be performed in an automated fashion before.

In fact, having sensors connected through a wireless channel permits a non-

invasive but efficient form of environmental monitoring or surveillance, if

nodes are equipped with the required sensors or video cameras. Temperature,

pressure and soil make-up are just a few examples of monitorable quantities, 1 2 Introduction as well as vehicles and persons are actual objects whose movements may

deserve surveillance. Seismic movements are also often controlled by WSNs

[4]. In the last decades, cellular communications have monopolized the at- tention of the researchers, and synchronization in such networks is generally

achieved via a master-slave approach between the mobile terminal and the

base station. On the other hand, in the Internet the Network Time Pro-

tocol (NTP) [5], which makes use of a network of time servers, is used for

clock synchronization among users. Clock synchronization between agents

in a WSN is instead a tougher issue, mainly due to the fact that such a

network is generally infrastructure-less, therefore neither NTP is well suited

nor hierarchical synchronization can always be effectively implemented. The communication capabilities of sensors allow the whole network to distributively compute functions of the gathered data as well as cost func-

tions useful for future uses and decisions. Assuming that sensors are able to

communicate with the surrounding neighbors means that all sensors should

have a common notion of time, otherwise channel access schemes (such as

time division multiple access, TDMA, for example) cannot be used. The

lack of synchronization in this case leads also to a side effect with respect

to channel access failure, which is an unwanted power consumption due to

unsuccessful packet transmissions and collisions. Algorithms for distributed

processing also assume the network nodes to be timely-synchronized, other-

wise there would be no agreement on whether deciding if an event happened

before or after another. In other words, a common time scale is fundamental

also for distributed processing in order to match each event with the time

instant it has happened. Another application of relevant interest which bases

its functioning on clock synchronization is localization: several methods have

been taken into account in order to make sensor nodes in a network able to

localize themselves as well as objects on the basis of the knowledge on the

position of specific reference nodes. Among the various proposed techniques

for localization, the time-based approaches can be exploited to perform node

localization under a fundamental assumption: nodes must have the same

notion of time. To stress this point, the reader should notice that the local-

ization accuracy in an ultra-wideband (UWB) system using a time-of-arrival

(TOA) approach is extremely high [6], therefore fine clock synchronization in

such scenarios becomes an extremely important issue. Channel access, dis-

tributed processing and localization are not the only applications that assume

clock synchronization between sensor nodes: other examples are cooperative

communications, coordinated actuation, spectrum sensing and power control,

among many others. The problem of clock synchronization in WSNs has been thorough inves- Introduction 3 tigated in the past. Several surveys [7''10] have been written about this issue,

and for a detailed explanation of the problem the reader should also read El-

son''s Ph.D. dissertation [11]. Many synchronization algorithms (also called

synchronization protocols) have been proposed in the literature, both for sen-

sor networks that make use of radio frequency (RF) communications [12''18]

and for sensor networks based on underwater acoustic transmissions [19''22].

These algorithms either estimate clock parameters with respect to a refer-

ence (leading to ''uncoupled clocks') [12''15], or actively modify the clock

behaviour via suitable control circuits (leading to ''coupled clocks') [16''18].

There is a vast literature as well about possible techniques to better esti-

mate the clock parameters among two nodes with specific assumptions on

the nature of the network delays [23''31]. This thesis is divided into two parts, after the first chapter which is in- tended to provide the reader the necessary notions in order to fully appreciate

the analytical and simulative results presented after. Part I of this thesis de-

scribes clock estimation and synchronization algorithms for WSNs based on

the two-way message exchange, which allows to estimate the clock param-

eters and subsequently correct the local clock behaviour, thus leading to

uncoupled clocks among the network. Based on data gathered thanks to this

exchange process, a general framework is first introduced in order to allow

clock offset estimation when the network delays have probability distribu-

tions belonging to the exponential family of distributions, thus generalizing

existing results in the literature as well as adding new findings. Based on

this, estimation algorithms are proposed both for the classical (ML) and the

Bayesian framework, using the theory of factor graphs. The second part of this thesis relies on the application of consensus al- gorithms for reaching synchronization in WSNs. First, an innovative fast

consensus algorithm is presented which allows top performance in terms of

convergence speed and outstanding results in terms of noise resilience com-

pared to what is available in the literature. After recasting the clock synchro-

nization issue as a double consensus problem with both clock offset and skew

among nodes, the new fast consensus approach is finally applied for reaching

clock synchronization, showing that it brings substantial improvements with

respect to similar algorithms based on consensus. Finally, in the Appendix a work whose topic falls out of the main stream of the thesis is reported. In fact, an algorithm for reducing the backhaul

network usage in cellular networks is presented, when multi-cell processing

is allowed for uplink communications from a mobile terminal, making use

of error control policies and several cooperation techniques between the base

stations, under a constraint on the quality of service in terms of the maximum

allowed outage probability. 4 Introduction Chapter 1 General Concepts on Clock

Synchronization The purpose of this chapter is to give the reader the basics of clock synchro-

nization so that most the fundamental concepts presented and discussed in

this thesis can actually find a reference in the present thesis itself. The chapter deals with the basics of an oscillator in a sensor node, which is the component from which the node gets its own notion of time. After having

explained how an oscillator is basically composed, hardware techniques for

oscillation frequency correction and control are illustrated. The presentation

of a generally accepted clock model follows, which is the foundation for all the

results presented in this thesis. Finally, the main characteristics of message-

based algorithms are presented as well as the distinction between estimation-

based and clock coupling-based algorithms 1.1 The Notion of Time in a Node When people need to know the current time they consult devices whose

specific purpose is to provide an estimate of the time of the day. Looking

at the wall clock or at the watch are common ways to answer the question

''what time is it''. In sensors, and more generally in every electronic device,

this answer is provided through an electric circuit by means of an oscillating

signal generated inside. The name of such a circuit is precisely oscillator, or

clock. There exist natural materials which have a particular oscillating property, namely piezoelectricity: in a crystal of a piezoelectric material, a mechani-

cal stress produces an accumulation of charges on the opposite sides of the

crystal. In this sense, the material is actually behaving like a capacitor: by 5 6 1. General Concepts on Clock Synchronization connecting the two sides with an electric circuit, the flow of electric current

is induced. In more general terms, the crystal behaves like a RLC circuit

(resistor, capacitor and inductor) with a certain frequency of resonance, typ-

ical of the material. If the mechanical compression is done periodically at

such a frequency, then an output periodic electric signal at the frequency of

resonance is obtained. This is precisely the oscillating signal provided by a

clock in the electronic device. A piezoelectric material largely used in oscillators is quartz. A common oscillation frequency for quartz oscillators is 32768 Hz, as it happens, for

example, in the microprocessor Texas Instrument MSP430 F1611 [32] em-

bedded in Tmote Sky sensors. A clock oscillating at 4 MHz can be found

instead in MICA motes, which are employed in testbeds at the University of

California, Berkeley [33]. Although they are claimed to run at the same nominal frequency, even nominally identical crystal oscillators have slightly different running behaviours,

mainly due to fabrication issues and climate conditions. Indeed, it happens

that their instantaneous phases could be different and generally time-varying.

Given that an estimate of the phase difference is possible, what is needed

right after is a way to implement a phase correction to the oscillating circuit.

The application of a voltage correcting signal to an oscillator for modifying its

running behaviour is exactly what happens in a Voltage Controlled Oscillator

(VCO). The feedback-based circuits that generate such correcting signals are

generally referred as (digital) phase-locked loops ((D)PLLs) or delay-locked

loops (DLLs). For a detailed description of these circuits see [34]. In an actual application of clock synchronization, by means of a PLL the node i is able to synchronize its own clock to an external signal produced

through the interaction with other nodes'' oscillators. Designing the nature of

such an interaction between nodes means providing a clock synchronization

algorithm, which is precisely what is being described in the present thesis. 1.2 Clock Models The main building block in a synchronizing network of agents is the clock,

which can be defined as follows. Definition 1.1 A clock is an electronic device that counts oscillations in an

accurately-machined quartz crystal, at a particular frequency [7]. Therefore, a clock can be considered a timer which increases of a unit at

every clock tick, at a certain frequency. 1.2. Clock Models 7 The observable value at sensor node i is the clock counter at absolute time t, i.e., Ti(t) (1.1) which is often simply denoted by ''clock' of node i at time t. Each oscillator runs at frequency fi(t), generally different from the oscil- lator nominal frequency f0. The clock oscillation period (the inverse of the

oscillation frequency) is defined as Υi(t) = 1 fi(t) . (1.2) As far as the clock accuracy is concerned, it is generally known [8] that oscillators running at higher frequencies are more stable, while oscillators

at lower frequencies consume less power, so that it is common practice to

have sensors equipped with slow running oscillators for energy consumption

purposes. This comes at the price of lower clock stability and less achievable

accuracy, since the time resolution is coarser. Indeed, changes in environ-

mental factors like temperature, pressure and humidity make the oscillation

frequency of clock of node i, fi(t), vary with time. Also, factory imperfections

make different clocks oscillate inherently at different frequencies. Therefore,

clock discrepancies between nodes are due both to different values of the

clock counters and to slightly different oscillation frequencies. 1.2.1 Clock Parameters Terms like clock offset, skew and drift are commonly used to describe the

differences in running behavior between clocks of any two nodes i and j.

These concepts are defined in the following, coherently with the definitions

provided in [7, 35]. The clock offset of node i relative to node j is the difference between the clock counters of the two nodes, i.e., θ (o) ij (t) = Ti(t) '' Tj (t) . In many cases in the present thesis, the key parameter is the initial clock

offset between i and j, which is defined as the clock offset between the

two nodes at time 0, i.e., θ (o) ij = θ (o) ij (0) = Ti(0) '' Tj (0) . (1.3) 8 1. General Concepts on Clock Synchronization If node j is the reference time, it turns out that the clock offset of node

i with respect to the reference time is simply θ (o) i (t) = Ti(t) '' t , while the initial offset of node i is θ (o) i = θ (o) i (0) = Ti(0) , θ (o) i '' R . (1.4) The clock skew of node i relative to node j is the difference of the first derivative of the clock counters of nodes i and j, i.e., θ (s) ij = ''Ti(t) ''t '' ''Tj(t) ''t . (1.5) If node j is the reference time, it turns out that the clock skew of node

i with respect to the reference time is simply θ (s) i = ''Ti(t) ''t '' 1 , θ (s) i '' [''1, +'') . (1.6) The clock drift of node i relative to node j is the difference of the sec- ond derivative of the clocks of nodes i and j, i.e., θ (d) ij = ''2Ti(t) ''t2 '' ''2Tj(t) ''t2 . (1.7) If node j is the reference time, it turns out that the clock drift of node

i with respect to the reference time is simply θ (d) i = ''2Ti(t) ''t2 , θ (d) i '' R . (1.8) 1.2.2 The Adopted Clock Model In order to describe the clock dynamics, several models have been proposed

in the literature. The most complete modeling of a clock consists of a Taylor

expansion of the clock counter truncated at the second degree, i.e., [7, 35, 36] Ti(t) = θ (o) i + 1 + θ (s) i t + θ (d) i t2 2 . In this thesis, clock drift θ (d) i is not considered so that the considered reference model is the following [7''10,12''16,23''31,35,37''43] (either with or

without the skew presence) Ti(t) = θ (o) i + 1 + θ (s) i t , (1.9) 1.2. Clock Models 9 in which θ (o) i and θ (s) i are unknown parameters to be estimated. Often, an a-priori description for these parameters is also available. In each chapter

of this thesis it is specified if clock skew θ (s) i is also considered, after clock offset θ (o) i , which is always taken into account. Fig. 1.1 shows the running behaviours of different clocks adhering to the model (1.9). 0 20 40 60 80 100 ''40 ''20 0 20 40 60 80 100 120 140 t T i (t) a 0 20 40 60 80 100 0 50 100 150 t T i (t) b Figure 1.1: Examples of clocks with different offsets and no skews (a) and

with different skews but no offsets (b). Clock offsets and skews are real values, describing the level of discrepancy between two clocks. However, they have strongly different meanings and rele-

vance, since a relatively small clock skew can lead to a huge clock offset after a

long period of time in which no synchronizing correction is made. As a result,

if having an unbounded initial clock offset could not heavily affect synchro-

nization, even a small clock skew requires frequent re-synchronizations if it

is not taken into account during the design of the algorithm. In real sensor networks, values of clock skews are generally bounded and quite small. The community is used to express the skew in terms of parts

per million (ppm): if a clock is said to deviate c parts per million, it means

that after a million of clock ticks it diverges of the equivalent of c clock ticks.

To be concrete, the clock embedded in the microprocessor Texas Instrument

MSP430 F1611 [44] has a clock skew on the order of 40-50 ppm, which means

that for each second the clock deviates of 40-50 µs. Also other datasheets [45]

report skews of this order of magnitude. If deviating of a few µs every second

could appear as a negligible effect, just realize that in a day (24 ' 60 ' 60 = 86400 seconds) the oscillator deviates of few seconds, which is a huge time

from a sensor network application viewpoint. 10 1. General Concepts on Clock Synchronization Absolute or Relative Synchronization' In clock synchronization, identifying the known and unknown quantities is

a matter of fundamental importance. In the model (1.9), Ti(t) represents

the observation at node i at time t; on the other hand, the time scale at

node i is Ti(t) itself, so that the sensor node is totally unconscious of the

absolute time reference t. In fact, it can only access the counter Ti(t). As a

result, by itself the node is not able to estimate its own clock parameters θ (o) i and θ (s) i , unless by using time-related observations coming from other nodes. Moreover, since timestamps recorded by other nodes are related to these

nodes'' timescales, an estimate of the absolute values θ (o) i and θ (s) i is generally not possible, except for the case in which node i uses readings from a special

node whose timescale is assumed as the reference, i.e., θ (o) i = θ (s) i = 0. 1.3 Approaches to Clock Synchronization Clock synchronization between agents requires information exchange between

nodes in order to pursue a common time basis. The main question now is

what kind of data should nodes exchange in order to reach synchronization'

Existing clock synchronization algorithms either make use of messages (OSI

level 2 data exchange) or electromagnetic radio frequency pulses for exchang-

ing clock related data (OSI level 1 data exchange). The two approaches lead

respectively to message-based and pulse-coupled algorithms. The comparison

between them is done in the next section. 1.3.1 Message-Based vs Pulse-Coupled Algorithms Depending on the available hardware, the target level of accuracy and the

channel resources, nodes can either exchange message-based data or physical

signaling [9]: Message-Based Algorithms : in message-based algorithms, the local time of a node is written (e.g. timestamped ) in a message, which is broadcast to the neighboring nodes via the wireless channel. Since they involve

data exchange at level 2 of the OSI stack, such algorithms suffer of soft-

ware delays, such as interrupts or operating system calls. However, it

is proved that most of such delays can be effectively neglected through

kernel [12] or MAC-layer timestamping [13, 14], i.e. reading the times-

tamp and adding it to the packet just before the transmission on the

wireless medium. A fundamental advantage of such algorithms is that 1.3. Approaches to Clock Synchronization 11 they don''t require the network to allocate specific resources (such as

bandwidth) for synchronization purposes. Pulse-Coupled Algorithms : pulse-coupled algorithms are based on phys- ical layer signaling instead of message transmissions; in fact, time infor-

mation is encoded in the transmission instants of the sent waveforms.

To this purpose, generally either a specific frequency band is dedicated

or an overlay system (such as UWB) is employed. The received signal

at neighboring nodes is then processed in order to achieve synchro-

nization with the sending node. Such processing techniques possibly

include the use of PLLs [16, 37]. Although pulse-coupled approaches are interesting from the signal pro- cessing point of view, this thesis entirely focuses on message-based clock

synchronization algorithms, since it is assumed that neither specific chan-

nel resources are reserved for synchronization purposes nor sensor nodes

are equipped with UWB transceivers. Basic wireless resources and simple

communication hardware are well suited constraints to low-cost low-power

sensing devices. 1.3.2 Coupled Clocks vs Uncoupled Clocks In the realm of message-based algorithms, synchronization methods can be

divided in two categories 1 . Depending on the type of data processing and clock correction, message-based algorithms can lead either to coupled clocks

or to uncoupled clocks [9]. Indeed, the algorithms can either estimate the

clock parameters with respect to the other nodes and then perform a cor-

rection through a VCO (uncoupled clocks), or act directly to the oscillator

circuit to correct it step by step (coupled clocks). Regarding this aspect, the

difference between uncoupled and coupled clocks reflects also the distinction

between Part I and Part II of this thesis: in fact, in Part I clock synchroniza-

tion algorithms are designed with the goal of estimating clock parameters, so

that clocks in the network result to be uncoupled, since no correction signal

is applied to the oscillator circuit before the estimation algorithm stops run-

ning; on the other hand, the synchronization algorithm derived in Chapter 7

in Part II relies on the application of a correction signal to the oscillator on

the fly at each step, therefore leading to coupled clocks along the network. It

can also be observed that, while in Part I the focus is on the statistical de-

tailed description of the message exchange process between two nodes, Part 1The distinction is actually transversal with respect to the categorization message-based vs pulse-coupled. 12 1. General Concepts on Clock Synchronization II provides a broader picture of the problem, better analysing the overall

network performance while lacking of local detail. 1.3.3 Synchronization Via Parameter Estimation Clock synchronization through parameter estimation can be performed via

time-related data exchange. The following definition holds. Definition 1.2 The process by which nodes exchange time-related informa-

tion in message-based algorithms is called a synchronization paradigm. To the best of the author''s knowledge, three synchronization paradigms can

be identified [10, 15]: Sender-Receiver Synchronization - SRS : in this classic paradigm two nodes S (the sender) and R (the receiver) are involved; they exchange

timestamped messages in a two-way handshake fashion (started by

node S) in order to make node S able to estimate its clock relative

parameters with respect to node R. Receiver-Receiver Synchronization - RRS : at least three nodes are involved in this paradigm, the parent node P that sends broadcast

messages, and two other nodes, R1 and R2. After that, R1 and R2

exchange data about the local time of reception of the messages and

estimate their clock relative parameters. Receiver-Only Synchronization - ROS : a node L is able to synchronize to node S by listening to the messages exchanged by node S and node

R that are synchronizing using the SRS paradigm. Part I of this thesis is dedicated to the analysis of clock synchronization algorithms based on clock parameters estimation which make use of the SRS

paradigm. 1.3.4 Synchronization Via Clock Coupling Some network-wide clock synchronization algorithms that perform a step

by step clock correction assume a low level of detail in the communication

model but allow for a more comprehensive analysis and provide a broader

picture of the algorithm behaviour. Synchronization algorithms that fall

in this category and are taken into account in this thesis are based on the

consensus approach. Basics of consensus algorithms as well as the application

of consensus to the problem of clock synchronization are described in Part

II. Part I Clock Synchronization in WSNs via the Two-Way Message Exchange Clock synchronization can be reached through the estimation of the parameters that identify the disagreement between node''s clocks. This is the approach considered in Part I of the present thesis. 13 Chapter 2 The Two-Way Message Exchange In Chapter 1, message-based protocols have been divided in two categories,

based on the fact that they lead either to coupled or to uncoupled clocks.

The current chapter describes the process of data exchange exploited in

estimation-based clock synchronization algorithms, that lead to uncoupled

clocks. This communication protocol provides the rules for getting the time-

related information needed for clock parameters estimation. The obtained

observation model is carefully analysed and its performance studied. 2.1 The Two-Way Data Exchange One of the preferred ways to exchange time related data in a message-based

algorithm is the so called two-way message exchange, which identifies a two-

way communication handshake protocol between two nodes sharing their

timestamps in order to achieve synchronization. The two-way message exchange is a data communication process between two nodes in which there is a double message exchange between the involved

nodes, one for each link direction. The messages carry timestamped infor-

mation according to the local clock time scale, so that at the end of the

information exchange there is sufficient data to infer the clock disagreement

parameters of the two nodes, making clock synchronization possible between

them. The mechanism is not perfectly symmetric, since there is a node which initiates the process and the other which responds. On the other hand,

with little overhead (just one more short message, which can also exploit

piggybacking of following messages) at the end of the data exchange both

the nodes get all the raw information available, thus they can perform the

same data processing to clock parameters estimation. 15 16 2. The Two-Way Message Exchange R S T 1 S T 2 R T 4 S T 3 R Figure 2.1: Two-way message exchange between node S and R. 2.2 System Model The two-way message exchange can be better described if one looks at Fig.

2.1. The lowest horizontal line represents the time scale of node S, which

is the node that begins the data exchange process. The upper oblique line

instead represents the time scale of node R. The non-zero angle between the

two lines represents the clock skew between the time basis of node S and

node R. The data exchange [23, 25''31, 38''42] starts with a message sent by node S, containing the timestamp T 1 S of the time at which such a message has been sent 1 , measured according to the time scale of node S. Such a message is received by node R at time T 2 R, according to the time scale of node R. After a predefined period of time, node R replies with a message containing

the timestamps T 1 S , T 2 R and the timestamp T 3 R relative to the sending time of this message, measured according to the time scale of node R. Finally, such

a message arrives to node S at time T 4 S , according to the time scale of node S. By keeping into account of both clock offset θ(o) and skew θ(s) between the two nodes using the model (1.9), the following relations between the 1The dependence on the absolute time of the counter values T (t) is dropped in order to make the notation more readable. 2.2. System Model 17 timestamps hold [26, 27, 31, 38]: T 2 R = 1 + θ (s) T 1 S + d + X + θ(o) T 3R = 1 + θ (s) T 4 S '' d '' Y + θ(o) . (2.1) The fixed portion of the link delay is assumed symmetric and denoted by

d, while the random portions of the same delay in uplink and downlink

respectively are denoted by X and Y . A full discussion about these delays

as well as their possible modeling is performed in the following. 2.2.1 Fixed and Random Portions of Link Delay The major impairments in the two-way message exchange come from the

message transmission delays. There is no need to stop and try to find a

possible cause of delay that could affect a message transmission through a

shared medium. The wireless channel is indeed used by a potentially large

number of users, and there is an extensive bibliography on the available

channel access schemes. For sure, the channel access is neither deterministic

nor instantaneous. In other words, there is a stochastic delay to be modelled. The delay encountered by a message transmission through a wireless chan- nel has been decomposed into four categories [10, 12''14]: ' Send Time: the time required by the transmitter to construct the mes- sage and to forward it to the NIC 2. This time includes delays coming from interrupts and operating system calls. ' Access Time: the time required to get a successful message transmission in the wireless medium. It depends on the channel access scheme. ' Propagation Time: the time needed for the electromagnetic wave to travel on the wireless medium from the transmitter to the receiver. It

encompasses latencies and forwarding delays at intermediate hops in

large networks. ' Receive Time: the time needed to the receiver for the message reception and decoding as well as for the notification of the message arrival to

the operating system. These delays are typical of a large class of wireless networks and protocols, from the TDMA one-hop networks to the large multi-hop CSMA networks. In 2In the following, the NIC (Network Interface Card) is intended as the part of the device which is actually sending the packet over the air. 18 2. The Two-Way Message Exchange this thesis, for the delay analysis in the two-way message exchange, only one-

hop communications will be considered and CSMA is assumed as the wireless

channel access scheme [33]. With this assumption, the just mentioned four

delays can be divided in two categories. Assuming nodes are not mobile, the

propagation delay can be considered fixed (or deterministic), since it just

depends on the distance between the two sensors, no random components

are present. However, the send, access and receive times encompass random

impairments among their causes (operating system calls, channel access, . . . ),

therefore they can be considered random (or stochastic). In this thesis, the

following definitions hold (see [46]): Definition 2.1 The deterministic (fixed) part of the link delay in the two-

way message exchange process is the minimum of the link delays over the

message exchange rounds. Definition 2.2 The stochastic (random) part of the link delay in the two-

way message exchange process is the positive quantity to be added to the

deterministic part of the link delay in order to get the overall delay. As a result, both the deterministic and the stochastic delays are positive

quantities. In the past, a measurement campaign has been performed [46] to see if the end-to-end delay could be somehow modelled. What turned out from such

work is that the stochastic end-to-end delay can be better described by a

Gamma probability distribution, which comprises a number of distributions.

What plays a key role in the stochastic part of the delay is the queueing delay

at the transmitters; this is the reason why many works in the literature as-

sumed exponential distributed random portions of delays [23''27]. However,

the Gaussian, log-normal and Weibull distributions are other good candi-

dates adopted in the literature for modelling the stochastic delay. Given the

substantial diversity between these distributions, it is hard to find a common

framework which encompasses the entire range of possible distributions. On

the other hand, in Chapter 3 a new framework is presented in order to allow

for the study of different random delay distributions at the same time in a

more general fashion, through the use of tools borrowed from the theory of

the exponential family of distributions. Message delays can be reduced through smart tricks. The most used tech- niques involve reading the timestamps just before forwarding the message to

the NIC for the actual transmission (kernel [12] or MAC layer timestamp-

ing [13,14]) to eliminate the send and the access times. Moreover, if also the

receiver timestamps the message arrival time at a low enough level in the

host''s operating system kernel, the receive time is strongly reduced since it 2.3. State of The Art 19 does not include the overhead of system calls, context switches, or even the

transfer of the message from the network interface to the host. Finally, the

propagation time cannot be eliminated because of its physical reason, but it

can be estimated from the round trip time through a handshake paradigm. 2.3 State of The Art Several clock synchronization algorithms exploit the two-way message ex-

change in order to reach clock agreement. Indeed, as an example, both

Timing-sync Protocol for Sensor Networks (TPSN [14]) and Reference Broad-

cast Synchronization (RBS [12]) use data coming from this exchange process

in order to compensate for clock offset and skew. The two-way message exchange well applies to a statistical analysis of timestamped data, both in presence of clock offset and skew, as it can be

seen from (2.1). A major statistical analysis of the two-way message exchange

mechanism has been done first in [23], where a thorough discussion on the

best use of the timestamped data is also presented. After this first work, a

deep study on the statistical performance of this message exchange process

for clock parameter estimation has been done in the literature. If clock skew is not considered, the issue is to correctly estimate the clock offset between nodes S and R. Assuming the random portion of delays be-

ing identically distributed exponential random variables both in uplink and

downlink, Ghaffar in [23] states that an ML estimate (MLE) of the clock

offset does not uniquely exist in the case in which both the mean of the

random portion of the delays and the fixed portion of delay are known. On

the other hand, Jeske in [24] proved that such an MLE actually exists when

the fixed portion of delay is unknown, with the mean of the exponentially

distributed delays being either known or unknown; an expression for the

MLE is therein derived. However, several ad-hoc clock offset estimators were

proposed in [23]; one of them (the Minimum Link Delay Algorithm, MnLD)

coincides with the actual MLE derived in [24]. Still assuming that the ran-

dom delay distribution is exponential but, more generally, asymmetric be-

tween uplink and downlink, in [25] the best linear unbiased estimator using

order statistics (BLUE-OS) [47] is derived as well as the minimum variance

unbiased estimator (MVUE) [47] via the application of the Rao-Blackwell-

Lehmann-Sheffé theorem. Making the assumption that the stochastic part

of the delay is the result of a diverse serie of causes, [26] considers it as

following a Gaussian distribution. The applicability of such an hypothesis

could be questionable, because if so, there is a non-zero probability that the

overall delay (deterministic plus stochastic) is negative and this is physically 20 2. The Two-Way Message Exchange not possible. On the other hand, the Gaussian assumption may hold in the

case the standard deviation is chosen small enough, by recalling that about

the 99.7% of the samples are inside the 6' interval symmetric around the

distribution mean. The authors in [26] provide expressions for the MLE for

the clock offset and the relative Cramér-Rao lower bound (CRB) under this

Gaussian assumption. The case of a Weibull distributed random delay is

faced instead in [28], where it is shown that the problem of finding the MLE

of the clock offset is generally convex, therefore it is solvable by numeric op-

timization A uniform MVUE (UMVUE) is also derived for the exponential

distribution case in which the fixed portion of delay is known. When a more

general assumption is made about the noise distribution, it becomes hard to

find an exhaustive solution to the clock offset estimation problem. In [29,30]

such a difficulty has been faced by the usage of (Iterative) Gaussian Mixture

Kalman Particle Filter, (I)GMKPF, in order to find an estimate of the clock

offset in the presence of non-Gaussian and non-exponential random delay

distribution or in the presence of a mixture of several distributions. If the model assumes the presence of both clock skew and offset among node''s oscillators, frequent re-synchronization is reduced since a skew cor-

rection naturally leads to a longer clock synchronization. A Gaussian dis-

tribution for the random portion of delay is assumed in [31] where the joint

MLEs for clock offset, skew and fixed part of the delay are derived as long as

performance bounds. Moreover, a lower complexity suboptimal estimator is

also proposed. In [26] the joint MLE for both clock skew and offset is derived

in the case of Gaussian distributed random delays with the assumption of

knowing the fixed part of delay. Since in this case the estimator expression

involves non-trivial computation (and an MLE for the exponential case does

not even exist if d is known), an ML-like estimator is proposed for both the

Gaussian and the exponential cases, which exploits just the first and the last

time stamps. This estimator assumes d to be unknown, which is a more

realistic assumption in real WSNs. Performance is obviously worse than the

MLE but a simpler expression is provided for the estimators so that they

are easily implementable. Later in [27] a graphical algorithm to find the ML

clock offset and skew estimate is derived in the case of exponential symmet-

ric random delays. Due to its complexity, a suboptimal alternative empirical

algorithm is proposed. 2.3.1 Relevant Synchronization Algorithms In this section, the most famous clock synchronization algorithms in radio

WSNs based on the two-way message exchange are described. 2.3. State of The Art 21 Reference Broadcast Synchronization (RBS) To the best of the author''s knowledge, RBS [11, 12] was the first relevant

algorithm proposed for clock synchronization in WSNs; it was presented in

2002 [12] by Elson et al. for synchronizing a set of Berkeley motes [33]. The

problem of reducing the uncertainty in the message delivery time has been

solved by using the RRS paradigm in a smart way: in order to eliminate

the Send and the Access times, the authors let a set of nodes synchronize

themselves by recording the arrival time (according to their local clocks) of

a number of broadcast packets received from an elected reference node; the

key assumption is that the absolute time of arrival of the packet is the same

at all the nodes: in other words it is assumed that the packets arrive to all

the reference node''s neighbors at the same instant. After having received

these packets, such nodes exchange their local readings. Each node is there-

fore able to compute clock offset and skew with respect to any other node

(except the reference node) by solving a least square linear regression. The

receive time is limited by allowing system clock reading directly inside of the

network driver''s interrupt handler. A multi-hop extension of RBS consists on

exploiting nodes that are in the coverage range of more than one node: such

nodes work as clock bridges between two or more broadcast regions, there-

fore they are able to make the time conversion from a broadcast domain to

another. This algorithm has the following pros: ' The reference broadcast packet doesn''t need to be a dedicated packet, therefore it is possible to exploit any broadcast transmission. ' The algorithm is robust against node deaths and packet losses since the linear regression works even if some points are missing. ' Local time scales are built up, making the algorithm well suited to those applications that don''t need a network global time scale (like, for

example, a localization application). ' A large number of synchronizing broadcasts enable tighter synchroniza- tion because residuals errors tend to follow well-behaved distributions. There are also some cons: ' For a single-hop network of N nodes there is a huge (O(N 2)) number of exchanged packets for synchronization. ' Low convergence speed because of the high number of message ex- changes. 22 2. The Two-Way Message Exchange ' Need for the election of the beacon sender. ' The reference node is left unsynchronized. Timing-sync Protocol for Sensor Networks (TPSN) Ganeriwal et al. in [14] propose an algorithm that works upon a different

paradigm with respect to RBS: in fact, while the latter is based on synchro-

nizing a set of receiver nodes, TPSN bases its fundamentals on the more

conventional approach, SRS. In order to synchronize the network, a hier-

archical structure is constructed: this is the reason for the presence of a

''discovery phase' in which a root node is chosen by running a leader elec-

tion algorithm in the network so that a tree is built up. The ''synchronization

phase' follows, in which every node synchronizes itself to its parent by a two-

way message exchange mechanism. The implementation results on Berkeley

Motes are quite encouraging since TPSN provides twice better synchroniza-

tion performance with respect to RBS; moreover, the synchronization error

doesn''t blow up with the hop distance. However, a strong limitation in this

protocol is that it doesn''t provide any skew correction, therefore a periodic

resynchronization phase of the entire network is needed. MAC layer time-

stamping is also implemented in this algorithm in order to reduce the send

and the access time. TPSN has the following pros: ' It is robust to single node deaths and births, with effective special provisions. ' It is easily scalable, since the synchronization error doesn''t blow up with hop distance. ' Performance is good also with respect to RBS. There are also some cons: ' It is not suitable in the case nodes have some mobility since a hierar- chical structure has to be built up and kept running. ' Need for the election of the root node. ' No clock skew estimation. 2.3. State of The Art 23 Flooding Time Synchronization Protocol (FTSP) In FTSP [13], the first step is the election of a root which is intended to

represent the global reference time; it periodically broadcasts its clock to its

neighbors which collect this data and build up a table with a set of entries

global time-local time, using linear regression to estimate clock skew and

offset with respect to the root node. After they get synchronized to the root,

they themselves become roots for their neighborhood by broadcasting their

synchronized time. Differently from RBS, the broadcast packet must contain

the sender timestamp, but similar to TPSN the paradigm is SRS. Also here,

MAC layer time-stamping is used in order to reduce the send and the access

time. The pros of this algorithm are the following: ' It supports topology changes since no fixed hierarchical structure is required. ' Less number of message exchanges with respect to RBS and TPSN. On the other hand, the main cons are: ' A root election and renewing process is needed. ' The actual number of exchanged messages is redundant, due to cover- age overlapping. Pairwise Broadcast Synchronization (PBS) The authors of PBS [15], are the inventors of a new synchronization approach,

the ROS. As explained in Chapter 1, a node L exploits data embedded in mes-

sages exchanged by node S and R, synchronizing through the SRS paradigm,

to estimate its clock skew and offset with respect to the clock of node S.

Results of PBS in terms of synchronization error and in terms of the number

of exchanged packets are encouraging since the clock precision is comparable

to that of RRS approach while the number of exchanged messages is propor-

tional to the number of clusters in the network. Here some advantages of

this algorithm are listed: ' Low number of exchanged messages. ' Fine results in term of synchronization error. However, the main disadvantages are: ' PBS needs to work over another SRS synchronization protocol. 24 2. The Two-Way Message Exchange ' There is the need to divide the network in clusters and synchronize them by additional pairwise synchronization among super nodes in different

clusters. Several clock offset and skew estimators have been proposed [39''42] for the listening nodes, assuming this framework. On the other hand, a thorough

analysis of this problem is out of the scope of this thesis. Chapter 3 Clock Offset Estimation: A

Unified Framework The results presented in this chapter refer to a collaboration of Prof. Van-

gelista and me with Prof. Erchin Serpedin and Aitzaz Ahmad, both of them

with the Department of Electrical and Computer Engineering, Texas A&M

University, College Station, TX (USA). The research work has been per-

formed while I was on leave as a ''Visiting Scholar' at Texas A&M University,

from January to July 2011, partially supported by an ''A. Gini' fellowship.

Results have been submitted to the journal IEEE Transactions on Informa- tion Theory [48]. 3.1 Introduction The concepts introduced in this chapter are based on the two-way message

exchange introduced in Chapter 2. Here, a new framework is defined for

treating several assumptions on the random part of the link delays in a

unified manner. In the considered scenario, only clock offset is assumed,

so that the model depicted in Chapter 2 is substantially simplified. This

allows to make a key quantity substitution in the observation model which

leads to parameters decoupling. Tools borrowed from the exponential family of distributions prove useful to tackle the problem of finding the MLE of the clock offset under the gen-

eral framework, both recovering existing results and extending them to new

findings using the convex optimization theory. 25 26 3. Clock Offset Estimation: A Unified Framework R S T 1 S (1) T 2 R(1) · · · · · · T 4 S (1) T 3 R(1) T 1 S (K ) T 2 R(K ) T 4 S (K ) T 3 R(K ) Figure 3.1: K rounds two-way message exchange between node S and R with

clock offset only between them. 3.2 System Model The data exchange mechanism on which the new common framework bases

its fundamentals is the two-way message exchange, as described in Chapter

2. As an extension of what introduced therein, here it is assumed that the

data exchange is iterated for K rounds, with the aim of providing a more

accurate estimate of the clock parameters. In order to keep into account

of the exchange repetition, the notation is extended as follows. The four

timestamps {T 1 S , T 2 R, T 3 R, T 4 S } gathered at round k are here denoted as {T 1 S (k), T 2 R(k), T 3 R(k), T 4 S (k)} , k = 1, . . . , K . In other words, brackets are added to remark the dependence of the times-

tamps on the round index k. No clock skew is taken into account: Fig. 3.1 represents the considered scenario: a K rounds two-way message exchange mechanism with offset pres-

ence only. This assumption leads to a simplified relation between timestamps

with respect to (2.1). In fact, setting θ(s) = 0 leads to T 2R(k) = T 1 S (k) + d + X (k) + θ (o) T 3 R(k) = T 4 S (k) '' d '' Y (k) + θ (o) , (3.1) where, as before, d represents the fixed (among all K rounds) portion of the

delay while X(k) and Y (k) model the i.i.d. random portions of the link delay

in uplink and downlink at round k, respectively. 3.3. A New General Framework 27 Following the approach of [23, 25, 28''30], the readings are then processed to obtain U(k) = T 2 R(k) '' T 1 S (k) = d + θ (o) + X(k) (3.2a) V (k) = T 4 S (k) '' T 3 R(k) = d '' θ (o) + Y (k) . (3.2b) At this point, it results possible to define new quantities such that the two relations in (3.2) results to be uncoupled. In fact, by defining ξ = d + θ (o) (3.3a) ' = d '' θ (o) (3.3b) the model (3.2) can be rewritten as U(k) = ξ + X(k)

V (k) = ' + Y (k) for k = 1, . . . , K. 3.3 A New General Framework As pointed out in Chapter 2, the random portions of the delays X(k) and

Y (k) can obey to a number of possible probability distribution functions.

The aim of this chapter is to propose a new general framework for these

delays so that three important distributions can be considered at the same

time in a general fashion. In order to make the notation shorter, the readings U(k) and V (k) are organized in vector form, i.e., U = [U(1), . . . , U(K)] ' V = [V (1), . . . , V (K)] ' , so that it comes easier to express the likelihood functions of the observa-

tions. Here it is assumed that the likelihoods follow either the Gaussian or

log-normal distribution, in a general framework. This can be done by bor-

rowing notation and properties from the theory of the exponential family of

distributions [49, 50]. 28 3. Clock Offset Estimation: A Unified Framework Unconstrained Likelihood: fU (U |ξ) ' exp ξ K X k=1 ηξ(U(k)) '' K'ξ(ξ) ! (3.4a) fV (V |') ' exp ' K X k=1 η'(V (k)) '' K''(') ! (3.4b) where ηξ( ·) and η'(·) are functions of the observations U and V , respec- tively and they are assumed to be sufficient statistics for estimating ξ and '.

The 'ξ( ·) and ''(·) are called log-partition functions and they serve as nor- malization factors so that both fU (U |ξ) and fV (V |') are valid probability distribution functions. Given the normalization purposes of the log-partition

functions, they are naturally defined except for an additive constant. The

proportional signs in (3.4a) and (3.4b) refer to the fact that a multiplicative

factor only dependent on the observations U and V , respectively, is present

in front of the exponential term. In this case, the likelihood functions are

called ''unconstrained' since the domains are independent on the parameters

ξ and ', as opposite to the case when the likelihood functions are assumed

adhering to the exponential distribution, as shown below. Constrained Likelihood: fU (U |ξ) ' exp ξ K X k=1 ηξ(U(k)) '' K'ξ(ξ) ! K Y k=1 I(U(k) '' ξ) (3.5a) fV (V |') ' exp ' K X k=1 η'(V (k)) '' K''(') ! K Y k=1 I(V (k) '' ') (3.5b) where I( ·) is the indicator function, defined as I(x) = 1 x ' 0 0 x < 0 . In this case, the likelihood function is called ''constrained' since its domain

is dependent on the parameters ξ and '. The exponential family of distributions represents a precious tool to anal- yse the likelihood functions in a sufficiently general fashion to encompass to-

gether potentially very different distributions like Gaussian, log-normal and

exponential. The three of them are all valid candidates for best describing

the random behaviour of X(k) and Y (k), thus it follows that the choice of

adopting the exponential family framework is an effective way to study the 3.3. A New General Framework 29 problem of clock offset estimation in a more general way with respect to what

done in the past. Although original, the approach of adopting uses and costumes of the exponential family of distribution can be specialized at the specific need, re-

covering the results of the literature when the assumption is to have random

portions of delays obeying specifically either to Gaussian, log-normal or ex-

ponential distribution. This is of fundamental importance, since if on one

side the framework permits a general analysis of the problem, simultaneously

it leads to a continuity with the existing results. It is useful to dig further into the theory of the exponential family of distributions: the following properties [49] will prove useful in the rest of the

upcoming discussion. They are listed referring to the parameter ξ and the

observations U : similar properties hold for ' and V , respectively. 1. The mean value and the variance of the sufficient statistics ηξ(U(k)) can be written as E [ηξ(U(k))] = '''ξ(ξ) ''ξ (3.6) '2 ηξ = Var [ηξ(U (k))] = ''2'ξ(ξ) ''ξ2 . (3.7) 2. The moment generating function (MGF) of the random variable ηξ(U(k)) is given by Mη ξ (h) = exp ('ξ (ξ + h) '' 'ξ(ξ)) . (3.8) The following proposition holds from the positiveness of the variance in (3.7). Proposition 3.1 The log-partition function 'ξ( ·) is convex. The following theorem specializes the common framework to the three con-

sidered specific distributions. Theorem 3.2 The sufficient statistics and the log-partition function for the

Gaussian, log-normal and exponential likelihood functions are respectively 1. N ξ, ' 2 ξ : ηξ (U(k)) = U(k) '2 ξ , 'ξ(ξ) = ξ2 2'2 ξ , (3.9) 30 3. Clock Offset Estimation: A Unified Framework Distribution fU (U |ξ) ηξ(U(k)) 'ξ(ξ) '2η ξ N (ξ, ' 2 ξ ) ' exp h '' P K

k =1(U (k)''ξ) 2 2'2 ξ i U (k) 2'2 ξ ξ2 2'2 ξ 1 '2 ξ log N (ξ, ' 2 ξ ) ' exp h '' P K

k =1(log U (k)''ξ) 2 2'2 ξ i log U (k) 2'2 ξ ξ2 2'2 ξ 1 '2 ξ λξ exp h ''λξ P K

k=1 (U (k) '' ξ) i · E (λξ) λξ 0 0 · Q K

k=1 I (U (k) '' ξ) Table 3.1: Specialization of the proposed new general framework. 2. log N ξ, ' 2 ξ : ηξ (U(k)) = log U(k) '2 ξ , 'ξ(ξ) = ξ2 2'2 ξ , (3.10) 3. E (λξ): ηξ (U(k)) = λξ , 'ξ(ξ) = 0 . (3.11) Proof: See Appendix 3.A.1. The results stated by Theorem 3.2 are summarized in Tab. 3.1.

The following corollary holds. Corollary 3.3 In the Gaussian, log-normal as well as in the exponential

case the log-partition function can be expressed as a second-degree incomplete

polynomial, i.e., 'ξ(ξ) = aξξ 2 = '2 ηξ 2 ξ2 . (3.12) In a real scenario, if the variance of the sufficient statistics is not known, it can be substituted by its empirical version, since the weak law of large

numbers assures that P K

k=1 ηξ (U (k)) K p '' E [ηξ(U)] , K '' '' P K

k=1 η 2 ξ (U (k)) K p '' E η2 ξ (U ) , K '' '' . 3.4. Maximum Likelihood Estimation 31 3.4 Maximum Likelihood Estimation This section focuses on the ML estimation of the clock offset, thus recovering

the results existing in the literature. In order to perform the estimation,

the problem is recast as an instance of convex optimization, differently from

the graphical approach adopted in [24]. The case of unconstrained and con-

strained likelihoods are treated separately. 3.4.1 Unconstrained Likelihood Using (3.4a) and (3.12), the unconstrained likelihood functions are given by fU (U |ξ) ' exp ξ K X k=1 ηξ(U(k)) '' K '2 ηξ 2 ξ2 ! (3.13a) fV (V |') ' exp ' K X k=1 η'(V (k)) '' K '2η ' 2 '2 ! . (3.13b) The MLE of ξ and ' can be expressed as ' ξML = arg max ξ exp ξ K X k=1 ηξ(U(k)) '' K '2η ξ 2 ξ2 ! (3.14a) ' 'ML = arg max ' exp ' K X k=1 η'(V (k)) '' K '2 η' 2 '2 ! . (3.14b) Theorem 3.4 The likelihood maximization problems (3.14a) and (3.14b)

are strictly concave and the MLEs are given by ' ξML = P K

k=1 ηξ (U (k)) K'2η ξ (3.15a) ' 'ML = P K

k=1 η' (V (k)) K'2η ' . (3.15b) Hence, the MLE ' θ (o) ML for the clock offset can be written as ' θ (o) ML = ' ξML '' ' 'ML 2 . (3.16) Proof: See Appendix 3.A.2 32 3. Clock Offset Estimation: A Unified Framework Gaussian Distribution Theorem 3.4 finds an application in the case the likelihood functions fU

and fV follow a Gaussian distribution, i.e., fU (U(k) |ξ) ' N (ξ, ' 2 ξ ) and fV (V (k) |') ' N (', ' 2 ') [26] (see also Tab. 3.1). Using (3.15a) with (3.9) and (3.12), it follows that the MLE is given by ' ξML = P K

k=1 U (k) K . (3.17) By a similar reasoning, the MLE for ' (3.15b) is given by ' 'ML = P K

k=1 V (k) K . (3.18) Using (3.16), the MLE for the offset θ(o) can be expressed as ' θ (o) ML = P K

k=1(U (k) '' V (k)) 2N . (3.19) The above estimate coincides exactly with the one reported in [26]. Log-Normal Distribution When the likelihood functions of the readings U and V are log-normal (see

Tab. 3.1), Theorem 3.4 can still be applied, making use also of (3.10) and

(3.12), for finding the MLE of ξ, i.e., ' ξML = P K

k=1 log U (k) K . With a similar reasoning, the MLE for ' results to be ' 'ML = P K

k=1 log V (k) K . Finally, the ML estimator for θ(o) can be written as ' θ (o) ML = P K

k=1 (log U (k) '' log V (k)) K . (3.20) 3.4. Maximum Likelihood Estimation 33 3.4.2 Constrained Likelihood Using (3.5a), (3.5b) and (3.12), the constrained likelihood functions can be

written as fU (U |ξ) ' exp ξ K X k=1 ηξ(U(k)) '' K '2η ξ 2 ξ2 ! K Y k=1 I(U(k) '' ξ) (3.21a) fV (V |') ' exp ' K X k=1 η'(V (k)) '' K '2 η' 2 '2 ! K Y k=1 I(V (k) '' ') . (3.21b) The MLEs of ξ and ' can be expressed as ' ξML = arg max ξ exp ξ K X k=1 ηξ(U(k)) '' K '2 ηξ 2 ξ2 ! such that U(k) ' ξ (3.22a) ' 'ML = arg max ' exp ' K X k=1 η'(V (k)) '' K '2 η' 2 '2 ! such that V (k) ' ' . (3.22b) It proves useful to define the order statistics of the samples U(k) and V (k) as U(1) = min (U(1), . . . , U(K)) V(1) = min (V (1), . . . , V (K)) . Theorem 3.5 The likelihood maximization problems (3.22a) and (3.22b)

are strictly concave and the MLEs can be expressed as ' ξML = min P K

k=1 ηξ (U (k)) K'2 ηξ , U(1) ! (3.23a) ' 'ML = min P K

k=1 η' (V (k)) K'2η ' , V(1) ! . (3.23b) The MLE ' θ (o) ML for the clock offset is given by ' θ (o) ML = ' ξML '' ' 'ML 2 . (3.24) Proof: See Appendix 3.A.3. 34 3. Clock Offset Estimation: A Unified Framework Exponential Distribution An application of Theorem 3.5 can be found if the likelihood functions fU

and fV represent exponential distributions, with parameters λξ and λ', re-

spectively, i.e., fU ' E (λξ) and fV ' E (λ') (see also Tab. 3.1). Using (3.11) yields ' ξML = U(1) . (3.25) With a similar reasoning, ' 'ML = V(1) . Using (3.24), the MLE of θ(o) is given by ' θML = U(1) '' V(1) 2 , (3.26) thus recovering the result of [24], therein derived through graphical argu-

ments. 3.5 Statistical Bounds The purpose of this section is to provide statistical bounds to clock offset

estimation in the classical framework in order to evaluate the performance

of the MLEs derived in the previous sections in terms of the variance of the

estimation error. The upcoming analysis assumes that the likelihood functions fU and fV belongs to the exponential family of distributions without specifying any

shape for the log-partition function, therefore the provided analytical results

are indeed generally valid for a wide variety of distributions and almost all

the distribution of interest. Moreover, the upcoming results can be used in a

context different from the clock offset estimation, since they are self standing

in classical estimation theory. Both the case of unconstrained and constrained likelihood are taken into account for providing the necessary generality to the discussion of the

problem. The general expressions for the likelihoods of the observations

Z = [Z(1), . . . , Z(K)] T are given by Unconstrained Likelihood: fZ(Z; ρ) ' exp ρ K X k=1 η(Z(k)) '' K'(ρ) ! (3.27) 3.5. Statistical Bounds 35 Constrained Likelihood: fZ(Z; ρ) ' exp ρ K X k=1 η(Z(k)) '' K'(ρ) ! K Y k=1 I(Z(k) '' ρ) (3.28) where ρ is the scalar parameter to be estimated. In the following, the Cramer-

Rao and Chapman-Robbins bounds for the estimation error variance of the

deterministic parameter ρ are derived. 3.5.1 Cramer-Rao Lower Bound The Cramer-Rao lower bound (CRB) [47] provides a limit on the variance of

the estimation error for an unbiased estimator of a deterministic parameter.

Although it is a simple bound to compute, it relies on some ''regularity condi-

tions' which are not always satisfied. In particular, in the case of constrained

likelihoods (cf. (3.28)) the CRB cannot be computed, so that the variance

of the estimation error has to be described by a different bound. The CRB states that, given the condition E '' log fZ (Z; ρ) ''ρ = 0 for all ρ , (3.29) the variance of an unbiased estimator ' ρ of ρ is lower bounded by Var(' ρ) ' ''1 E h ''2 log f Z ( Z;ρ) ''ρ2 i , (3.30) Theorem 3.6 The CRB for the estimation error of ρ in the unconstrained

likelihood function in (3.27) is given by Var(' ρ) ' 1 K'2η (3.31) where '2 η = ''2' (ρ) ''ρ2 . Proof: The Fisher information is given by I(ρ) = E ''2 log fZ (Z; ρ) ''ρ2 = ''K ''2' (ρ) ''ρ2 = ''K' 2 η so that the proof readily follows. 36 3. Clock Offset Estimation: A Unified Framework 3.5.2 Chapman-Robbins Bound The Chapman-Robbins bound (CHRB) [51] provides a lower bound on the

variance of an estimator of a deterministic parameter. Differently from the

CRB, the CHRB does not make any assumptions on the regularity condition

(3.29) that often prevents from the use of the CRB. Moreover, the CHRB

is substantially tighter than the CRB in many cases. Hence, CHRB is here

used for determining a lower bound on the variance of an unbiased estimator

of ρ for constrained likelihood functions. Generally speaking, the CHRB for estimating a parameter ρ is given by Var(' ρ) ' " inf h 1 h2 ( E fZ(Z; ρ + h) fZ(Z; ρ) 2 '' 1 )# ''1 . (3.32) Theorem 3.7 The CHRB for the parameter ρ given the likelihood function

(3.28) can be expressed as Var(' ρ) ' inf h n (Mη(h))'' 2K · δK(h) '' 1 o h2 ''1 (3.33) where Mη(h) is the MGF of the statistic η(Z(k)) and δ(h) = E [exp (2hη(Z(k))) I (Z(k) '' ρ '' h)] (3.34) with the expectation taken with respect to any Z(k). Proof: The details of the proof are relegated to Appendix 3.A.4. 3.5.3 MSE and Variance of the Estimation Error The performance bounds computed above provide lower limits to the variance

of the estimation error of ξ and '. Such results can be used to lower bound

the mean squared error (MSE) of the clock offset θ(o), MSE 'θ(o) = E 'θ(o) '' θ(o) 2 . In fact, by using (3.3), the following result is immediate. Proposition 3.8 The MSE of any estimator of θ(o) can be expressed as MSE 'θ(o) = 1 4 Var ' ξ + Var ' ' + 1

4 (bξ '' b') 2 where bξ and b' [47] are the biases of the estimators ' ξ and ' ', respectively. 3.6. Numerical Results 37 Gaussian Distribution - CRB If the likelihood function for ξ is Gaussian distributed (see Tab. 3.1), then

using (3.31), it results that the CRB for an unbiased estimator ' ξ is given by Var ' ξ ' '2 ξ K , with a similar expression for the CRB in estimating ' as well. Using Propo-

sition 3.8, it follows that MSE 'θ(o) ' '2 ξ + ' 2 ' 4K . (3.35) As a remark, ' θ (o) ML (3.19) is an efficient estimator since its MSE achieves (3.35) with equality (cf. Appendix 3.A.5). Exponential Distribution - CHRB If the likelihood for ξ is exponentially distributed (see Tab. 3.1), using (3.8)

it can be easily shown that Mη ξ (U ) (h) = 1 and (3.34) becomes δ(h) = exp (λξh) , so that the CHRB (3.33) can be rewritten as Var ' ξ ' inf h exp (λξhK) '' 1 h2 ''1 = 0.6476 λ2 ξ K 2 and similarly for ' '. Using Proposition 3.8, it holds that MSE 'θ(o) = 1 4 Var ' ξ + Var ' ' + 1

4 (bξ '' b') 2 ' 0.162 K2 1 λ2 ξ + 1 λ2 ' ! + 1

4 (bξ '' b') 2 . (3.36) 3.6 Numerical Results In this section the theoretical findings of this chapter are corroborated by

numerical simulations. In particular, performance of the estimator in case of 38 3. Clock Offset Estimation: A Unified Framework log-normal likelihood is analysed, as well as the MSE in all cases is compared

with the statistical bounds computed in Section 3.5. The setting parameters

are given by 'ξ = '' = 0.1 for both Gaussian and log-normally distributed

likelihoods, while λξ = λ' = 10 for exponentially distributed likelihood func-

tions. 3.6.1 MSE Performance and Bounds Comparing the performance of the proposed estimator with the statistical

bounds is useful to provide a measure of correctness and efficiency of MLEs

derived in Section 3.4 in the case of Gaussian and exponentially distributed

likelihoods. Fig. 3.2 reports the MSE in estimating the clock offset using the

MLEs for Gaussian and exponential likelihoods, (3.19) and (3.26), respec-

tively (the log-normal case is analogous to the Gaussian case, therefore it

is not reported here). As benchmarks, the CRB and CHRB are also shown

in the plot. First, it is evident that the MLE ' θ (o) ML in the case of Gaussian likelihood is efficient since the MSE reaches the CRB and the CHRB, the

two being coincident. On the other hand, in the exponential case the CRB

cannot be computed because the regularity assumption (3.29) is not satis-

fied, therefore only the CHRB is shown in figure. The MSE of the MLEs

in this case is close, but not coincident, to the CHRB. A further fair com-

parison which can be made is among the Gaussian and the exponential case,

since the parameter choice is such that the variance of the observations is the

same: from the shown curves it is evident that the MSE in the exponential

case is lower than that in the Gaussian case as the number of observations K

increases. This behaviour can be verified by the MSE expressions (3.37) and

(3.38), reported in Appendix 3.A.5: in the Gaussian case, the MSE decreases

proportionally to 1/K, while in the exponential distribution case it decreases

proportional to 1/K2. 3.6.2 MSE Robustness with Log-Normally Distributed

Likelihood So far, no clock offset estimator has been proposed in the literature in case

the likelihood function of the readings is log-normal. The proposed estimator ' θ (o) ML (3.20) can then be used to determine an estimate of the clock offset in the maximum likelihood sense. Until now, if the likelihood of the readings

was log-normal, the available estimators were just those assuming the likeli-

hoods to be either Gaussian or exponentially distributed, therefore they are

expected to perform worse than the estimator (3.20) matched to the case of 3.6. Numerical Results 39 0 5 10 15 20 25 10 ''6 10 ''5 10 ''4 10 ''3 10 ''2 K MSE( θ(o) ) MLE '' Gaussian

CRB '' Gaussian

CHRB '' Gaussian

MLE '' Exp. Distr.

CHRB '' Exp. Distr. Figure 3.2: MSE and bounds for estimating θ(o) by using the MLE with

Gaussian and exponentially distributed likelihood. 40 3. Clock Offset Estimation: A Unified Framework 0 5 10 15 20 25 10 ''4 10 ''3 10 ''2 10 ''1 10 0 10 1 K MSE( θ(o) ) MLE '' Log''normal

MLE '' Gaussian Based

MLE '' Exp. Distr. Based

CRB

CHRB Figure 3.3: MSE robustness (in terms of the likelihood functions) and bounds

for the estimation of θ(o) in case of log-normally distributed likelihood. log-normal likelihood. Fig. 3.3 compares the MSE of the new proposed estimator (3.20) with that of the MLEs that (wrongly) assume the likelihood to be either Gaussian

or exponentially distributed, when the actual likelihood of the readings is log-

normally distributed. The figure shows how the two latter approaches are not

robust with respect to the likelihood distribution, since their performance is

extremely poor. Moreover, it can be observed how the proposed MLE (3.20)

is efficient since it achieves the CRB (as well as the CHRB). 3.7 Conclusions The new proposed framework allows to uniformly treat different likelihood

functions, by exploiting the common properties among them, and using no-

tation and fundamental properties of the exponential family of distributions.

This generalization process allows to tackle the ML estimation problem in a

more general fashion, which can be specialized at the specific need depend-

ing on the behaviour of the likelihood functions. This particular approach is 3.7. Conclusions 41 naturally based on convex optimization, therefore in the exponential distri-

bution case it represents an alternative view to the problem of clock offset

ML estimation with respect to what has been done up to now, under this

specific assumption [24]. Statistical bounds are provided for all the cases

of interests, while numerical results confirm the performance of the derived

MLEs. Moreover, it is worth to highlight that the provided framework allows to derive the expression of θ (o) ML in case the likelihood functions of the readings are log-normal. This result is new in the literature, although the log-normal

distribution is among the candidates for modelling the distribution of the

random portions of the delay. Simulation results show that the new MLE

performs better than the state-of-the-art in clock offset estimation. The ML estimation method outlined above differs from the previous work [24] in that it is based on convex optimization, while [24] maximized the

likelihood graphically. Hence, this approach presents an alternative view of

the ML estimation of the clock offset. It also allows us the determine the ML

estimator of θ(o) when the likelihood function is log-normally distributed.

In addition, Theorem 3.5 also provides a stepping stone for the findings

described in Chapter 4, where estimation of θ(o) in the Bayesian regime is

discussed. 42 3. Clock Offset Estimation: A Unified Framework 3.A Appendix 3.A.1 Proof of Theorem 3.2 1. N ξ, ' 2 ξ : fU (U |ξ) ' exp " '' P K

k=1 (U (k) '' ξ) 2 2'2 ξ # ' exp " '' P K

k=1 (U 2(k) '' 2ξU(k) + ξ2) 2'2 ξ # ' exp " ξ K X k=1 U(k) '2 ξ '' K ξ2 2'2 ξ # , so that (3.9) follows from comparison with (3.4a). 2. log N ξ, ' 2 ξ : fU (U |ξ) ' exp " '' P K

k=1 (log U (k) '' ξ) 2 2'2 ξ # ' exp " '' P K

k=1 log 2 U(k) '' 2ξ log U(k) + ξ2 2'2 ξ # ' exp " ξ K X k=1 log U(k) '2 ξ '' K ξ2 2'2 ξ # , therefore (3.10) follows, again from comparison with (3.4a). 3. E (λξ): fU (U |ξ) ' exp " ''λξ K X k=1 (U(k) '' ξ) # K Y k=1 I (U(k) '' ξ) ' exp " ''λξ K X k=1 U(k) + ξ K X k=1 λξ # K Y k=1 I (U(k) '' ξ) ' exp " ξ K X k=1 λξ # K Y k=1 I (U(k) '' ξ) , which leads to (3.11) through comparison with (3.5a). 3.A. Appendix 43 3.A.2 Proof of Theorem 3.4 The MLE of ξ (3.15a) can be determined by maximizing the exponent of

(3.14a), since it can be easily proved as strictly concave [52]. Setting to zero

the first derivative of the exponent of (3.14a) leads to (3.15a). A similar

reasoning for the MLE (3.14b) of ' brings to (3.15b). Finally, the MLE of

θ(o) (3.16) is derived using (3.3) and the invariance principle [ 47]. 3.A.3 Proof of Theorem 3.5 The strict concavity of the maximization problems (3.22a) and (3.22b) follows

from noting that the constraints U(k) ' ξ (resp. V (k) ' ') are linear functions of ξ (resp. ') and using similar arguments to those used in the

proof of Theorem 3.4. Moreover, the K constraints can be reduced to a single

constraint in each problem, by using the order statistics of the observations,

i.e., ' ξML = arg max ξ exp ξ K X k=1 ηξ(U(k)) '' K '2η ξ 2 ξ2 ! such that U(1) ' ξ ' 'ML = arg max ' exp ' K X k=1 η'(V (k)) '' K '2 η' 2 '2 ! such that V(1) ' ' . The solution of the unconstrained problem in (3.22a), ¯ ξ, is given by (3.15a). If ¯ ξ ' U(1), then the solution of (3.22a) is given by ' ξML = ¯ ξ. On the other hand, being the objective function concave, if U(1) ' ¯ ξ, then the ML estimate is given by ' ξML = U(1). The combination of the two cases leads to (3.23a). A similar reasoning for ' yields (3.23b). Finally, the MLE of θ(o) (3.24) is

derived using (3.3) and the invariance principle [47]. 44 3. Clock Offset Estimation: A Unified Framework 3.A.4 Proof of Theorem 3.7 The ratio of the likelihood functions can be expressed as fZ(Z; ρ + h) fZ(Z; ρ) = e( (ρ+h) P K

k =1 η(Z (k))''K '(ρ+h)) QK k=1 I(Z (k) '' ρ '' h) e( ρ P K

k =1 η(Z (k))''K '(ρ)) Q K

k =1 I(Z (k)''ρ) =e( h P K

k =1 η(Z (k))''K '(ρ+h)+K '(ρ) ) K Y k=1 I(Z(k) '' ρ '' h) =e( h P K

k =1 η(Z (k))) e(''K('(ρ+h)+'(ρ))) K Y k=1 I(Z(k) '' ρ '' h) . The expectation of the ratio of the likelihood functions can now be calculated

as E fZ(Z; ρ + h) fZ(Z; ρ) 2 = E " e( 2h P K

k =1 η(Z (k))) e(''2K('(ρ+h)+'(ρ))) K Y k=1 I(Z(k) '' ρ '' h) # = e(''2K('(ρ+h)+'(ρ)))E " e( 2h P K

k =1 η(Z (k))) K Y k=1 I(Z(k) '' ρ '' h) # = Mη(Z)(h) ''2K E " e( 2h P K

k =1 η(Z (k))) K Y k=1 I(Z(k) '' ρ '' h) # where it follows from (3.8) that Mη(Z)(h) ''2K = e''2K('(ρ+h)'''(ρ)) . Since the samples Z(k) are i.i.d, E " e( 2h P K

k =1 η(Z (k)) ) K Y k=1 I(Z(k) '' ρ '' h) # = E e(2hη(Z(k)))I(Z(k) '' ρ '' h) K . With δ( ·) defined in the theorem, the proof is complete. 3.A. Appendix 45 3.A.5 MSE Expressions for ML Estimators Gaussian Distribution If the likelihood for ξ is Gaussian distributed (see Tab. 4.1), the MLE is

given by (3.17). Since the variance of the readings Uj is '2ξ and the MLE

(3.17) is unbiased, it is straightforward to see that MSE ' ξML = '2 ξ K , and similarly for ' 'ML. Given (3.16) and Proposition 3.8, it can be concluded that MSE 'θ(o) ML = '2 ξ + ' 2 ' 4K . (3.37) Exponential Distribution If the likelihood for ξ is exponential distributed (see Tab. 4.1), the MLE is

given by (3.25). Through simple algebra it can be seen that U(1) is exponen-

tially distributed with parameter λ '

ξ = λξ K , so that Var ' ξML = 1 λ2 ξ K 2 . It can be noticed that ' ξML is a biased estimator for ξ, with bias bξ,ML = 1 λξK . Similarly, Var ' 'ML = 1 λ2 ' K 2 and b ',ML = 1 λ'K . Therefore, given (3.16) and Proposition 3.8, it can be concluded that MSE 'θ(o) ML = 0.25 K2 1 λ2 ξ + 1 λ2 ' ! + 0.25 K2 1 λξ '' 1 λ' 2 . (3.38) 46 3. Clock Offset Estimation: A Unified Framework Chapter 4 A Factor Graph-Based Clock

Offset Estimator The results presented in this chapter refer to a collaboration of Prof. Van-

gelista and me with Prof. Erchin Serpedin and Aitzaz Ahmad, both of them

with the Department of Electrical and Computer Engineering, Texas A&M

University, College Station, TX (USA). The research work has been per-

formed while I was on leave as a ''Visiting Scholar' at Texas A&M University,

from January to July 2011, partially supported by an ''A. Gini' fellowship.

Results have been summarized in two papers, one submitted to the journal

IEEE Transactions on Information Theory [48] and the other accepted for

publication at the conference ICASSP 2012 [53]. 4.1 Introduction This chapter presents a new clock offset estimator based on a Bayesian version

of the new framework introduced in Chapter 3. The estimation procedure

assumes a time-varying clock offset and it makes use of the factor graph

theory together with message passing for deriving a closed-form expression

of the estimator. Moreover, in order to measure the performance of the

estimator, Bayesian estimation bounds are derived for the MSE and existing

estimators are implemented, then showing that the new proposed estimator

performs better than the state-of-the-art in clock offset estimators and it

provides performance really close to the computed lower bounds. The assumptions made in Chapter 3 refer to the absence of clock skew between node pairs in a WSN and time-invariant clock offsets. Although this

model permits to derive a common framework and closed-form expressions

for clock offset estimators, it is generally not well suited to real world scenar- 47 48 4. A Factor Graph-Based Clock Offset Estimator ios, where climate conditions change over time, but they are not kept into

consideration. In fact, with fixed clock offsets, only fabrication imperfections

are modelled. Considering the fixed portion of the delay to be time-invariant

is also quite restrictive because it does encompass neither slow mobility of

the wireless nodes nor shadowing effects in urban environments. To deal with offset evolution over time, here it is assumed that clock offset evolves obeying to a Gauss-Markov model [29, 30, 42]. The Bayesian

estimation procedure is performed by the application of the factor graph

theory with the message passing algorithm. The basics of factor graphs are

explained in the next section. 4.2 Factor Graphs and Message Passing Factor graphs are bipartite graphical models used to better represent the

relations among variables in a global function. The global function is decom-

posed in a set of local sub-functions, called factors, each of them dependent

on a subset of variables. Generally, each factor is represented by a factor

node and each variable by an edge. The fundamental rule is that an edge is

connected to a particular factor node if and only if the variable associated to

that edge is an argument of the factor associated to that factor node [54]. Message passing algorithms are used to perform inference in factor graphs by exchanging messages (beliefs) among nodes in the graph. The sum-product

and the max-product algorithm are examples of message passing algorithms

in which summation and maximization are performed respectively at each

factor node. Here, the focus is on the max-product algorithm, and the ex-

changed messages are as follows variable to factor node : mx''f (x) = Y h ''n(x)\f mh''x (x) (4.1) factor node to variable : mf''x (x) = max \{x} f (Z) Y z ''n(f)\{x} mz''f (z) (4.2) where Z = n (f ) is the subset of variables from which the local function

f is dependent. The marginal distribution, or the belief, associated with a

variable is obtained by performing the product of all the incoming messages

to that variable node. 4.3. A Factor Graph-based Estimator 49 4.3 A Factor Graph-based Estimator The time-varying behaviour of the clock offset and the fixed portion of the

delay is kept into account by assuming a Gauss-Markov process for both the

parameters ξ and ', i.e., ξ(k) = ξ(k '' 1) + w(k) (4.3) '(k) = '(k '' 1) + v(k) for k = 1, . . . , K (4.4) with w(k) and v(k) i.i.d and w(k), v(k) ' N (0, ' 2). The Gauss-Markov model for the clock offset has already found application in previous works on

clock synchronization [29, 30, 42]. Under the Bayesian assumption, the a-posteriori pdf can be expressed as fa(ξ, ' |U, V ) ' fp(ξ, ')f''(U, V |ξ, ') = fp,ξ(ξ(0)) K Y k=1 fξ(ξ(k) |ξ(k '' 1))fp,'('(0)) K Y k=1 f'('(k) |'(k '' 1)) · K Y k=1 fU (U(k) |ξ(k))fV (V (k)|'(k)) (4.5) where the a-priori distributions fp,ξ(ξ(0)) and fp,'('(0)) are assumed uni-

form, i.e., they can be considered constant. To make the notation more

readable, the following definitions hold: ζk k ''1 = fξ(ξ(k)|ξ(k '' 1)) ' N (ξ(k '' 1), ' 2) , νk k ''1 = f'('(k)|'(k '' 1)) ' N ('(k '' 1), ' 2) , 'k = fU (U(k) |ξ(k)) , 'k = fV (V (k) |'(k)) . In the following treatment, only the case of constrained likelihood is anal- ysed, since the unconstrained likelihood is a subset of the constrained case,

as it will be clarified shortly. The common framework introduced in Chapter 3 is adapted to the Bayesian framework, therefore the likelihood functions fU and fV are assumed obey-

ing to the exponential family of distributions (3.21), so that they can be 50 4. A Factor Graph-Based Clock Offset Estimator expressed as fU (U(k) |ξ(k)) ' exp ξ(k)ηξ(U(k)) '' '2η ξ,k 2 ξ 2(k) ! I(U(k) '' ξ(k)) fV (V (k) |'(k)) ' exp '(k)η'(V (k)) '' '2η ',k 2 '2(k) ! I(V (k) '' '(k)) . (4.6) In Fig. 4.1, the a-posteriori pdf (4.5) is represented via factor graphs. As an example of factor graph representation of Gauss-Markov models, graphical

phase estimation has been performed in [55]. It is fundamental to notice

that the structure of the pdf naturally leads to a separation between terms

dependant on ξ(k) only and terms dependant on '(k) only. This means

that the resulting factor graph is actually composed by two sub-graphs, not

communicating to each other. This is the consequence of the decoupling

between the two parameters derived in Chapter 3 and still valid. Moreover,

through inspection of (4.5) it can be inferred that the two sub-graphs have a

similar structure, therefore in the following the message passing is described

for ξ only, being the case for ' completely equivalent. Finally, the sub-graphs

in Fig. 4.1 are cycle-free, therefore inference through message passing in these

graphs is optimal [54]. To make the notation easier, it is convenient to define the following quan- tities αξ,k = '' '2η ξ,k 2 , βξ,k = ηξ(U(k)) so that the constrained likelihood function 'k can be expressed as 'k ' exp αξ,kξ 2(k) + βξ,kξ(k) I(U(k) '' ξ(k)) . (4.7) The next step is to apply the message passing mechanism to derive an estimate for θ(o)(K). 4.3.1 Message Computation The application of the max-product algorithm leads to message passing be-

tween variable nodes and factor nodes. The procedure begins at the factor

node 'K, from which a message is sent to the variable ξ(K) (see Fig. 4.2),

i.e., m' K ''ξ(K) = 'K . 4.3. A Factor Graph-based Estimator 51 ζ0 ζ1 0 ξ(0) ξ(1) '1 ζK''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K ν0 ν1 0 '(0) '(1) '1 νK''1 K ''2 νK K ''1 '(K '' 1) 'K''1 '(K) 'K Figure 4.1: Factor graph representation of the density (4.5). Distribution fU (U(k) |ξ) ηξ(U(k)) 'ξ(ξ(k)) '2η ξ N (ξ(k), ' 2 ξ ) ' exp h '' (U (k) ''ξ(k)) 2 2'2 ξ i U (k) 2'2 ξ ξ2(k) 2'2 ξ 1 '2 ξ log N (ξ(k), ' 2 ξ ) ' exp h '' (log U (k) ''ξ(k)) 2 2'2 ξ i log U (k) 2'2 ξ ξ2(k) 2'2 ξ 1 '2 ξ λξ exp [ ''λξ (U(k) '' ξ(k))] · E (λξ) λξ 0 0 ·I (U(k) '' ξ(k)) Table 4.1: Specialization of the proposed new general framework for time-

varying parameters. 52 4. A Factor Graph-Based Clock Offset Estimator ζ0 ζ1 0 ξ(0) ξ(1) '1 ζK''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K Figure 4.2: The factor 'K sends the message m' K ''ξ(K) to the variable ξ(K). ζ0 ζ10 ξ(0) ξ(1) '1 ζ K ''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K Figure 4.3: The variable ξ(K) sends the message mξ(K)''ζK K''1 to the factor ζK K ''1. ζ0 ζ10 ξ(0) ξ(1) '1 ζK''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K Figure 4.4: The factor ζK K ''1 sends the message mζ K K''1''ξ(K ''1) to the variable ξ(K '' 1). 4.3. A Factor Graph-based Estimator 53 Such message is then relayed to the factor node ζK K ''1 (see Fig. 4.3), i.e., mξ(K)''ζK K''1 = 'K . which computes the product of such message with the factor ζK K ''1. The resulting message is then sent to the variable ξ(K ''1) after the maximization (cf. (4.2)) process, performed over the variable ξ(K) (see Fig. 4.4), i.e., mζK K''1 ''ξ(K ''1) ' max ξ(K) ζK K ''1 · mξ(K)''ζ K K''1 = max ξ(K) 1 '' 2''2 exp ''(ξ(K) '' ξ(K '' 1)) 2 2'2 · exp αξ,Kξ 2(K) + βξ,Kξ(K) I(U(K) '' ξ(K)) that can be rewritten in a more general form as mζK K''1 ''ξ(K ''1) ' max ξ(K) 'U(K) exp Aξ,Kξ 2(K) + Bξ,Kξ2(K '' 1)+ + Cξ,Kξ(K)ξ(K '' 1) + Dξ,Kξ(K) (4.8) with Aξ,K = '' 1 2'2 + αξ,K , Bξ,K = '' 1 2'2 Cξ,K = 1 '2 , Dξ,K = βξ,K . (4.9) To determine the message (4.8) the procedure is the same as that followed

in the proof of Theorem 3.5 with a single available reading. Indeed, by

indicating with ¯ ξ(K) the unconstrained maximizer of the exponent in the objective function, i.e., ¯ ξ(K) = arg max ξ(K) Aξ,Kξ 2(K)+Bξ,Kξ2(K''1)+Cξ,Kξ(K)ξ(K''1)+Dξ,Kξ(K) , it follows that ¯ ξ(K) = '' Cξ,Kξ(K '' 1) + Dξ,K 2Aξ,K , (4.10) therefore the solution of the maximization problem in (4.8) is given by ' ξ(K) = min ¯ ξ(K), U(K) . If ¯ ξ(K) ' U(K), the estimation problem is solved. If instead ¯ ξ(K) < U(K), then the chain has to be traversed backward. On the other hand, the expres-

sion of ¯ ξ(K) depends on ξ(K '' 1) which is not determined at this stage. To 54 4. A Factor Graph-Based Clock Offset Estimator ζ0 ζ10 ξ(0) ξ(1) '1 ζ K ''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K Figure 4.5: The variable ξ(K '' 1) sends the message mξ(K''1)''ζK''1 K''2 to the factor ζ K ''1 K ''2 . complete the computation, further messages need to be propagated backward

in the graph from the variable node ξ(K '' 1). It is now assumed that ¯ ξ(K) ' U(K), therefore ¯ ξ(K) from (4.10) is plugged back in (4.8) to obtain a modified version of (4.8), that is mζK K''1''ξ(K ''1) ' exp ( Bξ,K '' C2 ξ,K 4Aξ,K ξ2(K '' 1) '' Cξ,KDξ,K 2Aξ,K ξ(K '' 1) ) . (4.11) The next step is the message sent to the factor node ζK''1 K ''2 from the variable node ξ(K '' 1), which is the product between the message mζK K''1''ξ(K ''1) and the message received from the factor node 'K''1 (see Fig. 4.5), i.e., m ξ(K ''1)''ζ K''1 K''2 = mζK K''1''ξ(K ''1) · m' K''1''ξ(K ''1) . At this stage, the factor node ζK''1 K ''2 takes the maximization over ξ(K '' 1) of the product between the incoming message and the factor ζK''1 K ''2 , in order to deliver the following message to the variable node ξ(K '' 2): m ζ K''1 K''2 ''ξ(K ''2) ' max ξ(K ''1)'U(K''1) ζK''1 K ''2 · mξ(K''1)''ζ K''1 K''2 = max ξ(K ''1) 1 '' 2''2 exp '' (ξ(K '' 1) '' ξ(K '' 2)) 2 2'2 · exp Bξ,K '' C2 ξ,K 4Aξ,K ξ2(K '' 1) '' Cξ,KDξ,K 2Aξ,K ξ(K '' 1) · exp αξ,K''1ξ 2(K '' 1) + βξ,K''1ξ(K '' 1) I(U(K '' 1) '' ξ(K '' 1)) . As before, the message above can be rewritten in terms of A, B, C and D as m ζ K''1 K''2 ''ξ(K ''2) ' max ξ(K ''1)'U(K''1) exp(Aξ,K''1ξ 2(K '' 1)+ Bξ,K''1ξ(K '' 2) 2 + Cξ,K''1ξ(K '' 1)ξ(K '' 2) + Dξ,K''1ξ(K '' 1)) (4.12) 4.3. A Factor Graph-based Estimator 55 with Aξ,K''1 = '' 1 2'2 + αξ,K''1 + Bξ,K '' C2 ξ,K 4Aξ,K , Bξ,K''1 = '' 1 2'2 , Cξ,K''1 = 1 '2 Dξ,K''1 = βξ,K''1 '' Cξ,KDξ,K 2Aξ,K . As before, one can compute the unconstrained maximizer ¯ ξ(K '' 1) of the objective function above, which turns out to be given by ¯ ξ(K '' 1) = '' Cξ,K''1ξ(K '' 2) + Dξ,K''1 2Aξ,K''1 and the solution to the maximization problem (4.12) is again expressed as ' ξ(K '' 1) = min ¯ ξ(K '' 1), U(K '' 1) . In the same way as previously seen, ¯ ξ (K '' 1) depends on ξ(K '' 2) therefore at least another set of messages must be passed backward on the factor graph

(see Fig. 4.6). Once ¯ ξ(K '' 1) is plugged back in (4.12), it holds that m ζ K''1 K''2 ''ξ(K ''2) ' exp Bξ,K''1 '' C2 ξ,K ''1 4Aξ,K''1 ξ 2(K '' 2) '' Cξ,K''1Dξ,K''1 2Aξ,K''1 ξ(K '' 2) . (4.13) Note that the above has a form similar to (4.11). By keeping traversing the

graph, messages similar to (4.11) and (4.13) are obtained, with opportune

definitions for the quantities A, B, C and D. A generalization of what has just been seen leads to, for i = 1, . . . , K '' 1 Aξ,K''i = '' 1 2'2 + αξ,K''i + Bξ,K''i+1 '' C2 ξ,K ''i+1 4Aξ,K''i+1 Bξ,K''i = '' 1 2'2 , Cξ,K''i = 1 '2 Dξ,K''i = βξ,K''i '' Cξ,K''i+1Dξ,K''i+1 2Aξ,K''i+1 (4.14) and ¯ ξ(K '' i) = '' Cξ,K''iξ(K '' i '' 1) + Dξ,K''i 2Aξ,K''i (4.15) ' ξ(K '' i) = min ¯ ξ(K '' i), U(K '' i) . (4.16) 56 4. A Factor Graph-Based Clock Offset Estimator ζ0 ζ10 ξ(0) ξ(1) '1 ζ K ''1 K ''2 ζK K ''1 ξ(K '' 1) 'K''1 ξ(K) 'K Figure 4.6: The factor ζK''1 K ''2 sends the message mζ K''1 K''2 ''ξ(K ''2) to the variable ξ(K '' 2). The message passing keeps going until the end of the graph is reached, which

means that, for i = K '' 1 and using (4.15) and (4.16) one obtains ¯ ξ(1) = '' Cξ,1ξ(0) + Dξ,1 2Aξ,1 (4.17) ' ξ(1) = min ¯ ξ(1), U(1) . (4.18) Similarly, it follows that mζ1 0 ''ξ0 ' exp Bξ,1 '' C2 ξ,1 4Aξ,1 ξ 2(0) '' Cξ,1Dξ,1 2Aξ,1 ξ(0) . (4.19) The estimate ' ξ(0) can be obtained by maximizing the above message, which naturally leads to an unconstrained maximization issue, thus yielding ' ξ(0) = ¯ ξ(0) = max ξ(0) mζ1 0 ''ξ(0) '' ' ξ(0) = Cξ,1Dξ,1 4Aξ,1Bξ,1 '' C2ξ,1 . (4.20) Now that the estimate in (4.20) is fully determinable from the observations:

its value can be used to solve for ' ξ(1) in (4.18) and it can then be plugged back in the chain using the recursions (4.15) and (4.16). 4.3.2 Closed-Form Expressions For ' θ (o) ML(K ) The aim of this section is to provide a concise expression for the estimator ' θML(K), by defining new quantities: given the linear function of x gξ,k(x) = '' Cξ,kx + Dξ,k 2Aξ,k , (4.21) 4.3. A Factor Graph-based Estimator 57 which has a key property, summarized in the following lemma. Lemma 4.1 For real numbers x1 and x2, the function gξ,k( ·), defined in (4.21), satisfies gξ,k (min(x1, x2)) = min (gξ,k(x1), gξ,k(x2)) . Proof: The constants Aξ,k, Cξ,k and Dξ,k are defined in (4.9) and (4.14).

The proof follows by observing that ''Cξ,k 2Aξ,k > 0, which implies that gξ,k is a monotonically increasing function. Using the just defined function gξ,k, the following chain of equalities can be stated as ¯ ξ(1) = gξ,1 ' ξ(0) ' ξ(1) = min U(1), gξ,1 ' ξ(0) ¯ ξ(2) = gξ,2 ' ξ(1) ' ξ(2) = min U(2), gξ,2 ' ξ(1) with gξ,2 ' ξ(1) = gξ,2 min U(1), gξ,1 ' ξ(0) = min gξ,2 (U(1)) , gξ,2 gξ,1 ' ξ(0) (4.22) where (4.22) holds thanks to Lemma 4.1. In turn, the estimate ' ξ(2) can be rewritten as ' ξ(2) = min U(2), min gξ,2 (U(1)) , gξ,2 gξ,1 ' ξ(0) = min U(2), gξ,2 (U(1)) , gξ,2 gξ,1 ' ξ(0) . With the same procedure, it holds ' ξ3 = min U(3), gξ,3 (U(2)) , gξ,3 (gξ,2 (U(1))) , gξ,3 gξ,2 gξ,1 ' ξ(0) . For m ' j, it is convenient to define the function Gm ξ,j (.) = gξ,m (gξ,m''1 . . . gξ,j (.)) , (4.23) 58 4. A Factor Graph-Based Clock Offset Estimator so that the estimate ' ξ3 can be compactly rewritten as ' ξ3 = min U(3), G3 ξ,3 (U (2)) , G 3 ξ,2 (U (1)) , G 3 ξ,1 ' ξ(0) . It must be noted that the just presented estimators only depends on functions of data, therefore they are fully determined. By keeping on plug-

ging back the quantities in the chain, the estimate for ξ(K) can be readily

obtained. Analogous expressions can be obtained for ', by keeping in mind that A',K''i, B',K''i, C',K''i and D',K''i for i = 0, . . . , K ''1, can be obtained from

(4.9) and (4.14) by substituting αξ,K''i and βξ,K''i with the proper α',K''i and

β',K''i, respectively. By making use of these constants, ' '(0), g',k and Gm ',j can be defined analogously to (4.20), (4.21) and (4.23). The closed form expression for the clock offset estimate ' θ(o) (K) is given by the following theorem. Theorem 4.2 The state estimates ' ξ(K) and ' '(K) for the posterior pdf in (4.5) can be expressed as ' ξ(K) = min U(K), G K ξ,K (U (K '' 1)) , . . . , G K ξ,2 (U (1)) , G K ξ,1 ' ξ(0) ' '(K) = min V (K), GK ',K (V (K '' 1)) , . . . , G K ξ,2 (V1) , G K ξ,1 ' '(0) and the factor graph-based clock offset estimator (FGOE) ' θ(o)(K) is given by ' θ(o)(K) = ' ξ(K) '' ' '(K) 2 . (4.24) Proof: The proof follows from the previous discussion and by making use of

(3.3). As it has been previously remarked, the closed form expressions in The- orem 4.2 allow to estimate the time-varying clock offset θ(o)(K), when the

likelihood functions, f (U(k) |ξ(k)) and f(V (k)|'(k)), have either a Gaussian, exponential or log-normal distribution. The following sections present the an-

alytical results when the general framework here adopted is particularized to

all these three distributions. Remark If the likelihood functions fU (U(k) |ξ(k)) and fV (V (k)|'(k)) are unconstrained, it follows that the solution of the maximization problem max ξ(k) exp Aξ,kξ 2(k) + Bξ,kξ2(k '' 1) + Cξ,kξ(k)ξ(k '' 1) + Dξ,kξ(k) 4.3. A Factor Graph-based Estimator 59 coincides with the unconstrained maximizer, i.e., ' ξ(k) = ¯ ξ(k) ''k = 1, . . . , K. Hence, the unconstrained likelihood maximization problem is subsumed in the

message passing framework for constrained likelihood maximization. Gaussian Distribution In case of Gaussian likelihoods (see Tab. 4.1), the functions fU (U(k) |ξ(k)) and fV (V (k) |'(k)) are unconstrained. By keeping in mind this, it follows from Theorem 4.2 that ' ξ(K) for Gaussian distributed observations U(k) is given by ' ξ(K) = G K ξ,1 ' ξ (0) . To determine the constants in (4.9) and (4.14) it is sufficient to compare the

Gaussian pdf (see Tab.4.1) with (4.7), to get αξ,k = '' 1 2'2 ξ,k , βξ,k = U(k) '2 ξ,k , (4.25) and using these values in (4.9) and (4.14), thus obtaining Aξ,K = '' 1 2'2 '' 1 2'2 ξ,K , Bξ,K = '' 1 2'2 Cξ,K = 1 '2 , Dξ,K = U(K) '2 ξ,K (4.26) Aξ,K''i = '' 1 2'2 '' 1 2'2 ξ,K ''i + Bξ,K''i+1 '' C2 ξ,K ''i+1 4Aξ,K''i+1 Bξ,K''i = '' 1 2'2 , Cξ,K''i = 1 '2 Dξ,K''i = U(K '' i) '2 ξ,K ''i '' Cξ,K''i+1Dξ,K''i+1 2Aξ,K''i+1 for i = 1, . . . , K '' 1. Using similar arguments, the estimate ' '(K) is given by ' '(K) = GK ',1 ' '(0) . Finally, the FGOE ' θ(o)(K), is expressed (3.3) as ' θ(o)(K) = GK ξ,1 ' ξ(0) '' G K ',1 ' '(0) 2 . (4.27) 60 4. A Factor Graph-Based Clock Offset Estimator It is interesting to observe the behaviour of the FGOE ' θ(o)(K) when the Gauss-Markov model variance '2 tends to zero. Consider gξ,K(ξ) = '' Cξ,Kξ + Dξ,K 2Aξ,K where the constants Aξ,K, Bξ,K, Cξ,K and Dξ,K are determined by (4.26).

After some manipulation, it holds that gξ,K(ξ) = '2 ξ ξ + ' 2U(K) '2 ξ + ' 2 , and it is quite straightforward to see that lim '2 ''0 gξ,K(ξ) = ξ , Similarly, gξ,K''1(ξ) '' ξ as ' 2 '' 0. It can be shown that in the low system noise regime, as '2 '' 0, it holds that ' ξ(K) '' ' ξ(0) = Cξ,1Dξ,1 4Aξ,1Bξ,1 '' C2ξ,1 . The same relations are valid for ', for which it can be shown that ' '(K) '' ' '(0) = C',1D',1 4A',1B',1 '' C2',1 . Therefore ' θ(o)(K) '' ' ξ(0) '' ' '(0) 2 , which can be proven to be equal to the MLE (3.19). Log-Normal Distribution When the likelihood functions are log-normally distributed (see Tab. 4.1), it

holds that αξ,k = '' 1 2'2 ξ,k βξ,k = log U(k) '2 ξ,k . It is clear that the only difference with the Gaussian case is a different ex-

pression of βξ,k. Being an unconstrained case of likelihood functions, ' ξ(K) in this case is again ' ξ(K) = GK ξ,1 ' ξ(0) , 4.3. A Factor Graph-based Estimator 61 with the constants in (4.9) and (4.14) given by Aξ,K = '' 1 2'2 '' 1 2'2 ξ,K , Bξ,K = '' 1 2'2 Cξ,K = 1 '2 , Dξ,K = log U(K) '2 ξ,K Aξ,K''i = '' 1 2'2 '' 1 2'2 ξ,K ''i + Bξ,K''i+1 '' C2 ξ,K ''i+1 4Aξ,K''i+1 Bξ,K''i = '' 1 2'2 , Cξ,K''i = 1 '2 Dξ,K''i = log U(K '' i) '2 ξ,K ''i '' Cξ,K''i+1Dξ,K''i+1 2Aξ,K''i+1 for i = 1, . . . , K '' 1. The same procedure leads to similar results about '. Hence, the FGOE ' θ(o)(K) can be expressed as ' θ(o)(K) = GK ξ,1 ' ξ(0) '' G K ',1 ' '(0) 2 . (4.28) Again, as the Gauss-Markov system noise '2 '' 0, the FGOE approaches its ML counterpart derived in Chapter 3, cf. (3.20). Exponential Distribution If the likelihood functions are exponentially distributed (see Tab. 4.1), it

means that they represent a case of constrained likelihoods. The quantities

αξ,k and βξ,k are given by αξ,k = 0, βξ,k = λξ , so that the constants Aξ,k, Bξ,k, Cξ,k and Dξ,k can be written as Aξ,k = '' 1 2'2 , Bξ,k = '' 1 2'2 Cξ,k = 1 '2 , Dξ,k = λξ for all k = 1, . . . , K. Using Theorem 4.2, it follows that GK ξ,K (U (K '' 1)) = '' Cξ,KU(K '' 1) + Dξ,K 2Aξ,K = U(K '' 1) + λξ' 2 GK ξ,K ''1(U (K '' 2)) = U (K '' 2) + 2λξ' 2 62 4. A Factor Graph-Based Clock Offset Estimator and so on. The estimator ' ξ(0) can be evaluated as ' ξ(0) = Cξ,1Dξ,1 4Aξ,1Bξ,1 '' C2ξ,1 = + '' , so that GK ξ,1( ' ξ(0)) = + '' . (4.29) Finally, using (4.29) and Theorem 4.2, it holds that ' ξ (K) = min(U(K), U(K '' 1) + λξ' 2, U(K '' 2)+ 2λξ' 2, U(1) + (K '' 1)λξ'2) . (4.30) Similar results hold for ', so that the estimate ' ' (K) is given by ' '(K) = min(V (K), V (K '' 1) + λ'' 2, V (K '' 2)+ 2λ'' 2, V (1) + (K '' 1)λ''2) (4.31) and the combination of the two estimators above provides the estimate ' θ(o)(K) provides (4.24) ' θ (o)(K) = 1

2 min(U(K), U(K '' 1) + λξ' 2, U(K '' 2)+ 2λξ' 2, U(1) + (K '' 1)λξ'2)'' 1

2 min(V (K), V (K '' 1) + λ'' 2, V (K '' 2)+ 2λ'' 2V (1) + (K '' 1)λ''2) . (4.32) As the Gauss-Markov system noise '2 '' 0, the above estimator yields ' θ(o)(K) '' 'θML = min (U(K), . . . , U(1)) '' min (V (K), . . . , V (1)) 2 , which is the MLE (3.26). 4.4 Bayesian Statistical Bounds The purpose of this section is to provide statistical bounds to the clock offset

estimation error in the Bayesian framework, similarly to what performed in

Section 3.5 for the classical case. As in that case, the likelihood is assumed

belonging to the exponential family of distributions with the general expres-

sion for the log-partition function, so that the analytical results are valid for 4.4. Bayesian Statistical Bounds 63 a large class of distributions. Moreover, the results can be used by themselves

in other Bayesian estimation contexts. By assuming the general notation (3.27) and (3.28) for the likelihoods, the following sections provide Bayesian bounds for the variance of the error

performed in estimating ρ. To this purpose, the Bayesian Cramer-Rao bound

and a Bayesian version of the Chapman-Robbins bound are derived. 4.4.1 Bayesian Cramér-Rao Lower Bound In this section, the Bayesian Cramér-Rao lower bound (BCRB) is derived,

which requires regularity conditions analogous to (3.29) for being computed. The BCRB states that the variance of the estimator ' ρ(k) of ρ(k) is bounded below by the lower-right sub-matrix of the inverse of the Bayesian

information matrix, J''1 CR(k) [56], i.e., Var (' ρ(k)) ' J'' 1 CR(k) = [J ''1 CR(k)]kk (4.33) where the Bayesian information matrix is given by [JCR(k)]ij = E '' log f (Zk, ρk) ''ρ(i) '' log f (Zk, ρk) ''ρ(j) = ''E ''2 log f (Zk, ρk) ''ρ(i)''ρ(j) . The expectation is taken with respect to the joint pdf and the following

definitions hold Zk = [Z(1), . . . , Z(k)] ' ρk = [ρ(0), ρ(1), . . . , ρ(k)] ' fZ(Z(k) |ρ(k)) ' exp (η(Z(k))ρ(k) '' 'k(ρ(k))) . (4.34) By assuming that the parameter ρ(k) follows a Gauss-Markov model, the

evolution can be described by fρ(ρ(k) |ρ(k '' 1)) = 1 '' 2''2 exp '' (ρ(k) '' ρ(k '' 1)) 2 2'2 . (4.35) In [57], a recursive formula for evaluating the Bayesian sub-matrix was de-

rived, being JCR(k + 1) = '' E (2) CR(k) JCR(k) + E (1) CR(k) ''1 E (2) CR(k) + E (3A) CR (k) + E (3B) CR (k) (4.36) 64 4. A Factor Graph-Based Clock Offset Estimator with E (1) CR(k) = E '' ''2 ''ρ2(k) log fρ(ρ(k + 1) |ρ(k)) E (2) CR(k) = E '' ''2 ''ρ(k)''ρ(k + 1) log fρ(ρ(k + 1) |ρ(k)) E (3A) CR (k) = E '' ''2 ''ρ2(k + 1) log fρ(ρ(k + 1) |ρ(k)) E (3B) CR (k) = E '' ''2 ''ρ2(k + 1) log fρ(Z(k + 1) |ρ(k + 1)) and the expectation is again taken with respect to the joint pdf. The fol-

lowing theorem specializes the recursive formula (4.36) to the Gauss-Markov

model. Theorem 4.3 Assuming the Bayesian framework in (4.34) and (4.35), the

recursive Bayesian information matrix in (4.36) is given by JCR(k + 1) = ' 2 + J''1 CR(k) ''1 + '2 ηk (4.37) with JCR(0) = 0. Proof: Using the density functions, fρ(ρ(k) |ρ(k''1)) in (4.35) and fZ(Zk|ρ(k)) in (4.34), it can be proved that E (1) CR(k) = 1 '2 , E (2) CR(k) = '' 1 '2 , E (3A) CR (k) = 1 '2 and E (3B) CR (k) = Z Z ''2'k(ρ(k + 1)) ''ρ2(k + 1) fρ(ρ(k + 1), Z(k + 1))dρ(k + 1)dZ(k + 1) = ''2'k(ρ(k + 1)) ''ρ2(k + 1) = '2 ηk . The thesis easily follows by plugging these quantities in (4.36). 4.4.2 Bayesian Chapman-Robbins Bound In the case the regularity assumptions required by the BCRB are not satisfied

(such as for the constrained likelihood functions), a Bayesian version of the

Chapman-Robbins bound (BCHRB) can be used to provide a lower bound

on the variance of an estimator of ρ(k). 4.4. Bayesian Statistical Bounds 65 The BCHRB states that the variance of an estimator ' ρk of ρk is lower bounded as Var( ' ρk) '' [Tk(hk) '' 1] ''1 h kh T k 0 with in the positive semi-definite sense, where Tk(hk) = E " f (Zk, ρk + hk) f (Zk, ρk) 2 # , and hk = [0, h1, . . . , hk] ' . Theorem 4.4 The BCHRB for the parameter ρ(k) can be expressed as Var(' ρ(k)) ' 1 JCH,k where JCH,k = inf hk Tk(hk) '' 1 h2 k and Tk(hk) = k Y j=1 M''2 η (hj)Mη(2hj) ! ' exp " 1 '2 k X j=1 (hj '' hj''1) 2 # . (4.38) Proof: See Appendix 4.A.1 for details. 4.4.3 MSE and Variance of the Estimation Error As for the classical case, also under the Bayesian framework the bounds

on the variance of the estimation error of ξ and ' (substituting ρ in the

likelihoods) can be used to lower bound the MSE of the clock offset θ(o)(K),

by making use of Proposition 3.8. Gaussian Distribution - BCRB When the likelihood function for ξ(k) is Gaussian distributed (see Tab. 4.1),

by using (4.25) and (4.37), it follows that JCR,ξ (k + 1) = ' 2 + J''1 CR ,ξ (k) ''1 + 1 '2 ξ,k , 66 4. A Factor Graph-Based Clock Offset Estimator with JCR,ξ (0) = 0. A similar recursion for JCR,' (k) can be derived. The

MSE of θk can be then lower bounded as Var(' θ(o)(k)) ' 1

4 1 JCR,ξ (k) + 1 JCR,' (k) . (4.39) Exponential Distribution - BCHRB If the likelihood for ξ(k) is exponentially distributed (see Tab. 4.1), (4.38)

becomes Tk(hk) = exp λξ k X j=1 hj ! exp " 1 '2 k X j=1 (hj '' hj''1) 2 # . In fact, it just needs to be noticed that 'ξ (ξ(k)) is constant over ξ(k) and

ηξ(U(j)) = λξ, so that (4.41) turns out to be E [exp (2hjηξ(U(j)))] = exp (λξhj) therefore S(hk) = exp λξ P k

j=1 hj . 4.5 Numerical Results Numerical results are provided in this section to corroborate the theoretical

findings of this chapter in the various scenarios. As already stated, the

performance metric used to evaluate the proposed estimators is the MSE

in estimating θ(o)(K). Parameters are chosen as 'ξ = '' = 0.1 for both

Gaussian and log-normally distributed likelihoods, while λξ = λ' = 10 for

exponentially distributed likelihood functions, so that, as in Chapter 3, the

variance of the observations is the same in all the three cases. Finally, the

variance of the Gauss-Markov model (4.4) is set to ' = 10''4. 4.5.1 MSE Performance and Bounds Assuming either Gaussian or exponentially distributed likelihoods, perfor-

mance of the FGOEs proposed in Section 4.3 in these two specific cases are

here compared with the Bayesian bounds introduced in Section 4.4. Fig. 4.7

reports the MSE performance of the FGOEs ' θ(o) (K) (4.27) and (4.32) and compares it with the BCRB and the BCHRB. As in the classical estimation

scenario, for Gaussian likelihoods the MSE using (4.27) overlaps the reported 4.5. Numerical Results 67 0 5 10 15 20 25 10 ''6 10 ''5 10 ''4 10 ''3 10 ''2 10 ''1 10 0 10 1 K MSE( θ(o) (K)) FGOE '' Gaussian

BCRB '' Gaussian

BCHRB '' Gaussian

FGOE '' Exp. Distr.

BCHRB '' Exp. Distr. Figure 4.7: MSE and bounds for estimating θ(o)(K) when using the proposed

FGOE with Gaussian and exponentially distributed likelihood. bounds. The MSE of the FGOE in the case of exponentially distributed like-

lihoods (4.32) is also plotted against the BCHRB as well in Fig. 4.7. In this

case, the MSE is quite close to BCHRB, although it does not coincide with

it, as exactly was the case in the classical estimation framework. 4.5.2 MSE Robustness with Log-Normally Distributed

Likelihood Fig. 4.8 shows the MSE in case the likelihood is log-normally distributed and

the various FGOEs are used: it can be seen that the FGOEs that (wrongly)

assume that the likelihood is either Gaussian or exponentially distributed

(dashed and dotted curve, respectively) have a quite bad performance as the

MSE is unbounded and unpredictable. On the other hand, the proposed esti-

mator ' θ (o) ML(K ) (4.28) derived under the assumption of log-normal likelihoods has extremely good performance, establishing a strong improvement with re-

spect the other estimators in this case. Moreover, the MSE curve overlaps

the reported BCRB (which, in turn, coincides with the BCHRB), denoting 68 4. A Factor Graph-Based Clock Offset Estimator 0 5 10 15 20 25 10 ''4 10 ''2 10 0 10 2 10 4 10 6 10 8 K MSE( θ(o) (K)) FGOE '' Log''normal

FGOE '' Gaussian Based

FGOE '' Exp. Distr. Based

BCRB

BCHRB Figure 4.8: MSE and bounds for the estimation of θ(o)(K) in case of log-

normal likelihood. estimator efficiency. 4.5.3 Classical vs Bayesian framework The estimators proposed in the classical and the Bayesian framework in

Chapter 3 and this chapter, respectively, can also be compared with each

other on the basis of their MSE performance as the system noise varies. The

goal of this analysis here is to show that the performance of the Bayesian

estimators approaches the performance of the MLEs of Chapter 3 as ' '' 0. Fig. 4.9 shows the MSE for the cases of Gaussian, exponential and log- normal likelihoods for K = 25. In the plot, the MSEs in the classical frame-

work are represented by the horizontal lines obtained with the MLEs using

(3.37) and (3.38) (see Appendix 3.A.5). It can be observed how the MSE ob-

tained with the use of the FGOEs for estimating θ(o)(K) almost approaches

the MSE of the MLEs as ' < 10''3, for all the three considered distributions. 4.5. Numerical Results 69 10 ''5 10 ''4 10 ''3 10 ''2 10 ''1 10 0 10 ''6 10 ''5 10 ''4 10 ''3 10 ''2 ' MSE( θ(o) (K)) Gaussian

Exp. Distr.

Log''normal Figure 4.9: MSE in the estimation of θ(o)(K) vs ' using FGOEs. The MSE

in the Bayesian case approaches the curves of the MLE as ' '' 0. 70 4. A Factor Graph-Based Clock Offset Estimator 4.6 Conclusions In this chapter, the model of Chapter 3 is improved by introducing a Gauss-

Markov model regulating the time behaviour of the clock offset related quan-

tities ξ and '. Extending the clock model by incorporating time-varying

offset provides a more realistic scenario for actual oscillators in sensor nodes.

Based on this, the general framework for the likelihood functions of the ob-

servations and the theory of factor graphs and message passing are exploited

in order to derive closed-form expressions for clock offset estimators under

the Bayesian framework, via the use of max-product algorithm. Statistical

bounds derived under the Bayesian framework in order to assess the quality

of the proposed FGOEs. The provided bounds are valid for a generic distri-

bution of the exponential family, so that they can also be used in scenarios

other than clock synchronization. The reported numerical studies confirm

the good behaviour of the proposed estimators. It can be concluded that

the FGOEs introduced in this chapter are valid alternatives to the state-of-

the-art in clock offset estimator when time-varying clock offset is realistically

assumed. 4.A. Appendix 71 4.A Appendix 4.A.1 Proof of Theorem 4.4 It holds that Tk(hk) =E " f (Zk, ρk + hk) f (Zk, ρk) 2 # = Z + '' '''' Z + '' '''' f (Zk, ρk + hk) f (Zk, ρk) 2 f (Zk, ρk)dZkdρk = S(hk) Z + '' '''' f (ρk + hk)2 f (ρk) dρk with S(hk) = Z + '' '''' f (Zk|ρk + hk) f (Zk |ρk) 2 f (Zk |ρk)dZk . (4.40) Continuing with the calculations Th(hk) =S(hk) Z + '' '''' f (ρk + hk)2 f (ρk) dρk =S(hk) Z + '' '''' f 2(ρ0 + h0) f (ρ0) ' k Y j=1 f 2(ρ(j) + hj |ρj''1 + hj''1) f (ρ(j) |ρj''1) dρk =S(hk) Z + '' '''' k Y j=1 f 2(ρ(j) + hj |ρj''1 + hj''1) f (ρ(j) |ρj''1) dρk . Since the quantity f 2(ρ(j)+hj |ρj''1+hj''1) f (ρ(j) |ρj''1) results to be equal to (j = 1, . . . , k) 1 ' '' 2' exp '' ρ2 j ''1 2'2 + ρ(j) + 2(hj '' hj''1) '2 ρj''1 ' exp '' (hj '' hj''1) 2 '2 exp '' ρ(j)2 2'2 '' 2ρ(j)(hj '' hj''1) '2 it follows that Z + '' '''' f 2(ρ(j) + hj |ρj''1 + hj''1) f (ρ(j) |ρj''1) dρj''1 = exp (hj '' hj''1) 2 '2 72 4. A Factor Graph-Based Clock Offset Estimator to get Tk(hk) = S(hk) exp " 1 '2 k X j=1 (hj '' hj''1) 2 # . Moreover, (4.40) can be written as S(hk) = k Y j=1 Z + '' '''' f (Zj|ρ(j) + hj) f (Zj |ρ(j)) 2 f (Zj |ρ(j))dZj , and it can be noted that f (Zj|ρ(j) + hj) f (Zj |ρ(j)) 2 = exp [ ''2 ('ρ(ρ(j) + hj) '' 'ρ(ρ(j)))] ' exp (2hjηρ(Zj)) . Therefore Z + '' '''' f (Zj|ρ(j) + hj) f (Zj |ρ(j)) 2 f (Zj |ρ(j))dZj = M'' 2 ηρ (hj )' E [exp (2hjηρ(Zj))] . Then, since E [exp (2hjηρ(Zj))] = exp ('ρ(ρ(j) + 2hj) '' 'ρ(ρ(j))) (4.41) it can be easily proved that Z + '' '''' f (Zj|ρ(j) + hj) f (Zj |ρ(j)) 2 f (Zj |ρ(j))dZj =M'' 2 ηρ (hj )Mηρ (2hj ) , so yielding S(hk) = k Y j=1 M''2 ηρ (hj )Mηρ (2hj ) . Part II Fast Consensus in WSNs and its Application to Clock Synchronization Clock coupling is well represented by the relation between oscillators when consensus is run for achieving clock synchronization. In Part II of this thesis an innovative consensus algorithm is presented, as well as its application to the clock synchronization problem. 73 Chapter 5 Consensus and Clock

Synchronization Differently from the results presented in Part I, the second part of this the-

sis relies on synchronization of coupled oscillators. Coupling between clocks

of different wireless sensor nodes involves the implementation of a correct-

ing signal to the oscillators (acting on the clock phase and/or on the clock

frequency) which modifies their running behaviors. In fact, while in Part I

the goal is the estimation of clock offsets (or skews) between different nodes,

the algorithms proposed in this part instead actively modify the oscillation

frequency of the oscillators by physically applying a voltage signal to the os-

cillators through well-known suited structures, such as PLLs or DLLs. Since

the correcting signals are functions of neighbors clock-related readings, af-

ter the first iteration the clocks of the network will result coupled together,

requiring an analysis fundamentally different from that performed in Part

I. Consensus algorithms are well suited distributed procedures whose imple-

mentation for clock synchronization purposes cause coupling of the network

oscillators, as it will be clear next. This chapter describes the fundamentals of consensus algorithms, so that the relevant performance improvement permitted by the innovative consen-

sus algorithm presented in Chapter 6 can be fully appreciated. Of course,

the primary objective of the present thesis is reaching clock synchronization,

therefore the reader will not be surprised of seeing in Chapter 7 the appli-

cation to clock synchronization of the just mentioned innovative results on

consensus. 75 76 5. Consensus and Clock Synchronization 5.1 Distributed Consensus in WSNs The recent technology advancements in sensor equipment allowed the re-

search community to propose (and eventually implement on real hardware)

efficient techniques for distributed processing in WSNs [1]. In fact, sensors

are now equipped with on-board processors that allow for local pre-processing

of raw sensed data, thus reducing the need of data transfer to the hub nodes.

This concurs to make the network a robust and decentralized entity. Given local sensor readings, a common form of distributed processing consists of reaching an agreement on a common value. More specifically,

starting from sensed data at each node, the process of exchanging data-

related information in order to agree on a common representative value is

called distributed consensus. Applications of consensus can be found in a variety of scenarios in WSNs. Sensors gather data from the environment, so that each sensor node can get

information regarding environmental conditions such as temperature, pres-

sure and humidity, or about ongoing phenomena like target movements. Dis-

tributed averaging on sensed data is useful for reducing the uncertainty on

the measures. It also represents one of the most studied cases in which

distributed consensus can be successfully applied. In this case, since the

common agreement value is the average of the initial readings, the research

community refers to it as average consensus [58''61]. Consensus also finds

application in other areas, such as sensor fusion [62], just to cite one. Also,

clock synchronization can be recast as a double consensus problem (over both

clock offsets and skews) [9, 16''18], as it will be shown in Section 5.4 and in

Chapter 7. The classical implementation of consensus is through a linear update of the readings, as it will be explained later in Section 5.2. On the other hand,

alternative consensus approaches exist, like, for example, asynchronous gossip

algorithms [63, 64]. A method which uses similarities with fluid mechanics is

described in [61], but it is currently applicable only to highly structured net-

works. A further interesting alternative approach is to apply the alternating

direction multipliers method (ADMM) for reaching an agreement throughout

the network (see [65, p. 253] and [66''69]). The whole Chapter 6 is devoted

to the analysis of this method. As in most distributed algorithms, the main design issues are conver- gence speed and resilience to noise. For maximizing the first, in the context

of (linear) average consensus [70] provides an iterative method for computing

the best update weights at each sensor node. Throughout this thesis, such

method is denoted as the optimal Boyd''s solution. Although the compu- 5.2. Consensus System Model 77 tational burden for computing it prevents from an implementation on real

sensor nodes, this optimized solution is anyway taken into account in this

thesis, at least as a benchmark. Applying a polynomial filter to the net-

work matrix is the method chosen in [71] for increasing the convergence rate.

In the realm of gossip algorithms, techniques for increasing the convergence

speed can be found in [72]. However, the state-of-the-art of fast consensus for

convergence rate is [73], where the past values of the states are incorporated

in the algorithm in order to speed up the convergence. Unfortunately, this

comes at the cost of a bad noise resilience, which can potentially prevent

from recovery of the correct consensus value. It appears evident that the problem of distributed consensus has been well studied in the literature, both in continuous and in discrete time. In

this chapter, basics of discrete time linear consensus are introduced as well

as the framework for its application to the problem of clock synchronization. 5.2 Consensus System Model In a WSN, just couples of nodes that are in the transmission range of one

another are actually able to communicate reliably. In order to deal with these

network communication constraints, some tools borrowed from the graph

theory are used. Here it is assumed that the communication properties of

the network are described by an undirected 1 graph of N nodes, one for each sensor. An edge connects nodes i and j if they are in the transmission

range one of each other, or, in other words, if i can receive packets from j

reliably and vice-versa. The link existence information is summarized in the

adjacency matrix ', which has 1 in position (i, j) if node i can transmit

information to node j, 0 otherwise. In other words, [']i,j = ( 1 if node j can communicate to node j

0 otherwise . (5.1) The adjacency matrix of an undirected graph is symmetric by definition, i.e.

' = ' ' . In this thesis it is assumed that the underlying communication graph is strongly connected besides undirected. This means that there exists a path

connecting each node to every other node in the network, so that communi-

cation is possible among any pair of nodes, via multi-hop transmissions. An example of a graph and its adjacency matrix is provided in Figg. 5.1 and 5.2, respectively. 1In this thesis, communication links are assumed symmetric. 78 5. Consensus and Clock Synchronization 1 4 2 5 6 3 Figure 5.1: Example of an undirected and strongly connected graph. ' =

1 1 0 1 0 0

1 1 0 1 1 0

0 0 1 1 0 0

1 1 1 1 1 1

0 1 0 1 1 0

0 0 0 1 0 1

. Figure 5.2: Adjacency matrix relative to the network whose graph is repre-

sented in Fig. 5.1. In consensus theory, convergence to a common value is reached through the iterative exchange and processing of data coming from neighbors at each

sensor node. Mathematically speaking, let xi(0) = θi '' R be the initial reading 2 of sensor i '' [1, . . . , N]. Let also xi(t) be the estimated consensus value at time t. The estimated consensus value at time t + 1, xi(t + 1), is

linearly dependent on the values xj(t), j '' Ni, where Ni denotes the set of neighbors of node i. Let x(t) = [x1(t), . . . , xN (t)] ' be the column vector form of the estimated values at time t. The classical formulation of the

linear consensus problem relates the states at time t + 1 to those at time t

as follows [58, 63, 70, 74, 75] x (t + 1) = Sx(t), t ' 0, x (0) = θ , (5.2) where S '' R N 'N is a square matrix coherent with the communication con- straints of the graph, i.e. [S]i,j = 0 if i cannot communicate with j. 2The work can be easily extended to vectorial case. 5.3. Requirements for Consensus Matrix 79 5.3 Requirements for Consensus Matrix A lot of approaches assimilable to (5.2) are actually identified under the

name of consensus algorithms, but their common property is the convergence

toward a common value to all the agents in the network. What makes the

difference in terms of convergence speed and noise resilience is the choice of

update matrix S. Consensus is reached when vector x(t) has equal components, correspond- ing to a linear combination of the elements in the initial readings vector x(0),

i.e., lim t '''' x (t) = 1 '' ' x(0) , '' '' R N , (5.3) where 1 is the N ' 1 vector of all ones. By an iterative application of (5.2), it is true also that lim t '''' x (t) = lim t '''' Stx (0) , (5.4) therefore, from (5.3) and (5.4) it must be that lim t '''' St = 1'' ' , (5.5) which means that the limit (5.5) must exist and be a rank 1 matrix. The

following theorem states equivalent conditions for reaching consensus, based

on properties of matrix S. Theorem 5.1 Let λ1, . . . , λN be the eigenvalues of matrix S sorted in a

decreasing order. Necessary and sufficient conditions for reaching consensus,

i.e., for which (5.5) holds, are C1) S1 = 1 C2) '' ' S = '' ' , with k''k : '' ' 1 = 1 C3) ρ(S) = 1 C4) µ(λ1 = 1) = 1, where ρ(S) denotes the spectral radius of matrix S, and µ(λ) is the algebraic

multiplicity of the eigenvalue λ. Proof: See Appendix 5.A.1. 80 5. Consensus and Clock Synchronization 5.3.1 Average Consensus In this thesis, the focus is on linear average consensus, that is consensus on

the average of the initial readings θ = 1 N 1 ' θ = 1 N N X i=1 θi , which means that it must be '' ' x (0) = '' ' θ = θ . (5.6) The following corollary simply follows from (5.6) [76]. Corollary 5.2 Under the assumptions of Theorem 5.1, average consensus

is achieved if and only if C5) '' = 1 N 1. 5.3.2 Popular Average Consensus Matrices From the set of all matrices satisfying the properties C1)-C5) for achieving

average consensus, only four relevant matrices are chosen and described.

Three of them are relatively easy to implement since they require simple

local knowledge of the neighborhood at each node. On the other hand, the

last matrix is reported as a benchmark since it is the result of an optimization

process, not easy at all to implement in a distributed fashion. Consensus Laplacian The Laplacian matrix SLA of the graph [16,58,60] is the most popular matrix

for reaching average consensus, being easy to implement distributively. It is

constructed with the following procedure. First compute the diagonal matrix ' whose diagonal element gi in position i is the number of neighbors of node i, i.e. gi = P j 6=i[']i,j. The Laplacian consensus matrix is computed as SLA = I '' (' '' '), where I is the N ' N identity matrix. Consensus Metropolis The Metropolis-Hastings matrix SMH [70,77] associated to the graph requires

a two-hop neighbors knowledge of the network by each node. It constructed

with the following procedure. 5.4. Application of Consensus to Clock Synchronization 81 First consider the symmetric stochastic matrix Ss defined as follows [Ss]i,j = 1 max {gi,gj} if (i, j) is an edge of the graph and i 6= j 1 '' P j 6=i[Ss]i,j if i = j 0 otherwise . (5.7) The Metropolis-Hastings matrix is defined as SMH = I '' Ss. Averaged Consensus Matrix The averaged consensus matrix SAC [78] is as simple to implement as the

Laplacian matrix, since the required network knowledge is the same. It is

directly derived from the adjacency matrix ' of the graph. The procedure for constructing it is to set each element of a row of SAC equal to 1 over the number of non-zero elements of that specific row. Boyd''s Optimum Solution The three previous matrices are constructed via an empirical procedure. On

the other hand, Boyd et. al. in [70] provide an iterative procedure to find the

specific matrix Sboyd which optimizes the convergence rate of the algorithm

(5.2), through the minimization of the second eigenvalue in modulus. Its

implementation requires a high computational cost, therefore in this thesis

such matrix is used as a benchmark rather than being suggested for practical

implementation. 5.4 Application of Consensus to Clock Synchro-

nization Clock synchronization is a well suited application of distributed consensus

in which agreement is required both for clock offsets and for clock skews.

There is an extensive literature on consensus-based clock synchronization

algorithms. Simeone et. al. in [16] applied the average consensus for reaching

an agreement on clock frequencies only. On the other hand, in [17, 18] both

clock phases and clock frequencies are corrected, even if in [18] it is not

clear how frequency observations are obtained, since they are generally not

available at the nodes. An interesting algorithm was proposed in [59], where

link delays are carefully taken into account into the analysis. A formal statement of the problem of clock synchronization in consensus terms is now provided. 82 5. Consensus and Clock Synchronization 5.4.1 System Model By stacking the clock counter values (1.1) and the clock periods (1.2) in

the vectors T (t) , [T1(t), . . . , TN(t)] ' and Υ(t) , [Υ1(t), . . . , ΥN(t)]', respec- tively, the problem of clock synchronization is now recast as a double consen-

sus problem, both for agreement on counters and on periods. By assuming

initial conditions T (0) and Υ(0), for clock phases and periods, respectively,

the adopted reference model is [9, 16''18] T (t + 1) = T (t) + Υ(t) + u(t) , (5.8a) Υ(t + 1) = Υ(t) + v(t) (5.8b) where u(t) and v(t) are N ' 1 real-valued vectors, which represent the consensus-based control signals to be designed for reaching synchronization. 5.4.2 Relevant Works The update equations (5.8) are the reference model for a number of consensus-

based clock synchronization algorithms, which obviously differ in the expres-

sions of the correcting signals u(t) and v(t). The most relevant works are: a) Simeone et. al. in [16] proposed a clock synchronization algorithm in which only correction on the clock counters is performed, i.e., v(t) ' 0. A consensus algorithm is instead applied to clock phases, so that

u (t) = (S '' I) T (t), with S being a row-stochastic matrix compatible with the underlying graph. Weights in matrix S are chosen in order

to give more importance to more reliable (in terms of received power)

channels. b) On the other hand, Schenato et. al in [17] designed a consensus-based clock synchronization algorithm in which both consensus on clock phases

and frequencies are considered, so that the control signals are expressed

as u(t) = (S '' I) T (t) and v(t) = (S '' I) Υ(t). The implementation of such an algorithm is assured by the fact that sensors are assumed

able to measure the period ratios Υi(t)/Υj.(t). c) Another relevant clock synchronization protocol is the one proposed by Carli et. al. in [18]. Consensus both in clock phases and in clock

periods is applied, so that the control signal can still be expressed as

u (t) = (S '' I) T (t) and v(t) = α (S '' I) T (t), with α '' R. On the other hand, while the control u(t) can be straightforwardly applied, for

an application of v(t) the values of Υ(t) are needed, but they are not

directly measurable at the sensors. 5.4. Application of Consensus to Clock Synchronization 83 In Chapter 6 an innovative consensus algorithm is presented, which allows a convergence speed on the order of the state-of-the-art of fast consensus

and an outstanding noise resilience. This algorithm will be applied to clock

synchronization in Chapter 7. 84 5. Consensus and Clock Synchronization 5.A Appendix 5.A.1 Proof of Theorem 5.1 For the sufficiency, first note that C1) and C2) imply that a right and a left

eigenvector relative to eigenvalue λ1 = 1 are 1 and '', respectively. Let Q be

the non singular matrix such that it holds S = Q'P Q'' 1 'P = 1 S2 , where it is assured from C3) and C4) that S2 is a (N '' 1) ' (N '' 1) matrix with ρ(S2) < 1. It then holds that S '' 1'' ' = Q 0 S2 Q'' , so that it is ρ S '' 1'' ' < 1 . (5.9) Therefore, by following the steps in [76, Thm. 1] S t '' 1'' ' = St I '' 1'' ' = S t I '' 1'' ' t = h S I '' 1'' ' i t = S '' 1'' ' t , (5.10) where the first equality descends from C1) and the second from C2). By

using (5.9), result (5.10) implies (5.5). For proving necessity, it must be noted first [76] that the limit limt'''' S t exists if and only if there is a non singular matrix Q such that S = Q Ir S3 Q'' 1 , with Ir is the r ' r identity matrix (0 ' r ' N) and ρ(S3) < 1. This implies that ρ(S) ' 1. By denoting with 'i the columns of matrix Q and with 'i the rows of matrix Q'' 1, it is lim t '''' S t = lim t '''' Q Ir S t

3 Q'' 1 = Q Ir 0 Q'' 1 = r X i=1 'i' ' i . 5.A. Appendix 85 Since (5.5) holds, r must be equal to 1 and 'i' '

i = 1'' ' . This implies that C1) and C2) hold and that λ1 = 1 is a simple eigenvalue of matrix S, therefore

proving also C3) and C4). 86 5. Consensus and Clock Synchronization Chapter 6 Fast Consensus via ADMM The results presented in this chapter refer to a collaboration of Prof. Van-

gelista and me with Prof. Tomaso Erseghe 1 and Dr. Emiliano Dall''Anese2. They have been published in [78]. 6.1 Introduction As differently from the linear consensus presented in Chapter 5, in the present

chapter the problem of distributed agreement is faced from a rather different

perspective. The application of the ADMM leads in fact to an innovative

reformulation of the average consensus problem, as it will be clear next. The adopted approach here is a recasting of the average consensus prob- lem presented in Section 5.3.1 as an instance of convex optimization. In

fact, here the ADMM [65] is exploited to solve the optimization problem in

a distributed fashion. The ADMM is an iterative algorithm by which convex

optimization problems can be solved (for a comprehensive description of the

ADMM, the reader is redirected to [65, p. 253]). The average consensus problem over parameters θi can be recast as the following convex optimization problem, θ = argmin x N X i=1 (x '' θi) 2 . (6.1) The ADMM has found several applications to (6.1), such those proposed in

[66,67] (that here is called Method A) in the context of parameter estimation. 1T. Erseghe is with the Department of Information Engineering, University of Padova, via G. Gradenigo 6/b, 35131 Padova, Italy. 2E. Dall''Anese is with the Department of Electrical Engineering, University of Min- nesota, 200 Union St., MN 55455, Minneapolis, USA. 87 88 6. Fast Consensus via ADMM In network decoding and cognitive radio sensing, the ADMM has also been

proven useful [68, 69] (referred as Method B). Since in both methods the application of this interesting distributed al- gorithm produced satisfying results and it resulted to be robust to noise, the

present work is intended to perform a deep study of an ADMM-based form of

consensus in order to place it in the literature of fast consensus and to derive

simple analytical bounds for performance evaluation. A comparison with op-

timal and suboptimal consensus algorithms [59,63,70,73''75] is made in such

sense. Moreover, a generalization of the two most interesting and promising

reformulations of ADMM, proposed in [66] and [68] respectively, is done. It

is also shown how Method A and B differ in convergence speed and noise re-

silience. The forthcoming analysis can be considered complementary to that

performed in [66''69]. By relaxing the ADMM augmentation constants it is

possible to represent the resulting algorithm in matrix form, thus allowing

for an optimization study of the convergence and noise resilience properties

of the algorithm. In this chapter, it is shown that the proposed ADMM-based algorithm can be optimized for assuring: i) a convergence speed comparable to [73] which is the state-of-the-art of fast consensus ii) a significantly improved robustness to quantization/communication noise with respect to [73]; iii) a faster convergence speed and a stronger resilience to noise than that of the optimal Boyd''s solution; iv) a convergence performance comparable to [61], applicable to any net- work. If suboptimal parameters are selected, the performance is just slightly worse with respect to the optimum, once a unique parameter has been care-

fully chosen. On the other hand, the improved noise resilience makes the

ADMM-based consensus algorithm be the state-of-the-art in fast consensus. 6.2 ADMM-based Average Consensus This section, as well as the following, assumes the absence of noise due either

to quantization or to communication errors. Noisy measurements will be

dealt later in Section 6.4. 6.2. ADMM-based Average Consensus 89 Method A The reformulation of ADMM in [66] implies the introduction of the dummy

variables {zj} to facilitate the derivation of the algorithm. Such variables will be finally taken away when the optimization problem will be solved. By

considering all nodes being ''bridge' nodes, the formulation of Method A can

therefore be stated as follows θ = argmin {xi},{zj} N X i=1 (xi '' θi) 2 subject to xi = zj, ''i and j '' Ni (6.2) which is conceptually equivalent to (6.1). Since the network is assumed

strongly connected, the set of equality constraints xi = zj assures that the

estimate values {xi} N i=1 will finally coincide across the whole network. In Appendix 6.A.1 it is shown that the following notation can be used to express the ADMM Method A of (6.2): ai,j = ci,j P '' ''Nj c'',j , bi,j = ci,j 1 + P '' ''Ni ci,'' , di = X j ''Ni bi,j (6.3) where ci,j > 0 are augmentation constants, and where the condition P i ai,j = 1 holds. For t ' 1, the update equations can be expressed as xi(t + 1) = (1 '' di)θi + di xi(t) + λi(t) + 2ui(t) λi(t + 1) = λi(t) + ui(t) (6.4) with initial conditions xi(1) = (1 '' di) θi , λi(1) = 0 (6.5) and with ui(t) = X j ''Ni bi,j X k ''Nj ak,jxk(t) '' dixi(t) (6.6) which needs to be evaluated in a distributed fashion by means of two message

exchanges: the first for the summation in ak,j, the second for the summation

in bi,j. Method B The second reformulation of the problem (6.1) descends from [68], which

introduces the auxiliary variables {zi,j}, allowing to recast the optimization 90 6. Fast Consensus via ADMM problem as follows θ = argmin {xi},{zi,j} N X i=1 (xi '' θi) 2 subject to xi = zi,j and to zi,j = zj,i, ''i and j '' Ni . (6.7) Similarly to the constraints present in (6.2), since the network is assumed

strongly connected, again the set of equality constraints in (6.7) assures that

the estimate values {xi} N i=1 will finally coincide across the whole network. In Appendix 6.A.1 it is shown how the ADMM implementation of (6.7) can be expressed by (6.4) and (6.5), where (6.6) is replaced by ui(t) = X j ''Ni ei,jxj(t) '' X j ''Ni ei,j xi(t) , ei,j = (1 '' di) ci,jcj,i ci,j + cj,i . (6.8) As opposite to Method A, the application of (6.7), requires just one message

exchange. Remark Compared to the literature, in the present work the augmentation constants

are relaxed, in the sense that in [68] a unique coefficient is used, i.e., ci,j = c,

while in [66] the augmentation constant is dependent on the specific node,

i.e. ci,j = cj. 6.3 ADMM-based Consensus in Vector Form In order to facilitate the convergence analysis and the study of noise re-

silience of ADMM, in this Section a matrix formulation of both Method A

and Method B is proposed. Defining I as the N ' N identity matrix, the 6.3. ADMM-based Consensus in Vector Form 91 following notation is introduced: C = [ci,j]ij , cij > 0 ' C = ( [ ci,jcj,i ci,j+cj,i ]ij for j '' Ni 0 for j 6'' Ni ''1 = diag(C'1)

''2 = diag(C1) A = [aj,i]ij = '''' 1 1 C ' B = [bi,j]ij = (I + ''2)'' 1C E = [ei,j]ij = (I + ''2)'' 1 ' C D = diag(di) = (I + ''2)'' 1''2 , (6.9) where in the definition of A the interchange of i and j is intended; moreover,

it must be noted that matrix ' C is symmetric by construction. With these definitions, U = ( BA '' D , Method A E '' diag(E1) , Method B (6.10) summarizes the multiplications in (6.6) (Method A) and in (6.8) (Method

B). Since A is by construction a row stochastic matrix, i.e., A1 = 1, it is true that D = diag(BA1). This property allows to state that matrix U has

the same structure in both methods, and it also satisfies U 1 = 0. In Appendix 6.A.2 it is shown how the linear systems described by (6.4), (6.6), and (6.8) can be represented by the following linear update equation x (t + 1) = (I + D + 2U )x(t) '' (D + U)x(t '' 1) (6.11) or, in a more compact and convenient (for performance analysis purposes)

form s (t) = M s(t '' 1) , M = 2F '' G G '' F I 0 , s (t) = x(t + 1) x (t) , (6.12) with F = U + I and G = I '' D. The update rule (6.12) is valid for t > 0 with the initial state s (0) = Gθ 0 . (6.13) 92 6. Fast Consensus via ADMM The matrix framework of (6.12) corresponds to the standard form of consen-

sus problems (5.2), even if matrix M is not in standard row-stochastic form,

as requested from properties C1)-C4) in Theorem 5.1 for reaching consensus. On the other hand, matrix M has other important properties that allows to reach consensus anyway, in fact (see Appendix 6.A.2): P1) Defined the eigenvalues of matrix M as λi, it is true that λ1 = 1 with single multiplicity, while all the other eigenvalues satisfy |λi| < 1. P2) The left and right eigenvector of matrix M corresponding to eigenvalue 1 are, respectively, '''1 = 1 N h (1 + C1)', ''(C1)' i , r1 = 1 , where ''' 1r1 = 1 holds. Such properties lead to some consequences on the limit power behavior of matrix M , which is described by (see [79]) M '' = lim t '''' M t = r1''' 1 . (6.14) Since it is true that ''' 1s(0) = 1 N 1 'θ = θ , (6.15) convergence of the series {s(t)} to the average consensus is assured, that is s ( '') = lim t '''' s (t) = M ''s(0) = 1θ . Moreover, since the eigenvalue 1 has single multiplicity (see P1)), the t-th power of M can be splitted (for t > 0) in the sum of its limit power (6.14)

and a t-dependent matrix, i.e., M t ' M t + M '' , ' M = M '' M '' , so that the state evolution is described, for t > 0, by s (t) = ' M t s (0) + 1θ . (6.16) Matrix ' M results to have the same spectral structure of M , with the substantial difference that the eigenvalue λ1 = 1 is mapped to eigenvalue ' λ1 = 0, so that the whole set of eigenvalues 'λi of ' M satisfies |'λi| < 1. 6.3. ADMM-based Consensus in Vector Form 93 6.3.1 Assumption on the augmentation constants Properties P1) and P2) are enough to state that matrix M is appropriate for

consensus; on the other hand, they do not say anything about the convergence

speed and the noise resilience of the update equation (6.12); such properties,

in fact, are evinced by the analysis of the Jordan structure of M . In order to

evaluate these fundamental properties of the algorithm, matrix C is chosen

such that D = dI, or equivalently G = γI, with γ = 1 '' d. This implies the following form for the matrix C: C = ǫS , S 1 = 1 , ǫ > 0 , (6.17) which means that C has to be chosen equal to a row-stochastic matrix S

(that can be considered its shape) times an amplitude factor ǫ. Given that (6.17) holds, it is true that d = ǫ 1 + ǫ , γ = 1 1 + ǫ , therefore matrix F assumes the following expression F = 2d(Φ '' I) + I , Φ = ( 1 2 Sdiag(S '1)''1S' + 1 2 I , Method A 1

2 ' S '' 1

2 diag( ' S 1) + I , Method B . (6.18) It naturally follows that, for ǫ = 1, F = Φ and ' C = ' S . The choice (6.17) brings some consequences in the Jordan structure of matrix M . The proofs of the theorems can be found in Appendix 6.A.2. Theorem 6.1 Under condition (6.17) matrix F is positive definite, i.e., it

is unitarily diagonalizable with positive real valued eigenvalues fi. The eigen-

values of F satisfy f1 = 1 , 0 < γ ' fi < 1, i 6= 1 . (6.19) As a consequence, matrix M is unitarily similar to a block diagonal matrix,

M = Σ diag(Bi) Σ'' where Σ'' = Σ'' 1, with 2 ' 2 blocks Bi = 2fi '' γ ''(fi '' γ) 1 0 . (6.20) 2 94 6. Fast Consensus via ADMM 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 'i λ 2 i'' 1 ,2 i ǫ=0 ǫ=1 ǫ=0 (a) 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 'i |λ 2 i'' 1 ,2 i| ǫ= '' ǫ=1 (b) Figure 6.1: Eigenvalues λi of matrix M as functions of 'i for ǫ ' 1 (a) and ǫ ' 1 (b). Solid lines for odd i and dashed lines for even i. 6.3. ADMM-based Consensus in Vector Form 95 The eigenvalues of F can be related to those of matrix Φ and to the amplitude factor ǫ. Sorted eigenvalues '1 = 1 ' '2 ' . . . ' 'N are assumed for Φ; moreover, because of (6.19), it follows that 1 ' 'i ' 1

2 , and therefore that fi = 2d'i + 1 '' 2d . From Theorem 6.1, the eigenvalues of matrix M are the solutions of the characteristic polynomials of matrices Bi, that is of the equations det(λI '' Bi) = 0, therefore they are of the form λ2i''1,2i = (2fi '' γ) ± p(2fi '' γ)2 '' 4(fi '' γ) 2 . (6.21) In Fig. 6.1 the location of the eigenvalues (6.21) is represented, as a function

of 'i. It can be shown that, for i = 1, (6.21) provides λ1 = 1 and λ2 = d.

For i > 1 different cases need to be considered. More specifically, if ǫ < 1,

it follows that λ2i''1 6= λ2i both being real valued. On the other hand, when

ǫ ' 1, if fi = flow or fi = fhigh, where flow = γ + 1 '' '' 1 '' 2γ 2 , fhigh = γ + 1 + '' 1 '' 2γ 2 , then (6.21) provides two real valued coinciding eigenvalues λ2i''1 = λ2i =

fi '' 1

2 γ. If flow < fi < fhigh, λ2i''1 and λ2i constitute a complex conjugate couple, i.e., λ2i''1 = λ''2i. In the last case, for fi < flow and fi > fhigh, λ2i''1

and λ2i are distinct and real valued. In all cases, the following relations hold λ2i''1λ2i = fi '' γ , λ2i''1 + λ2i = 2fi '' γ , (1 '' λ2i''1)(1 '' λ2i) = 1 '' fi . (6.22) The just described considerations imply the following properties on the Jor-

dan structure of M : P3) If ǫ < 1, then M is diagonalizable. P4) If ǫ ' 1, then M is diagonalizable if and only if fi 6= flow and fi 6= fhigh, for all i. If not, the maximum size of the Jordan blocks is Q = 2 (and

in this case Bi results non diagonalizable). This aforementioned properties allow a complete understanding of the powers

of M and ' M , as it is summarized by the following theorem. Theorem 6.2 Under condition (6.17), the powers of M and ' M are uni- tarily similar to a block diagonal matrix, namely M t = Σ diag(Bt i) Σ '' and ' M t = Σ diag( ' B t

i) Σ '' with Σ'' = Σ''1, where: 96 6. Fast Consensus via ADMM ' The first block has the form B t

1 = 1 1 '' d 1 '' dt+1 ''d(1 '' d t) 1 '' d t ''d(1 '' d t ''1) , ' B t

1 = dt 1 '' d ''d d ''1 1 . (6.23) ' For ǫ < 1 all remaining blocks have the form B t

1 = ' B t

i = λt+1 2i ''1 ''λ t+1

2i ''λ2i''1λ2i(λ t 2i ''1 ''λ t 2i) λt 2i ''1 ''λ t 2i ''λ2i''1λ2i(λ t ''1 2i ''1 ''λ t ''1 2i ) λ2i''1''λ2i (6.24) with roots taken from (6.21). Alternatively, (6.24) can be rewritten in

the form (sometimes more meaningful) B t i = ' B t

i = " P t

k=0 λ t ''k 2i ''1λ k 2i '' P t ''1 k=0 λ t ''k 2i ''1λ k+1

2i P t ''1 k=0 λ t ''1''k 2i ''1 λ k 2i '' P t ''2 k=0 λ t ''1''k 2i ''1 λ k+1

2i # (6.25) which is valid for t ' 2, but also for t = 1 when the bottom right element is set to 0. ' For ǫ ' 1 all the blocks satisfying fi 6= flow and fi 6= fhigh have the form (6.24) (or (6.25)), while the blocks that satisfy either fi = flow or

fi = fhigh have the form B t

i = ' B t

i = (1 + t)λt 2i ''tλ t+1

2i tλt''1 2i ''(t '' 1)λ t 2i , (6.26) where λ2i = fi '' 1

2 γ. All blocks B t i and ' B t

i are real valued. 2 In this section, it has been shown that with the assumption (6.17) a full spectral analysis of matrix M can be performed. For the rest of the paper,

(6.17) it is assumed to hold. 6.4 ADMM Performance Analysis A performance evaluation of the model (6.12) is now performed, in terms

of convergence speed and, introducing the proper noise term, of the noise

resilience. 6.4. ADMM Performance Analysis 97 6.4.1 Noiseless communications In case of noiseless communications, (6.16) holds with no noise terms added.

It is well known that the convergence speed of the algorithm is regulated by ρ = sr( ' M ) = max i |'λi| = max i 6=1 |λi| , (6.27) which is the spectral radius, or the largest eigenvalue modulus of a matrix.

It is possible to derive a bound based on ρ, that is generally fairly loose (if

identified by exploiting [80, Fact 3], or similar results) but indicative of the

convergence speed of the consensus algorithm. In Appendix 6.A.3, the reader can find the proof of the following theorem: Theorem 6.3 When (6.17) applies and in the absence of noise, the con-

vergence of ADMM to its limit value is regulated, for t > ǫ, by the norm-2

bound kx(t) '' x('')k2 ' γ kθk2 t ρ t ''1 , (6.28) where x( '') = 1θ. 2 6.4.2 Encompassing for communication noise Noise coming from wireless communications among nodes can be kept into

account in the model (6.12) by adding a noise term n(t), as follows s (t) = M s(t '' 1) + n(t) (6.29) where the vector n(t) is as follows n (t) = w(t) 0 , (6.30) with w(t) collecting the N noise samples at each sensor at time t. Noise n(t) accounts for noise that affects the exchanged readings. Possible sources of noise in the readings {xi(t)} encompass the quantization errors, which are unavoidable for transmitting values in finite precision, and the

thermal noise produced by the receivers. The noise resilience of the algorithm can be studied through the subse- quent application of (6.29), i.e., s (t) = ' M t s (0) + t ''1 X k=1 ' M k n (t '' k) + n(t) + 1 θ + 1 t ''1 X k=1 '''1n(t '' k) , (6.31) 98 6. Fast Consensus via ADMM which gives a representation of s(t) in terms of s(0) and the noise terms.

Because of such corrupting terms, it can happen that consensus on the av-

erage value θ cannot be achieved, therefore a study of the noise resilience

properties of the algorithm is mandatory. It can be observed from (6.31) that the last term on the right hand side of the equation is indeed a random walk, whose variance increases with time

if the condition '''1 n(k) = 1 + ǫ N 1'w(k) = 0 (6.32) is not verified. To overcome this issue (which in general prevents from reach-

ing average consensus even if the random walk is has zero mean), in [68] the

authors only consider noise terms that are orthogonal to ''1. Such impair-

ment can be overcome also by proper changes in the consensus algorithm,

like in [81]. Before proceeding to the performance analysis, it is necessary to design a communication protocol for the algorithm (6.29), so that the noise model

can be subsequently characterized. Given that node i has knowledge of its

first-hop neighborhood Ni only, it is assumed, without losing generality, that the only source of noise is the quantization error in the exchanged data. The

upcoming analysis can be easily extended if also thermal noise is considered. Method A By recalling what done in [66], the description of the communication protocol

follows. 1. Node i sends to its first-hop neighbors a message containing the differ- ence mi,1(t) = 2xi(t) '' xi(t '' 1) . As it has already been said, quantization error is present, which is mod-

eled by an additive noise qi,1(t) with zero mean and common variance

'2 q . Therefore, the noise-corrupted quantity received by all neighbors of node i is mi,1q(t) = 2xi(t) '' xi(t '' 1) + qi,1(t) . By stacking all the mi,1''s together in a vector, one obtains m1(t) = 2x(t) '' x(t '' 1) , and the corresponding noisy version is equal to m1q(t) = m1(t)+q1(t). 6.4. ADMM Performance Analysis 99 2. The second step requires nodes to transmit an elaboration of m1(t), which is given by m2(t) = Am1q(t) , A = diag(S'1)'' 1S' , and its noisy version is equal to m2q(t) = m2(t) + q2(t), with q2(t)

having the same statistics of q1(t) while being statistically independent.

The message is then broadcast to the neighbors. 3. The final step involves the update of the estimate values x(t) as follows x (t + 1) = (1 '' d)x(t) + Bm2q(t) '' d h q1(t) + diag(S'1)q2(t) i , (6.33) with B = dS, and where the last term has been designed to satisfy

(6.32). Moreover, the update step (6.33) can be actually implemented

since qi,1(t) and qi,2(t) are known quantities at node i. The algorithm (6.33) is then equivalent to (6.29) and (6.30), with w (t) = (F '' I)q1(t) + d(S '' diag(S'1))q2(t) , (6.34) which satisfies (6.32) since 1'w(t) = 0, as a consequence of the double sto-

casticity of matrix F (see Appendix 6.A.2). It is easy to prove that noise

w (t) is zero mean, while it has correlation matrix equal to Rw = E [w(t)w''(t)] = h (F ''I) 2 + d2(S''diag(S'1))(S'''diag(S'1)) i '2 q . Method B In Method B just one message exchange is required, therefore the whole

communication protocol is simpler than in the previous case. 1. As for Method A. 2. Once the {mi,1q} have been received, the update step is implemented as follows x (t + 1) = (1 + d)x(t) '' dx(t '' 1) + (F '' I)m1q(t) . (6.35) From (6.35) it can be inferred that noise w(t) satisfies (6.32), since w (t) = (F '' I)q1(t) . (6.36) Also in this case, noise has zero mean but correlation matrix Rw = (F '' I) 2'2 q . 100 6. Fast Consensus via ADMM 6.4.3 Noise resilience characterization Noise resilience of (6.29) is analysed through the evaluation of the distance

of the state vector x(t) from its average 1x(t) at time t, i.e., 'x(t) = x(t) '' 1x(t) = Kx(t) , K = I '' 1 N 11 ' where x(t) = 1 N 1 'x(t) . It is straightforward to note that the two signals, 'x(t) and 1x(t) are orthog-

onal by construction. The aforementioned decomposition has already been

applied to similar contexts (see [64, 75]). Average component By defining the average quantities s (t) = x(t+1) x(t) , s(0) = γ θ 0 . the following relation holds s (t) = B1s(t '' 1) , B1 = 1 + d ''d 1 0 . (6.37) It needs to be noticed that the update matrix that regulates the dynamics of (6.37) is actually the block (6.20) with i = 1, with eigenvalues λ1 = 1 and

λ2 = d. A recursive application of (6.37) brings to s (t) = B t 1s(0) from which an expression for the average signal can be obtained as x(t) = (1 '' d t)θ . (6.38) The above guarantees the convergence of the average signal x(t) to θ with a

rate that depends on the parameter d. Distance from the average The signal representing the distance from the average has the following dy-

namics 's(t) = ' M 's(t '' 1) + 'n(t) , ' M = K 0 0 K M (6.39) 6.4. ADMM Performance Analysis 101 where 's(t) = 'x(t+1) 'x(t) , 's(0) = γ 'θ 0 , 'n(t) = w(t) 0 with 'θ = Kθ and with noise w(t) satisfying the property w(t) = Kw(t).

Matrix ' M is known, and it has a similar eigenstructure to matrix M with the two eigenvalues λ1 = 1 and λ2 = d mapped to zero, therefore the Jordan

structure is the same ( ' Bi = Bi for i 6= 1) except for ' B1 which is the 2 ' 2 zero matrix. As a consequence, the powers of ' M are fully described recalling (6.24) and (6.26). A recursive application of (6.39) gives 's(t) = ' M t 's(0) + t ''1 X k=0 ' M k 'n(t '' k) . whose convergence speed is regulated by ξ = sr( ' M ) , (6.40) which satisfies the condition ξ ' ρ = max(ξ, d) < 1. The evolution of the signal expressing the distance from the average can also be written as 'x(t) = γQHtQ'' 'θ + 'η(t) (6.41) where Q is the unitary matrix diagonalizing F (F = Qdiag(fi)Q'') and

where Ht = diag(hi,t) contains the bottom left elements of ' B t

i, which are hi,t = 0 , i = 1 λt 2i''1 ''λ t 2i λ2i''1''λ2i , i 6= 1, λ2i''1 6= λ2i tλt''1 2i , i 6= 1, λ2i''1 = λ2i . (6.42) Moreover, 'η(t) = t ''1 X k=0 QHkQ''w(t '' k) (6.43) is the overall noise term. Convergence to zero of the elements in (6.42) implies the convergence in mean to zero of 'x(t), in fact lim t '''' E [ 'x(t)] = lim t '''' γQHtQ'' 'θ = 0 . On the other hand, the convergence speed properties of (6.41) have to be

analysed through the mean squared error, ' MSEt = E [ k 'x(t)k 2 2]. In Ap- pendix 6.A.3 the following theorem on a bound on ' MSEt is proved: 102 6. Fast Consensus via ADMM Theorem 6.4 When (6.17) applies, the MSE ' MSEt = E k 'x(t)k 2 2 = m2 t + ' 2 t , mt = kE [ 'x(t)] k2 = kγQHtQ'' 'θk2 , '2t = E k 'η(t)k 2 2 for t > ǫ satisfies mt ' γ t ξ t ''1k 'θk2 '2t ' K ' 2 q , K = N X i=2 Ki Ui (6.44) where Ki = (1 + fi '' γ) (1 '' fi)(1 + 3fi '' 2γ)(1 '' fi + γ) Ui = (fi ''1) 2 +d2sr((S''diag(S'1))(S'''diag(S'1))) , Method A (fi '' 1) 2 + 4d2 , Method A, S = S' (fi '' 1) 2 + d( '' fi '' γ '' '' d)2 , Method A, S = S' 0 (fi '' 1) 2 , Method B . (6.45) 2 A few remarks on (6.44) are needed. The reader can notice how the convergence speed is regulated by ξ, while the noise level is dependent on K.

Moreover, the resulting bound on the noise variance at time t is not dependent

on matrix eigenvectors (such as in [66,68]) and, as it will be shown in Section

6.6, it is also very strict, especially when in Method A matrix S has a specific

structure. 6.4.4 Initial states choice The condition (6.15), identifies a vectorial subspace of dimension 2N ''1 from which the initial condition s(0) can be chosen, therefore there is no special

constraint to use (6.13), while the noise bound (6.44) is not dependent on the

choice of s(0) and the convergence results are not fundamentally affected. As

an example, the choice s (0) = θ θ , (6.46) 6.5. Optimization Issues 103 is another valid initial state. What makes this specific choice interesting is

that the average signal is equal to the average of the initial readings, at each

time instant t, i.e., x(t) = θ . The convergence bounds (6.28) and (6.44) are just slightly modified, the

product γt being substituted by 2t '' 1. 6.5 Optimization Issues Given the results stated in the previous sections, now the focus is on their

analysis in order to optimize the algorithm (6.29) in terms of convergence

speed or noise resilience. The parameters to be tuned for optimization pur-

poses are the shape matrix S and the amplitude factor ǫ. 6.5.1 Convergence speed The aim of this section is to choose S and ǫ in order to maximize the con-

vergence speed. To begin with, S is assumed given and the aim is to choose

ǫ such that the convergence speed is maximized. If s(0) is given by (6.46), the convergence speed is described by ξ (6.40), and its minimization can be

performed by inspection of Fig. 6.1 (see Appendix 6.A.4); it turns out that ǫopt = 1 '' 1 ''(4'2''3)2 , '2 > 3

4 1 , '2 = 3

4 '2 '' 1

2 ('2 '''N )2+('2+'N ''1) '' ('2 '''N )2+4('2'' 1

2 )(1'''N ) , '2 < 3

4 (6.47) for having ξopt = 1+ '' 2dopt ''1 2 , '2 > 3

4 1

2 , '2 = 3

4 pdopt(2'2 '' 1) , '2 < 3 4 , dopt = ǫopt 1 + ǫopt . (6.48) In Fig. 6.2 such results can be observed, with 'N = 1

2 , as it is in common scenarios. If, instead, the initial state is chosen to be equal to (6.13), the convergence speed depends on both d and ξ (see, respectively, (6.38) and (6.44)), therefore

ρ = max(ξ, d) in (6.27) is the key parameter to minimize. In Appendix 6.A.4 104 6. Fast Consensus via ADMM 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 '2 dopt ξopt Figure 6.2: Convergence speed maximization choices, with initial state (6.46)

and 'N = 1

2 . it is shown that, if '2 ' 3

4 results (6.47) and (6.48) keep holding. On the other hand, if '2 < 3

4 , ǫopt = 1 is the best choice, with ξopt = 1

2 . As a general remark, the case '2 < 3

4 is a rare case in common networks, therefore generally the convergence properties are regulated by the same equations,

(6.47) and (6.48). If the aim is jointly optimizing S and ǫ and S is a generic N ' N matrix, the optimization problem is still an open issue. On the other hand, the best

choice for matrix S already exists if only symmetric matrices are considered,

and it is S = Sboyd with Sboyd the optimum Boyd solution (see Section 5.3.2).

In this case Φ = ( 1 2 S 2 boyd + 1

2 I , Method A 1

4 Sboyd + 3

4 I , Method B therefore '2 depends on the second largest eigenvalue modulus of Sboyd de-

noted as ρboyd, i.e., '2 = ( 1 2 (1 + ρ 2 boyd) , Method A 1

4 (3 + ρboyd) , Method B . (6.49) Fig. 6.3 shows the value of ξ obtained from substituting (6.49) in (6.47) and (6.48), therefore such parameter is there plotted as a function of ρboyd.

For Method A, '' ξ is instead plotted since, differently from the other al- 6.5. Optimization Issues 105 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ρboyd pξA,opt ξB,opt ρboyd ξoreshkin,opt Figure 6.3: Spectral radius of ADMM and of other state-of-the-art consensus

algorithm'' Boyd''s optimal matrix and Oreshkin''s method. gorithms, the data dissemination involves two message exchanges. It is in-

teresting to notice that, for large networks (ρboyd > 1 '' 2 ), which is a com- mon scenario, ADMM provides a better convergence speed with respect to

the optimum Boyd solution. In Fig. 6.3, the spectral radius for the con-

sensus algorithm proposed by Oreshkin et al. [73] is also plotted; it is the

current state-of-the-art for fast consensus, and its performance outperforms

ADMM''s. To further study the convergence speed, for large networks it results that ρboyd is very close to 1 so that it can be written as ρboyd = 1 '' Ψ(N), with small Ψ(N), and, by asymptotic Taylor series expansion it turns out that '' 1

2 log ξopt,A = '' log ξopt,B = p2Ψ(N) 2 + O(Ψ(N)) (6.50) so that the speed gap with respect to [73] is only of a factor of 2 (see [73,

eq. (15)]). For ρboyd, it holds that '' log ρboyd = Ψ(N) + O(Ψ(N) 2), there- fore the speed gain of ADMM over the Boyd''s solution is substantial and

comparable to other state-of-the-art fast consensus algorithms [61]. In order to realize which of the two parameters, S or ǫ, is more impor- tant for maximization of convergence speed, in Fig. 6.4 the speed measure

'' 1

2 log ξA is shown when the underlying network is a mesh grid consisting of N = L2 nodes, in which each node is connected to the four closest neighbors

in the grid (right, left, below, above). Discontinuities in the results coming 106 6. Fast Consensus via ADMM 0 50 100 150 0 0.025 0.05 0.075 0.1 S =Sboyd, ǫ=ǫopt S =SAC, ǫ=ǫopt S =Sboyd, ǫ=5 S =SAC, ǫ=5 L '' 1 2 lo g ξ A Figure 6.4: Convergence speed '' 1

2 log ξA dependent on ǫ and S when the underlying network graph is a mesh grid where each node has 4 neighbors. from boundary conditions are overcome by assuming that the grid lies on a

torus. It appears evident how ǫ is the most important parameter for opti-

mizing the convergence speed, since just sub-optimal performance is reached

if S is chosen to be the averaged consensus matrix SAC (see Section 5.3.2),

with non zero entries equal to 1 5 . Similar results apply to Method A and, in general, to less structured networks. 6.5.2 Resilience to noise For ADMM-based consensus, the key parameter for describing the noise re-

silience properties is K, as it is stated in (6.44). From (6.45) it is evident

that Method B has a better resilience to noise with respect to Method A,

therefore only Method B will be considered in the forthcoming analysis for

comparison with the state-of-the-art of standard and fast consensus algo-

rithms. Fig. 6.5 shows the behaviour of KiUi as a function of d for different

values of 'i, where KiUi = d(1 '' 'i) [1 + d(2'i '' 1)] [1 '' d(2'i '' 1)] [1 + d(3'i '' 2)] (Method B) . (6.51) A first observation is that all curves stay below 1, which means that the

overall noise is smaller than the quantization noise in all cases, i.e., '2t '

(N '' 1)' 2 q . Moreover, since KiUi tends to 0 as d decreases (i.e., as ǫ goes to 6.5. Optimization Issues 107 0 0.2 0.4 0.6 0.8 1 10 ''3 10 ''2 10 ''1 10 0 d KiUi 'i = 0.5 'i '' 1 Figure 6.5: Method B: Noise values KiUi as functions of d for different values

of 'i. 0), then the noise can be controlled at a desired level, by carefully choosing

the design parameter ǫ. Making the comparison with standard consensus, it can be seen that KiUi = 1 '' ρi 1 + ρi (Standard consensus) (6.52) with ''1 < ρi < 1 being the real valued eigenvalues of the (symmetric) consensus matrix. Getting KiUi < 1 is assured by having ρi > 0, which

is not guaranteed for all i, even using Boyd''s optimal matrix. This means

that the expected level of noise in standard consensus is higher than that in

ADMM Method B. A similar conclusion can be drawn for the solution proposed by Oreshkin; in fact, with (θ1, θ2, θ3) = ( ''δ, 0, 1 + δ) (see [73, eq. (13)]), in Appendix 6.A.4 by similar arguments it is shown that KiUi = (1 + αδ)(1 '' ρi) (1 '' αδ)(1 + ρi) (Oreshkin et al., [73]) (6.53) where ''1 < ρi < 1 are again the real valued eigenvalues of the (symmetric) consensus matrix used, and where αδ ' 0 is carefully designed for the fastest convergence. It is straightforward to see that (6.52) is always less than (6.53),

therefore the algorithm proposed by Oreshkin is expected to provide worse 108 6. Fast Consensus via ADMM noise performance with respect to standard consensus. This is also supported

by the fact that an optimal choice of αδ asymptotically provides αδ = 1 '' 2 p2Ψ(N) + O(Ψ(N)) . 6.5.3 Results for highly structured networks In highly structured networks, performance can be further analysed. In fact,

in the following sections line graphs and mesh grids will be characterized in

terms of convergence speed and noise resilience, when ADMM-based consen-

sus is used. Some closed form solutions will be provided. Line graph In the line graph, every node is connected to two nodes only, the preceding

and the following one. Boundary conditions are avoided by connecting the

first and the last node in the chain. In this case, matrix Sboyd is circulant,

characterized by the first row of the form [1 '' 2a, a, 0, . . . , 0, a], with a = ( 1 3 ''cos(2'/N) , even N 1 2+cos('/N ) ''cos(2'/N) , odd N and with eigenvalues having the form λn,boyd = 1 + 2a(cos(2'n/N) '' 1), n = 0, . . . , N '' 1. It is also ρboyd = 1 + 2a(cos(2'/N) '' 1). Mesh grid The mesh grid is a graph in which each node is connected to the four closest

nodes (right, left, below, above). Boundary conditions are avoided through

the assumption that the grid lies on a torus. Matrix Sboyd is block circu-

lant and the first block-row is given by [P , aI, 0, . . . , 0, aI], with P being a

circulant matrix whose first row equal to [1 '' 4a, a, 0, . . . , 0, a], and a = ( 1 5 ''cos(2'/L) , even L 1 3+2 cos('/L) ''cos(2'/L) , odd L . Eigenvalues are given by the expressions λij,boyd = 1 '' 4a + 2a cos(2'i/L) + 2a cos(2'j/L), i, j = 0, . . . , L '' 1, with ρboyd = 1 + 2a(cos(2'/L) '' 1). Table 6.1 summarizes the aforementioned results, according to (6.50) for high values of N, while in Fig. 6.6 the noise resilience results are shown,

with an evident performance improvement through the use of ADMM with

respect to standard and fast consensus approaches. 6.6. Numerical Results 109 0 200 400 10 ''1 10 0 10 1 10 2 10 3 10 4 Line graph 0 200 400 10 ''1 10 0 10 1 10 2 10 3 10 4 Mesh grid Oreshkin Boyd ADMM, Method B K/N N L Oreshkin Boyd ADMM, Method B Figure 6.6: Noise resilience in line graph (left) and mesh grid (right). Line graph Mesh grid '' log ρboyd ' 2'2 N2 ' '2 N '' 1

2 log ξA,opt ' '' log ξB,opt ' ' N ' ' '' 2N '' log ξoreshkin,opt ' ' 2' N ' '' 2' '' N Table 6.1: Convergence speed results for line graph and mesh grid. 6.6 Numerical Results Performance of ADMM are here tested by assuming a network of N nodes

whose communication properties are described by a random geometric graph

(RGG). More specifically, nodes are disposed in a disc of area 1 m2 (radius

R = 1/ '' '), and the coverage radius 3 ' is chosen such that the average number of neighbors is Nneigh, to have 4 ' = pNneigh/('N). 3The coverage area of a node is assumed circular, with radius '.

4Under the assumption of uniform spatial distribution for the nodes, and by neglecting boundary conditions, the probability that any two nodes are in the communication range 110 6. Fast Consensus via ADMM 0 100 200 300 400 500 10 ''7 10 ''6 10 ''5 10 ''4 10 ''3 Method A ADMM''MH

ADMM''Boyd

Bound

True MSE 0 100 200 300 400 500 10 ''7 10 ''6 10 ''5 10 ''4 10 ''3 Method B t t MSEt N MSEt N Figure 6.7: MSE with a RGG of N = 100 nodes, Nneigh = 8, and '2q = 10'' 6. Comparison between Method A and Method B In Fig. 6.7 the MSE when using ADMM Method A and B is shown as a

function of t, with N = 100, Nneigh = 8 and '2q = 10'' 6. For statistical confidence, the MSE is averaged over 1000 realizations of the noise. The used shape matrices are the Boyd''s optimal solution Sboyd, and the MH matrix SMH (see Section 5.3.2). The amplitude parameter ǫ is optimally

chosen for the maximization of the convergence speed, according to (6.47). Method B clearly outperforms Method A in terms of noise resilience, confirming the theoretical results of Section 6.4.3. On the other hand, the two

methods are pretty much equivalent in terms of convergence speed. Moreover,

the choice of a suboptimal shape matrix leads to just limited loss, as it was

already been noticed in Fig. 6.4 for mesh grids. In Fig. 6.7 also the MSE

bound (6.44) is reported, and it can be observed that such a bound is indeed

very strict far large values of t (where it saturates to the noise floor level

K '2 q ) while for smaller t it provides an excellent indication of the convergence speed. of each other is Pconn = '' 2, which is the ratio between the area covered by a node, and

the entire area. The result follows by also noting that the average number of connected

nodes is approximately Nneigh = N Pconn. 6.6. Numerical Results 111 0 20 40 60 80 100 0 0.05 0.1 ADMM Method B, AC with optimum ε ADMM Method B, AC with fixed ε N '' log ξ (a) 0 20 40 60 80 100 10 ''1 10 0 0 20 40 60 80 100 0 5 10 15 N N K

N ǫ (b) (c) Figure 6.8: ADMM Method B with fixed ǫ in a RGG with Nneigh = 8: (a)

convergence speed; (b) MSE at steady state; (c) optimal choices of ǫ used. Suboptimal choice for ǫ Computing the optimized values of ǫ according to (6.47) can be hard to

implement. For such reason, it is interesting to see what is the performance

loss if ǫ is chosen differently from ǫopt in (6.47). In Fig. 6.8, the performance

of ADMM in terms of convergence speed and noise level is shown as well

as the fixed values of ǫ used, this giving a flavour of the loss induced by

this suboptimal, but simpler to implement, choice for the amplitude factor

ǫ. The used values for ǫ are those that maximize the average convergence

speed, through the minimization of the average spectral radius definition

ξ = (E [ξtref ])1/tref , with tref = 100. It is clear from Figg. 6.8a and 6.8b that

this suboptimal choice for ǫ leads to a small loss both in convergence speed

and noise resilience. 112 6. Fast Consensus via ADMM 0 100 200 300 400 500 10 ''7 10 ''6 10 ''5 10 ''4 10 ''3 10 ''2 10 ''1 Standard consensus, MH

Std. consensus, Boyd

Std. consensus, 2 steps Boyd

Oreshkin, Boyd

Oreshkin, MH

ADMM Method A, Boyd

ADMM Method A, AC

ADMM Method B, Boyd

ADMM Method B, AC t MSEt N (a) N = 50 0 20 40 60 80 100 10 ''7 10 ''6 10 ''5 10 ''4 10 ''3 10 ''2 10 ''1 N MSEt N (b) t = 300 Figure 6.9: MSE with an underlying RGG: (a) as a function of t for N = 50

nodes, and (b) as a function of N for t = 300. Nneigh = 4, '2q = 10'' 6, and ǫ optimized for convergence speed. Comparison with the state-of-the-art In order to compare ADMM with the standard and fast consensus algorithms,

in Fig. 6.9a the MSE (averaged over 1000 noise realizations) is shown for a

RGG with N = 50 and '2q = 10'' 6. The used shape matrices are the Boyd''s optimum solution Sboyd, and the AC matrix SAC, the latter providing better 6.7. Conclusions 113 performance with respect to the MH matrix. The amplitude factor ǫ is again

chosen for optimizing the convergence speed according to (6.47). Fig. 6.9a shows how ADMM allows a dramatic speed improvement with respect to the optimal Boyd''s solution. On the other hand, the slight speed

loss with respect to [73] is balanced by a consistent improvement in noise

resilience when ADMM Method B is used. In Fig. 6.9b, the MSE is shown

for t = 300 and for different values of N. It can be seen that the relations

between the various algorithms hold, independently on the network size. 6.7 Conclusions This chapter went through the application of ADMM to the problem of av-

erage consensus. The approach used for analysing the ADMM properties

was the relaxation of the ADMM augmentation constants, thanks to which

a closed form matrix description of ADMM has been provided. Convergence

speed, noise resilience and simple and tight analytical bounds have been de-

rived. These last turned out to be particularly useful for optimizing ADMM

parameters, and for analytically assessing ADMM properties. It has been

shown how the ADMM-based consensus has excellent properties in terms of

convergence speed, which place it among the best algorithms for fast consen-

sus, getting performance similar to the state-of-the-art [73]. Moreover, the

ADMM-based consensus showed the best noise resilience among the stan-

dard and the fast consensus algorithms in the literature. All these properties

make the ADMM-based consensus an attractive tool for reaching clock syn-

chronization in a WSN quickly and with a very low noise level, as it is better

described in Chapter 7. 114 6. Fast Consensus via ADMM 6.A Appendix 6.A.1 Proof of Methods A and B Proof of Method A The augmented Lagrangian function corresponding to (6.2) can be derived

through the application of the results in [65, p. 255], therefore getting Lc(x, z, λ) = N X i=1 (xi '' θi) 2 + N X i=1 X j ''Ni 2λi,j(xi '' zj) + N X i=1 X j ''Ni ci,j(xi '' zj) 2 (6.54) where vectors x, z, λ and c respectively collect the states xi,zj, the Lagrange

multipliers λi,j, and the augmentation constants ci,j > 0. By using the

equivalence P N

i=1 P j ''Ni = P N

j=1 P i ''Nj , the ADMM update equations for t ' 0 are derived from (6.54), i.e., xi(t + 1) = θi + X j ''Ni ci,jzj(t) '' X j ''Ni λi,j(t) 1 + X j ''Ni ci,j zj(t + 1) = X i ''Nj ci,jxi(t + 1) + X i ''Nj λi,j(t) X i ''Nj ci,j λi,j(t + 1) = λi,j(t) + ci,j(xi(t + 1) '' zj(t + 1)) (6.55) where convergence to xi( '') = θ is assured by the fact that ci,j > 0, in- dependently on the choice for the initial conditions λi,j(0) and zj(0). A

more convenient form for (6.55) comes from setting null initial conditions,

λi,j(0) = 0 and zj(0) = 0, and by defining the new parameters λi(t) = '' P j ''Ni λi,j(t '' 1) 1 + P j ''Ni ci,j , (6.56) so that storing all the Lagrange multipliers is not needed any more. In

fact, by first noting that, from the second of (6.55) it is P i ''Nj λi,j(t) + 6.A. Appendix 115 P i ''Nj ci,j(xi(t + 1) '' zj(t + 1)) = 0 which, in turn, from the third of (6.55), gives X i ''Nj λi,j(t + 1) = 0 , t ' 0 , (6.57) by setting null initial conditions for λi,j(0), (6.57) results to hold for t ' ''1. Substituting such result into the second of (6.55), it turns out that zj(t) = X i ''Nj ai,jxi(t) , t ' 1 . (6.58) The update equation (6.4) is therefore obtained by substituting (6.58) in the

first of (6.55) and using definitions (6.3) and (6.6). It is valid for t ' 1. The initial conditions (6.5) come from the first of (6.55) and from (6.56). Proof of Method B In Method B, the augmented Lagrangian function is Lc(x, z, λ) = N X i=1 (xi '' θi) 2 + N X i=1 X j ''Ni 2λi,j(xi '' zi,j)+ N X i=1 X j ''Ni ci,j(xi '' zi,j) 2 (6.59) where an additional constraint, zi,j = zj,i, must be kept into account. The

ADMM update equations for t ' 0 in this case are xi(t + 1) = θi + X j ''Ni ci,jzi,j(t) '' X j ''Ni λi,j(t) 1 + X j ''Ni ci,j zi,j(t + 1) = ci,jxi(t+1)+cj,ixj(t+1) ci,j + cj,i + λi,j(t)+λj,i(t) ci,j + cj,i λi,j(t + 1) = λi,j(t) + ci,j(xi(t + 1) '' zi,j(t + 1)) (6.60) with ci,j > 0, which again assures convergence to xi( '') = θ for any initial condition λi,j(0) and zj(0). The second of (6.60) comes from the additional

constraint zi,j = zj,i in (6.7). By proceeding analogously to what done in

Method A, null initial conditions, xi(0) = 0 and zi,j(0) = 0, are considered,

thus obtaining ci,jzi,j(t + 1) + cj,izj,i(t + 1) = λi,j(t) + λj,i(t) + ci,jxi(t + 1) +

cj,ixj(t + 1) from the second of (6.60). By also using the third of (6.60), it 116 6. Fast Consensus via ADMM turns out that λi,j(t + 1) + λj,i(t + 1) = 0 for t ' 0. Moreover, by recalling λi,j(0) = 0, it also holds that for t ' 0 it is zi,j(t) = ci,jxi(t) + cj,ixj(t) ci,j + cj,i . (6.61) The substitution of (6.61) into the first and third of (6.60) finally gets (6.4),

with ui(t) as defined in (6.8) and λi still defined by (6.56). 6.A.2 Proof of Results of 6.3 Proof of (6.11), (6.12) and (6.13) The update equations (6.4) can be written in matrix form for t ' 1 as x (t + 1) = (I '' D)θ + (D + 2U)x(t) + λ(t) λ (t + 1) = λ(t) + U x(t) (6.62) with initial conditions x(1) = (I '' D)θ and λ(1) = 0. To reformulate (6.62) as an update of x(t) as a function of x(t) itself and of the previous values

x (t '' 1), x(t) is added and subtracted to the first of (6.62), thus getting x (t + 1) = h (I '' D)θ + (D + 2U)x(t) + λ(t) i + x(t) '' h (I '' D)θ + (D + 2U)x(t '' 1) + λ(t '' 1) i which gives (6.11) through the use of the second of (6.62), thus proving also

(6.12). Being t ' 1, the initial state would be s (1) = (I + D + 2U)(I '' D)θ (I '' D)θ . On the other hand, the same recursion is found by using the initial state

given by (6.13), as s(1) = M s(0). Proof of P1) and P2) The eigenvalues of M as described in P1) are assured by the convergence

properties of ADMM [65]. P2) is instead proved by the following properties F 1 = 1 , (1 + C1)'U = 0 , (1 + C1)'D = (C1)' . (6.63) 6.A. Appendix 117 Proof of Theorem 6.1 The properties of matrix F can be derived from standard matrix properties

[79, 82]. It must be first noted that, since (6.17) holds, it is ''2 = d 1 ''d I , so that F = (1 '' d) h C '''' 1 1 C ' + I i , Method A (1 '' d) h ' C + I + ''2 '' diag( ' C 1) i , Method B . (6.64) As a consequence, the following properties hold: a) Elements in F are all non negative, this fact being assured from ci,j > 0 in Method A, while in Method B it must be also noted that diag( ' C 1) = diag h X j ci,jcj,i ci,j+cj,i i 'diag hX j ci,j i = diag(''2) (6.65) with ' applied entrywise. b) F 1 = 1, since U 1 = 0 (see (6.63)).

c) F is row-stochastic as a consequence of a) and b).

d) Since the underlying graph is strongly connected, F is irreducible.

e) c) and d) assure that F has an eigenvalue '1 = 1 with single multi- plicity, while the others satisfy |'i| < 1, i 6= 1 [79, Theorem 10, p. 84]. f) Inspection of (6.64) leads to the conclusion that F is a real symmetric matrix. g) Inspection of (6.64) and by recalling (6.65) leads to the conclusion that F is positive definite. h) As a consequence of g), F has real valued and positive eigenvalues.

i) As a consequence of f) and g), F is diagonalizable by a unitary matrix.

j) As a consequence of F '' (1 '' d)I being a positive semidefinite matrix (seen by inspection), the eigenvalues of F satisfy fi ' 1 '' d. The block diagonalization (6.20) of matrix M is a consequence of the unitary diagonalization of matrix F . Proof of Theorem 6.2 Since M and ' M have the same eigenstructure except for the first eigenvalue (λ1 = 1 and 'λ1 = 0), Theorem 6.1 assures that ' M = Σ diag( ' Bi) Σ'' where ' B1 6= B1 and ' Bi = Bi, i 6= 1. Blocks Bi can be further analysed: a) If λ2i''1 6= λ2i, block Bi can be written as Bi = V iSiV '' 1 i where V i = λ2i''1 λ2i 1 1 , Si = λ2i''1 0 0 λ2i , V '' 1 i = 1 λ2i''1 '' λ2i 1 ''λ2i ''1 λ2i''1 . (6.66) 118 6. Fast Consensus via ADMM As a consequence, B t

1 = V 1 1 0 0 dt V '' 1 1 , B t

i = V i λt 2i ''1 0 0 λt 2i V '' 1 i which proves the first of (6.23) and (6.24). Standard properties of the geo-

metric series lead to the expression (6.25). For ' B1 it holds that ' B t

1 = V 1 0 0 0 dt V '' 1 1 which proves the second of (6.23) by substitution of V 1. b) If λ2i''1 = λ2i, block Bi can be expressed as Bi = V iJiV '' 1 i , with V i = λ2i 1 '' λ2i 1 ''1 , J i = λ2i 1 0 λ2i , V '' 1 i = 1 1 '' λ2i 1 ''λ2i , (6.67) where J i is a Jordan block of size Q = 2. It follows that B t

i = V iJ t

iV ''1 i = V i λt 2i tλt''1 2i 0 λt 2i V '' 1 i which, by use of (6.67), proves (6.26). 6.A.3 Proof of Theorems of 6.4 Proof of Theorem 6.3 By letting ' B t

i = βt 1,i βt 2,i βt3,i β t 4,i with Q the unitary matrix which diagonalizes F , it holds that ' M t = Q Q diag(βt 1,i) diag(βt 2,i) diag(βt 3,i) diag(βt 4,i) Q'' Q'' . It follows from (6.16) that x(t) '' x('') = γ Q diag(β t 3,i)Q ''θ, therefore the application of standard matrix properties (see [82]) lead to kx(t)''x('')k2 ' γ kθk2 maxi |β t 3,i|. Now, from (6.23) it is |β t 3,i| = d t/(1 '' d), which is valid for t > 1. On the other hand, from (6.26) it holds that |β t 3,i| ' tρ t ''1, and the same bound is valid for (6.25). Therefore, for t > d/(1 '' d) = ǫ it is dt/(1 '' d) < tρ t ''1 as d ' ρ, and this proves (6.28). 6.A. Appendix 119 Proof of Theorem 6.4 Taking the expected value of 'x(t) and using (6.41) leads to kγQHtQ'' 'θk2 ' γkHtk2k 'θk2 . By inspection of (6.42) (the reader is also referred to (6.25)), it can be seen

that kHtk2 ' tξ t ''1, which proves the first of (6.44). For the noise signal 'η(t), (6.43) and the statistical independence of w(t) lead to E k 'η(t)k 2 2 = t ''1 X k=0 E kQHkQ''w(t '' k)k 2

2 = t ''1 X k=0 trace (HkQ''RwQH''k) ' + '' X k=0 trace (HkQ''RwQH''k) = E k 'η('')k 2 2 . By recalling that the Hk are real-valued and diagonal matrices, it holds that E k 'η('')k 2 2 = '2 q N X i=2 '' X k=0 h 2 i,k ! [Q''RwQ]i,i '2q . Simple algebra manipulation allows to express '' X k=0 h2 i,k = 0 , i = 1 1+λ2i''1λ2i (1 ''λ22i ''1 )(1 ''λ22i)(1''λ2i''1λ2i) , i 6= 1, λ2i''1 6= λ2i 1+λ2i''1λ2i (1 ''λ2i''1λ2i)3 , i 6= 1, λ2i''1 = λ2i . The bound Ki in (6.45) is obtained from (1 ''λ2i''1λ2i) 2 > (1''λ22i''1)(1''λ22i), and from 1 + λ2i''1λ2i (1 '' λ22i''1)(1 '' λ 2 2i)(1 '' λ2i''1λ2i) = 1 + fi '' γ (1 '' fi)(1 + 3fi '' 2γ)(1 '' fi + γ) (6.68) where (6.22) is also used. The bound Ui for Method B descends on the fact

that matrix Q diagonalizes F , therefore [Q''RwQ]i,i '2q = [Q''(F '' I) 2Q]i,i = (fi '' 1)2 . 120 6. Fast Consensus via ADMM For Method A, the issue is more involved, since Q does not necessarily diag-

onalize (S ''diag(S'1))(S'''diag(S'1)). Generally speaking, an upper bound Ui can be derived by exploiting the positive semidefinite inequality sr (S ''diag(S'1))(S'''diag(S'1)) I (S''diag(S'1))(S'''diag(S'1)) . If S is symmetric, then (6.18) leads to F = d(S 2 ''I)+I, therefore Q results to diagonalize S, obtaining [Q''RwQ]i,i '2 q = (fi '' 1) 2 + d2[Q''(S '' I)2Q]i,i = (fi '' 1) 2 + d2(si '' 1)2 ' (fi '' 1) 2 + 4d2 where si are the real valued eigenvalues of S which satisfy ''1 < si ' 1. Finally, if S is also positive semidefinite, then the relation F = d(S 2 ''I)+I leads to fi = d(s 2 i '' 1) + 1 = '' si = r f i '' γ d which provides the last bound Ui by substitution. 6.A.4 Proof of Results of 6.5 Sketch of the proof of (6.47) and (6.48) The cases ǫ ' 1 and ǫ > 1 are treated separately, when the initial state is given by (6.46). ' ǫ ' 1: By inspection of Fig. 6.1a, it must be noted that λ2i''1 ' λ2i for all i, so that just λ2i''1 can be considered for the optimization purposes.

It results that λ2i''1 is a monotone increasing function of 'i, so that ξ is

determined by '2, hence ξ = λ3. The minimum is given by the lowest

curve with ǫ = 1, to have ξ = λ3 = ( 2'2 '' 1 , '2 ' 3

4 1

2 , '2 < 3

4 . (6.69) ' ǫ > 1: By inspection of Fig. 6.1b, it is |λ2i''1| ' |λ2i|, so (again) just λ2i''1 can be considered. The two cases, '2 ' 3

4 and '2 < 3

4 , need to be treated separately. 6.A. Appendix 121 '' '2 ' 3

4 : It holds that |λ3| > |λ2i''1| for all i > 2, hence ξ is (again) determined by '2 as ξ = |λ3|. The minimum here is reached by choosing ǫ in such a way that '2 corresponds to the point where

the eigenvalues locuses separate, that is, when f2 = fhigh. This

corresponds to the first line of (6.47). '' '2 < 3

4 : In this case, the minimum is guaranteed when '2 and 'N are equal, that is when |λ2N''1| = |λ3|. This provides the second line of (6.47). It must be observed that the values of ξ in (6.48) (derived by choice (6.47))

are the optimum choices, since they are smaller than those of (6.69). On the other hand, when the initial state is given by (6.13), the eigenvalue d also plays a role. For '2 ' 3

4 the solution provided by (6.47) is still optimal because ξopt ' d (see plots in Fig. 6.1b where d = λ2 set by '1 = 1). For '2 < 3

4 , the two cases ǫ > 1 (for which d > 1

2 , and so ρ > 1

2 ) and ǫ ' 1 (for which ξ ' d, and so ρ = ξ) need again to be treated separately. Thus, (6.69) is still the optimum choice, which gives ρ = 1 2 for ǫ = 1. Proof of (6.53) According to [73] the parameters (θ1, θ2, θ3) = ( ''δ, 0, 1 + δ) are used, so that the block Bi has the form Bi = (1 + αδ)ρi ''αδ 1 0 = λ2i''1 + λ2i ''λ2i''1λ2i 1 0 . Using the above in (6.68) leads to Ki = 1 + λ2i''1λ2i (1 '' λ22i''1)(1 '' λ 2 2i)(1 '' λ2i''1λ2i) = 1 (1 '' α2δ2)(1 '' ρ2i) . By then assuming the noise model w(t) = (1 + αδ)(W '' I)q1, equivalent to the one used in (6.36), it turns out that Ui = (1 + αδ)2(ρi '' 1) 2, from which (6.53) follows. 122 6. Fast Consensus via ADMM Chapter 7 An ADMM-Based Clock

Synchronization Algorithm The results presented in this chapter refer to a collaboration of Prof. Van-

gelista and me with Dr. Emiliano Dall''Anese 1 and Prof. Tomaso Erseghe2. They have been published in [83]. 7.1 Introduction Differently from the algorithms presented in Part I of this thesis, in this

chapter a clock synchronization algorithm is proposed in which a correction

signal is applied at each step of the algorithm. As it has already been said,

this makes the clocks be coupled together. The applied control signals result

to be linear functions of the observations of each node''s neighborhood, so

that this algorithm falls down in the class of linear consensus-based clock

synchronization. t will be clear next that the full synchronization issue (both

in frequencies and counters) is recast as a double consensus problem. In Chapter 6 an innovative algorithm for reaching average consensus with convergence speed among the best in the literature has been proposed: in the

present chapter the ADMM-based consensus is indeed applied to the consen-

sus on clock frequencies through the implementation of the required message

passing procedure based on the measurements of frequency displacements.

Plain consensus is applied for reaching agreement on the clock counters, but

this only affects marginally the results. In fact, it is shown that the innovative 1E. Dall''Anese is with the Department of Electrical Engineering, University of Min- nesota, 200 Union St., MN 55455, Minneapolis, USA. 2T. Erseghe is with the Department of Information Engineering, University of Padova, via G. Gradenigo 6/b, 35131 Padova, Italy. 123 124 7. An ADMM-Based Clock Synchronization Algorithm application of the ADMM-based of Chapter 6 to the problem of consensus

on clock frequencies allows to substantially reduce the time needed to get the

whole WSN synchronized, with respect to what happens with the application

of the standard consensus in both clock frequencies and clock counters. The

increased convergence speed has not a bad reflection on the noise resilience,

since it is clearly comparable to other consensus-based approaches. 7.2 Problem Statement The adopted system model is the one described by equations (5.8). As it

has already been said, in this chapter the synchronization on clock phases is

achieved via the application of a plain consensus algorithm, while agreement

on clock periods is obtained via the ADMM-based consensus, so that the

control signal u(t) comes from standard consensus, while v(t) is designed

according to the guidelines described in Chapter 6. By summarizing what is

to be described next, the proposed solution comprises a plain consensus algo-

rithm running in parallel to an ADMM-based consensus algorithm, ensuring

fast convergence to commons values for T (t) and Υ(t). 7.3 ADMM-based Consensus on Clock Periods Consider the goal of computing the average value ¯ Υ , (1/N) P N

i=1 Υi(0). The application of ADMM to this problem leads to the update equation

(6.11), in which x is replaced by Υ, thus providing Υ(t + 1) = (I + D + 2U )Υ(t) '' (D + U)Υ(t '' 1) , (7.1) with initial conditions Υ(1) = Υ(0) = [Υ1(0), . . . , ΥN(0)] ' , and where the involved matrices are those defined in (6.9) and (6.11) (Method A). Under

the assumption (6.17), it holds that C = ǫS , D = dI , d = ǫ ǫ + 1 , (7.2) with S the row-stochastic shape matrix and ǫ > 0 the amplitude factor.

It is possible to optimize the aforementioned parameters (shape matrix and

amplitude factor) in order to maximize the convergence speed, by choosing

a well suited matrix S (such as Sboyd or SAC) and, more importantly, an

optimized value for the amplitude factor ǫ. In fact, while a careful choice for

the amplitude factor is fundamental for reaching higher convergence speed,

the choice of the shape matrix is less relevant, as it has already been showed 7.3. ADMM-based Consensus on Clock Periods 125 in Chapter 6. For speeding up the convergence toward ¯ Υ, throughout the remaining of this chapter ǫ will be set equal to ǫopt according to (6.47). Using (7.2), a new matrix can be defined, being L = SQ , Q = [diag(S ' 1)]''1 S' , a symmetric doubly stochastic (L = L ' , L1 = 1) and positive semi-definite matrix, which allows to rewrite (7.1) as Υ(t + 1) = Υ(t) + d h (L '' I)Υ(t) + L(Υ(t) '' Υ(t '' 1)) i . (7.3) In general, (7.3) provides fast convergence (much faster than the optimal

Boyd''s solution) and resilience to noise impairments. From (7.3) it follows that the enforcing control v(t) at time t in (5.8b) can be expressed as v (t) = d h (L '' I)Υ(t) + L(Υ(t) '' Υ(t '' 1)) i , (7.4) and it can be seen that it is only dependent upon the clock periods vectors

Υ(t) and Υ(t '' 1). 7.3.1 Algorithm implementation The implementation of the control signal (7.4) is not an issue, since standard

PLL or DLL structures [34] are well suited techniques for correcting the

oscillation frequency. On the other hand, the values Υ(t) are not measurable

quantities. Fortunately, other related quantities can actually be measured

via the application of largely used techniques. In fact, methods that rely on

time differences [15,17,84], or efficient frequency estimation techniques (such

as the Schmidl-Cox [85]) can be effectively used to measure the (relative)

frequency displacements between node pairs, i.e., ''fi,j(t) , Υj(t) Υi(t) '' 1 , '' j '' Ni , (7.5) sequentially gathered at each node i. By exploiting the definition of the frequency correlation factor for sensor i at time t, i.e., ''Fi(t) , Υi(t '' 1) Υi(t) '' 1 , (7.6) the control signal (7.4) can be rewritten as a function of (7.5) as ''Fi(t + 1) '' d X j ''N (2) i Li,j h ''Fj(t) '' ''fi,j(t) i , (7.7) 126 7. An ADMM-Based Clock Synchronization Algorithm where N (2) i denotes the set of two-hop neighboring nodes of sensor i. Since the frequency offsets of practical systems is quite small (usually it ranges

from 10 to 100 ppm [43]), in the derivation of the above the approximations

[1 + ''Fi(t + 1)]''1 '' 1 '' ''Fi(t + 1) and ''Fi(t + 1)''fi,j(t) '' 0 have been used. The two-hop information flow represented by (7.7) requires a double in- formation exchange at each iteration t. More in the specific, if n is the in-

termediate node between nodes i and j, it is n '' Ni and j '' Nn. Moreover, again under the assumption of small frequency offsets, the approximation

''fi,j(t) '' ''fi,n(t) + ''fn,j(t) can be made. With this in mind, (7.7) can be expressed as ''Fi(t + 1) '' d X n ''Ni Si,n h ''Gn(t) '' ''fi,n(t) i , (7.8) with ''Gn(t) = X j ''Nn Qn,j h ''Fj(t) '' ''fn,j(t) i . (7.9) It must be recalled that the quantities ''fi,n(t) and ''fn,j(t) are measured

at node i and n respectively, while ''Fj(t) is transmitted during the first

message exchange and ''Gn(t) during the second exchange. 7.3.2 Noise resilience The frequency estimation process as well as the message exchanges involve

the presence of corrupting noise. Referring to eq. (7.8) and (7.9), an AWGN

model is assured both for the estimation noise affecting {''fi,n(t), ''fn,j(t)} and for the quantization/communication noise in {''Fj(t), ''Gn(t)}. All sources of noise are assumed mutually independent; this comes after the

assumption that {''fi,n(t), ''fn,j(t)} are estimated independently. Noise components are incorporated in the N '1 vector wv(t), to be added to the noiseless control signal (7.4), to obtain v (t) = d h (L '' I)Υ(t) + L(Υ(t) '' Υ(t '' 1)) i + wv(t) . (7.10) Because of the aforementioned assumptions, wv(t) is AWGN, zero mean and

with covariance matrix given by Rv , E [wv(t)w'' v (t)] '' d 2T 2 nom h S ΣGS ' + ((S ' Σf)S') ' I + LΣF L ' + S[((Q ' Σf)Q') ' I]S' i , 7.4. Consensus on Clock Counters 127 where ' stays for the Hadamard (entry-wise) matrix product, Υnom denotes the nominal clock period, and ΣF , diag('2F 1 , . . . , ' 2 FN ) , ΣG , diag('2 G1 , . . . , ' 2 GN ) , Σf , ['2 fi,j ]i,j=1,...,N , collect estimation and transmission noise variances. Because of uncorrelated noise, the following result holds from Theorem 6.4: Theorem 7.1 The noisy control signal (7.10) assures consensus on clock

periods Υ(t). Letting the deviation from the average value of Υ(t) be written

as ' Υ(t) , KΥ(t), with K = I '' 1 N 11 ' , upon defining the mean squared error on estimating ¯ Υ, MSEΥ(t) , E h k 'Υ(t)k 2 2 i , (7.12) the asymptotic convergence can be expressed by the norm-2 result lim t '''' MSEΥ(t) ' N X i=2 (1 + d''i)'2max d(1 '' ''i)(1 '' d''i)(2 + 3d''i '' d) (7.13) where ''i, i = 2, . . . , N are the eigenvalues of L excluding ''1 = 1, and where

'2 max is the largest eigenvalue of matrix KRv K . 7.4 Consensus on Clock Counters Although a straightforward application of the ADMM-based consensus to

clock phases T (t) is possible since clock counters are directly observable

values, it eventually leads to a bias on the control signal u(t) at steady state.

Therefore, a plain consensus algorithm is applied for reaching agreement also

on such readings, so that the faster convergence property coming from the

application of ADMM to clock periods can be exploited anyway in order to

get faster full synchronization. The noise corrupted control signal u(t) can be expressed as u (t) = (S '' I)T (t) + wu(t) , (7.14) where S is any standard consensus matrix, such those described in Section

5.3.2. Noise wu(t) encompasses AWGN noise sources analogously to wv(t) 128 7. An ADMM-Based Clock Synchronization Algorithm in (7.10). More specifically, wu(t) comes from the noisy readings T (t) and

it is assumed to be zero mean and with covariance matrix Ru = (S '' I)ΣT (S '' I) ' , ΣT , diag('2 T1 , . . . , ' 2 TN ) , with '2 Ti being the variances of communication and/or quantization noise affecting the readings T (t). As for clock periods, the used metric for assessing the algorithm per- formance is the mean squared error in agreeing on common clock counters,

i.e., MSET (t) , E k'c(t)k 2 2 , (7.15) where ' T (t) , KT (t) . (7.16) 7.5 Numerical Results Performance of the proposed ADMM-based clock synchronization algorithm

are now evaluated and compared with existing approaches, in terms of syn-

chronization error ' T (t) and MSE. The considered setup consists of a WSN of N = 50 nodes whose connec- tivity is represented by a RGG with an average number of neighbors equal

to 6. In this scenario (a representation can be observed in Fig. 7.1), nodes

are uniformly disposed in a disc area of 1 m2. ADMM-based consensus is applied for reaching consensus on clock pe- riods, with averaged consensus shape matrix SAC (see Section 5.3.2) and

amplitude factor ǫ optimized for convergence speed (see (6.47)). Matrix S

is also set equal to the averaged consensus matrix (ADMM+AC),. Noise is

characterized by the (diagonal) covariance matrices Ru = '2uI and Rv = ' 2 v I , with '2u = 10'' 2 and '2 v = 10 ''6 respectively for the control signals u(t) and v (t). The nominal oscillation frequency is set to fnom = 1/Υnom = 32768 Hz. The initial conditions Υ(0) are chosen as independent Gaussian random

variables, with mean Υnom and standard deviation 50 ppm. The initial con-

ditions for clock counters T (0) are chosen as independent Gaussian random

variables with zero mean and unitary variance. Finally, the time interval

between two successive iterations of the algorithm is set to 10 seconds. 7.5.1 Convergence Rate and Noise Resilience Through an analysis of the synchronization error (7.16) and of the mean

squared errors (7.15) and (7.12), in this section the convergence speed and the 7.5. Numerical Results 129 Figure 7.1: Network communication graph. noise resilience of the proposed ADMM-based algorithm ((5.8a) and (5.8b))

will be evaluated. The phase errors at five nodes is showed in Fig. 7.2. It can be seen how the synchronization error decreases as time goes by. Moreover, a snapshot around

t = 560 shows that, from this time on, the network remains synchronized,

that is, the absolute value of the synchronization error stays below 1. Note

that in the network setup of Fig. 7.1, nodes are pretty sparse, so that the

algorithm takes a while to get the network synchronized. On the other hand,

it will be shown next that other consensus-based approaches have worse

performance. The characteristic sawtooth shape of Fig. 7.2 is a consequence of the fact that, at the discontinuity points, the correction signals ((7.14) and (7.10)) are

applied, and until the next iteration, the frequency skew remains constant;

thus, in such intervals the synchronization error increases linearly. However,

the slope of the error decreases as time goes by, denoting a progressive im-

provement in the clock periods agreement across the network. In Fig. 7.3 the considered metrics are the MSEs (7.15) and (7.12), namely MSET (t) and MSEΥ(t) respectively. Consensus on clock phases comes ob-

viously after agreement on periods has been achieved. Moreover, it can be 130 7. An ADMM-Based Clock Synchronization Algorithm 0 100 200 300 400 500 600 700 800 900 1000 ''400 ''200 0 200 400 t 190 195 200 205 210 215 220 ''2 ''1 0 1 2 t Figure 7.2: Synchronization phase error Ti(t) for five nodes. noted that the initial disagreement on the clock periods makes the clock

phase error to increase. Such trend is rightafter inverted as consensus on

clock periods is pursued. Noise bound (7.13) for the estimation of ¯ Υ is also showed in Fig. 7.3, confirming its tightness: in fact,the simulative error curve overlaps the bound

at steady state. 7.5.2 Performance Comparison with Standard Consen-

sus The ADMM-based clock synchronization algorithm provides better perfor-

mance with respect to synchronization algorithms based on plain consensus,

even with optimized settings. In fact, Fig. 7.4 shows MSET (t) and MSEΥ(t)

for ADMM+AC as well as the performance coming from the double applica-

tion of plain consensus, both to clock phases and to clock periods. The (com-

mon) matrix used for the double consensus synchronization algorithm is the

averaged consensus matrix SAC (AC). As a benchmark, (since its high com-

putational cost prevents its practical implementation) the Boyd''s optimum

solution Sboyd (see Section 5.3.2) is also used (BO). The great performance

improvement brought by the ADMM-based consensus is well visible: for

clock periods, the convergence speed of the algorithm introduced in Chap-

ter 6 proves to be approximately more than three times the one achieved

by plain consensus. This excellent performance also brings relevant bene- 7.5. Numerical Results 131 0 50 100 150 200 250 300 350 400 450 500 10 ''6 10 ''4 10 ''2 10 0 10 2 t ADMM + AC

noise bound 0 50 100 150 200 250 300 350 400 450 500 10 ''2 10 0 10 2 10 4 t Figure 7.3: MSE for clock phases and clock periods with ADMM+AC. 0 50 100 150 200 250 300 350 400 450 500 10 ''6 10 ''4 10 ''2 10 0 10 2 t ADMM + AC

AC + AC

BO + BO 0 50 100 150 200 250 300 350 400 450 500 10 ''2 10 0 10 2 10 4 t Figure 7.4: MSE of ADMM+AC compared to standard and optimized con-

sensus approaches. 132 7. An ADMM-Based Clock Synchronization Algorithm fits for reaching full synchronization: in fact, while for standard consensus

more than 340 iterations are needed in order to get the network synchronized

(i.e., MSET < 1), the same performance is obtained by waiting almost 270

iterations if using the ADMM based consensus. Finally, as it can be observed in figure, the noise resilience of the proposed algorithm is comparable to that of the standard consensus approaches. 7.6 Conclusions While in Chapter 6 a new distributed consensus algorithm based on the

ADMM has been introduced, here it was shown how its application to the

problem of clock synchronization in WSNs leads to better results, compared

to the state-of-the-art of consensus-based clock synchronization. ADMM-based consensus was here applied for achieving agreement on clock periods, while plain consensus was used for clock phases, since a direct

application of the method proposed in Chapter 6 would have lead to an

undesired and not negligible bias on the correction signal u(t). Even if ADMM-based consensus could not be straightforwardly applied to clock phases synchronization, its use for agreement on clock periods had

extremely positive consequences on the convergence speed of the overall al-

gorithm. In fact, it allowed to reach full synchronization well in advance with

respect to existing consensus-based approaches. Appendix A Efficient Base Station Selection in

Uplink Cellular Networks The results presented in this chapter refer to a collaboration with Prof. Van-

gelista and Prof. Stefano Tomasin 1 and have been published in [86, 87]. While falling out of the main topic of this thesis, because of its relevance in

my Ph.D. program this work has been included as an appendix chapter. Be-

ing self-standing, the present work can also be appreciated by the interested

reader independently from the rest of the thesis. A.1 Introduction Wireless cellular systems are characterized by the communication between a

mobile terminal (MT) and one or more base stations (BSs). Recent advance-

ments in the research community thinking brought the concept of coopera-

tion among BSs to the attention of the other researchers as well as to the

industry. In fact, by adopting cooperation among BSs the network through-

put can be increased via the implementation of a distributed multiple-input

multiple-output (MIMO) system, also known as macro-diversity or multi-cell

processing (MCP). MCP has been first studied from an information theoretic viewpoint and later also in more realistic scenarios with correlated Rayleigh fading between

antennas [88] and with possible application to 3GPP long term evolution

(LTE) advanced [89]. Moreover, scheduling and power allocation have been

studied for MCP in downlink [90]. 1S. Tomasin is with the Department of Information Engineering, University of Padova, via G. Gradenigo 6/b, 35131 Padova, Italy. 133 134 A. Efficient Base Station Selection in Uplink Cellular Networks In uplink, MCP encompasses the cooperation among different BSs in or- der to decode the packet transmitted by a MT in a joint fashion at the

serving BS [91, Chapter 2]. In other words, in uplink, cooperating BSs share

information on the received packet therefore they need to exploit the back-

haul network in a non-negligible manner. The issue of designing the proper

characteristics of the backhaul network has been faced in [92, 93], where the

capacity of an uplink MCP system with limited backhaul throughput has

been investigated. Also, in [94], the goal is to maximize the user throughput

while having a limited backhaul capacity. Cooperation among BSs has been studied in [95] for application to a fre- quency division duplexing (FDD) LTE network in which both demodulation

and decoding are performed at the serving BS, thus requiring a heavy use of

the backhaul network. Differently, in [96] just decoding is performed at the

serving BS, therefore only soft information bits are forwarded to the serving

BS. A study on the asymptotic properties of the spectral efficiency of both

the optimum and the linear minimum mean squared error (MMSE) joint

multi-cell receivers have been performed in [97] assuming a large number of

users per cell, showing that cooperation among receivers allows a significant

performance improvement. As it can be evinced from this discussion, the main issue in the design of an uplink MCP system is the high capacity backhaul network necessary

to exchange soft information on the demodulated signals among BSs. The

selection of the appropriate cooperating BSs for performing decoding jointly

is therefore fundamental from this point of view: on this matter, the authors

of [98] select the BSs in order to maximize the network throughput for a given

user deployment under backhaul capacity constraints. Handoff for uplink

MCP and BS selection are investigated in [99, 100]. In [101], a dynamic

greedy algorithm is proposed for selecting the cooperating BSs considering

multiple MTs transmitting in a cell and using linear receive beamforming

with multi antenna. Considering error control policies leads to a higher overhead on the back- haul network, since both automatic repeat request (ARQ) and hybrid ARQ

(HARQ) encompass retransmissions by the MTs, thus increasing the infor-

mation sharing and the average decoding delay, while naturally allowing a

more reliable communication. This work considers an uplink MCP system in which BSs can cooperate in order to decode a packet transmitted by a single MT in the considered sector.

(H)ARQ error control policies are used for maintaining a threshold level of

quality of service (QoS) in terms of maximum allowed outage probability

after a given number of retransmissions. The objective is to accordingly

select the cooperating BSs such that the average number of cooperating BSs A.2. System Model 135 is minimized (therefore reducing the backhaul network usage). The approach

of this work is different from what most of the literature does since, instead

of imposing a constraint on the backhaul capacity, a QoS target must be

guaranteed to the user with the aim of minimizing the backhaul occupancy.

Such an approach especially fits with HARQ error control since the duration

of the packet transmission is not fixed: the backhaul network can therefore

be statistically exploited by each MT. Another novel aspect of this work consists on the number of considered policies: in fact, while error control is implemented either with ARQ, HARQ

with chase combining or HARQ with incremental redundancy, two levels of

cooperation among BSs are also assumed: joint decoding: cooperating BSs forward soft information to the serving BS which tries to decode the packet local decoding: cooperating BSs try to decode the packet locally and only exchange one information bit on whether they decoded the packet or

not. In this work, the number of cooperating BSs at each retransmission frame is optimized assuming to have perfect knowledge on the average channel at-

tenuation (path loss) between the MT to each cooperating BSs. Since finding

the optimum solution is computationally expensive, an iterative algorithm is

introduced which proves to be both efficient and effective. Although making

strong assumptions on the channel knowledge, this work provides a way to

increase the system efficiency and may provide further insights for possible

future implementations of MCP solutions. A.2 System Model The considered scenario (see Fig. A.1) involves a MT transmitting a packet to

a serving BS, which is typically the closest BS to the terminal. It is assumed

that the surrounding BSs may help the serving BS through cooperation in

a macro diversity approach. Such BSs are called cooperating BSs. Just

one transmitting MT is considered, since tit is assumed that a pre-selection

phase [102] has already taken place. However, a cooperating BS may serve

other MTs as well as cooperate with other serving BSs at the same time. As

an alternative to the use of other BSs as cooperators, specific devices (re-

lays) may be used, therefore implementing a distributed broadband wireless

communication (BWC) system. On the other hand, this is not relevant for

the upcoming analysis. In fact, the purpose of this work is to verify if using 136 A. Efficient Base Station Selection in Uplink Cellular Networks cooperating BSs proves useful for reducing the backhaul usage in decoding a

packet transmitted by a MT while meeting certain requirements on the QoS. Figure A.1: The MT (red cross) is communicating with the M = 3 closest

BSs (blu squares) in a cellular scenario. Unsuccessful decoding is due to channel fading conditions, so that meet- ing QoS targets requires the adoption of error management policies. Error

control is implemented via the use of retransmission policies like plain ARQ

or hybrid ARQ (HARQ), the latter either with chase combining (HARQ-CC)

or incremental redundancy (HARQ-IR). Considering a time division multi-

plexing system, retransmissions take place in different frames, each organized

in two slots: the first slot is used by the MT to transmit data packets at fixed

spectral efficiency R (in bit/s/Hz), while in the second slot the serving BS re-

sponds either with an acknowledgement (ACK) or with a not ACK (NACK)

to the MT, in order to signal successful or failed packet decoding, respec-

tively. The second slot is generally shorter than the first, being ACKs and

NACKs shorter packets with respect to data packets transmitted by the MT.

A further assumption about ACK and NACK packets is that they are always

decoded successfully by the MT: this can be reasonably assumed if, for ex-

ample, the transmission rate of ACKs and NACKs are low enough. (H)ARQ A.2. System Model 137 frames are retransmitted until either the serving BS decodes the packet or a

maximum of N frames is reached, in which case an outage event is declared. Channel fading is assumed rapidly time-varying and not known at the BSs. The average channel conditions (i.e., path loss in the channels from the

MT to the BSs) are instead assumed known at the BSs. Given the channel

statistical description, the serving BS selects xn '' [0, . . . , M] BSs (including itself) that will cooperate at frame n '' [1, . . . , N] for decoding the packet transmitted by the MT. The xn BSs are denoted as active at frame n. Two cooperation methods among BSs are envisioned in this work. On one hand, the active BSs try to decode the packet locally after each frame

without exchanging any further soft information: this approach is called local

decoding (LD), and the only data exchange between BSs is 1 bit indicating

the outcome (either success or failure) of the local decoding process at the end

of each frame. Differently from LD, in joint decoding (JD) the cooperating

BSs instead forward all the received bits to the serving BS which tries to

decode the packet making use of all the available data. In this latter case,

the active BSs exchange soft information at the end of each frame. Because of

the greater amount of available data, JD is expected to perform better than

LD in terms of QoS. This comes at the cost of a larger use of the backhaul

network. As a matter of comparison, also the case in which just one (the

serving) BS is active is considered and denoted by the acronym NC, which

stands for no cooperation. Let 'm be the average signal to noise ratio (SNR) of the channel between the MT and the BSs m. It is assumed that the BSs are sorted according to

decreasing values of the SNR, i.e., '1 ' '2 ' . . . ' 'M . (A.1) At each frame n, the serving BS selects the xn BSs with the higher SNRs to

get the best results. In order to evaluate the performance of the various approaches, an appro- priate metric must be introduced. To this end, let pn(x, ', R) be the proba-

bility of decoding failure at the serving BS before frame n, with n = 1, . . . , N;

let also be p1(x, ', R) = 1. In the definition of pn, the dependence on the

number of active BSs at each frame x , [x1, . . . , xN], on the SNRs of the BSs-MT links ' , ['1, . . . , 'M] and on the required spectral efficiency R is made explicit. The fundamental metric adopted in this work is then the

following. Definition A.1 The average number of active BSs is defined as f (x) , N X n=1 xnpn(x, ', R) . (A.2) 138 A. Efficient Base Station Selection in Uplink Cellular Networks Since f is a weighted average of xn, it provides a measure of the backhaul

network usage, even if a specific metric will be introduced in the next section

for such specific purpose. The objective here is to elaborate a strategy in order to accordingly choose the vector x so that the average number of active BSs (A.2) is minimized,

under a constraint on the maximum outage probability. In other words, the

goal is to solve the problem min x f (x), (A.3a) subject to pN+1(x, ', R) ' θ, (A.3b) xn '' {0, 1, . . . , M}, n = 1, 2, . . . , N. (A.3c) This is essentially different from just requiring to minimize the outage prob- ability. In fact, in that case the optimum solution would be simply setting

xn = M for all n, thus overloading the backhaul network. However, in prob-

lem (A.3) the goal is to preserve the backhaul resources while at the same

time guaranteeing a target QoS. In order to attack problem (A.3), it is necessary to derive explicit ex- pressions for the outage probability in each combination of error correction

method and decoding policy. This is to be done in the following sections. A.3 Outage Probability Expressions The goal of this section is to provide explicit expressions for the outage prob-

ability pn before n frames, in order to enforce the constraint (A.3b). All the

cooperation policies among BSs and retransmission methods are considered,

while assuming that the channels suffer of flat Rayleigh fading with path

loss. To this end, new quantities and notations are introduced. Moreover, in

order to evaluate the cell coverage, special focus is given to the case 'm = '

for all m, this case being the condition in which the MT is at the edge of

all surrounding cells. The importance of this case resides on the fact that it

provides the highest outage probability, thus proving useful in system design

for cell dimensioning and sites position planning. The considered combinations of decoding policies and error control meth- ods encompass JD, LD and NC, used together with both HARQ-CC and

HARQ-IR. Plain ARQ is used only in combination with LD and NC, since

JD has no difference with LD, as it will be clear next. Table A.1 summarizes

the aforementioned combinations. In this section, co-user interference is not considered in order to obtain analytical results. The scenario in which only one MT is transmitting inside A.3. Outage Probability Expressions 139 ARQ HARQ-IR HARQ-CC NC X X X LD X X X JD X X Table A.1: Combination of decoding policies and retransmission methods. the sector at that specific time can be achieved by several techniques, ranging

from multiple antennas or multiple receivers in each BS and time or frequency

multiplexing among cells, at the price of either a higher cost or a lower

spectral efficiency. Section A.5 reports simulative results in which both inter-

cell and intra-cell interference are taken into account. In combinations involving the use of JD and HARQ, there is a non negligi- ble backhaul occupancy, since data on the received signals must be forwarded

from the cooperating BSs to the serving BS. On the other hand, with LD

no soft information forwarding is needed, and the backhaul occupancy (1

bit) can be considered negligible. Given this, in order to better evaluate the

backhaul usage of JD, it will be assumed, for simplicity, that the BSs are

connected on a ring, each BS being able to communicate directly just to the

previous and the following BS in the chain. Definition A.2 The average number of backhaul packets exchanged among

active BSs BP is defined as the average sum of the lengths of the paths from

each active BS to the serving BS, BP = N X n=1 xn '' 1 2 xn '' 1 2 + 1 + '(xn) xn + 1 2 pn(x, ', R) , (A.4) with '(xn) = ( 1 xn even

0 xn odd. The backhaul occupancy is not the only performance metric adopted in this work. In fact, it is well known that retransmissions due to decoding

failures imply a delay on the packet delivery. Therefore, it is necessary to

keep into account of such delay by introducing a proper metric. Definition A.3 The normalized (with respect to the packet duration) aver-

age decoding delay D is the average time required for decoding a packet when

transmitted in uplink from a MT, i.e. D = X n pn(x, ', R) . 140 A. Efficient Base Station Selection in Uplink Cellular Networks The delay D depends on the MT position. Its meaning does not describe a

time but the number of transmissions needed to achieve correct decoding. In

fact, the purpose of introducing such a quantity is to get a measure of the

retransmission delay, while being transparent to delays in the retransmission

requests from the BS. Moreover, the design of the backhaul network must

consider the traffic of the soft information as well as the signalling packets

about channel conditions that allow BS selection, in order not to impact

on D. The actual performance comparison will be done on a delay-related

quantity: Definition A.4 The normalized average payload throughput ' is the ratio

between the required spectral efficiency and the (normalized) average decoding

delay, i.e. ' = R D . (A.5) In order to simplify the upcoming analysis, the indicator function relative to BS m at frame n ym(n) , ( 0 no cooperation

1 cooperation , (A.6) is introduced. Since the BSs are ordered with increasing path loss with

respect to the MT, ym(n) are univocally determined from xn. Moreover, it

is useful to introduce the number of frames in which BS m has cooperated

until frame n as Ym(n) , n X j=1 ym(j), for m = 1, 2, . . . , M, and for n = 1, 2, . . . , N. By inspection of (A.6), it follows that the ym(n) can be expressed in matrix form, with the row sums being equal to Ym(N) and the column sums

equal to xn, i.e.,

y1(1) y1(2) · · · y1(N) y2(1) y2(2) · · · y2(N) ... ... . .. ... yM(1) yM(2) · · · yM(N)

+ '' Y1(N) + '' Y2(N) · · · + '' YM(N) '' + '' + '' + x1 x2 · · · xN . In the following, the explicit expressions of the outage probability pn are derived for all the cases of interest. The NC policy is omitted since it can be

obtained from either the JD or the LD case by setting M = 1. A.3. Outage Probability Expressions 141 A.3.1 Local Decoding with ARQ (LD-ARQ) Assuming plain ARQ, if the first transmission by the MT fails, the terminal

retransmits the same packet to the BSs. At the receiver side, the BSs discard

the packet each time a decoding failure happens. This means that there is

no information reuse neither from different frames nor from different BSs, so

that the only meaningful policy for ARQ is LD. In this case, the expression of

the outage probability after n frames, pn+1, is given by the product '' over the

active BSs and frames '' of the probabilities of decoding failure for each BS

m during each frame n. The spectral efficiency at BS m for a given channel

realization is given by the random variable Φ LD ''A m (x, 'm) , log2 1 + ym(j)'m|hm(j)| 2 , where hm(j) are i.i.d. normalized complex Gaussian random variables mod-

eling the Rayleigh fading in the channel from the MT to BS m at frame j.

Path loss is taken into account by 'm. The outage probability is therefore

given by pLD''A n+1 (x, ', R) = n Y j=1 xj Y m=1 P[Φ LD ''A m (x, 'm) < R] = = n Y j=1 xj Y m=1 am, (A.7) with am , P['m|hm(j)|2 < 2R '' 1] =

= 1 '' e'' 2R ''1 'm . Given the ordering (A.1) assumed for the BSs, it holds that 0 ' a1 ' . . . ' aM ' 1 . (A.8) A.3.2 Local Decoding with HARQ-CC (LD-HARQ-CC) In LD-HARQ-CC the outage probability is the product of the partial outage

probabilities at each BS. The spectral efficiency at BS m is the random

variable ΦLD''CC m (x, 'm) , log2 1 + n X j=1 ym(j)'m |hm(j)| 2 ! , 142 A. Efficient Base Station Selection in Uplink Cellular Networks while the probability of being in outage after n frames is pLD''CC n+1 (x, ', R) = M Y m=1 P ΦLD''CC m (x, 'm) < R . (A.9) Probability (A.9) can be rearranged as M Y m=1 P " n X j=1 ym(j)'m |hm(j)| 2 < 2R '' 1 # = M Y m=1 FC Ym(n), 2R '' 1 'm where FC(k, z) is the cumulative distribution function (cdf) of the Gamma

random variable P k

''=1 |X''| 2 (with X'' i.i.d. normalized complex Gaussian random variables) with scale 1 and shape k, being defined as FC(k, z) = 1 (k '' 1)! γ(k, z), z ' 0 , where γ(k, z) = '' X l=0 ( ''1) l l! zk+l k + l , z ' 0 . For the scenario of multi-relay networks, similar results have been ob- tained in [103]. A.3.3 Joint Decoding with HARQ-CC (JD-HARQ-CC) In JD-HARQ-CC the spectral efficiency is given by ΦJD''CC(x, ') , log2 1 + n X j=1 xj X m=1 'm |hm(j)| 2 ! with outage probability after n frames pJD''CC n+1 (x, ', R) = P ΦJD''CC(x, ') < R . (A.10) Even if in Appendix A.A.2 a closed-form expression of probability (A.10) is

derived, it results computationally intractable. To simplify the treatment,

the worst case scenario is assumed in the absence of interference, that is

the MT is assumed at the edge of all surrounding sectors, and 'm = ' for

m = 1, 2, . . . , M. In this case, probability (A.10) is given by the cdf of a

Gamma random variable with scale 1 and shape kn, where kn , P n

j=1 xj , i.e., p JD ''CC n+1 (x, ', R) = P[' khnk 2 < 2R '' 1] = FC kn, 2R ''1 ' . A.3. Outage Probability Expressions 143 In Section A.5, the system performance is studied with MTs dropped ran-

domly in the sector, thus extending the special case 'm = ' through numer-

ical simulation. A.3.4 Local Decoding with HARQ-IR (LD-HARQ-IR) For LD-HARQ-IR, information is accumulated at each BS, frame after frame.

The spectral efficiency therefore results to be ΦLD''IR(x, 'm) , n X j=1 ym(j) log2 1 + 'm|hm(j)| 2 with the outage probability after n frames given by p LD ''IR n+1 (x, ', R) = M Y m=1 P ΦLD''IR(x, 'm) < R . An alternative view is that BS m fails to decode the packet if the sum of

Ym(N) random variables of the kind log2 (1 + 'mZ) is less than R. Fol-

lowing the approach of [104], ΦLD''IR(x, 'm) is approximated with a Gaus-

sian random variable with mean µLD''tot(m) , Ym(N)µ('m) and variance

'2 LD ''tot(m) , Ym(N )' 2('m), where µ('m) , E[log2(1 + 'mZ)] = log2(e)e 1/'m E1(1/'m) and '2('m) , E[(log2(1 + 'mZ) '' µ('m)) 2] = 2 'm log 2

2(e)e 1/'m G4,0 3,4 1/'m

0,0,0

0, ''1,''1,''1 '' µ 2('m), where E1(x) , R '' 1 t''1e''xtdt and Gm,n p,q z

a1,...,ap

b1,...,bq is the Meijer G function. Therefore the approximation pLD''IR n+1 (x, ', R) '' M Y m=1 Q µLD''tot(m) '' R 'LD''tot(m) holds, where Q[ ·] is the tail probability of the standard Gaussian distribution. 144 A. Efficient Base Station Selection in Uplink Cellular Networks A.3.5 Joint Decoding with HARQ-IR (JD-HARQ-IR) For JD-HARQ-IR, the serving BS keeps accumulating information coming

from the cooperating BSs at each frame, so that the spectral efficiency is

given by ΦJD''IR(x, ') , n X j=1 log2 1 + M X m=1 ym(j)'m |hm(j)| 2 ! . (A.11) The expression of the outage probability after n frames is therefore pJD''IR n+1 (x, ', R) = P ΦJD''IR(x, ') < R . (A.12) A closed-form expression of (A.12) cannot be derived, even for the simple case of large n and 'm = ' for all m. Indeed, in Appendix A.A.1 it is

discussed how even a Gaussian approximation of (A.11) cannot be exploited

since a closed-form expression of the variance '2 Φ does not exist. A.4 Base Station Selection Solving problem (A.3) is not at easy task at all. Being an integer pro-

gramming problem, its solution requires an exhaustive search (ES) over the

(M + 1) N configurations of active BSs at each frame. Indeed, as the maxi- mum number of active BSs M and the maximum number of (H)ARQ frames

N increase, the ES becomes computationally unfeasible. By adopting a con-

tinuous relaxation of the problem, when the outage probability has a closed-

form tractable expression a solution can be found, as shown in Section A.4.2

for LD-ARQ and MT positioned at the same distance from all the M sur-

rounding BSs. In the general case, there is the need of a heuristic algorithm

for recursively computing the number of active BSs at each frame. The

approach being introduced in the next section provides a low complexity

suboptimal solution to problem (A.3), whose performance results extremely

close to ES, thus justifying its adoption in real scenarios. A.4.1 The Recursive Search (RS) The recursive search (RS) algorithm iteratively computes the number of ac-

tive BSs at each frame n, by starting from a static feasible solution and then

modifying the xn of a single unit, one at a time in a greedy fashion. In fact,

the recursive procedure generally decreases the number of active BSs at early

frames and decreases it at later frames, trying to reduce the average number A.4. Base Station Selection 145 of active BSs (A.2). The rationale behind this procedure is that if correct

decoding occurs at an early stage, the BSs scheduled for cooperation in the

next frames are not active. On the other hand, to get an early decoding, a

high number of active BSs is necessary at the earliest frames. Therefore, a

trade-off must be designed between probability of early decoding and number

of active BSs at early frames. With such assumptions, RS looks for the best

trade-off by simultaneously ensuring that the QoS constraint on the outage

probability is met. Although RS is generally suboptimal, in Section A.4.2 it

is shown that in a particular case (LD-ARQ and 'm = ' for all m) it leads

to the optimum solution. The starting configuration of RS is a static cooperation (SC), which is a condition in which the number of active BSs is the same at each frame, i.e., x(0) n = xmin , min{1 ' M' ' M : pN+1([M', . . . , M'], ', R) ' θ}. (A.13) The value of x (0) n is obtained via a search procedure. At iteration i + 1, with i = 0, 1, . . ., the BS configuration is modified by selecting two frame indices s and d, with s < d. Then the number of active

BSs at the early frame s is decreased by one and the number of active BSs

at the later frame d is instead increased by the same quantity, i.e., x(i+1) s = x(i) s '' 1 , x (i+1)

d = x (i)

d + 1 . Then if f (x(i+1)) ' f(x (i)) and condition (A.3b) is still satisfied, then the new BS configuration x(i+1)

is saved as the tentative optimum configuration of active BSs ( ¯ x ). This means that at each iteration the total average number of cooperating BSs is

decreased while meeting constraints (A.3b) and (A.3c). The choice of s and

d satisfies s, d '' {1, 2, . . . , N} and s < d , (A.14) starting from s = 1. Moreover, for a given value of s, d is first set to N

and then reduced, thus first activating BSs at the later stages, to have the

sharpest decrease of f (xi+1). The pseudocode of the RS algorithm is reported

in Table A.2. Note that with RS the total number of active BSs is kept fixed, i.e., N X n=1 x(i) n = xminN . (A.15) If the MT is at the same distance from all the surrounding BSs and ARQ is used, RS assures that the outage constraint is always satisfied without the 146 A. Efficient Base Station Selection in Uplink Cellular Networks Table A.2: Recursive Search Algorithm. Set x (0) n = xmin from (A.13), for all n = 1, . . . , N; Set i = 0

for s = 1 to '' N 2 '' for d = N down to s + 1 while x (i) s > 0 x (i+1) s = x (i) s '' 1; if x (i)

d < M then x (i+1)

d ' x (i)

d + 1 otherwise break; end if

if pN+1(x'', ', R) ' θ and if f(x (i+1)) < f(x(i)) then ¯ x ' x (i+1); end if i ' i + 1 end end end

Return ¯ x as the set of active BSs. need of checking it explicitly. In fact, (A.7) assures that in this case the

outage probability only depends on the total number of active BSs across

all frames. However, in general this is not the case, therefore the outage

constraint (A.3b) must be checked at each iteration. Simulation results re-

ported in Section A.5 show that the performance of RS is really close to the

optimum solution provided by ES, in all considered policies. Moreover, RS

represents a low complex solution since it can be easily shown that it con-

verges in O (MN2) iterations, i.e. in polynomial time. On the other hand,

the ES requires the evaluation of (A.2) for each possible configuration of ac-

tive BSs, therefore requiring O (M + 1)N operations, i.e., an exponential number on the parameter N. Genetic approaches as well as sphere decoding

algorithms have been also proposed in the literature [98]: they lead to good

results but at the cost of a higher computational burden. A.4.2 A closed-form expression for LD-ARQ Considering LD-ARQ leads to a closed-form solution of the continuous relax-

ation of problem (A.3) in a specific case. However, before proceeding with

the specific solution, it is worth to notice why the rationale behind the RS

algorithm is justified. A.5. Numerical Results 147 Theorem A.1 Vector x solving the minimization problem (A.3) for LD-

ARQ satisfies xn ' xn+1, n = 1, 2, . . . , N '' 1. Proof: See Appendix A.A.3 If the MT is positioned at the same distance from the M surrounding BSs, i.e. 'm = ' for m = 1, . . . , M, a closed-form solution can be derived,

as it is shown in Appendix (A.A.4). The solution provides xn = ( '' loga(1 '' xn+1 log a) 1 ' n ' N '' 1

'' loga δ

θ n = N , (A.16) where δ is a solution of the implicit equation E N (δ) = 1 , (A.17) with E 1(δ) , δ E 2(δ) , δ 1 + log δ θ E n(δ) = En''1(δ) h 1 + log En''1(δ)

En''2(δ) i n = 3, . . . , N . (A.18) Eq. (A.17) is trascendent, therefore it has to be solved by the use of numerical methods. However, it has a unique solution, as it is proved in

Appendix A.A.4. The last step is to find a solution in ZN , which is done by approximat- ing the solution (A.16) with the vertex in ZN of the hypercube around x

providing the least average number of active BSs and satisfying the outage

constraint. Such discretization process is a source of sub-optimality for the

analytical solution, but in Section A.5 it is shown that it actually provides

performance close to the optimum. A.5 Numerical Results The considered simulative scenario consists of a cellular system in which a

single MT is present in the studied sector. It is surrounded by M = 3 BSs

arranged on three vertices of an hexagon of unitary side, dividing a cell in

three sectors. Antenna attenuation for out of sector signals is assumed to be

20 dB [105], while the maximum number of retransmission frames is set to

N = 3. Rayleigh fading affects the channel while path loss is assumed with 148 A. Efficient Base Station Selection in Uplink Cellular Networks coefficient 3.4, therefore the SNR of the link between the MT and BS m at

distance d is given by 'm = '0 (d) ''3.4 |ξ|2 , where '0 is the average SNR at unitary distance and ξ is a complex zero-

mean Gaussian random variable with unitary variance modelling the Rayleigh

fading. Interference is kept into account by considering the presence of MTs in the neighboring sectors with interference probability q. The condition q = 0

means total absence of interferers, while q = 1 corresponds to the case in

which all the interfering MTs are transmitting. The positions of interfering

MTs is random within their sectors. Both intra-cell and inter-cell interference

is considered by including a ring of surrounding cells around the studied

sector, assuming a MT (active with probability q) for each sector of the ring. In the following subsections, a number of communication policies are com- pared. Within each policy, two macro diversity approaches are considered:

1) flexible cooperation (FC), in which the active BSs are selected according

to the methods proposed in this work, and 2) static cooperation (SC), where

BSs are activated according to the rule provided by (A.13). For FC, results

for both RS and ES (as a benchmark) are reported. The maximum allowed

outage probability is set at θ = 10''2. A.5.1 Outage Probability in the Sector To provide examples of the distribution of the outage probability in the stud-

ied sector, this section reports spatial plots for two specific cases, assuming

R = 2 bit/s/Hz and activating al the M = 3 BSs. Supposing that the serving

BS is located at the point with coordinates (0, 0), Fig. A.2 focuses on the

policy LD-HARQ-IR with q = 0. It can be observed that the outage prob-

ability increases with the distance from the M = 3 BSs, revealing that the

position at the edge of the cell sectors is the worst case. Similar conclusions

can be drawn looking at Fig. A.3 which reports the spatial distribution of

the average (with respect to the interfering nodes'' positions) outage proba-

bility for LD-ARQ and q = 1. The point at the maximum distance from all

the BSs results to be the worst case for the outage probability also using the

other policies listed in Section A.3, the corresponding plots being omitted. A.5.2 Coverage improvement In order to evaluate the cell coverage improvement allowed by the innovative

FC policies introduced so far in a macro diversity approach, in this section A.5. Numerical Results 149 Figure A.2: Final outage probability pN+1 as a function of the MT position

in a sector with LD-HARQ-IR and in the absence of interference (q = 0).

The serving BS is located at coordinates (0,0) in the plot. Figure A.3: Final outage probability pN+1 as a function of the MT position

in a sector with LD-ARQ and in the presence of interference (q = 1). The

serving BS is located at coordinates (0,0) in the plot. 150 A. Efficient Base Station Selection in Uplink Cellular Networks 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 ''20 ''15 ''10 ''5 0 5 10 15 R [bit/s/Hz] ' min [dB] JD

LD

NC

LD''HARQ''IR''Ga Figure A.4: Minimum value of the SNR at the unitary distance vs the spectral

efficiency of the first transmission for various decoding schemes. Solid line:

HARQ-IR; dashed line: HARQ-CC; dotted line: ARQ. the MT is placed at the edge of three sectors, equally distant from the M = 3

surrounding BSs, that is in the worst case in the sector in terms of outage

probability. It is assumed that all the M = 3 BSs are activated during each

frame, i.e., x = [M, M, M]. Interference-free Scenario When interferers are switched off (q = 0), the minimum acceptable value of

the SNR '0 can be evaluated by letting the MT transmit while imposing

that the outage constraint (A.3b) is fulfilled, i.e., that means finding 'min = min '0 {'0 : pN+1(x, ', R) ' θ} . Fig. A.4 shows the value of 'min as a function of the normalized spectral efficiency R using a) macro diversity with JD, b) macro diversity with LD

and c) no cooperation (NC), i.e. only the serving BS is active. Since all BSs

are active in all frames, a unique line is reported for all cooperation and error

protection scheme, since distinction between RS and ES has not meaning at

this point. JD significantly outperforms both macro diversity with LD and A.5. Numerical Results 151 NC by reducing 'min by about 3.5 dB and 9 dB, respectively, for R = 0.4

bit/s/Hz. Moreover, also the Gaussian approximation discussed in Section

A.3.4 (LD-HARQ-IR-Ga) is shown in Fig. A.4, so that it can be observed

that it provides a satisfying approximation of the simulated performance. Fig. A.4 can be read also from other different viewpoints: in fact, results of 'min can be translated in terms of coverage radius. As an example of the

increased coverage properties yielded by FC schemes, it can be observed that

NC achieves a spectral efficiency of 0.4 bit/s/Hz at the edge of a hexagonal

cell with unitary side, while the same spectral efficiency is achieved by JD

and LD on cells with a side of 1.3 and 1.8, respectively, thus substantially

extending the coverage radius of the BSs. Another reading of Fig. A.4 consists of looking at the achieved spectral efficiency R as a function of the SNR, therefore obtaining information about

diversity and multiplexing gains [106]. In particular, note that HARQ-CC

(dashed lines) has a higher multiplexing gain with respect to ARQ (dotted

lines), while HARQ-IR (solid lines) exhibits also a diversity gain, as can be

seen from the slope at high SNRs. Interference Scenario If the interferers are switched on, their transmissions produce interference

to the studied MT. The outage probability pN+1(x, '(q), R) is therefore also

a function of the interference probability q which determines the signal to

interference plus noise ratio (SINR) vector '(q), so that a threshold for the

maximum value of q can be found for having sustainable communications in

terms of outage probability, for each cooperation policy. Fig. A.5 shows the

maximum value of the interference probability, i.e., qmax = max q {q : pN+1(x, '(q), R) ' θ} , as a function of the spectral efficiency R of the first transmission that can

be reached by the MT of interest, despite interference. Also in this case JD

provides the highest immunity to interference for a fixed spectral efficiency or,

equivalently, letting BSs cooperate increases the achievable spectral efficiency

for a given level of interference. A.5.3 Randomly Dropped Users In this section, MTs are randomly dropped in the sector. In each case, the

cooperation and error protection combinations of Table A.1 are considered

in order to determine the number of active BSs for each frame. The cases in 152 A. Efficient Base Station Selection in Uplink Cellular Networks 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 R [bit/s/Hz] q max JD

LD

NC Figure A.5: Maximum allowed interference probability qmax vs the spectral

efficiency of the first transmission for various decoding schemes. Solid line:

HARQ-IR; dashed line: HARQ-CC; dotted line: ARQ. A.5. Numerical Results 153 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 E[f] q RS''HARQ''IR

SC''HARQ''IR

ES''HARQ''IR Figure A.6: Total average number of active BSs as a function of q for HARQ-

IR. Solid line: JD; dashed line: LD. which the outage probability constraint is not statistically met are dropped

and the MT transmission re-scheduled. From data on the number of active

BSs, the performance metrics (A.2), (A.4) and (A.5) are computed. In the

following, the spectral efficiency R is set to 0.4 bit/s/Hz. Total Average Number of Active BSs In analogy with the well known soft handover overhead, the function f (x) ''1 can be seen as a macro diversity overhead, which is proportional to the addi-

tional hardware resources required in the cellular system to implement uplink

BS cooperation. Indeed, active BSs must be able to manage additional up-

link signal processing in order to cooperate with the serving BS. If no extra

hardware is provided in the active BSs, cooperation translates into a spectral

efficiency reduction. Moreover, the backhaul network will be partially used

for the required exchange of information between cooperating BSs, deter-

mining an additional overhead, especially for JD, which will be considered in

more details in the following. For HARQ-IR, Fig. A.6 shows E[f (x)] as a function of the interference probability q, where the average is taken over the random MT position and

'0 = 0 dB. JD outperforms LD in the entire range of q (especially for values 154 A. Efficient Base Station Selection in Uplink Cellular Networks 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.5 2 2.5 3 3.5 4 E[f] q RS''HARQ''CC

SC''HARQ''CC

ES''HARQ''CC Figure A.7: Total average number of active BSs as a function of q for HARQ-

CC. Solid line: JD; dashed line: LD. of q close to 1), thus it can be concluded that information sharing among

the BSs significantly reduces the average number of active slots. For both

JD and LD, RS always outperforms SC while at the same time providing

performance really close to the optimum solution: in fact, for q = 0.5, more

than 25% of active BSs are saved in the JD case. Fig. A.7 shows E[f (x)] for HARQ-CC as a function of q. Also in this case RS has only slightly suboptimal performance with respect to the best

configuration (ES) of active BSs at each frame; in particular, for q = 0.5,

ES outperforms RS only by 4% in the JD case, with a consistent reduction

in computational complexity: for example, in the performed simulations, for

RS the search is completed in less than 5 iterations in all cases, instead of

(M + 1)N = 64 iterations for ES. Moreover, RS significantly reduces the

backhaul usage with respect to SC, since for q = 0.5, it requires 24% less

total active BSs on average for the JD case. Fig. A.8 shows E[f (x)] for LD-ARQ as a function of q. Even in this case, RS significantly outperforms SC while only slightly under-performing

ES. For example, the resource saving with respect to SC at q = 0.5 is of

about 20%. A.5. Numerical Results 155 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 2.5 3 3.5 4 4.5 5 5.5 E[f] q RS''LD''ARQ

SC''LD''ARQ

ES''LD''ARQ Figure A.8: Total average number of active BSs as a function of q for ARQ. Backhaul Occupation Fig. A.9 shows E[BP ] as a function of q by remarking that RS allows a strong

saving in the resource usage with respect to SC. As an example, for q = 0.5

and JD-HARQ-IR, RS requires 52% less backhaul links usage on average,

while the saving increases up to 64% for JD-HARQ-CC for the same value

of q. Payload Throughput Fig. A.10 shows E['] as a function of q for a system using HARQ-IR. It

can be observed how SC slightly outperforms RS. This result was expected

since RS tends to reduce the number of active BSs during earlier frames while

increasing it in the latter frames. As a consequence, while the average number

of active BSs is reduced, the average decoding delay is increased, since there

is a higher probability of late decoding. On the other hand, to confirm the

close similarity between ES and RS, they have a very close performance also

in terms of payload throughput. Fig. A.11 shows E['] as a function of q for HARQ-CC. Also in this case it can be observed that SC outperforms RS. By comparing Fig.s A.6

and A.7 with Fig.s A.10 and A.11 it can be observed that while requiring a 156 A. Efficient Base Station Selection in Uplink Cellular Networks 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 E[B P ] q RS''JD''HARQ

SC''JD''HARQ

ES''JD''HARQ Figure A.9: Total average of number backhaul transmissions as a function of

q for JD-HARQ. Solid line: HARQ-IR; dashed line: HARQ-CC. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.36 0.365 0.37 0.375 0.38 0.385 0.39 E[ '] [bit/s/Hz] q RS''HARQ''IR

SC''HARQ''IR

ES''HARQ''IR Figure A.10: Average payload throughput as a function of q for HARQ-IR.

Solid line: JD; dashed line: LD. A.6. Conclusions 157 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.36 0.365 0.37 0.375 0.38 0.385 0.39 E[ '] [bit/s/Hz] q RS''HARQ''CC

SC''HARQ''CC

ES''HARQ''CC Figure A.11: Average payload throughput as a function of q for HARQ-CC.

Solid line: JD; dashed line: LD. very close payload throughput, the various techniques lead to a substantially

different number of active BSs, i.e. different backhaul occupancies. Hence,

1/f (x) from Fig.s A.6 and A.7 can be read as the energy efficiency of the

transmission. A.5.4 LD-ARQ and 'm = ' To compare the performance of the analytical solution (AS) derived in Section

A.4.2 with the optimal solution (ES) of (A.3) for the LD-ARQ case, in Fig.

A.12 the total average number of active BSs is shown in the case in which

the MT is positioned at the edge of the cell and the transmission rate is set

equal to R = 2 bit/s/Hz. It can be seen that the analytical solution is clearly

optimal. Finally, it can be observed that in the case in which the SNR toward

all the BSs is the same, RS is fully optimal in the LD-ARQ case. A.6 Conclusions In uplink cellular communications, a MT in a sector transmits to a set of

surrounding BSs in a multi-cell processing scenario. Error control schemes 158 A. Efficient Base Station Selection in Uplink Cellular Networks 6 7 8 9 10 11 12 13 14 15 1 1.5 2 2.5 3 3.5 4 ' 0 [dB] f AS''LD''ARQ

ES''LD''ARQ

RS''LD''ARQ Figure A.12: Total number of active BSs when the MT is positioned at the

edge of the cell with ARQ. as well as cooperation policies among BSs are allowed in order to select which

BSs should be activated in the retransmission frames with the aim of minimiz-

ing the average number of active BSs during all frames, under a constraint on

the maximum allowed outage probability. The iterative proposed algorithm,

RS, provides an efficient and just slightly suboptimal solution to the problem,

for each selected combination of error control scheme and cooperation policy.

The proposed algorithm permits a strong performance improvement (up to

64%) in the backhaul usage with respect to existing cooperative solutions

(SC), as it is also confirmed by the provided simulation results. It can be

therefore concluded that RS is a computationally cheap but effective solution

to the problem of minimizing the backhaul usage under a QoS constraint. A.A. Appendix 159 A.A Appendix A.A.1 Gaussian Approximation for JD-HARQ-IR It can be noticed that Wj , P M

m=1 ym(j)|hm(j)| 2 is Gamma distributed with scale 1 and shape xj and E[log2(1 + 'Wj)] = 1 (xj '' 1)! Z + '' 0 log2(1 + 'w)w xj ''1e''wdw . It follows [107] that E[log2(1 + 'Wj)] = 1 (xj '' 1)! log 2 ' 1 ' xj ' xj sin(xj') 1F1 xj; xj + 1; 1 ' '' 1 (xj '' 1)! p''xj ' log 1 ' '' Ψ(xj) '' 1 '(1 '' xj) 2F2 1, 1; 2, 2 '' xj; 1 ' , with pFq (r1, . . . , rp; s1, . . . , sq; z) , '' X k=0 Q p

i=1(ri)k Q q

i=1(si)k zk k! Ψ(z) , 1 (z '' 1)! ''(z '' 1)! ''z and (r)k = r(r + 1) . . . (r + k '' 1) is the rising factorial of r. Finally it is µ' = n X j=1 E[log2(1 + 'Wj)]. However, a closed-form expression for E[log 2

2(1 + 'Wj)] does not exist, there- fore '2 ' cannot be determined analytically. 160 A. Efficient Base Station Selection in Uplink Cellular Networks A.A.2 Closed-form Expression of Probability (A.10) and

ARQ Probability (A.10) can be rewritten thanks to [108] as P " n X j=1 xj X m=1 'm |hm(j)| 2 < 2R '' 1 # = 1 '' M Y m=1 'Ym(n) m ! M X m=1 Ym(n) X l=1 'm,l( '''m) ' Ym(n) ''l+1 m ' Ψ 'm(2 R '' 1), Ym(n) '' l , where Ψ(a, b) is the cdf of a Poisson random variable of parameter a, i.e., 'm,l(b) = ( ''1) l ''1 X '(M,m,l) Y j ij + Yj(n) '' 1 ij 'j(b) . The set '(M, m, l) contains partitions of l '' 1 through the positive integers indices ij, such that P M

j=1,j 6=m ij = l '' 1 and 'j (b) = ('j + b) ''(Yj(n)+ij). A.A.3 Proof of Theorem A.1 The proof of theorem A.1 in the case N = 2 is reported here. It can be easily

extended to the general case. Let x+ ' M and x'' ' x+ be two integers. First observe that both configurations of vector x, [x'', x+] and [x+, x''], satisfy

the outage probability constraint since for LD-ARQ the outage probability

is only determined by the set of active BSs within the whole transmission

irrespective of their order, see (A.7). The aim is to show that f (x'', x+) ' f(x+, x'') . From (A.2) and (A.7) it is f (x+, x'') '' f(x'', x+) =x+ + x'' x+ Y m=1 am+ '' " x'' + x+ x'' Y m=1 am # . Using (A.8) it holds f (x+, x'') '' f(x'', x+) '(x+ '' x'')+ + x'' Y m=1 am(x''a x+ ''x'' x'' '' x+) . A.A. Appendix 161 Then it must be observed that for fixed Q x''

m=1 am, ax'' is minimized when a1 = . . . = ax '' = a , x'' pQ x''

m=1 am, therefore defining g(a) , x+ + x''a x+ '' (x'' + x+ax'') it follows that f (x+, x'') '' f(x'', x+) ' g(a) . Now, note that g(0) = x+ '' x'' ' 0 and g(1) = 0 and also ''g

''a = x''x+ a (ax+ '' ax'') ' 0 for 0 ' a ' 1. Since g(a) is a continuous function of a, the minimum of g(a) is assumed for a = 1 and it is equal to g(1) = 0, hence g(a) ' 0. A.A.4 Closed-form solution of (A.3) for 'm = '. The average outage probability after n frames (A.7) can be written as pLD''A n+1 (x, ') = a P n

j =1 xj , with a , 1 '' e'' 2R ''1 ' . With the application of the Lagrange multipliers method, the problem to

solve turns out to be min x ,δ '(x, δ) = min x ,δ " f (x) '' δ N X n=1 xn '' loga θ !# . (A.19) Note that in defining the Lagrangian (A.19) the bounds (A.3c) on xn are

not taken into account. Indeed, from Theorem A.1 if the solution of the

unbounded problem (A.3a) with the constraint (A.3b) yields xn > M for

some n, for sure it will be xi > M, n ' i ' N. In such a case, the values xi = M, n ' i ' N will be set, therefore attacking the problem (A.3a) with the outage constraint (A.3b) on the n '' 1 unknowns x1, . . . , xn''1 (all the others being equal to M). Setting to zero the derivatives of (A.19) ''' ''xn = a P n''1 j =1 xj " 1 + N ''1 X l=n xl+1a P l

j =n xj ! log a # '' δ = 0, (A.20a) with n = 1, 2, . . . , N, δ provided by (A.17) and (A.18). Therefore ''' ''δ = '' N X n=1 xn + loga θ = 0. (A.20b) 162 A. Efficient Base Station Selection in Uplink Cellular Networks By manipulating the N + 1 equations (A.20), (A.16) is obtained. Lastly, notice that (A.17) has a unique solution, since a product of two monotonically increasing functions is a monotone function if either of the two

functions is positive valued everywhere. This can be seen by noticing that

both the feasible values of δ in En(δ) and the positiveness of δ, from (A.16),

imply En(δ) > 0 for 1 ' n ' N '' 1. Hence, EN(δ) is a monotone function, therefore (A.17) has a unique solution. Appendix B List of Publications Part of the work presented in this thesis appeared also in the following arti-

cles: ' A. Ahmad, D. Zennaro, E. Serpedin, and L. Vangelista, ''A Factor Graph Approach to Clock Offset Estimation in Wireless Sensor Net-

works,' submitted to IEEE Trans. Inf. Theory. ' A. Ahmad, D. Zennaro, E. Serpedin, and L. Vangelista, ''Time-Varying Clock Offset Estimation in Two-Way Timing Message Exchange in

Wireless Sensor Networks using Factor Graphs,' accepted for publica-

tion in IEEE International Conference on Acoustics, Speech and Signal

Processing (ICASSP), 2012. ' T. Erseghe, D. Zennaro, E. Dall''Anese, and L. Vangelista, ''Fast Con- sensus by the Alternating Direction Multipliers Method,' IEEE Trans-

actions on Signal Processing, vol.59, no.11, pp.5523-5537, Nov. 2011. ' D. Zennaro, S. Tomasin, and L. Vangelista, ''Base Station Selection in Uplink Macro Diversity Cellular Systems with Hybrid ARQ,' IEEE J.

Sel. Areas Commun., vol.29, no.6, pp.1249-1259, June 2011. ' D. Zennaro, E. Dall''Anese, T. Erseghe, and L. Vangelista, ''Fast Clock Synchronization in Wireless Sensor Networks via ADMM-based Con-

sensus,' International Symposium on Modeling and Optimization in

Mobile, Ad Hoc and Wireless Networks (WiOpt), 2011, Princeton (NJ),

May 9-13, 2011. ' D. Zennaro, S. Tomasin, and L. Vangelista, ''Uplink Cell Selection for Cooperative Multi-Cell Networks with Hybrid ARQ,' Proc. of IEEE 163 164 B. List of Publications Global Telecommunications Conf. (GLOBECOM), Miami (FL), Dec.

6-10, 2010. Bibliography [1] I. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, ''Wireless sensor networks: a survey,' Computer Networks, vol. 38, no. 4, pp.

393''422, 2002. [2] I. Chlamtac, M. Conti, and J. J.-N. Liu, ''Mobile ad hoc networking: imperatives and challenges,' Ad Hoc Networks, vol. 1, no. 1, pp. 13 '' 64, 2003. [Online]. Available: http: //www.sciencedirect.com/science/article/pii/S1570870503000131 [3] H. Hartenstein and K. Laberteaux, ''A tutorial survey on vehicular ad hoc networks,' Communications Magazine, IEEE, vol. 46, no. 6, pp.

164 ''171, Jun. 2008. [4] M. Lukac, P. Davis, R. Clayton, and D. Estrin, ''Recovering temporal integrity with Data Driven Time Synchronization,' in

Proceedings of the 2009 International Conference on Information

Processing in Sensor Networks, ser. IPSN ''09. Washington, DC, USA: IEEE Computer Society, 2009, pp. 61''72. [Online]. Available:

http://dl.acm.org/citation.cfm'id=1602165.1602173 [5] D. Mills, ''Internet time synchronization: The network time protocol,' Communications, IEEE Transactions on, vol. 39, no. 10, pp. 1482 ''

1493, Oct. 1991. [6] S. Gezici, Z. Tian, G. Giannakis, H. Kobayashi, A. Molisch, H. Poor, and Z. Sahinoglu, ''Localization via ultra-wideband radios: a look at

positioning aspects for future sensor networks,' Signal Processing Mag-

azine, IEEE, vol. 22, no. 4, pp. 70 '' 84, Jul. 2005. [7] B. Sundararaman, U. Buy, and A. D. Kshemkalyani, ''Clock synchronization for wireless sensor networks: a survey,' Ad Hoc Networks, vol. 3, no. 3, pp. 281 '' 323, 2005. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S1570870505000144 165 166 BIBLIOGRAPHY [8] B. Sadler and A. Swami, ''Synchronization in Sensor Networks: an Overview,' in Military Communications Conference, 2006. MILCOM

2006. IEEE, Oct. 2006, pp. 1 ''6. [9] O. Simeone, U. Spagnolini, Y. Bar-Ness, and S. Strogatz, ''Distributed synchronization in wireless networks,' Signal Processing Magazine,

IEEE, vol. 25, no. 5, pp. 81''97, Sept. 2008. [10] I. K. Rhee, J. Lee, J. Kim, E. Serpedin, and Y. C. Wu, ''Clock syn- chronization in wireless sensor networks: an overview,' Sensors, vol. 9,

no. 1, pp. 56''85, 2009. [11] J. E. Elson, ''Time synchronization in wireless sensor networks,' Ph.D. dissertation, UCLA, Los Angeles, California, 2003. [12] J. Elson, L. Girod, and D. Estrin, ''Fine-grained network time synchro- nization using reference broadcasts,' SIGOPS Oper. Syst. Rev., vol. 36,

pp. 147''163, Dec. 2002. [13] M. Maróti, B. Kusy, G. Simon, and A. Lédeczi, ''The flooding time synchronization protocol,' in Proceedings of the 2nd international con-

ference on Embedded networked sensor systems, 2004, pp. 39''49. [14] R. Kumar, S. Ganeriwal, and M. B. Srivastava, ''Timing-sync proto- col for sensor networks,' in ACM Conf. Embedded Networked Sensor

Systems, 2003. [15] K.-L. Noh, E. Serpedin, and K. Qaraqe, ''A new approach for time syn- chronization in wireless sensor networks: Pairwise broadcast synchro-

nization,' Wireless Communications, IEEE Transactions on, vol. 7,

no. 9, pp. 3318''3322, Sept. 2008. [16] O. Simeone and U. Spagnolini, ''Distributed time synchronization in wireless sensor networks with coupled discrete-time oscillators,' Wire-

less Communications and Networking, EURASIP Journal on, 2007. [17] L. Schenato and G. Gamba, ''A distributed consensus protocol for clock synchronization in wireless sensor network,' in Decision and Control,

46th IEEE Conference on, Dec. 2007, pp. 2289''2294. [18] R. Carli, A. Chiuso, S. Zampieri, and L. Schenato, ''A PI consensus controller for networked clocks synchronization,' in Automatic Control,

IFAC World Congress on, 2008. BIBLIOGRAPHY 167 [19] A. A. Syed and J. Heidemann, ''Time Synchronization for High La- tency Acoustic Networks,' in INFOCOM 2006. 25th IEEE Interna-

tional Conference on Computer Communications. Proceedings, Apr.

2006, pp. 1 ''12. [20] N. Chirdchoo, W. S. Soh, and K. C. Chua, ''MU-Sync: A Time Syn- chronization Protocol for Undewater Mobile Networks,' in Proc. of

ACM WUWNet, San Francisco, CA, 2008. [21] J. Liu, Z. Zhou, Z. Peng, and J. H. Cui, ''Mobi-Sync: Efficient Time Synchronization for Mobile Underwater Sensor Networks,' in Proc. of

ACM WUWNet, Berkeley, CA, Nov. 2009. [22] F. Lu, D. Mirza, and C. Schurgers, ''D-sync: Doppler-based time synchronization for mobile underwater sensor networks,' in Proceedings

of the Fifth ACM International Workshop on UnderWater Networks,

ser. WUWNet ''10. New York, NY, USA: ACM, 2010, pp. 3:1''3:8. [Online]. Available: http://doi.acm.org/10.1145/1868812.1868815 [23] H. Abdel-Ghaffar, ''Analysis of synchronization algorithms with time- out control over networks with exponentially symmetric delays,' Com-

munications, IEEE Transactions on, vol. 50, no. 10, pp. 1652 '' 1661,

Oct. 2002. [24] D. Jeske, ''On maximum-likelihood estimation of clock offset,' Com- munications, IEEE Transactions on, vol. 53, no. 1, pp. 53 '' 54, Jan.

2005. [25] Q. Chaudhari, E. Serpedin, and K. Qaraqe, ''On Minimum Variance Unbiased Estimation of Clock Offset in a Two-Way Message Exchange

Mechanism,' Information Theory, IEEE Transactions on, vol. 56,

no. 6, pp. 2893 ''2904, Jun. 2010. [26] K.-L. Noh, Q. Chaudhari, E. Serpedin, and B. Suter, ''Novel Clock Phase Offset and Skew Estimation Using Two-Way Timing Message

Exchanges for Wireless Sensor Networks,' Communications, IEEE

Transactions on, vol. 55, no. 4, pp. 766 ''777, Apr. 2007. [27] Q. Chaudhari, E. Serpedin, and K. Qaraqe, ''On Maximum Likelihood Estimation of Clock Offset and Skew in Networks With Exponential

Delays,' Signal Processing, IEEE Transactions on, vol. 56, no. 4, pp.

1685 ''1697, Apr. 2008. 168 BIBLIOGRAPHY [28] A. Ahmad, A. Noor, E. Serpedin, H. Nounou, and M. Nounou, ''On Clock Offset Estimation in Wireless Sensor Networks with Weibull Dis-

tributed Network Delays,' in Pattern Recognition (ICPR), 2010 20th

International Conference on, Aug. 2010, pp. 2322 ''2325. [29] J.-S. Kim, J. Lee, E. Serpedin, and K. Qaraqe, ''A robust estimation scheme for clock phase offsets in wireless sensor networks

in the presence of non-gaussian random delays,' Signal Processing, vol. 89, no. 6, pp. 1155 '' 1161, 2009. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0165168409000024 [30] '''', ''Robust Clock Synchronization in Wireless Sensor Networks Through Noise Density Estimation,' Signal Processing, IEEE Trans- actions on, vol. 59, no. 7, pp. 3035 ''3047, Jul. 2011. [31] M. Leng and Y.-C. Wu, ''On clock synchronization algorithms for wireless sensor networks under unknown delay,' Vehicular Technology, IEEE Transactions on, vol. 59, no. 1, pp. 182 ''190, Jan. 2010. [32] ''MSP430 F1611 data sheet,' Texas Instruments. [33] J. Hill and D. Culler, ''A Wireless Embedded Sensor Architecture for System-Level Optimization,' Tech. Rep., 2001. [34] H. Meyr and G. Ascheid, Synchronization in Digital Communications, vol.1. Wiley, 1990. [35] S. Moon, P. Skelly, and D. Towsley, ''Estimation and removal of clock skew from network delay measurements,' in INFOCOM ''99. Eighteenth

Annual Joint Conference of the IEEE Computer and Communications

Societies. Proceedings. IEEE, vol. 1, Mar. 1999, pp. 227 ''234. [36] ''Definitions and terminology for synchronization networks,' ITU-T, Ed., City, State of Publication, 1996, ch. G.810, pp. 400''402. [37] O. Simeone, U. Spagnolini, G. Scutari, and Y. Bar-Ness, ''Physical-layer distributed synchronization in wireless networks and applications,'

Physical Communication, vol. 1, no. 1, pp. 67 '' 83, 2008.

[Online]. Available: http://www.sciencedirect.com/science/article/pii/

S1874490708000049 [38] Q. Chaudhari and E. Serpedin, ''Maximum Likelihood Estimation of clock parameters for synchronization of wireless sensor networks,' in

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE

International Conference on, Apr. 2008, pp. 2517 ''2520. BIBLIOGRAPHY 169 [39] Q. Chaudhari, E. Serpedin, and K. Qaraqe, ''Minimal cost clock syn- chronization using a sender-receiver protocol in Wireless Sensornets,'

in Performance Evaluation of Computer and Telecommunication Sys-

tems, 2008. SPECTS 2008. International Symposium on, Jun. 2008,

pp. 159 ''164. [40] '''', ''Some improved and generalized estimation schemes for clock synchronization of listening nodes in wireless sensor networks,' Com-

munications, IEEE Transactions on, vol. 58, no. 1, pp. 63 ''67, Jan.

2010. [41] Q. Chaudhari and E. Serpedin, ''A new scheme for synchronization of inactive nodes in a sender-receiver protocol,' in Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference

on, Apr. 2009, pp. 2905 ''2908. [42] Q. Chaudhari, E. Serpedin, and J.-S. Kim, ''Energy-Efficient Estima- tion of Clock Offset for Inactive Nodes in Wireless Sensor Networks,'

Information Theory, IEEE Transactions on, vol. 56, no. 1, pp. 582

''596, Jan. 2010. [43] B. Sadler, ''Fundamentals of energy-constrained sensor network sys- tems,' Aerospace and Electronic Systems Magazine, IEEE, vol. 20,

no. 8, pp. 17''35, Aug. 2005. [44] [Online]. Available: http://mail.millennium.berkeley.edu/pipermail/ tinyos-help/2008-October/036812.html [45] [Online]. Available: http://www.tla.co.nz/xtal1.html [46] C. J. Bovy, H. T. Mertodimedjo, G. Hooghiemstra, H. Uijterwaal, and P. V. Mieghem, ''Analysis of end-to-end delay measurements in inter-

net,' in Proc. Passive Active Meas. Workshop, Fort Collins, CO, Mar.

2002, pp. 26''33. [47] S. M. Kay, Fundamentals of statistical signal processing. Estimation theory. Prentice-Hall, 1993. [48] A. Ahmad, D. Zennaro, E. Serpedin, and L. Vangelista, ''A Factor Graph Approach to Clock Offset Estimation in Wireless Sensor Net-

works,' submitted to Information Theory, IEEE Transactions on. [49] M. Wainwright, T. Jaakkola, and A. Willsky, ''A new class of up- per bounds on the log partition function,' Information Theory, IEEE

Transactions on, vol. 51, no. 7, pp. 2313 '' 2335, Jul. 2005. 170 BIBLIOGRAPHY [50] M. J. Wainwright and M. I. Jordan, ''Graphical Models, Exponential Families, and Variational Inference,' Found. Trends Mach. Learn.,

vol. 1, pp. 1''305, Jan. 2008. [Online]. Available: http://dl.acm.org/

citation.cfm'id=1498840.1498841 [51] D. Chapman and H. Robbins, ''Minimum variance estimation without regularity assumptions,' Annals of Math. Statistics, vol. 22, no. 4, pp.

581''586, Dec. 1951. [52] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Uni- versity Press, 2003. [53] A. Ahmad, D. Zennaro, E. Serpedin, and L. Vangelista, ''Time-varying clock offset estimation in two-way timing message exchange in wireless

sensor networks using factor graphs,' in accepted at Acoustics, Speech

and Signal Processing, 2012. ICASSP 2012. IEEE International Con-

ference on, 2012. [54] F. Kschischang, B. Frey, and H.-A. Loeliger, ''Factor graphs and the sum-product algorithm,' Information Theory, IEEE Transactions on,

vol. 47, no. 2, pp. 498 ''519, Feb. 2001. [55] J. Dauwels and H.-A. Loeliger, ''Phase estimation by message passing,' in Communications, 2004 IEEE International Conference on, vol. 1,

Jun. 2004, pp. 523 '' 527 Vol.1. [56] H. V. Trees and K. Bell, Bayesian Bounds for Parameter Estimation and Nonlinear Filtering/Tracking. Wiley, 2007. [57] P. Tichavsky, C. Muravchik, and A. Nehorai, ''Posterior Cramer-Rao bounds for discrete-time nonlinear filtering,' Signal Processing, IEEE

Transactions on, vol. 46, no. 5, pp. 1386 ''1396, May 1998. [58] R. Olfati-Saber, J. Fax, and R. Murray, ''Consensus and cooperation in networked multi-agent systems,' Proceedings of the IEEE, vol. 95,

no. 1, pp. 215''233, Jan. 2007. [59] G. Scutari, S. Barbarossa, and L. Pescosolido, ''Distributed decision through self-synchronizing sensor networks in the presence of propaga-

tion delays and asymmetric channels,' IEEE Transactions on Signal

Processing, vol. 56, no. 4, pp. 1667''1684, Apr. 2008. [60] R. Olfati-Saber and R. Murray, ''Consensus problems in networks of agents with switching topology and time-delays,' Automatic Control,

IEEE Transactions on, vol. 49, no. 9, pp. 1520''1533, Sept. 2004. BIBLIOGRAPHY 171 [61] S. Sardellitti, M. Giona, and S. Barbarossa, ''Fast Distributed Average Consensus Algorithms Based on Advection-Diffusion Processes,' IEEE

Transactions on Signal Processing, vol. 58, no. 2, pp. 826 ''842, Feb.

2010. [62] R. Olfati-Saber and J. Shamma, ''Consensus filters for sensor networks and distributed sensor fusion,' in Decision and Control, 2005 and 2005

European Control Conference. CDC-ECC ''05. 44th IEEE Conference

on, Dec. 2005, pp. 6698''6703. [63] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, ''Randomized gossip algorithms,' Information Theory, IEEE Transactions on, vol. 52, no. 6,

pp. 2508 '' 2530, Jun. 2006. [64] T. C. Aysal, M. E. Yildiz, A. D. Sarwate, and A. Scaglione, ''Broad- cast Gossip Algorithms for Consensus,' IEEE Transactions on Signal Processing, vol. 57, no. 7, pp. 2748 ''2761, Jul. 2009. [65] D. P. Bersekas and J. N. Tsitsiklis, Parallel and distributed computa- tion: numerical methods. Athena Scientific, 1997. [66] I. Schizas, A. Ribeiro, and G. Giannakis, ''Consensus in ad hoc WSNs with noisy links - part I: Distributed estimation of deterministic sig-

nals,' Signal Processing, IEEE Transactions on, vol. 56, no. 1, pp.

350''364, Jan. 2008. [67] I. D. Schizas, G. B. Giannakis, S. I. Roumeliotis, and A. Ribeiro, ''Con- sensus in Ad Hoc WSNs With Noisy Links - Part II: Distributed Es-

timation and Smoothing of Random Signals,' IEEE Transactions on

Signal Processing, vol. 56, no. 4, pp. 1650 ''1666, Apr. 2008. [68] H. Zhu, G. B. Giannakis, and A. Cano, ''Distributed In-Network Channel Decoding,' IEEE Transactions on Signal Processing., vol. 57,

no. 10, pp. 3970''3983, Oct. 2009. [69] S.-J. Kim, E. Dall''Anese, and G. B. Giannakis, ''Cooperative Spectrum Sensing for Cognitive Radios Using Kriged Kalman Filtering,' IEEE

Journal on Selected Topics in Signal Processing, vol. 5, no. 1, pp. 24

''36, Feb. 2011. [70] S. Boyd, P. Diaconis, and L. Xiao, ''Fastest mixing Markov chain on a graph,' SIAM Rev., vol. 46, no. 4, pp. 667 '' 689, 2004. 172 BIBLIOGRAPHY [71] E. Kokiopoulou and P. Frossard, ''Polynomial Filtering for Fast Con- vergence in Distributed Consensus,' IEEE Transactions on Signal Pro-

cessing, vol. 57, no. 1, pp. 342 ''354, Jan. 2009. [72] M. Cao, D. A. Spielman, and E. M. Yeh, ''Accelerated gossip algorithms for distributed computation,' in 44th Annual Allerton Conference on

Communication, Control, and Computation, Sept. 2006. [73] B. Oreshkin, M. Coates, and M. Rabbat, ''Optimization and analysis of distributed averaging with short node memory,' IEEE Transactions

on Signal Processing, vol. 58, no. 5, pp. 2850 ''2865, May 2010. [74] L. Xiao and S. Boyd, ''Fast linear iterations for distributed averaging,' Systems & Control Letters, vol. 53, no. 1, pp. 65 '' 78, Sept. 2004. [75] D. Jakovetic and, J. Xavier, and J. Moura, ''Weight Optimization for Consensus Algorithms With Correlated Switching Topology,' IEEE

Transactions on Signal Processing, vol. 58, no. 7, pp. 3788 ''3801, Jul.

2010. [76] L. Xiao and S. Boyd, ''Fast linear iterations for distributed averaging,' in Decision and Control, 2003. Proceedings. 42nd IEEE Conference on,

vol. 5, Dec. 2003, pp. 4997 '' 5002. [77] L. Xiao, S. Boyd, and S.-J. Kim, ''Distributed average consensus with least-mean-square deviation,' Journal of Parallel and Distributed Com-

puting, vol. 67, no. 1, pp. 33 '' 46, 2007. [78] T. Erseghe, D. Zennaro, E. Dall''Anese, and L. Vangelista, ''Fast Con- sensus by the Alternating Direction Multipliers Method,' Signal Pro-

cessing, IEEE Transactions on, vol. 59, no. 11, pp. 5523 ''5537, Nov.

2011. [79] F. R. Gantmacher, The Theory of Matrices, Vol. 2. Chelsea Publishing Company, 2000. [80] J. S. Rosenthal, ''Convergence rates of markov chains,' SIAM Review, vol. 37, pp. 387''405, 1995. [81] S. Kar and J. Moura, ''Distributed Consensus Algorithms in Sensor Networks With Imperfect Communication: Link Failures and Channel

Noise,' IEEE Transactions on Signal Processing, vol. 57, no. 1, pp. 355

''369, Jan. 2009. BIBLIOGRAPHY 173 [82] R. Horn and C. Johnson, Matrix analysis. Cambridge University Press, 1985. [83] D. Zennaro, E. Dall''Anese, T. Erseghe, and L. Vangelista, ''Fast clock synchronization in wireless sensor networks via ADMM-based consen-

sus,' in Modeling and Optimization in Mobile, Ad Hoc and Wireless

Networks (WiOpt), 2011 International Symposium on, May 2011, pp.

148 ''153. [84] W. Lindsey, F. Ghazvinian, W. Hagmann, and K. Dessouky, ''Network synchronization,' Proceedings of the IEEE, vol. 73, no. 10, pp. 1445''

1467, Oct. 1985. [85] T. Schmidl and D. Cox, ''Low-overhead, low-complexity [burst] syn- chronization for ofdm,' in Communications, IEEE International Con-

ference on, vol. 3, Jun. 1996, pp. 1301 ''1306. [86] D. Zennaro, S. Tomasin, and L. Vangelista, ''Uplink Cell Selection for Cooperative Multi-Cell Networks with Hybrid ARQ,' in GLOBECOM

2010, 2010 IEEE Global Telecommunications Conference, Dec. 2010,

pp. 1 ''5. [87] '''', ''Base Station Selection in Uplink Macro Diversity Cellular Sys- tems with Hybrid ARQ,' Selected Areas in Communications, IEEE

Journal on, vol. 29, no. 6, pp. 1249 ''1259, Jun. 2011. [88] S. Chatzinotas, M. Imran, and R. Hoshyar, ''On the multicell processing capacity of the cellular MIMO uplink channel in correlated rayleigh

fading environment,' Wireless Communications, IEEE Transactions

on, vol. 8, no. 7, pp. 3704 ''3715, Jul. 2009. [89] S. W. Peters, A. Y. Panah, K. T. Truong, and R. W. Heath, ''Relay architectures for 3GPP LTE-advanced,' EURASIP J. Wirel.

Commun. Netw., vol. 2009, pp. 1:1''1:14, Mar. 2009. [Online].

Available: http://dx.doi.org/10.1155/2009/618787 [90] L. Venturino, N. Prasad, and X. Wang, ''Coordinated Scheduling and Power Allocation in Downlink Multicell OFDMA Networks,' Vehicular

Technology, IEEE Transactions on, vol. 58, no. 6, pp. 2835 ''2848, Jul.

2009. [91] H. Hu, Y. Zhang, and J. Luo, Distributed Antenna Systems: Open Architecture for Future Wireless Communications, 2007. 174 BIBLIOGRAPHY [92] A. Sanderovich, O. Somekh, and S. Shamai, ''Uplink Macro Diversity with Limited Backhaul Capacity,' in Information Theory, 2007. ISIT

2007. IEEE International Symposium on, Jun. 2007, pp. 11 ''15. [93] A. Sanderovich, O. Somekh, H. Poor, and S. Shamai, ''Uplink Macro Diversity of Limited Backhaul Cellular Network,' Information Theory,

IEEE Transactions on, vol. 55, no. 8, pp. 3457 ''3478, Aug. 2009. [94] P. Marsch and G. Fettweis, ''A Framework for Optimizing the Up- link Performance of Distributed Antenna Systems under a Constrained

Backhaul,' in Communications, 2007. ICC ''07. IEEE International Conference on, Jun. 2007, pp. 975 ''979. [95] C. Hoymann, L. Falconetti, and R. Gupta, ''Distributed Uplink Sig- nal Processing of Cooperating Base Stations Based on IQ Sample Ex-

change,' in Communications, 2009. ICC ''09. IEEE International Con-

ference on, Jun. 2009, pp. 1 ''5. [96] L. Falconetti, C. Hoymann, and R. Gupta, ''Distributed Uplink Macro Diversity for Cooperating Base Stations,' in Communications Work-

shops, 2009. ICC Workshops 2009. IEEE International Conference on,

Jun. 2009, pp. 1 ''5. [97] O. Somekh, B. Zaidel, and S. Shamai, ''Spectral Efficiency of Joint Mul- tiple Cell-Site Processors for Randomly Spread DS-CDMA Systems,'

Information Theory, IEEE Transactions on, vol. 53, no. 7, pp. 2625

''2637, Jul. 2007. [98] M. Kamoun and L. Mazet, ''Base-station selection in cooperative single frequency cellular network,' in Signal Processing Advances in Wireless

Communications, 2007. SPAWC 2007. IEEE 8th Workshop on, Jun.

2007, pp. 1 ''5. [99] Y. Wang, X. Shen, and P. Zhang, ''Uplink performance analysis in group cell systems,' in Vehicular Technology Conference, 2005. VTC

2005-Spring. 2005 IEEE 61st, vol. 4, Jun. 2005, pp. 2405 '' 2409. [100] A. Del Coso and S. Simoens, ''Distributed compression for MIMO co- ordinated networks with a backhaul constraint,' Wireless Communi-

cations, IEEE Transactions on, vol. 8, no. 9, pp. 4698 ''4709, Sept.

2009. BIBLIOGRAPHY 175 [101] A. Papadogiannis, D. Gesbert, and E. Hardouin, ''A Dynamic Cluster- ing Approach in Wireless Networks with Multi-Cell Cooperative Pro-

cessing,' in Communications, 2008. ICC ''08. IEEE International Con-

ference on, May 2008, pp. 4033 ''4037. [102] S. Kiani and D. Gesbert, ''Optimal and Distributed Scheduling for Multicell Capacity Maximization,' Wireless Communications, IEEE

Transactions on, vol. 7, no. 1, pp. 288 ''297, Jan. 2008. [103] I. Stanojev, O. Simeone, Y. Bar-Ness, and C. You, ''Performance of multi-relay collaborative hybrid-ARQ protocols over fading channels,'

Communications Letters, IEEE, vol. 10, no. 7, pp. 522 ''524, Jul. 2006. [104] P. Wu and N. Jindal, ''Coding Versus ARQ in Fading Channels: How Reliable Should the PHY Be'' in Global Telecommunications Confer-

ence, 2009. GLOBECOM 2009. IEEE, Dec. 2009, pp. 1 ''6. [105] F. Calabrese, M. Anas, C. Rosa, P. Mogensen, and K. Pedersen, ''Per- formance of a Radio Resource Allocation Algorithm for UTRAN LTE

Uplink,' in Vehicular Technology Conference, 2007. VTC2007-Spring.

IEEE 65th, Apr. 2007, pp. 2895 ''2899. [106] D. Tse and P. Visvanath, Fundamentals of Wireless Communication. Cambridge University Press, 2005. [107] A. Prudnikov, Y. A. Brychkov, and O. Marichev, Integrals and Series, vol. 1. Gordon and Breach Science Publishers, 1986. [108] S. Amari and R. Misra, ''Closed-form expressions for distribution of sum of exponential random variables,' Reliability, IEEE Transactions

on, vol. 46, no. 4, pp. 519 ''522, Dec. 1997.

Grassi Mobil™ - Formulati per fornire elevate prestazioni anche in condizioni operative estreme

© Eiom - All rights Reserved P.IVA 00850640186