1、MMSE Estimation for Sparse Representation Modeling Michael Elad The Computer Science Department The Technion Israel Institute of technology Haifa 32000,Israel*Joint work with Irad Yavneh&Matan ProtterThe CS Department,The TechnionApril 6th,2009MMSE Estimation for Sparse Representation ModelingBy:Mic
2、hael Elad2 Noise Removal?In this talk we focus on signal/image denoising q Important:(i)Practical application;(ii)A convenient platform for testing basic ideas in signal/image processing.q Many Considered Directions:Partial differential equations,Statistical estimators,Adaptive filters,Inverse probl
3、ems®ularization,Wavelets,Example-based techniques,Sparse representations,q Main Massage Today:Several sparse representations can be found and used for better denoising performance we introduce,motivate,discuss,demonstrate,and explain this new idea.Remove Additive Noise?MMSE Estimation for Sparse
4、Representation ModelingBy:Michael Elad31.Background on Denoising with Sparse Representations2.Using More than One Representation:Intuition3.Using More than One Representation:Theory4.A Closer Look At the Unitary Case5.Summary and Conclusions AgendaMMSE Estimation for Sparse Representation ModelingBy
5、:Michael Elad4Part I Background on Denoising with Sparse RepresentationsMMSE Estimation for Sparse Representation ModelingBy:Michael Elad5 Relation to measurements Denoising By Energy Minimization Thomas Bayes 1702-1761Prior or regularizationy:Given measurements x:Unknown to be recovered xPryx21xf22
6、Many of the proposed signal denoising algorithms are related to the minimization of an energy function of the formq This is in-fact a Bayesian point of view,adopting the Maximum-A-posteriori Probability(MAP)estimation.q Clearly,the wisdom in such an approach is within the choice of the prior modelin
7、g the signals of interest.MMSE Estimation for Sparse Representation ModelingBy:Michael Elad6 Sparse Representation ModelingMM KNDA fixed Dictionaryq Every column in D(dictionary)is a prototype signal(atom).q The vector is generated randomly with few(say L for now)non-zeros at random locations and wi
8、th random values.A sparse&random vectorxNMMSE Estimation for Sparse Representation ModelingBy:Michael Elad7x L.t.sy21minarg0022DDD-y=-Back to Our MAP Energy Function q The L0“norm”is effectively counting the number of non-zeros in.q The vector is the representation(sparse/redundant).q Bottom line:De
9、noising of y is done by minimizing xL.t.symin0022D22200y.t.sminDorMMSE Estimation for Sparse Representation ModelingBy:Michael Elad8q Next steps:given the previously found atoms,find the next one to best fit the residual.q The algorithm stops when the error is below the destination threshold.q The M
10、P is one of the greedy algorithms that finds one atom at a time Mallat&Zhang(93).q Step 1:find the one atom that best matches the signal.q The Orthogonal MP(OMP)is an improved version that re-evaluates the coefficients by Least-Squares after each round.2yD The Solver We Use:Greed Based 22200y.t.smin
11、DMMSE Estimation for Sparse Representation ModelingBy:Michael Elad9 Orthogonal Matching Pursuit Ki1forrdzmin)i(ECompute1niz 0000Sandyyr0,0nD2nr1nnInitialization Main Iteration1.2.3.4.5.)i(E)i(E,Ki1.t.siChoose00 nnSpsup.t.symin:LSDnnyr:sidualReUpdateDi SS:SUpdate01nnnStopYesNoOMP finds one atom at a
12、time for approximating the solution of 22200y.t.sminDMMSE Estimation for Sparse Representation ModelingBy:Michael Elad10Part II Using More than One Representation:IntuitionMMSE Estimation for Sparse Representation ModelingBy:Michael Elad11 Back to the Beginning.What If Consider the denoising problem
13、and suppose that we can find a group of J candidate solutionssuch that 22200y.t.sminD J1jj222j00jyNjDBasic Questions:q What could we do with such a set of competing solutions in order to better denoise y?q Why should this help?q How shall we practically find such a set of solutions?Relevant work:Lar
14、sson&Selen(07)Schintter et.al.(08)Elad and Yavneh(08)MMSE Estimation for Sparse Representation ModelingBy:Michael Elad12 Motivation General Why bother with such a set?q Because each representation conveys a different story about the desired signal.q Because pursuit algorithms are often wrong in find
15、ing the sparsest representation,and then relying on their solution is too sensitive.q Maybe there are“deeper”reasons?12DDMMSE Estimation for Sparse Representation ModelingBy:Michael Elad13 Our Motivationq An intriguing relationship between this idea and the common-practice in example-based technique
16、s,where several examples are merged.q Consider the Non-Local-Means Buades,Coll,&Morel(05).It uses (i)a local dictionary(the neighborhood patches),(ii)it builds several sparse representations(of cardinality 1),and (iii)it merges them.q Why not take it further,and use general sparse representations?2D
17、1DMMSE Estimation for Sparse Representation ModelingBy:Michael Elad14 Generating Many Representations Our Answer:Randomizing the OMPKi1forrdzmin)i(ECompute1niz 0000Sandyyr0,0nD2nr1nnInitialization Main Iteration1.2.3.4.5.)i(E)i(E,Ki1.t.siChoose00 nnSpsup.t.symin:LSDnnyr:sidualReUpdateDi SS:SUpdate01
18、nnnStop)i(EcexpyprobabilitwithiChoose0YesNo*Larsson and Schnitter propose a more complicated and deterministic tree pruning method*For now,lets set the parameter c manually for best performance.Later we shall define a way to set it automaticallyMMSE Estimation for Sparse Representation ModelingBy:Mi
19、chael Elad15 Lets Try 10000yProposed Experiment:q Form a random dictionary D.q Multiply by a sparse vector 0().q Add Gaussian iid noise v with=1 and obtain .q Solve the problem using OMP,and obtain .q Use Random-OMP and obtain .q Lets look at the obtained representations 100200D10001jRandOMPj0100y.t
20、.smin2200D+=vyOMPMMSE Estimation for Sparse Representation ModelingBy:Michael Elad16 Some Observations 010203040050100150CandinalityHistogram Random-OMP cardinalitiesOMP cardinality859095100105050100150200250300350Representation ErrorHistogram Random-OMP errorOMP error00.10.20.30.4050100150200250300
21、Noise AttenuationHistogram Random-OMP denoisingOMP denoising051015200.050.10.150.20.250.30.35CardinalityNoise Attenuation Random-OMP denoisingOMP denoising00RandOMPi220220yDDD22yDWe see that The OMP gives the sparsest solution Nevertheless,it is not the most effective for denoising.The cardinality o
22、f a representation does not reveal its efficiency.MMSE Estimation for Sparse Representation ModelingBy:Michael Elad17 The Surprise(at least for us)050100150200-3-2-10123indexvalue Averaged Rep.Original Rep.OMP Rep.10001jRandOMPj10001Lets propose the average as our representationThis representation I
23、S NOT SPARSE AT ALL but it gives06.0y220220DDDMMSE Estimation for Sparse Representation ModelingBy:Michael Elad18 Is It Consistent?Yes!00.10.20.30.40.500.050.10.150.20.250.30.350.40.450.5OMP Denoising FactorRandOMP Denoising Factor OMP versus RandOMP resultsMean PointHere are the results of 1000 tri
24、als with the same parameters?Cases of zero solutionMMSE Estimation for Sparse Representation ModelingBy:Michael Elad19Part III Using More than One Representation:Theory MMSE Estimation for Sparse Representation ModelingBy:Michael Elad20 Our Signal ModelKNDA fixed Dictionaryxq D is fixed and known.q
25、The vector is built by:Choosing the support s with probability P(s)from all the 2K possibilities.For simplicity,assume that|s|=k is fixed and known.Choosing the s coefficients using iid Gaussian entries N(0,x).q The ideal signal is x=D=Dss.The p.d.f.P()and P(x)are clear and knownMMSE Estimation for
26、Sparse Representation ModelingBy:Michael Elad21 Adding NoiseKNDA fixed Dictionaryxyv+Noise Assumed:The noise v is additive white Gaussian vector with probability Pv(v)The conditional p.d.f.s P(y|s),P(s|y),and even also P(x|y)are all clear and well-defined(although they may appear nasty).222yxexpCxyP
27、MMSE Estimation for Sparse Representation ModelingBy:Michael Elad22 The Key The Posterior P(x|y)y|xPWe have access toMAPMMSE)y|x(PArgMaxx xMAPy|xEx MMSEq The estimation of and multiplication by D is equivalent to the above.q These two estimators are impossible to compute,as we show next.Oracle known
28、 support soraclex MMSE Estimation for Sparse Representation ModelingBy:Michael Elad23 Lets Start with The Oracle yPP|yPy|Ps,y|Psss2x2ss2expP22sss2yexp|yPD2x2s22sss22yexpy|PDs1sTs212xsTs2shy111QDIDD*When s is known*Comments:This estimate is both the MAP and MMSE.The oracle estimate of x is obtained b
29、y multiplication by Ds.MMSE Estimation for Sparse Representation ModelingBy:Michael Elad24We have seen this as the oracles probability for the support s:2x2s22sss22yexpy|PD The MAP Estimation2)log(det(2hhexp)s(P.)s|y(P)s(P)y|s(P1ss1sTsQQsMAP)s,y|(P)y|s(PArgMax)y|(PArgMaxMAP)y|(P)y|s(PArgMaxss,MAPs2)
30、log(det(2hhexp)s(PArgMaxs 1ss1sTssMAPQQMMSE Estimation for Sparse Representation ModelingBy:Michael Elad25 The MAP EstimationImplications:q The MAP estimator requires to test all the possible supports for the maximization.In typical problems,this is impossible as there is a combinatorial set of poss
31、ibilities.q This is why we rarely use exact MAP,and we typically replace it with approximation algorithms(e.g.,OMP).2)log(det(2hhexp)s(PArgMaxs 1ss1sTssMAPQQMMSE Estimation for Sparse Representation ModelingBy:Michael Elad26s1sshs,y|EQThis is the oracle for s,as we have seen before The MMSE Estimati
32、on2)log(det(2hhexp)s(P.)s|y(P)s(P)y|s(P1ss1sTsQQsMMSEs,y|E)y|s(Py|EssMMSE)y|s(PMMSE Estimation for Sparse Representation ModelingBy:Michael Elad27 The MMSE EstimationsMMSEs,y|E)y|s(Py|EssMMSE)y|s(PImplications:q The best estimator(in terms of L2 error)is a weighted average of many sparse representat
33、ions!q As in the MAP case,in typical problems one cannot compute this expression,as the summation is over a combinatorial set of possibilities.We should propose approximations here as well.MMSE Estimation for Sparse Representation ModelingBy:Michael Elad28This is our c in the Random-OMP The Case of|
34、s|=k=1 2kT22x2x21ss1sTs)dy(21exp2)log(det(2hhexp)s(P)y|s(PQQThe k-th atom in Dq Based on this we can propose a greedy algorithm for both MAP and MMSE:MAP choose the atom with the largest inner product(out of K),and do so one at a time,while freezing the previous ones(almost OMP).MMSE draw at random
35、an atom in a greedy algorithm,based on the above probability set,getting close to P(s|y)in the overall draw.MMSE Estimation for Sparse Representation ModelingBy:Michael Elad29 Bottom Lineq The MMSE estimation we got requires a sweep through all supports(binatorial search)impractical.q Similarly,an e
36、xplicit expression for P(x/y)can be derived and maximized this is the MAP estimation,and it also requires a sweep through all possible supports impractical too.q The OMP is a(good)approximation for the MAP estimate.q The Random-OMP is a(good)approximation of the Minimum-Mean-Squared-Error(MMSE)estim
37、ate.It is close to the Gibbs sampler of the probability P(s|y)from which we should draw the weights.Back to the beginning:Why Use Several Representations?Because their average leads to a provable better noise suppression.MMSE Estimation for Sparse Representation ModelingBy:Michael Elad300.511.500.05
38、0.10.150.20.250.30.350.40.450.5Relative Mean-Squared-Error Comparative Results The following results correspond to a small dictionary(2030),where the combinatorial formulas can be evaluated as well.Parameters:N=20,K=30 True support=3 x=1 J=10(RandOMP)Averaged over 1000 experiments201.Emp.Oracle2.The
39、or.Oracle3.Emp.MMSE4.Theor.MMSE5.Emp.MAP6.Theor.MAP7.OMP8.RandOMPKnown supportMMSE Estimation for Sparse Representation ModelingBy:Michael Elad31Part IV A Closer Look At the Unitary Case IDDDDTTMMSE Estimation for Sparse Representation ModelingBy:Michael Elad32 Few Basic Observations ss22x22x2s1sss2
40、Ts2s2x22x22xsTs2sc1h1y1h11QDIIDDQyTDLet us denote(The Oracle)MMSE Estimation for Sparse Representation ModelingBy:Michael Elad33 Back to the MAP Estimation2)log(det(2hhexpArgMaxs 1ss1sTssMAPQQWe assume|s|=k fixed with equal probabilities*This part becomes a constant,and thus can be discarded22s2s1sT
41、schhQThis means that MAP estimation can be easily evaluated by computing,sorting its entries in descending order,and choosing the k leading ones.MMSE Estimation for Sparse Representation ModelingBy:Michael Elad34 Closed-Form Estimationq It is well-known that MAP enjoys a closed form and simple solut
42、ion in the case of a unitary dictionary D.q This closed-form solution takes the structure of thresholding or shrinkage.The specific structure depends on the fine details of the model assumed.q It is also known that OMP in this case becomes exact.What about the MMSE?Could it have a simple closed-form
43、 solution too?MMSE Estimation for Sparse Representation ModelingBy:Michael Elad35 The MMSE Again ssMMSE)y|s(PcThis is the formula we got:=We combine linearly many sparse representations(with proper weights)+The result is one effective representation(not sparse anymore)MMSE Estimation for Sparse Repr
44、esentation ModelingBy:Michael Elad36 The MMSE Again ssMMSE)y|s(PcThis is the formula we got:q We change the above summation toK1jjjkjMMSEeqwhere there are K contributions(one per each atom)to be found and used.q We have developed a closed-form recursive formula for computing the q coefficients.MMSE
45、Estimation for Sparse Representation ModelingBy:Michael Elad37 Towards a Recursive Formula iq2i2si22s22cexp2cexp.)y|s(PWe have seen that the governing probability for the weighted averaging is given by jjK1jsssiisssiissMMSEe)j(Iqq)y|s(PcIndicator function stating if j is in skjqMMSE Estimation for S
46、parse Representation ModelingBy:Michael Elad38 The Recursive Formula K11k11kj1jsssiikjqq1)q1(qk.)j(Iqqwherej1jqq K1j2jq K1j1jqK1j3jqK1jkjqMMSE Estimation for Sparse Representation ModelingBy:Michael Elad390.511.500.070.080.090.1Relative Mean-Squared-Error This is a synthetic experiment resembling th
47、e previous one,but with few important changes:2OracleKnown support An Example D is unitary The representations cardinality is 5(the higher it is,the weaker the Random-OMP becomes)Dimensions are different:N=K=64 J=20(RandOMP runs)Theor.MAPOMPRecursive MMSETheor.MMSERandOMPMMSE Estimation for Sparse R
48、epresentation ModelingBy:Michael Elad40Part V Summary and ConclusionsMMSE Estimation for Sparse Representation ModelingBy:Michael Elad41 Today We Have Seen that By finding the sparsest representation and using it to recover the clean signalHow?Sparsity and Redundancy are used for denoising of signals/imagesCan we do better?Today we have shown that averaging several sparse representations for a signal lead to better denoising,as it approximates the MMSE estimator.More on these(including the slides and the relevant papers)can be found in http:/www.cs.technion.ac.il/elad