# 7fjksafn

Non-Local Kernel Regression for Image and Video RestorationHaichao Zhang1,2, Jianchao Yang2, Yanning Zhang1, and Thomas S. Huang21School of Computer Science, Northwestern Polytechnical University, Xi’an, China 2Beckman Institute, University of Illinois at Urbana-Champaign, USA {hczhang, jyang29, huang}ifp.uiuc.edu, Abstract. This paper presents a non-local kernel regression NL-KR for image and video restoration tasks, which exploits both the non-local self-similarity and local structural regularity in natural im- ages. The non-local self-similarity is based on the observation that image patches tend to repeat themselves in natural images and videos; and the local structural regularity reveals that image patches have regular struc- tures where accurate estimation of pixel values via regression is possible. Explicitly unifying both properties, the proposed non-local kernel re- gression framework is robust and applicable to various image and videorestoration tasks. In this work, we are specifically interested in applying the NL-KR model to image and video super-resolution SR reconstruc- tion. Extensive experimental results on both single images and realistic video sequences demonstrate the superiority of the proposed framework for SR tasks over previous works both qualitatively and quantitatively.1IntroductionOne of the recent trends in image processing is to pursue the low-dimensional models for image representation and manipulation. Examples include the lo- cal structure based s [1], sparse representation s [2][3], manifold s [4], etc. The success of such models is guaranteed by the low Degree of Freedom DOF of the local structures in natural images, represented as mean- ingful local structural regularity as well as self-similarity of local patterns. Many conventional image processing algorithms are based on the assumption of local structural regularity, meaning that there are meaningful structures in the spatial space of natural images. Examples are structure tensor based meth-ods [1][5][6]and bilateral filtering [7]. These s utilize the local structural patterns to regularize the image processing procedure and are based on the as- sumption that images are locally smooth except at edges. Another type of s exploiting the self-similarity in natural images are recently emerging. The self-similarity property means that higher level patterns e.g., texton and pixon will repeat themselves in the image possibly in dif- ferent scales. This also indicates the DOF in one image is less than the DOFoffered by the pixel-level representation. A representative work is the popular Non-Local Means NL-Means [8], which takes advantage of the redundancy of2H.Zhang, J.Yang, Y.Zhang and T.Huangsimilar patches existing in the target image for denoising tasks. Later, this idea is generalized to handle multi-frame super-resolution tasks in [9]. Recently, this self-similarity property is thoroughly explored by Glasner et. al in [10] for ad- dressing single image super-resolution problems. Gabriel Peyr´ e et. al proposed a non-local regularization for general inverse problems [11]. We propose in this paper a Non-Local Kernel Regression NL-KR for image and video restoration see Fig. 1 for a graphical illustration of the proposed model. We take advantage of both local structural regularity and non-local similarity in a unified framework for more reliable and robust estimation. The non-local similarity and local structural regularity are intimately related, and are also complimentary in the sense that non-local similar pattern fusion can be regularized by the structural regularity while the redundancy from similar patterns enables more accurate estimation for structural regression.The rest of the paper is organized as follows. We first review and summarize related works in Section 2, then we propose our NL-KR model and discuss its relations to other algorithms in Section 3. The practical algorithm for SR based on NL-KR is described in Section 4. Experiments are carried out in Section 5 on both synthetic and real image sequences, and extensive comparisons are made with both classical as well as state-of-the-art s. Section 6 provides some discussions and concludes our paper.Fig.1. Non-Local Kernel Regression. 1 Similar patch searching different colors indicate the similarity red the highest, green the medium and blue the least; 2 Structural kernel estimation and reweighting estimate a regression kernel adapted to the structure at each position where the similar patches reside and re-weight them according to similarity; 3 Non-local kernel regression estimate the value for the query point with both local structural and non-local similar ination in raster-scan order.2Related WorksIn this work, we are interested in image and video restorations where we desire to estimate the pixel value of a given location in the image plane e.g., image super-resolution, inpainting and denoising. This section presents a brieftechnical review of local structural regression or filtering as well as the non-local similarity-based approach.Non-Local Kernel Regression for Image and Video Restoration32.1Local Structural RegressionTypical image filtering s usually per in a local manner, i.e., thue of the estimated image at a query location is influenced only by the pixels within a small neighborhood of that position. They usually take the ofˆ zxi argminzj∈Nxiyj− z2Kxixj− xi1where Nxi denotes the neighbors of xi, and Kxixj− xi is the spatial ker- nel at location xithat assigns larger weights to nearby similar pixels while smaller weights to farther non-similar pixels. Since the local image structure is not isotropic, local structure aware kernels are developed, with representative examples as Orientated Gaussian kernel [1] and Bilateral kernel [7]. To approx- imate the local structure better, higher order estimation can be usedˆ a argminaRxiY − Φa2WKxi2Here Φ is the polynomial bases given in Eq. 3 developed from Taylor expan- sion3with a the corresponding regression coefficients, and tril· extracts the lower triangular part of a matrix and stack it to a column vector. WKxi diag[Kxix1−xi,Kxix2−xi,··· ,Kxixm−xi] m |Nxi| is the weightmatrix defined by the kernel. Rxitakes a patch centered at xifrom Y and rep- resents it as a vector.Φ ⎡⎢⎢⎢⎣1 x1− xiTtril{x1− xix1− xiT}T··· 1 x2− xiTtril{x2− xix2− xiT}T··· ............ 1 xm− xiTtril{xm− xixm− xiT}T···⎤⎥⎥⎥⎦3Therefore, the first element of the regression coefficient is the desired pixel value estimation at xi.ˆ zxi eT1ΦTWKxiΦ−1ΦTWKxiRxiY4where e1is a vector with the first element equal to one, and the rest zero.2.2Non-Local Similarity-based EstimationLocal image structures tend to repeat themselves across the image and also the image sequence in videos. This property has been explored in many appli- cations such as texture synthesis [12], image inpainting [13], denoising [8] and super-resolution [9] [14]. This self-similarity property provides the redundancy3The regression bases do not have to be polynomial, and other choices are open. For more details about deriving the polynomial bases, one can refer to [6], which gives a nice tutorial on kernel regression.4H.Zhang, J.Yang, Y.Zhang and T.Huangthat is sometimes critical for many ill-posed image processing problems, as sim- ilar structures can be regarded as multiple observations from the same underly- ing ground truth. For instance, the NL-Means algorithm recently introduced by Buades et al. in [8] for image denoising has become very popular, due to its ef- fectiveness despite of its simplicity. The algorithm breaks the locality constraintsof previous conventional filtering s, making use of similar patterns foundin different locations of the image to denoise the image. Specifically, NL-Means algorithm is a weighted averagezxi j∈Pxiwijyj j∈Pxiwij5where Pxi denote the index set for similar pixel observations for xiincludes xiitself. The weight wijreflects the similarity between the observations at xi and xj[8]. Eq. 5, can be reulated into an optimization problemˆ zxi argminzj∈Pxi[yj− zxi]2wij argminzy − 1zxi2Wxi6where y denotes the vector consisting of the pixels at the locations in the similar set Pxi, 1 denotes a vector of all ones, and Wxi diag[wi1,wi2,...,wim] m |Pxi|. Compared with Eq. 2, the NL-Means estimation Eq. 6 can be regarded as a zero-order estimation, and the weight matrix is constructed from the similarity measures instead of the spatial kernel as before.3Non-Local Kernel Regression Model3.1Mathematical ulationWe derive the Non-Local Kernel Regression NL-KR algorithm in this sec- tion. The approach makes full use of both cues from local structural regularity and non-local similarity for image and video restoration. We argue that the pro- posed approach is more reliable and robust for ill-posed inverse problems local structural regression regularize the noisy candidates found by non-local similar- ity search; and non-local similarity provides the redundancy preventing possibleoverfitting of the local structural regression. Instead of using a point prediction model in non-local s, we use the more reliable local structure-based pre- diction. On the other hand, rather than predicting the value with only one local patch, we can try to make use of all the non-local similar patches in natural images. Mathematically, the proposed high-order Non-Local Kernel Regression model is ulated asNon-Local Kernel Regression for Image and Video Restoration5ˆ a argmina1 2local wiiRxiY − Φa2WKxi1 2non−localj∈Pxi\{i}wijRxjY − Φa2WKxj argmina1 2j∈PxiwijRxjY − Φa2WKxj argmina1 2j∈PxiRxjY − Φa2˜Wxj7where WKxjis the weight matrix constructed from kernel Kxj, and Pxi again is the similar index set for xi. wijis calculated between the location xiof interest and any position xjj ∈ Pxi by measuring the similarity of their neighborhoods weighted by a Gaussian kernelwij exp −RxiY − RxjY 2WG 2σ2 8where WGis the weight matrix constructed from a Gaussian kernel, which puts larger weights on the centering pixels of the patch. The proposed NL-KR regres- sion model consists of two parts1. Local regression term the traditional local regression or filtering term, with wiiset to be 1. This term also contributes as a fidelity loss, as the estimation should be close to the observation. 2. Non-local regression term instead of zero-order point estimation as in Non-Local means, higher-order kernel regression is also used to make full use the structural redundancy in the similar patches.The effects of these two parts will be more clear with experimental results inSection 5. To get the regression coefficients, differentiate the right hand side of Eq. 7 with respect to a and set it to zero, we haveˆ a ⎡⎣ΦT⎛⎝j∈PxiwijWKxj⎞⎠Φ⎤⎦−1ΦTj∈PxiwijWKxjRxjY9Then ˆ zxi eT1ˆ a. Examination on Eq. 9, we have the following two comments – The structural kernel is estimated from contaminated observations, and thus is not robust. Compared with Eq. 4, with non-local redundancy, our estima- tion is more stable because of the weighted average of kernel weight matrices inside the inverse, and the weighted average of the structural pixel values. – Compared with Eq. 6, our model can regularize the estimation from the non- local patches by structural higher-order regression, and thus is more robust to outliers.Therefore, the proposed model make full use of both important cues from local structure and non-local similarity, leading to more reliable and robust estimation,which will be verified by experimental results in Sec. 5.6H.Zhang, J.Yang, Y.Zhang and T.Huang3.2Structural Kernel EstimationIt is desirable to use a structure adaptive kernel in estimation. Given the observation Y and a query location xi, we can construct a structure adaptive kernel asKxix − xi 1detTexp −12x − xiTT−1x − xi 10where T is the diffusion tensor at xicontrolling the spatial structure of thekernel. Given two unit vectors u and v defined by the gradient and tangent direction respectively, we can construct T fuuT gvvTand adjust f and g according to the underlying structure, so that the induced kernel is isotropic f ≈ g at almost constant regions and aligned along the image contour g f otherwise. One possible choice4is fα,β βγ αγand gα,β αγ βγγ ≥ 0,where α and β are the eigen values of the structure tensor [1], reflecting the strength of the gradient along each eigenvector directions. α, β, u and v can be calculated from the following relation using singular value decomposition SVD∇Yxi∇YT xi αuuT βvvT11where ∇Yxiis a 2 1 vector, denoting the gradient of Y at xi.3.3Relations to Other WorksTons of works have emerged recently based on non-local redundancy and local regressions for image and video processing. It is worthwhile to talk about the relations of the proposed model to those previously proposed algorithms. The non-local models in [4], [8] and [9] use the redundancy from non-local self- similarity, but do not include the spatial structure explicitly as a regularization. The high order generalization of non-local means in [15] uses the computation ofnon-local similarity to find the local kernel for regression, which actually violates the philosophy of the non-local model. Local structural regression [1][6][7] ex- plicitly employ the spatial kernel for regularization, but neglect the redundancy of similar local patterns useful for robust estimation. The 3D kernel regression [16] exploits the local spatial-temporal structure by extending their 2D spatial kernel regression, also discards the non-local self-similarity. Sparse repre- sentation for denoising [2] and super-resolution [3] do local regression using bases learned from a training database. They per estimation on each individual local patch and discard the patch redundancy. The sparse representation model is later generalized in [17] for image denoising by doing simultaneous sparse cod-ing over similar patches found in different locations of the image. However, the non-local redundancy is used in a hard assignment clustering way instead of a soft way. [10] fully explores the self-similarity property for single image super- resolution, but no spatial structural regularization is applied. To summarize, ourmodel is the first work to explicitly unify the self-similarity and local structural regularization into a single model, allowing more robust estimation.4One can refer to [1] for other choices of diffusion tensor.Non-Local Kernel Regression for Image and Video Restoration7Fig.2. Operator Illustration. RlxjY , the patch of the LR image Y at xj, is ed by downsampling HR patch RhxjZ by factor r 2 as DpRh xjZ, keeping the c