Browsing by Author "Spokoiny, Vladimir"
Now showing 1 - 20 of 24
Results Per Page
Sort Options
- ItemAdaptive gradient descent for convex and non-convex stochastic optimization(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2019) Ogaltsov, Aleksandr; Dvinskikh, Darina; Dvurechensky, Pavel; Gasnikov, Alexander; Spokoiny, VladimirIn this paper we propose several adaptive gradient methods for stochastic optimization. Our methods are based on Armijo-type line search and they simultaneously adapt to the unknown Lipschitz constant of the gradient and variance of the stochastic approximation for the gradient. We consider an accelerated gradient descent for convex problems and gradient descent for non-convex problems. In the experiments we demonstrate superiority of our methods to existing adaptive methods, e.g. AdaGrad and Adam.
- ItemAdaptive manifold clustering(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2020) Besold, Franz; Spokoiny, VladimirClustering methods seek to partition data such that elements are more similar to elements in the same cluster than to elements in different clusters. The main challenge in this task is the lack of a unified definition of a cluster, especially for high dimensional data. Different methods and approaches have been proposed to address this problem. This paper continues the study originated by [6] where a novel approach to adaptive nonparametric clustering called Adaptive Weights Clustering (AWC) was offered. The method allows analyzing high-dimensional data with an unknown number of unbalanced clusters of arbitrary shape under very weak modeling as-sumptions. The procedure demonstrates a state-of-the-art performance and is very efficient even for large data dimension D. However, the theoretical study in [6] is very limited and did not re-ally address the question of efficiency. This paper makes a significant step in understanding the remarkable performance of the AWC procedure, particularly in high dimension. The approach is based on combining the ideas of adaptive clustering and manifold learning. The manifold hypoth-esis means that high dimensional data can be well approximated by a d-dimensional manifold for small d helping to overcome the curse of dimensionality problem and to get sharp bounds on the cluster separation which only depend on the intrinsic dimension d. We also address the problem of parameter tuning. Our general theoretical results are illustrated by some numerical experiments.
- ItemBayesian inference for spectral projectors of the covariance matrix(Ithaca, NY : Cornell University Library, 2018) Silin, Igor; Spokoiny, VladimirLet X1,…,Xn be an i.i.d. sample in Rp with zero mean and the covariance matrix Σ∗. The classical PCA approach recovers the projector P∗J onto the principal eigenspace of Σ∗ by its empirical counterpart ˆPJ. Recent paper [24] investigated the asymptotic distribution of the Frobenius distance between the projectors ∥ˆPJ−P∗J∥2, while [27] offered a bootstrap procedure to measure uncertainty in recovering this subspace P∗J even in a finite sample setup. The present paper considers this problem from a Bayesian perspective and suggests to use the credible sets of the pseudo-posterior distribution on the space of covariance matrices induced by the conjugated Inverse Wishart prior as sharp confidence sets. This yields a numerically efficient procedure. Moreover, we theoretically justify this method and derive finite sample bounds on the corresponding coverage probability. Contrary to [24, 27], the obtained results are valid for non-Gaussian data: the main assumption that we impose is the concentration of the sample covariance ˆΣ in a vicinity of Σ∗. Numerical simulations illustrate good performance of the proposed procedure even on non-Gaussian data in a rather challenging regime.
- ItemBootstrap confidence sets under a model misspecification(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2014) Spokoiny, Vladimir; Zhilova, MayyaA multiplier bootstrap procedure for construction of likelihood-based confidence sets is considered for finite samples and possible model misspecification. Theoretical results justify the bootstrap consistency for small or moderate sample size and allow to control the impact of the parameter dimension: the bootstrap approximation works if the ratio of cube of the parameter dimension to the sample size is small. The main result about bootstrap consistency continues to apply even if the underlying parametric model is misspecified under the so called Small Modeling Bias condition. In the case when the true model deviates significantly from the considered parametric family, the bootstrap procedure is still applicable but it becomes a bit conservative: the size of the constructed confidence sets is increased by the modeling bias. We illustrate the results with numerical examples of misspecified constant and logistic regressions.
- ItemCritical dimension in profile semiparametric estimation(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2013) Andresen, Andreas; Spokoiny, VladimirThis paper revisits the classical inference results for profile quasi maximum likelihood estimators (profile MLE) in the semiparametric estimation problem.We mainly focus on two prominent theorems: the Wilks phenomenon and Fisher expansion for the profile MLE are stated in a new fashion allowing finite samples and model misspecification. The method of study is also essentially different from the usual analysis of the semiparametric problem based on the notion of the hardest parametric submodel. Instead we apply the local bracketing and the upper function devices from Spokoiny (2012). This novel approach particularly allows to address the important issue of the effective target and nuisance dimension and it does not involve any pilot estimator of the target parameter. The obtained nonasymptotic results are surprisingly sharp and yield the classical asymptotic statements including the asymptotic normality and efficiency of the profile MLE. The general results are specified to the important special cases of an i.i.d. sample.
- ItemDiffusion tensor imaging : structural adaptive smoothing(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2007) Tabelow, Karsten; Polzehl, Jörg; Spokoiny, Vladimir; Voss, Henning U.Diffusion Tensor Imaging (DTI) data is characterized by a high noise level. Thus, estimation errors of quantities like anisotropy indices or the main diffusion direction used for fiber tracking are relatively large and may significantly confound the accuracy of DTI in clinical or neuroscience applications. Besides pulse sequence optimization, noise reduction by smoothing the data can be pursued as a complementary approach to increase the accuracy of DTI. Here, we suggest an anisotropic structural adaptive smoothing procedure, which is based on the Propagation-Separation method and preserves the structures seen in DTI and their different sizes and shapes. It is applied to artificial phantom data and a brain scan. We show that this method significantly improves the quality of the estimate of the diffusion tensor and hence enables one either to reduce the number of scans or to enhance the input for subsequent analysis such as fiber tracking.
- ItemEffiziente Methoden zur Bestimmung von Risikomaßen : Schlussbericht ; Projekt des BMBF-Förderprogramm "Neue Mathematische Verfahren in Industrie und Dienstleistungen"(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2004) Schoenmakers, John; Spokoiny, Vladimir; Reiß, Oliver; Zacherias-Langhans, Johan-Hinrich[no abstract available]
- ItemExponential bounds for the minimum contrast with some applications(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2007) Golubev, Yuri; Spokoiny, VladimirThe paper studies parametric minimum contrast estimates under rather general conditions. The quality if estimation is measured by the rate function related to the contrast which allows for stating the results without specifying the particular parametric structure of the model. This approach permits also to go far beyond the classical i.i.d. case and to obtain nonasymptotic upper bounds for the risk. These bounds apply even for small or moderate samples. They also cover the case of misspecified parametric models. Another important feature of the approach is that it works well in the case when the parametric set can be unbounded and non-compact. In the case of a smooth contrast, the obtained exponential bounds do not rely on the covering numbers and can be easily computed. We also illustrate how these bound can be used for statistical inference: bounding the estimation risk, constructing the confidence sets for the underlying parameters, establishing the concentration properties of the minimum contrast estimate. The general results are specified to the case of a Gaussian contrast and of an i.i.d. sample. We also illustrate the approach by several popular examples including least squares and least absolute deviation contrasts and the problem of estimating the location of the change point. What we obtain in these examples slightly differs from usual asymptotic results known in the classical literature. This difference is due to the unboundness of the parameter set and a possible model misspecification.
- ItemForward and reverse representations for Markov chains(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2006) Milstein, Grigori N.; Schoenmakers, John G.M.; Spokoiny, VladimirIn this paper we carry over the concept of reverse probabilistic representations developed in Milstein, Schoenmakers, Spokoiny (2004) for diffusion processes, to discrete time Markov chains. We outline the construction of reverse chains in several situations and apply this to processes which are connected with jump-diffusion models and finite state Markov chains. By combining forward and reverse representations we then construct transition density estimators for chains which have root-N accuracy in any dimension and consider some applications.
- ItemGaussian processes with multidimensional distribution inputs via optimal transport and Hilbertian embedding(Ithaca, NY : Cornell University Library, 2020) Bachoc, François; Suvorikova, Alexandra; Ginsbourger, David; Loubes, Jean-Michel; Spokoiny, VladimirIn this work, we propose a way to construct Gaussian processes indexed by multidimensional distributions. More precisely, we tackle the problem of defining positive definite kernels between multivariate distributions via notions of optimal transport and appealing to Hilbert space embeddings. Besides presenting a characterization of radial positive definite and strictly positive definite kernels on general Hilbert spaces, we investigate the statistical properties of our theoretical and empirical kernels, focusing in particular on consistency as well as the special case of Gaussian distributions. A wide set of applications is presented, both using simulations and implementation with real data.
- ItemIn search on non-Gaussian components of a high-dimensional distribution(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2006) Blanchard, Gilles; Kawanabe, Motoaki; Sugiyama, Masashi; Spokoiny, Vladimir; Müller, Klaus-RobertFinding non-Gaussian components of high-dimensional data is an important preprocessing step for efficient information processing. This article proposes a new em linear method to identify the "non-Gaussian subspace'' within a very general semi-parametric framework. Our proposed method, called NGCA (Non-Gaussian Component Analysis), is essentially based on the fact that we can construct a linear operator which, to any arbitrary nonlinear (smooth) function, associates a vector which belongs to the low dimensional non-Gaussian target subspace up to an estimation error. By applying this operator to a family of different nonlinear functions, one obtains a family of different vectors lying in a vicinity of the target space. As a final step, the target space itself is estimated by applying PCA to this family of vectors. We show that this procedure is consistent in the sense that the estimaton error tends to zero at a parametric rate, uniformly over the family, Numerical examples demonstrate the usefulness of our method
- ItemInhomogeneous dependence modelling with time varying copulae(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2007) Giacomini, Enzo; Härdle, Wolfgang; Spokoiny, VladimirMeasuring dependence in a multivariate time series is tantamount to modelling its dynamic structure in space and time. In the context of a multivariate normally distributed time series, the evolution of the covariance (or correlation) matrix over time describes this dynamic. A wide variety of applications, though, requires a modelling framework different from the multivariate normal. In risk management the non-normal behaviour of most financial time series calls for non-Gaussian dependence. The correct modelling of non-Gaussian dependences is therefore a key issue in the analysis of multivariate time series. In this paper we use copulae functions with adaptively estimated time varying parameters for modelling the distribution of returns, free from the usual normality assumptions. Further, we apply copulae to estimation of Value-at-Risk (VaR) of portfolios and show their better performance over the RiskMetrics approach, a widely used methodology for VaR estimation.
- ItemLocally time homogeneous time series modelling(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2008) Elagin, Mstislav; Spokoiny, VladimirIn this paper three locally adaptive estimation methods are applied to the problems of variance forecasting, value-at-risk analysis and volatility estimation within the context of nonstationary financial time series. A general procedure for the computation of critical values is given. Numerical results exhibit a very reasonable performance of the methods.
- ItemModern Nonparametric Statistics: Going Beyond Asymptotic Minimax(Zürich : EMS Publ. House, 2010) Johnstone, Iain M.; Spokoiny, VladimirDuring the years 1975-1990 a major emphasis in nonparametric estimation was put on computing the asymptotic minimax risk for many classes of functions. Modern statistical practice indicates some serious limitations of the asymptotic minimax approach and calls for some new ideas and methods which can cope with the numerous challenges brought to statisticians by modern sets of data.
- ItemNew Inference Concepts for Analysing Complex Data(Zürich : EMS Publ. House, 2004) Müller, Klaus-Robert; Spokoiny, VladimirThe main purpose of this workshop was to assemble international leaders from statistics and machine learning to identify important research problems, to cross-fertilize between the disciplines, and to ultimately start coordinated research efforts toward better solutions. The workshop focused on discussing modern methods for analysis complex high dimensional data with applications to econometrics, finance, biomedicine, genomics etc.
- ItemOptimal stopping via deeply boosted backward regression(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2018) Belomestny, Denis; Schoenmakers, John G.M.; Spokoiny, Vladimir; Tavyrikov, YuriIn this note we propose a new approach towards solving numerically optimal stopping problems via boosted regression based Monte Carlo algorithms. The main idea of the method is to boost standard linear regression algorithms in each backward induction step by adding new basis functions based on previously estimated continuation values. The proposed methodology is illustrated by several numerical examples from finance.
- ItemParameter estimation in time series analysis(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2009) Spokoiny, VladimirThe paper offers a novel unified approach to studying the accuracy of parameter estimation for a time series. Important features of the approach are: (1) The underlying model is not assumed to be parametric. (2) The imposed conditions on the model are very mild and can be easily checked in specific applications. (3) The considered time series need not to be ergodic or stationary. The approach is equally applicable to ergodic, unit root and explosive cases. (4) The parameter set can be unbounded and non-compact. (5) No conditions on parameter identifiability are required. (6) The established risk bounds are nonasymptotic and valid for large, moderate and small samples. (7) The results describe confidence and concentration sets rather than the accuracy of point estimation. The whole approach can be viewed as complementary to the classical one based on the asymptotic expansion of the log-likelihood. In particular, it claims a consistency of the considered estimate in a rather general sense, which usually is assumed to be fulfilled in the asymptotic analysis. In standard situations under ergodicity conditions, the usual rate results can be easily obtained as corollaries from the established risk bounds. The approach and the results are illustrated on a number of popular time series models including autoregressive, Generalized Linear time series, ARCH and GARCH models and meadian/quantile regression.
- ItemRegression methods in pricing American and Bermudan options using consumption processes(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2006) Belomestny, Denis; Milstein, Grigor N.; Spokoiny, VladimirHere we develop methods for efficient pricing multidimensional discrete-time American and Bermudan options by using regression based algorithms together with a new approach towards constructing upper bounds for the price of the option. Applying sample space with payoffs at the optimal stopping times, we propose sequential estimates for continuation values, values of the consumption process, and stopping times on the sample paths. The approach admits constructing both low and upper bounds for the price by Monte Carlo simulations. The methods are illustrated by pricing Bermudan swaptions and snowballs in the Libor market model.
- ItemReinforced optimal control(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2020) Bayer, Christian; Belomestny, Denis; Hager, Paul; Pigato, Paolo; Schoenmakers, John G. M.; Spokoiny, VladimirLeast squares Monte Carlo methods are a popular numerical approximation method for solving stochastic control problems. Based on dynamic programming, their key feature is the approximation of the conditional expectation of future rewards by linear least squares regression. Hence, the choice of basis functions is crucial for the accuracy of the method. Earlier work by some of us [Belomestny, Schoenmakers, Spokoiny, Zharkynbay, Commun. Math. Sci., 18(1):109?121, 2020] proposes to reinforce the basis functions in the case of optimal stopping problems by already computed value functions for later times, thereby considerably improving the accuracy with limited additional computational cost. We extend the reinforced regression method to a general class of stochastic control problems, while considerably improving the method?s efficiency, as demonstrated by substantial numerical examples as well as theoretical analysis.
- ItemRobust risk management : accounting for nonstationarity and heavy tails(Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2007) Chen, Ying; Spokoiny, VladimirIn the ideal Black-Scholes world, financial time series are assumed 1) stationary (time homogeneous) or can be modelled globally by a stationary process and 2) having conditionally normal distribution given the past. These two assumptions have been widely-used in many methods such as the RiskMetrics, one risk management method considered as industry standard. However these assumptions are unrealistic. The primary aim of the paper is to account for nonstationarity and heavy tails in time series by presenting a local exponential smoothing approach, by which the smoothing parameter is adaptively selected at every time point and the heavy-tailedness of the process is considered. A complete theory addresses both issues. In our study, we demonstrate the implementation of the proposed method in volatility estimation and risk management given simulated and real data. Numerical results show the proposed method delivers accurate and sensitive estimates.