CINXE.COM

Dictionary learning and tensor decomposition via the sum-of-squares method (David Steurer — ETH Zurich, Computer Science, Theory)

<!doctype html><html lang="en" class="overflow-y-scroll-ns"><head><meta charSet="utf-8"/><meta name="description" content=" We give a new approach to the dictionary learning (also known as “sparse coding”) problem of recovering an unknown $n\times m$ matrix $A$ (for $m \geq n$) from examples of the form $y = Ax + e,$ where $x$ is a random vector in $\mathbb R^m$ with at most $\tau m$ nonzero coordinates, and $e$ is a random noise vector in $\mathbb R^n$ with bounded magnitude. For the case $m=O(n)$, our algorithm recovers every column of $A$ within arbitrarily good constant accuracy in time $m^{O(\log m/\log(\tau^{-1}))}$, in particular achieving polynomial time if $\tau = m^{-\delta}$ for any $\delta&amp;gt;0$, and time $m^{O(\log m)}$ if $\tau$ is (a sufficiently small) constant. Prior algorithms with comparable assumptions on the distribution required the vector $x$ to be much sparser—at most $\sqrt{n}$ nonzero coordinates—and there were intrinsic barriers preventing these algorithms from applying for denser $x$. We achieve this by designing an algorithm for noisy tensor decomposition that can recover, under quite general conditions, an approximate rank-one decomposition of a tensor $T$, given access to a tensor $T&amp;amp;#39;$ that is $\tau$-close to $T$ in the spectral norm (when considered as a matrix). To our knowledge, this is the first algorithm for tensor decomposition that works in the constant spectral-norm noise regime, where there is no guarantee that the local optima of $T$ and $T&amp;amp;#39;$ have similar structures. Our algorithm is based on a novel approach to using and analyzing the Sum of Squares semidefinite programming hierarchy (Parrilo 2000, Lasserre 2001), and it can be viewed as an indication of the utility of this very general and powerful tool for unsupervised learning problems. "/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="icon" href="/images/eth.jpg"/><title>Dictionary learning and tensor decomposition via the sum-of-squares method (David Steurer — ETH Zurich, Computer Science, Theory)</title><style>@font-face{font-family:KaTeX_AMS;src:url(../../katex/dist/fonts/KaTeX_AMS-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_AMS-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_AMS-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Caligraphic;src:url(../../katex/dist/fonts/KaTeX_Caligraphic-Bold.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Caligraphic-Bold.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Caligraphic-Bold.ttf) format("truetype");font-weight:700;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Caligraphic;src:url(../../katex/dist/fonts/KaTeX_Caligraphic-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Caligraphic-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Caligraphic-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Fraktur;src:url(../../katex/dist/fonts/KaTeX_Fraktur-Bold.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Fraktur-Bold.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Fraktur-Bold.ttf) format("truetype");font-weight:700;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Fraktur;src:url(../../katex/dist/fonts/KaTeX_Fraktur-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Fraktur-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Fraktur-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Main;src:url(../../katex/dist/fonts/KaTeX_Main-Bold.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Main-Bold.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Main-Bold.ttf) format("truetype");font-weight:700;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Main;src:url(../../katex/dist/fonts/KaTeX_Main-BoldItalic.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Main-BoldItalic.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Main-BoldItalic.ttf) format("truetype");font-weight:700;font-style:italic;font-display:swap}@font-face{font-family:KaTeX_Main;src:url(../../katex/dist/fonts/KaTeX_Main-Italic.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Main-Italic.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Main-Italic.ttf) format("truetype");font-weight:400;font-style:italic;font-display:swap}@font-face{font-family:KaTeX_Main;src:url(../../katex/dist/fonts/KaTeX_Main-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Main-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Main-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Math;src:url(../../katex/dist/fonts/KaTeX_Math-BoldItalic.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Math-BoldItalic.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Math-BoldItalic.ttf) format("truetype");font-weight:700;font-style:italic;font-display:swap}@font-face{font-family:KaTeX_Math;src:url(../../katex/dist/fonts/KaTeX_Math-Italic.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Math-Italic.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Math-Italic.ttf) format("truetype");font-weight:400;font-style:italic;font-display:swap}@font-face{font-family:"KaTeX_SansSerif";src:url(../../katex/dist/fonts/KaTeX_SansSerif-Bold.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_SansSerif-Bold.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_SansSerif-Bold.ttf) format("truetype");font-weight:700;font-style:normal;font-display:swap}@font-face{font-family:"KaTeX_SansSerif";src:url(../../katex/dist/fonts/KaTeX_SansSerif-Italic.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_SansSerif-Italic.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_SansSerif-Italic.ttf) format("truetype");font-weight:400;font-style:italic;font-display:swap}@font-face{font-family:"KaTeX_SansSerif";src:url(../../katex/dist/fonts/KaTeX_SansSerif-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_SansSerif-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_SansSerif-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Script;src:url(../../katex/dist/fonts/KaTeX_Script-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Script-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Script-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Size1;src:url(../../katex/dist/fonts/KaTeX_Size1-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Size1-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Size1-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Size2;src:url(../../katex/dist/fonts/KaTeX_Size2-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Size2-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Size2-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Size3;src:url(../../katex/dist/fonts/KaTeX_Size3-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Size3-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Size3-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Size4;src:url(../../katex/dist/fonts/KaTeX_Size4-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Size4-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Size4-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}@font-face{font-family:KaTeX_Typewriter;src:url(../../katex/dist/fonts/KaTeX_Typewriter-Regular.woff2) format("woff2"),url(../../katex/dist/fonts/KaTeX_Typewriter-Regular.woff) format("woff"),url(../../katex/dist/fonts/KaTeX_Typewriter-Regular.ttf) format("truetype");font-weight:400;font-style:normal;font-display:swap}.katex{font:normal 1.21em KaTeX_Main,Times New Roman,serif;line-height:1.2;text-indent:0;text-rendering:auto;border-color:currentColor}.katex *{-ms-high-contrast-adjust:none!important}.katex .katex-mathml{position:absolute;clip:rect(1px,1px,1px,1px);padding:0;border:0;height:1px;width:1px;overflow:hidden}.katex .base{position:relative;white-space:nowrap;width:min-content}.katex .base,.katex .strut{display:inline-block}.katex .mathnormal{font-family:KaTeX_Math;font-style:italic}.katex .mathbb{font-family:KaTeX_AMS}.katex .vlist-t{display:inline-table;table-layout:fixed;border-collapse:collapse}.katex .vlist-r{display:table-row}.katex .vlist{display:table-cell;vertical-align:bottom;position:relative}.katex .vlist>span{display:block;height:0;position:relative}.katex .vlist>span>span{display:inline-block}.katex .vlist>span>.pstrut{overflow:hidden;width:0}.katex .vlist-t2{margin-right:-2px}.katex .vlist-s{display:table-cell;vertical-align:bottom;font-size:1px;width:2px;min-width:2px}.katex .msupsub{text-align:left}.katex .mspace{display:inline-block}.katex .sizing.reset-size3.size1{font-size:.71428571em}.katex .sizing.reset-size6.size3{font-size:.7em}.katex .svg-align{text-align:left}.katex svg{display:block;position:absolute;width:100%;height:inherit;fill:currentColor;stroke:currentColor;fill-rule:nonzero;fill-opacity:1;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1}.katex svg path{stroke:none}.katex .hide-tail{width:100%;position:relative;overflow:hidden}.katex-display{display:block;margin:1em 0;text-align:center}.katex-display>.katex{display:block;text-align:center;white-space:nowrap}.katex-display>.katex>.katex-html{display:block;position:relative} /*! TACHYONS v4.12.0 | http://tachyons.io */ /*! normalize.css v8.0.0 | MIT License | github.com/necolas/normalize.css */html{line-height:1.15;-webkit-text-size-adjust:100%}body{margin:0}h1{font-size:2em;margin:.67em 0}hr{box-sizing:content-box;height:0;overflow:visible}a{background-color:transparent}strong{font-weight:bolder}::-webkit-file-upload-button{-webkit-appearance:button;font:inherit}a,article,body,div,footer,h1,h2,html,li,main,nav,p,ul{box-sizing:border-box}.ba{border-style:solid;border-width:1px}.bb{border-bottom-style:solid;border-bottom-width:1px}.b--mid-gray{border-color:#555}.b--black-10{border-color:rgba(0,0,0,.1)}.cf:after,.cf:before{content:" ";display:table}.cf:after{clear:both}.cf{*zoom:1}.db{display:block}.dib{display:inline-block}.flex{display:flex}.flex-column{flex-direction:column}.justify-center{justify-content:center}.fl{float:left;_display:inline}.sans-serif{font-family:-apple-system,BlinkMacSystemFont,avenir next,avenir,helvetica neue,helvetica,ubuntu,roboto,noto,segoe ui,arial,sans-serif}.b{font-weight:700}.min-vh-100{min-height:100vh}.lh-title{line-height:1.25}.lh-copy{line-height:1.5}.link{text-decoration:none}.link,.link:active,.link:focus,.link:hover,.link:link,.link:visited{transition:color .15s ease-in}.link:focus{outline:1px dotted currentColor}.list{list-style-type:none}.mw7{max-width:48rem}.w-100{width:100%}.dark-gray{color:#333}.mid-gray{color:#555}.dark-blue{color:#00449e}.blue{color:#357edd}.bg-white{background-color:#fff}.hover-dark-blue:focus,.hover-dark-blue:hover{color:#00449e}.pa0{padding:0}.pa2{padding:.5rem}.pl0{padding-left:0}.pt3{padding-top:1rem}.ph2{padding-left:.5rem;padding-right:.5rem}.ma0{margin:0}.mr2{margin-right:.5rem}.mb2{margin-bottom:.5rem}.mb3{margin-bottom:1rem}.mt0{margin-top:0}.mv4{margin-top:2rem;margin-bottom:2rem}.underline{text-decoration:underline}.f3{font-size:1.5rem}.f4{font-size:1.25rem}.f5{font-size:1rem}.center{margin-right:auto;margin-left:auto}@media screen and (min-width:30em){.w-third-ns{width:33.33333%}.w-two-thirds-ns{width:66.66667%}.overflow-y-scroll-ns{overflow-y:scroll}.ph3-ns{padding-left:1rem;padding-right:1rem}}.underline{text-decoration-color:rgba(0,105,180,.5);text-decoration-thickness:.1em}.blue{color:#0069b4}.dark-blue,.hover-dark-blue:hover{color:#1f407a}</style></head><body><div id="main" class="mw7 center ph2 ph3-ns pt3 dark-gray bg-white sans-serif flex flex-column min-vh-100"><nav class="cf mb3 sans-serif lh-copy"><div class="fl w-third-ns w-100"><span class="dark-blue b">David Steurer</span></div><div class="fl w-two-thirds-ns w-100"><ul class="list dib pl0 ma0"><li class="dib mr2"><a href="/" class="link underline hover-dark-blue blue">home</a></li><li class="dib mr2"><a href="/papers/" class="link underline hover-dark-blue blue">papers</a></li><li class="dib mr2"><a href="/talks/" class="link underline hover-dark-blue blue">talks</a></li><li class="dib mr2"><a href="/cv/" class="link underline hover-dark-blue blue">c.v.</a></li><li class="dib mr2"><a href="/courses/" class="link underline hover-dark-blue blue">courses</a></li><li class="dib mr2"><a href="/contact/" class="link underline hover-dark-blue blue">contact</a></li></ul></div></nav><main style="flex:1"><article class="lh-copy"><h1 class="lh-title f3 dark-blue">Dictionary learning and tensor decomposition via the sum-of-squares method</h1><p class="mt0 mb2">with Boaz Barak, Jonathan Kelner. <strong>STOC 2015.</strong></p><p class="mt0 mb2"><a href="/paper/soslearning.pdf" class="link b underline hover-dark-blue blue">PDF</a></p><p class="mt0 mb2"></p><h2 class="f4 lh-title mid-gray">abstract</h2> <p class="mt0 mb2">We give a new approach to the dictionary learning (also known as “sparse coding”) problem of recovering an unknown <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>×</mo><mi>m</mi></mrow><annotation encoding="application/x-tex">n\times m</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.66666em;vertical-align:-0.08333em;"></span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal">m</span></span></span></span></span> matrix <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi></mrow><annotation encoding="application/x-tex">A</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathnormal">A</span></span></span></span></span> (for <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>m</mi><mo>≥</mo><mi>n</mi></mrow><annotation encoding="application/x-tex">m \geq n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7719400000000001em;vertical-align:-0.13597em;"></span><span class="mord mathnormal">m</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">≥</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal">n</span></span></span></span></span>) from examples of the form <span><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><mi>y</mi><mo>=</mo><mi>A</mi><mi>x</mi><mo>+</mo><mi>e</mi><mo separator="true">,</mo></mrow><annotation encoding="application/x-tex">y = Ax + e,</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.19444em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">y</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.76666em;vertical-align:-0.08333em;"></span><span class="mord mathnormal">A</span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.19444em;"></span><span class="mord mathnormal">e</span><span class="mpunct">,</span></span></span></span></span></span> where <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal">x</span></span></span></span></span> is a random vector in <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi mathvariant="double-struck">R</mi><mi>m</mi></msup></mrow><annotation encoding="application/x-tex">\mathbb R^m</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68889em;vertical-align:0em;"></span><span class="mord"><span class="mord mathbb">R</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.664392em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span></span></span></span></span></span></span></span> with at most <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>τ</mi><mi>m</mi></mrow><annotation encoding="application/x-tex">\tau m</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal" style="margin-right:0.1132em;">τ</span><span class="mord mathnormal">m</span></span></span></span></span> nonzero coordinates, and <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>e</mi></mrow><annotation encoding="application/x-tex">e</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal">e</span></span></span></span></span> is a random noise vector in <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi mathvariant="double-struck">R</mi><mi>n</mi></msup></mrow><annotation encoding="application/x-tex">\mathbb R^n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68889em;vertical-align:0em;"></span><span class="mord"><span class="mord mathbb">R</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.664392em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span></span></span></span></span></span></span></span></span> with bounded magnitude. For the case <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>m</mi><mo>=</mo><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">m=O(n)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal">m</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></span>, our algorithm recovers every column of <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi></mrow><annotation encoding="application/x-tex">A</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathnormal">A</span></span></span></span></span> within arbitrarily good constant accuracy in time <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>m</mi><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>log</mi><mo>⁡</mo><mi>m</mi><mi mathvariant="normal">/</mi><mi>log</mi><mo>⁡</mo><mo stretchy="false">(</mo><msup><mi>τ</mi><mrow><mo>−</mo><mn>1</mn></mrow></msup><mo stretchy="false">)</mo><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">m^{O(\log m/\log(\tau^{-1}))}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9869199999999998em;vertical-align:0em;"></span><span class="mord"><span class="mord mathnormal">m</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.9869199999999998em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.02778em;">O</span><span class="mopen mtight">(</span><span class="mop mtight"><span class="mtight">l</span><span class="mtight">o</span><span class="mtight" style="margin-right:0.01389em;">g</span></span><span class="mspace mtight" style="margin-right:0.19516666666666668em;"></span><span class="mord mathnormal mtight">m</span><span class="mord mtight">/</span><span class="mspace mtight" style="margin-right:0.19516666666666668em;"></span><span class="mop mtight"><span class="mtight">l</span><span class="mtight">o</span><span class="mtight" style="margin-right:0.01389em;">g</span></span><span class="mopen mtight">(</span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1132em;">τ</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913142857142857em;"><span style="top:-2.931em;margin-right:0.07142857142857144em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mtight">−</span><span class="mord mtight">1</span></span></span></span></span></span></span></span></span><span class="mclose mtight">)</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span></span>, in particular achieving polynomial time if <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>τ</mi><mo>=</mo><msup><mi>m</mi><mrow><mo>−</mo><mi>δ</mi></mrow></msup></mrow><annotation encoding="application/x-tex">\tau = m^{-\delta}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal" style="margin-right:0.1132em;">τ</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.8491079999999999em;vertical-align:0em;"></span><span class="mord"><span class="mord mathnormal">m</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8491079999999999em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">−</span><span class="mord mathnormal mtight" style="margin-right:0.03785em;">δ</span></span></span></span></span></span></span></span></span></span></span></span></span> for any <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>δ</mi><mo>&gt;</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\delta&gt;0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.73354em;vertical-align:-0.0391em;"></span><span class="mord mathnormal" style="margin-right:0.03785em;">δ</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">&gt;</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.64444em;vertical-align:0em;"></span><span class="mord">0</span></span></span></span></span>, and time <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>m</mi><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>log</mi><mo>⁡</mo><mi>m</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">m^{O(\log m)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8879999999999999em;vertical-align:0em;"></span><span class="mord"><span class="mord mathnormal">m</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8879999999999999em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.02778em;">O</span><span class="mopen mtight">(</span><span class="mop mtight"><span class="mtight">l</span><span class="mtight">o</span><span class="mtight" style="margin-right:0.01389em;">g</span></span><span class="mspace mtight" style="margin-right:0.19516666666666668em;"></span><span class="mord mathnormal mtight">m</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span></span> if <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>τ</mi></mrow><annotation encoding="application/x-tex">\tau</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal" style="margin-right:0.1132em;">τ</span></span></span></span></span> is (a sufficiently small) constant. Prior algorithms with comparable assumptions on the distribution required the vector <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal">x</span></span></span></span></span> to be much sparser—at most <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msqrt><mi>n</mi></msqrt></mrow><annotation encoding="application/x-tex">\sqrt{n}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.04em;vertical-align:-0.23972em;"></span><span class="mord sqrt"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8002800000000001em;"><span class="svg-align" style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord" style="padding-left:0.833em;"><span class="mord mathnormal">n</span></span></span><span style="top:-2.76028em;"><span class="pstrut" style="height:3em;"></span><span class="hide-tail" style="min-width:0.853em;height:1.08em;"><svg width='400em' height='1.08em' viewBox='0 0 400000 1080' preserveAspectRatio='xMinYMin slice'><path d='M95,702 c-2.7,0,-7.17,-2.7,-13.5,-8c-5.8,-5.3,-9.5,-10,-9.5,-14 c0,-2,0.3,-3.3,1,-4c1.3,-2.7,23.83,-20.7,67.5,-54 c44.2,-33.3,65.8,-50.3,66.5,-51c1.3,-1.3,3,-2,5,-2c4.7,0,8.7,3.3,12,10 s173,378,173,378c0.7,0,35.3,-71,104,-213c68.7,-142,137.5,-285,206.5,-429 c69,-144,104.5,-217.7,106.5,-221 l0 -0 c5.3,-9.3,12,-14,20,-14 H400000v40H845.2724 s-225.272,467,-225.272,467s-235,486,-235,486c-2.7,4.7,-9,7,-19,7 c-6,0,-10,-1,-12,-3s-194,-422,-194,-422s-65,47,-65,47z M834 80h400000v40h-400000z'/></svg></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.23972em;"><span></span></span></span></span></span></span></span></span></span> nonzero coordinates—and there were intrinsic barriers preventing these algorithms from applying for denser <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal">x</span></span></span></span></span>.</p> <p class="mt0 mb2">We achieve this by designing an algorithm for <em>noisy tensor decomposition</em> that can recover, under quite general conditions, an approximate rank-one decomposition of a tensor <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi></mrow><annotation encoding="application/x-tex">T</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathnormal" style="margin-right:0.13889em;">T</span></span></span></span></span>, given access to a tensor <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>T</mi><mo mathvariant="normal" lspace="0em" rspace="0em">′</mo></msup></mrow><annotation encoding="application/x-tex">T&#x27;</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.751892em;vertical-align:0em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.13889em;">T</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.751892em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">′</span></span></span></span></span></span></span></span></span></span></span></span></span> that is <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>τ</mi></mrow><annotation encoding="application/x-tex">\tau</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal" style="margin-right:0.1132em;">τ</span></span></span></span></span>-close to <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi></mrow><annotation encoding="application/x-tex">T</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathnormal" style="margin-right:0.13889em;">T</span></span></span></span></span> in the spectral norm (when considered as a matrix). To our knowledge, this is the first algorithm for tensor decomposition that works in the constant spectral-norm noise regime, where there is no guarantee that the local optima of <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi></mrow><annotation encoding="application/x-tex">T</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathnormal" style="margin-right:0.13889em;">T</span></span></span></span></span> and <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>T</mi><mo mathvariant="normal" lspace="0em" rspace="0em">′</mo></msup></mrow><annotation encoding="application/x-tex">T&#x27;</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.751892em;vertical-align:0em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.13889em;">T</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.751892em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">′</span></span></span></span></span></span></span></span></span></span></span></span></span> have similar structures.</p> <p class="mt0 mb2">Our algorithm is based on a novel approach to using and analyzing the <em>Sum of Squares</em> semidefinite programming hierarchy (Parrilo 2000, Lasserre 2001), and it can be viewed as an indication of the utility of this very general and powerful tool for unsupervised learning problems.</p> <h2 class="f4 lh-title mid-gray">keywords</h2><ul class="list pa0"><li class="dib mr2 mb2 ba b--mid-gray pa2">sum-of-squares method</li><li class="dib mr2 mb2 ba b--mid-gray pa2">tensor computations</li><li class="dib mr2 mb2 ba b--mid-gray pa2">semidefinite programming</li><li class="dib mr2 mb2 ba b--mid-gray pa2">machine learning</li></ul></article></main><footer><hr class="mv4 bb b--black-10"/><ul class="list db flex justify-center pl0 ma0 lh-copy sans-serif mv4 f5"><li class="dib mr2"><a href="/" class="link underline hover-dark-blue blue">home</a></li><li class="dib mr2"><a href="/papers/" class="link underline hover-dark-blue blue">papers</a></li><li class="dib mr2"><a href="/talks/" class="link underline hover-dark-blue blue">talks</a></li><li class="dib mr2"><a href="/cv/" class="link underline hover-dark-blue blue">c.v.</a></li><li class="dib mr2"><a href="/courses/" class="link underline hover-dark-blue blue">courses</a></li><li class="dib mr2"><a href="/contact/" class="link underline hover-dark-blue blue">contact</a></li><li class="dib mr2"><a href="/events/" class="link underline hover-dark-blue blue">events</a></li></ul></footer></div></body></html>

Pages: 1 2 3 4 5 6 7 8 9 10