Publications of Shai Shalev-Shwartz


Books

"Understanding Machine Learning: From Theory to Algorithms". Shai Shalev-Shwartz and Shai Ben-David. Cambridge university press. 2014.
[ Book's website, Cambridge, Amazon].
"Online Learning and Online Convex Optimization". Shai Shalev-Shwartz. Foundations and Trends in Machine Learning, Volume 4, Issue 2, DOI: 10.1561/2200000018. [Paper: pdf ]

Dissertations

"Online Learning: Theory, Algorithms, and Applications" Shai Shalev-Shwartz, The Hebrew University of Jerusalem. PH.d. thesis. July 2007. [Paper (corrected): pdf ]
(I'd like to thank Francesco Orabona for pointing out important corrections to Figures 5.2 and 5.4 and to Matus Telgarsky for pointing out a typo in the definition of convex functions.)
"Robust Temporal and Spectral Modeling for Query by Melody" Shai Shalev-Shwartz, The Hebrew University of Jerusalem. M.Sc. thesis. Jerusalem 2002 [Paper: pdf ]

Preprints, Reports, Blogs, etc.

AproxiPong: Understanding the Merits and Pitfalls of Reinforcement Learning Algorithms when combined with Deep Learning. Source code is available as well.
On a Formal Model of Safe and Scalable Self-driving Cars. Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua.
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data. Alon Brutzkus, Amir Globerson, Eran Malach, Shai Shalev-Shwartz.
Weight Sharing is Crucial to Succesful Optimization. Shai Shalev-Shwartz, Ohad Shamir, Shaked Shammah.
On the Sample Complexity of End-to-end Training vs. Semantic Abstraction Training. Shai Shalev-Shwartz and Amnon Shashua.
Distribution Free Learning with Local Queries. Galit Bary-Weisberg, Amit Daniely, Shai Shalev-Shwartz.
Long term planning by short term prediction. Shai Shalev-Shwartz, Nir Ben-Zrihem, Aviad Cohen, Amnon Shashua.
Faster Low-rank Approximation using Adaptive Gap-based Preconditioning. Alon Gonen, Shai Shalev-Shwartz.
Faster SGD Using Sketched Conditioning. Alon Gonen, Shai Shalev-Shwartz.
SelfieBoost: A Boosting Algorithm for Deep Learning. Shai Shalev-Shwartz.

Journal Papers

Average Stability is Invariant to Data Preconditioning. Implications to Exp-concave Empirical Risk Minimization. Alon Gonen, Shai Shalev-Shwartz. Accepted to JMLR. pdf
Near-Optimal Algorithms for Online Matrix Prediction. Elad Hazan, Satyen Kale, Shai Shalev-Shwartz. SIAM J. Comput., 46(2), 744--773, 2017. This is a long version of our COLT 2012 paper
On Lower and Upper Bounds for Smooth and Strongly Convex Optimization Problems. Yossi Arjevani, Shai Shalev-Shwartz, Ohad Shamir. Journal of Machine Learning Research (JMLR), 17(126):1-51, 2016.
Subspace Learning with Partial Information. Alon Gonen, Dan Rosenbaum, Yonina Eldar, Shai Shalev-Shwartz. Journal of Machine Learning Research (JMLR), 17(52):1-21, 2016.
Multiclass Learnability and the ERM Principle. Amit Daniely, Sivan Sabato, Shai Ben-David, Shai Shalev-Shwartz. Journal of Machine Learning Research, 16(Dec):2377-2404, 2015.
Learning Sparse Low-Threshold Linear Classifiers. Sivan Sabato, Shai Shalev-Shwartz, Nathan Srebro, Daniel Hsu, Tong Zhang. Journal of Machine Learning Research, 16(Jul):1275-1304, 2015.
Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization. Shai Shalev-Shwartz and Tong Zhang. Mathematical Programming SERIES A and B (to appear). [pdf on arxiv]
Matrix Completion with the Trace Norm: Learning, Bounding, and Transducing. Ohad Shamir and Shai Shalev-Shwartz. Journal of Machine Learning Research, 15(Oct):3401-3423, 2014.
Efficient Active Learning of Halfspaces: An Aggressive Approach. Alon Gonen, Sivan Sabato, Shai Shalev-Shwartz. Journal of Machine Learning Research, 14(Sep):2583-2615, 2013.
Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization. Shai Shalev-Shwartz and Tong Zhang. Journal of Machine Learning Research, 14(Feb):567-599, 2013.
"Regularization Techniques for Learning with Matrices" Sham Kakade, Shai Shalev-Shwartz, Ambuj Tewari. JMLR 13(Jun):1865-1890, 2012. [Journal paper, Technical Report, Slides of a related talk]
"Online Learning of Noisy Data" Nicolo Cesa-Bianchi, Shai Shalev-Shwartz and Ohad Shamir. To appear in IEEE Transactions on Information Theory, 2011. [Paper: pdf ]
"Efficient Learning with Partially Observed Attributes. Nicolo Cesa-Bianchi, Shai Shalev-Shwartz and Ohad Shamir. JMLR 12(Oct):2857-2878, 2011. [Paper: pdf ]
"Learning Kernel Based Halfspaces with the 0-1 Loss" Shai Shalev-Shwartz, Karthik Sridharan and Ohad Shamir. SIAM Journal on Computing, 2011. DOI: 10.1137/100806126. [Paper: pdf ]
"Stochastic Methods for l1-regularized Loss Minimization" Shai Shalev-Shwartz and Ambuj Tewari. Journal of Machine Learning Research, 12(Jun):1865-1892, 2011. [Paper: pdf ]
"Learnability, Stability and Uniform Convergence" Shai Shalev-Shwartz, Ohad Shamir, Nathan Srebro and Karthik Sridharan. Journal of Machine Learning Research, 11(Oct):2635-2670, 2010. [Paper: pdf ]
"Trading Accuracy for Sparsity in Optimization Problems with Sparsity Constraints" Shai Shalev-Shwartz, Tong Zhang, Nati Srebro, Siam Journal on Optimization. Volume 20, Issue 6, pp. 2807-2832 (2010). DOI 10.1137/090759574. [Paper: pdf ]
"On the Equivalence of Weak Learnability and Linear Separability: New Relaxations and Efficient Boosting Algorithms" Shai Shalev-Shwartz and Yoram Singer, Machine Learning Journal, Volume 80, Issue 2, Pages 141 - 163 (2010). DOI 10.1007/s10994-010-5173-z. [Paper: pdf ](Errata)
"Pegasos: Primal Estimated sub-GrAdient SOlver for SVM" Shai Shalev-Shwartz, Yoram Singer, Nathan Srebro, Andrew Cotter." Mathematical Programming, Series B, 127(1):3-30, 2011. [Paper: pdf ]
"Individual Sequence Prediction using Memory-efficient Context Trees" Ofer Dekel, Shai Shalev-Shwartz and Yoram Singer, IEEE Transactions on Information Theory. Volume 55, Issue 11, Pages 5251-5262, 2009. [Paper: pdf ]
"Ranking Categorical Features Using Generalization Properties" Sivan Sabato and Shai Shalev-Shwartz, Journal of Machine Learning Research, 2008. [Paper: pdf ]
"Online Learning of Complex Prediction Problems Using Simultaneous Projections" Yonatan Amit, Shai Shalev-Shwartz and Yoram Siner, Journal of Machine Learning Research, 2008. [Paper: pdf ]
"The Forgetron: A Kernel-Based Perceptron on a Budget" Ofer Dekel, Shai Shalev-Shwartz and Yoram Singer, SIAM Journal of COMPUTING, Vol. 37, Issue 5, Pages 1342-1372, 2007. [Paper: pdf ]
"A Primal-Dual Perspective of Online Learning Algorithms" Shai Shalev-Shwartz and Yoram Singer, Machine Learning Journal, 69:2/3, pages 115 - 142, 2007. [Paper: pdf ]
"A Large Margin Algorithm for Speech-to-Phoneme and Music-to-Score Alignment" Joseph Keshet, Shai Shalev-Shwartz, Yoram Singer and Dan Chazan. IEEE Trans. on Audio, Speech and Language Processing. [Paper: pdf ]
"Efficient Learning of Label Ranking by Soft Projections onto Polyhedra" Shai Shalev-Shwartz and Yoram Singer, Journal of Machine Learning Research 7 (July), pages 1567-1599, 2006. [Paper: pdf ]
"Online Passive-Aggressive Algorithms" Koby Crammer, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz and Yoram Singer, Journal of Machine Learning Research 7, pages 551-585, 2006. [Paper: pdf ]
"Smooth Epsilon-Insensitive Regression by Loss Symmetrization" Ofer Dekel, Shai Shalev-Shwartz and Yoram Singer, Journal of Machine Learning Research (JMLR), 6(May):711--741, 2005 [Paper: pdf ]

Conference Papers

Decoupling "when to update" from "how to update". Eran Malach, Shai Shalev-Shwartz. NIPS, 2017.
Failures of Gradient-Based Deep Learning. Shai Shalev-Shwartz, Ohad Shamir, Shaked Shammah. ICML, 2017.
Fast Rates for Empirical Risk Minimization of Strict Saddle Problems. Alon Gonen and Shai Shalev-Shwartz. COLT, 2017.
Effective Semisupervised Learning on Manifolds. Amir Globerson, Roi Livni, Shai Shalev-Shwartz. COLT, 2017.
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving. Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua. Learning, Inference and Control of Multi-Agent Systems Workshop, NIPS, 2016. A related talk can be found here.
Learning a Metric Embedding for Face Recognition using the Multibatch Method. Oren Tadmor, Yonatan Wexler, Tal Rosenwein, Shai Shalev-Shwartz, Amnon Shashua. NIPS, 2016.
Complexity theoretic limitations on learning DNF's. Amit Daniely and Shai Shalev-Shwartz. COLT, 2016.
Minimizing the Maximal Loss: How and Why?. Shai Shalev-Shwartz and Yonatan Wexler. ICML, 2016. Talk slides.
SDCA without Duality, Regularization, and Individual Convexity. Shai Shalev-Shwartz. ICML, 2016. A previous version of this paper is: SDCA without Duality. Talk slides.
Solving Ridge Regression using Sketched Preconditioned SVRG. Alon Gonen, Francesco Orabona, Shai Shalev-Shwartz. ICML, 2016.
On Graduated Optimization for Stochastic Non-Convex Problems. Elad Hazan, Kfir Y. Levy, Shai Shalev-Shwartz. ICML, 2016.
Beyond Convexity: Stochastic Quasi-Convex Optimization. Elad Hazan, Kfir Y. Levy, Shai Shalev-Shwartz. NIPS 2015. [arxiv]
Strongly Adaptive Online Learning. Amit Daniely, Alon Gonen, Shai Shalev-Shwartz. ICML 2015. [arxiv]
On the Computational Efficiency of Training Neural Networks. Roi Livni, Shai Shalev-Shwartz, Ohad Shamir. NIPS 2014. [arxiv]
The complexity of learning halfspaces using generalized linear methods. Amit Daniely, Nati Linial, Shai Shalev-Shwartz. COLT 2014. Received best student paper award. [pdf, arxiv]
Optimal Learners for Multiclass Problems. Amit Daniely and Shai Shalev-Shwartz. COLT 2014. [pdf]
From average case complexity to improper learning complexity. Amit Daniely, Nati Linial, Shai Shalev-Shwartz. STOC 2014 [on arxiv. See also a new preprint stregnethening the result to rely on a very natural assumption on the complexity of refuting random K-SAT formulas.]
K-means recovers ICA filters when independent components are sparse. Alon Vinnikov and Shai Shalev-Shwartz. ICML 2014 [pdf]
Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization. Shai Shalev-Shwartz and Tong Zhang. ICML 2014. [proceedings, arxiv]
More data speeds up training time in learning halfspaces over sparse vectors. Amit Daniely, Nati Linial, Shai Shalev-Shwartz. NIPS 2013. [Paper pdf]
Accelerated Mini-Batch Stochastic Dual Coordinate Ascent. Shai Shalev-Shwartz and Tong Zhang. NIPS 2013. [nips version, long version]
"Efficient Active Learning of Halfspaces: an Aggressive Approach" Alon Gonen, Sivan Sabato, Shai Shalev-Shwartz. ICML 2013. [Paper pdf. Also on arxiv]
"Learning Optimally Sparse Support Vector Machines" Andrew Cotter, Nati Srebro, Shai Shalev-Shwartz. ICML 2013. [Paper pdf] Source code is available here.
"Vanishing Component Analysis" Roi Livni, David Lehavi, Sagi Schein, Hila Nachlieli, Shai Shalev-Shwartz, Amir Globerson. ICML 2013. Received best paper award. [Paper pdf]
"Multiclass Learning Approaches: A Theoretical Comparison with Implications" Amit Daniely, Sivan Sabato, Shai Shalev-Shwartz. NIPS 2012. [Paper pdf]
"Learning Halfspaces with the Zero-One Loss: Time-Accuracy Tradeoffs" Shai Shalev-Shwartz and Aharon Birnbaum. NIPS 2012. [Paper pdf]
"Near-Optimal Algorithms for Online Matrix Prediction. Elad Hazan, Satyen Kale, Shai Shalev-Shwartz. COLT 2012 [Paper pdf]
"The Kernelized Stochastic Batch Perceptron". Andrew Cotter, Shai Shalev-Shwartz, Nathan Srebro. ICML 2012 [Paper conference version, arxiv]
"Learning the Experts for Online Sequence Prediction". Elad Eban, Amir Globerson, Shai Shalev-Shwartz, Aharon Birnbaum. ICML 2012 [Paper pdf]
"Using More Data to Speed-up Training Time" Shai Shalev-Shwartz, Ohad Shamir, Eran Tromer. AISTATS 2012. [Paper pdf]
"ShareBoost: Efficient multiclass learning with feature sharing" Shai Shalev-Shwartz, Yonatan Wexler, Amnon Shashua. NIPS 2012. [Paper pdf (full version)]
For a possible application of ShareBoost, see how Orcam's glasses are helping blind to 'see'. See also the NY times article.
"Large-Scale Convex Minimization with a Low-Rank Constraint" Shai Shalev-Shwartz, Alon Gonen, and Ohad Shamir. ICML, 2011. [Paper pdf (full version)] [Source code and information on how to reproduce the experiments is available here.]
"Access to Unlabeled Data can Speed up Prediction Time" Ruth Urner, Shai Ben-David, Shai Shalev-Shwartz. ICML, 2011. [Paper pdf]
"Multiclass Learnability and the ERM principle" Amit Daniely, Sivan Sabato, Shai Ben-David, Shai Shalev-Shwartz. COLT, 2011. Received best student paper award. [Paper pdf]
"Collaborative Filtering with the Trace Norm: Learning, Bounding, and Transducing" Ohad Shamir and Shai Shalev-Shwartz. COLT, 2011. [Paper pdf]
"Learning Linear and Kernel Predictors with the 0-1 Loss Function" Shai Shalev-Shwartz, Ohad Shamir, Karthik Sridharan. IJCAI (Best paper track), 2011. [Paper pdf]
"Quantity Makes Quality: Learning with Information Constraints" Nicolo Cesa-Bianchi, Shai Shalev-Shwartz and Ohad Shamir. AAAI (Nectar track), 2011. [Paper pdf]
"Learning from Noisy Data under Distributional Assumptions" Nicolo Cesa-Bianchi, Shai Shalev-Shwartz and Ohad Shamir. Robust Statistical Learning Workshop, NIPS 2010 [Paper, Full version ]
"Learning Kernel-Based Halfspaces with the Zero-One Loss" Shai Shalev-Shwartz, Ohad Shamir, Karthik Sridharan. COLT, 2010. Received best paper award. [Paper pdf]
"Efficient Learning with Partially Observed Attributes" Nicolo Cesa-Bianchi, Shai Shalev-Shwartz, Ohad Shamir. ICML, 2010. [Paper with all proofs]
"Some Impossibility Results for Budgeted Learning" Nicolo Cesa-Bianchi, Shai Shalev-Shwartz, Ohad Shamir. Budgeted Learning Workshop, ICML-COLT 2010. [Paper]
"Online Learning of Noisy Data with Kernels" Nicolo Cesa-Bianchi, Shai Shalev-Shwartz, Ohad Shamir. COLT, 2010. [Paper pdf]
"Composite Objective Mirror Descent" John Duchi, Shai Shalev-Shwartz, Yoram Singer, Ambuj Tewari. COLT, 2010.[Paper]
"Stochastic Methods for $\ell_1$ Regularized Loss Minimization" Shai Shalev-Shwartz and Ambuj Tewari. ICML, 2009. [Paper pdf] Errata: Section 3.1 contains errors. Please refer to the long version of the paper [JMLR Paper pdf]
"Stochastic Convex Optimization" Shai Shalev-Shwartz, Ohad Shamir, Karthik Sridharan and Nati Srebro COLT, 2009. [Paper pdf]
"Learnability and Stability in the General Learning Setting" Shai Shalev-Shwartz, Ohad Shamir, Karthik Sridharan and Nati Srebro COLT, 2009. [Paper pdf]
"Agnostic Online Learning" Shai Ben-David, David Pal and Shai Shalev-Shwartz COLT, 2009. [Paper pdf]
"Mind the duality gap: Logarithmic regret algorithms for online optimization" Sham Kakade and Shai Shalev-Shwartz. NIPS, 2008. [Paper pdf]
"Fast Rates for Regularized Objectives" Karthik Sridharan, Shai Shalev-Shwartz, Nathan Srebro. NIPS, 2008. [Paper pdf]
"SVM Optimization: Inverse Dependence on Training Set Size" Shai Shalev-Shwartz and Nathan Srebro. ICML 2008. Received best paper award [Paper (corrected): pdf ],[Errata]
"On the Equivalence of Weak Learnability and Linear Separability: New Relaxations and Efficient Boosting Algorithms" Shai Shalev-Shwartz and Yoram Singer. COLT 2008. [Paper: pdf ][Talk slides]
"Efficient Bandit Algorithms for Online Multiclass Prediction" Ambuj Tewari, Shai Shalev-Shwartz and Sham Kakade. ICML 2008. [Paper: pdf ][Talk slides]
"Efficient Projections onto the $\ell_1$-Ball for Learning in High Dimensions" John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra. ICML 2008. [Paper: pdf ]
"Pegasos: Primal Estimated sub-GrAdient SOlver for SVM" Shai Shalev-Shwartz, Yoram Singer, and Nathan Srebro. ICML 2007. [Paper: pdf ] [Talk Slides: ppt ] A source code is available here.
A technical report with a generalized logarithmic regret and detailed proofs:
"Logarithmic Regret Algorithms for Strongly Convex Repeated Games" ,Technical Report [2007-42], The Hebrew University, May 2007. [Paper: pdf]
"Prediction by Categorical Features: Generalization Properties and Application to Feature Ranking" Sivan Sabato and Shai Shalev-Shwartz, COLT 2007. [Paper: pdf ]
A version with all the proofs: [Paper: pdf]
"A Unified Algorithmic Approach for Efficient Online Label Ranking" Shai Shalev-Shwartz and Yoram Singer, AISTAT 2007. [Paper: pdf ]
"Convex Repeated Games and Fenchel Duality" Shai Shalev-Shwartz and Yoram Singer, NIPS 2006. [Paper: pdf ]
A version with all the proofs: [Paper: pdf]
"Online Classification for Complex Problems Using Simultaneous Projections" Yonatan Amit, Shai Shalev-Shwartz, and Yoram Singer, NIPS 2006. [Paper: pdf ]
"Online Learning meets Optimization in the Dual" Shai Shalev-Shwartz and Yoram Singer, COLT 2006. [Paper: pdf ]
A version with all the proofs: Technical Report[2006-2], Leibniz Center, 2006. [Paper: pdf]
"Online Multiclass Learning by Interclass Hypothesis Sharing" Michael Fink, Shai Shalev-Shwartz, Yoram Singer and Shimon Ullman ICML 2006. [Paper: pdf ]
"The Forgetron: A Kernel-Based Perceptron on a Fixed Budget." Ofer Dekel, Shai Shalev-Shwartz and Yoram Singer, Advances in Neural Information Processing Systems 17, MIT Press, 2005. Received "Outstanding student paper award". [Paper: pdf ] [A journal version with proofs: pdf ] [Talk Slides: ppt ] [Poster: ppt ]
"Phoneme Alignment Based on Discriminative Learning" Joseph Keshet, Shai Shalev-Shwartz, Yoram Singer, and Dan Chazan. Interspeech 2005 [Paper: pdf ]
"A New Perspective on an Old Perceptron Algorithm" Shai Shalev-Shwartz and Yoram Singer, Proceedings of the Sixteenth Annual Conference on Computational Learning Theory, 2005 [Paper: pdf ] [Errata (thanks to Francesco Orabona for pointing out a mistake in the paper)]
"The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees" Ofer Dekel, Shai Shalev-Shwartz and Yoram Singer, Advances in Neural Information Processing Systems 17, MIT Press, 2004. [Paper: pdf ] [Talk Slides: ppt ]
"Learning to Align Polyphonic Music" Shai Shalev-Shwartz, Joseph Keshet and Yoram Singer. ISMIR 2004 Webpage for the paper [Paper: pdf ] [Long version: pdf ] [Talk Slides: ppt ]
"Online and Batch Learning of Pseudo-Metrics" Shai Shalev-Shwartz, Yoram Singer and Andrew Y. Ng. ICML 2004 [Paper: pdf ] [Talk Slides: ppt ]
"Online Passive-Aggressive Algorithms" Koby Crammer, Ofer Dekel, Shai Shalev-Shwartz and Yoram Singer, Advances in Neural Information Processing Systems 16, MIT Press, 2003. [Paper: pdf ] [Talk Slides: ppt ]
"Smooth Epsilon-Insensitive Regression by Loss Symmetrization" Ofer Dekel, Shai Shalev-Shwartz and Yoram Singer, Proceedings of the Sixteenth Annual Conference on Computational Learning Theory, pages 433-447,Springer LNAI 2777, 2003 [Paper: pdf ] [Talk slides: ppt ]
"Robust Temporal and Spectral Modeling for Query by Melody" Shai Shalev-Shwartz, Shlomo Dubnov, Nir Friedman and Yoram Singer, Proceedings of the 25rd Conference on Research and Development in Information Retrieval (SIGIR), 2002. [Paper: pdf ] [Talk slides: ppt mp3 files (tar) ]

Other publications (Tech Reports, Workshops, Demonstrations, ...)

An Algorithm for Training Polynomial Networks. Roi Livni, Shai Shalev-Shwartz and Ohad Shamir.
Proximal Stochastic Dual Coordinate Ascent. Shai Shalev-Shwartz and Tong Zhang.
"Trading Accuracy for Sparsity" Shai Shalev-Shwartz, Nathan Srebro, Tong Zhang. Technical Report TTIC-TR-2009-3, May 2009. [pdf]
"Agnostic Online Learnability" Shai Shalev-Shwartz. Technical Report TTIC-TR-2008-2, October 2008. [Report] A much improved version [appears in COLT], together with Shai Ben-David and David Pal.
"Low \ell_1 Norm and Guarantees on Sparsifiability" Shai Shalev-Shwartz and Nathan Srebro. Sparse Optimization and Variable Selection, Workshop, ICML/COLT/UAI, July, 2008. [Extended abstract, Report] [Talk slides]
"Iterative Loss Minimization with $\ell_1$-Norm Constraint and Guarantees on Sparsity" Shai Shalev-Shwartz and Nathan Srebro. Technical Report, TTI, 2008. [Report]
"A Demonstration of a Query by Melody system" Shai Shalev-Shwartz and Yoram Singer, Presented in NIPS, 2003. [Movie file: mp4 ]