TY - JOUR
TI - Bayesian machine learning via category theory
AU - Culbertson, Jared
AU - Sturtz, Kirk
T2 - arXiv:1312.1445 [math]
AB - From the Bayesian perspective, the category of conditional probabilities (a variant of the Kleisli category of the Giry monad, whose objects are measurable spaces and arrows are Markov kernels) gives a nice framework for conceptualization and analysis of many aspects of machine learning. Using categorical methods, we construct models for parametric and nonparametric Bayesian reasoning on function spaces, thus providing a basis for the supervised learning problem. In particular, stochastic processes are arrows to these function spaces which serve as prior probabilities. The resulting inference maps can often be analytically constructed in this symmetric monoidal weakly closed category. We also show how to view general stochastic processes using functor categories and demonstrate the Kalman filter as an archetype for the hidden Markov model.
DA - 2013/12/05/
PY - 2013
DP - arXiv.org
UR - http://arxiv.org/abs/1312.1445
Y2 - 2019/11/22/17:32:35
KW - Bayesianism
KW - Categorical ML
KW - Categorical probability theory
KW - Purely theoretical
ER -
TY - JOUR
TI - Backprop as Functor: A compositional perspective on supervised learning
AU - Fong, Brendan
AU - Spivak, David I.
AU - Tuyéras, Rémy
T2 - arXiv:1711.10455 [cs, math]
AB - A supervised learning algorithm searches over a set of functions $A \to B$ parametrised by a space $P$ to find the best approximation to some ideal function $f\colon A \to B$. It does this by taking examples $(a,f(a)) \in A\times B$, and updating the parameter according to some rule. We define a category where these update rules may be composed, and show that gradient descent---with respect to a fixed step size and an error function satisfying a certain property---defines a monoidal functor from a category of parametrised functions to this category of update rules. This provides a structural perspective on backpropagation, as well as a broad generalisation of neural networks.
DA - 2019/05/01/
PY - 2019
DP - arXiv.org
ST - Backprop as Functor
UR - http://arxiv.org/abs/1711.10455
Y2 - 2019/11/23/14:42:07
KW - Categorical ML
KW - Machine learning
KW - Purely theoretical
ER -
TY - CONF
TI - Algebraic classifiers: a generic approach to fast cross-validation, online training, and parallel training
AU - Izbicki, Michael
AB - We use abstract algebra to derive new algorithms for fast cross-validation, online learning, and parallel learning. To use these algorithms on a classification model, we must show that the model has appropriate algebraic structure. It is easy to give algebraic structure to some models, and we do this explicitly for Bayesian classifiers and a novel variation of decision stumps called HomStumps. But not all classifiers have an obvious structure, so we introduce the Free HomTrainer. This can be used to give a "generic" algebraic structure to any classifier. We use the Free HomTrainer to give algebraic structure to bagging and boosting. In so doing, we derive novel online and parallel algorithms, and present the first fast cross-validation schemes for these classifiers.
C3 - ICML
DA - 2013///
PY - 2013
DP - Semantic Scholar
ST - Algebraic classifiers
KW - Algebra
KW - Categorical ML
KW - Machine learning
ER -
TY - JOUR
TI - Categorical Aspects of Parameter Learning
AU - Jacobs, Bart
T2 - arXiv:1810.05814 [cs]
AB - Parameter learning is the technique for obtaining the probabilistic parameters in conditional probability tables in Bayesian networks from tables with (observed) data --- where it is assumed that the underlying graphical structure is known. There are basically two ways of doing so, referred to as maximal likelihood estimation (MLE) and as Bayesian learning. This paper provides a categorical analysis of these two techniques and describes them in terms of basic properties of the multiset monad M, the distribution monad D and the Giry monad G. In essence, learning is about the reltionships between multisets (used for counting) on the one hand and probability distributions on the other. These relationsips will be described as suitable natural transformations.
DA - 2018/10/13/
PY - 2018
DP - arXiv.org
UR - http://arxiv.org/abs/1810.05814
Y2 - 2019/11/21/20:38:28
KW - Bayesianism
KW - Categorical ML
KW - Categorical probability theory
KW - Machine learning
ER -
TY - JOUR
TI - Neural Nets via Forward State Transformation and Backward Loss Transformation
AU - Jacobs, Bart
AU - Sprunger, David
T2 - arXiv:1803.09356 [cs]
AB - This article studies (multilayer perceptron) neural networks with an emphasis on the transformations involved --- both forward and backward --- in order to develop a semantical/logical perspective that is in line with standard program semantics. The common two-pass neural network training algorithms make this viewpoint particularly fitting. In the forward direction, neural networks act as state transformers. In the reverse direction, however, neural networks change losses of outputs to losses of inputs, thereby acting like a (real-valued) predicate transformer. In this way, backpropagation is functorial by construction, as shown earlier in recent other work. We illustrate this perspective by training a simple instance of a neural network.
DA - 2018/03/25/
PY - 2018
DP - arXiv.org
UR - http://arxiv.org/abs/1803.09356
Y2 - 2019/11/21/20:40:18
KW - Categorical ML
KW - Effectus theory
KW - Machine learning
ER -
TY - JOUR
TI - A Predicate/State Transformer Semantics for Bayesian Learning
AU - Jacobs, Bart
AU - Zanasi, Fabio
T2 - Electronic Notes in Theoretical Computer Science
T3 - The Thirty-second Conference on the Mathematical Foundations of Programming Semantics (MFPS XXXII)
AB - This paper establishes a link between Bayesian inference (learning) and predicate and state transformer operations from programming semantics and logic. Specifically, a very general definition of backward inference is given via first applying a predicate transformer and then conditioning. Analogously, forward inference involves first conditioning and then applying a state transformer. These definitions are illustrated in many examples in discrete and continuous probability theory and also in quantum theory.
DA - 2016/10/05/
PY - 2016
DO - 10/ggdgbb
DP - ScienceDirect
VL - 325
SP - 185
EP - 200
J2 - Electronic Notes in Theoretical Computer Science
LA - en
SN - 1571-0661
UR - http://www.sciencedirect.com/science/article/pii/S1571066116300883
Y2 - 2019/11/24/12:04:12
KW - Bayesianism
KW - Categorical ML
KW - Categorical probability theory
KW - Effectus theory
KW - Programming language theory
KW - Semantics
ER -
TY - JOUR
TI - What is a statistical model?
AU - McCullagh, Peter
T2 - The Annals of Statistics
DA - 2002/10//
PY - 2002
DO - 10/bkts3m
DP - Crossref
VL - 30
IS - 5
SP - 1225
EP - 1310
LA - en
UR - http://projecteuclid.org/euclid.aos/1035844977
Y2 - 2019/11/22/17:39:10
KW - Bayesianism
KW - Categorical ML
KW - Categorical probability theory
KW - Compendium
KW - Purely theoretical
KW - Statistical learning theory
ER -
TY - COMP
TI - dmurfet/deeplinearlogic
AU - Murfet, Daniel
AB - Deep learning and linear logic. Contribute to dmurfet/deeplinearlogic development by creating an account on GitHub.
DA - 2018/07/14/T01:08:44Z
PY - 2018
DP - GitHub
LA - Jupyter Notebook
UR - https://github.com/dmurfet/deeplinearlogic
Y2 - 2019/11/22/16:44:43
KW - Categorical ML
KW - Implementation
KW - Linear logic
KW - Machine learning
KW - Semantics
ER -
TY - COMP
TI - dmurfet/polysemantics
AU - Murfet, Daniel
AB - Polynomial semantics of linear logic. Contribute to dmurfet/polysemantics development by creating an account on GitHub.
DA - 2018/04/29/T20:41:43Z
PY - 2018
DP - GitHub
LA - Python
UR - https://github.com/dmurfet/polysemantics
Y2 - 2019/11/22/16:45:35
KW - Categorical ML
KW - Implementation
KW - Linear logic
KW - Machine learning
KW - Semantics
ER -
TY - JOUR
TI - Derivatives of Turing machines in Linear Logic
AU - Murfet, Daniel
AU - Clift, James
T2 - arXiv:1805.11813 [math]
AB - We calculate denotations under the Sweedler semantics of the Ehrhard-Regnier derivatives of various encodings of Turing machines into linear logic. We show that these derivatives calculate the rate of change of probabilities naturally arising in the Sweedler semantics of linear logic proofs. The resulting theory is applied to the problem of synthesising Turing machines by gradient descent.
DA - 2019/01/28/
PY - 2019
DP - arXiv.org
UR - http://arxiv.org/abs/1805.11813
Y2 - 2019/11/21/20:33:27
KW - Abstract machines
KW - Categorical ML
KW - Differentiation
KW - Linear logic
KW - Machine learning
ER -
TY - SLIDE
TI - Linear logic and deep learning
A2 - Murfet, Daniel
A2 - Hu, Huiyi
LA - en
KW - Categorical ML
KW - Linear logic
KW - Machine learning
KW - Semantics
ER -
TY - CONF
TI - Differentiable Causal Computations via Delayed Trace
AU - Sprunger, David
AU - Katsumata, Shin-ya
T2 - 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS)
AB - We investigate causal computations taking sequences of inputs to sequences of outputs where the nth output depends on the ﬁrst n inputs only. We model these in category theory via a construction taking a Cartesian category C to another category St(C) with a novel trace-like operation called “delayed trace”, which misses yanking and dinaturality axioms of the usual trace. The delayed trace operation provides a feedback mechanism in St(C) with an implicit guardedness guarantee.
C1 - Vancouver, BC, Canada
C3 - 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS)
DA - 2019/06//
PY - 2019
DO - 10/ggdf98
DP - Crossref
SP - 1
EP - 12
LA - en
PB - IEEE
SN - 978-1-72813-608-0
UR - https://ieeexplore.ieee.org/document/8785670/
Y2 - 2019/11/23/16:57:38
KW - Categorical ML
KW - Differentiation
ER -