Statistical Analysis and Model in Python
========================================
Error Propagation
-----------------
- `astropy.uncertainty `__
- Provides a **Distribution** object to represent statistical
distributions in a form that acts as a drop-in replacement for
**Quantity** or a regular **numpy.ndarray**. Still work in
progress.
- `uncertainties - Transparent calculations with uncertainties on the
quantities involved `__
- The **uncertainties** package is a free, cross-platform program
that transparently handles calculations with numbers with
uncertainties (like 3.14±0.01). It can also yield the derivatives
of any expression.
Modeling Tool
-------------
- `spotpy - A Statistical Parameter Optimization
Tool `__
- SPOTPY is a Python framework that enables the use of Computational
optimization techniques for calibration, uncertainty and
sensitivity analysis techniques of almost every (environmental-)
model.
- `BayesianOptimization - A Python implementation of global
optimization with gaussian
processes `__
- This is a constrained global optimization package built upon
bayesian inference and gaussian process, that attempts to find the
maximum value of an unknown function in as few iterations as
possible.
Sampling Tools and Bayesian Analysis
------------------------------------
- `emcee - The Python ensemble sampling toolkit for affine-invariant
MCMC `__
- By Dan Foreman-Mackey. **emcee** is a stable, well tested Python
implementation of the affine-invariant ensemble sampler for Markov
chain Monte Carlo (MCMC) proposed by Goodman & Weare (2010).
- `dynesty - Dynamic Nested Sampling package for computing Bayesian
posteriors and evidences `__
- By `Josh Speagle `__. A Dynamic
Nested Sampling package for computing Bayesian posteriors and
evidences. Pure Python.
- `nestle - Pure Python, MIT-licensed implementation of nested sampling
algorithms for evaluating Bayesian
evidence `__
- By `Kyle Barbary `__
- `nnest - Neural network accelerated nested and MCMC
sampling `__
- By Adam Moss. Based on `this
paper `__
- `sampyl - MCMC samplers for Bayesian estimation in Python, including
Metropolis-Hastings, NUTS, and
Slice `__
- **Sampyl** is a package for sampling from probability
distributions using MCMC methods. Similar to **PyMC3** using
theano to compute gradients, Sampyl uses autograd to compute
gradients.
- `PyMC3 - Probabilistic Programming in Python: Bayesian Modeling and
Probabilistic Machine Learning with
Theano `__
- **PyMC3** is a Python package for Bayesian statistical modeling
and Probabilistic Machine Learning focusing on advanced Markov
chain Monte Carlo (MCMC) and variational inference (VI)
algorithms. Its flexibility and extensibility make it applicable
to a large suite of problems.
- `Getting started with
PyMC3 `__ and the
`Example
Notebooks `__ are
good places to get started.
- `PyMC4 - A high-level probabilistic programming interface for
TensorFlow Probability `__
Gaussian Process
----------------
- A full introduction to the theory of Gaussian Processes is available
for free online: `Rasmussen & Williams
(2006) `__.
- `An Astronomer’s Introduction to Gaussian
Processes `__
- Very good introduction by Dan Foreman-Mackey.
- `sklearn.gaussian_process - The Gaussian Processes module in
scikit-learn `__
- `GPy - Gaussian processes framework in
python `__
- Gaussian processes underpin range of modern machine learning
algorithms. In `GPy `__, we’ve
used python to implement a range of machine learning algorithms
based on GPs. `Online document is
here `__
- `Jupyter notebooks to introduce
GPy `__
- `gpflow - Gaussian processes in
TensorFlow `__
- **GPflow** is a package for building Gaussian process models in
python, using **TensorFlow**.
- **GPflow** implements modern Gaussian process inference for
composable kernels and likelihoods.
- **GPflow** uses TensorFlow for running computations, which allows
fast execution on GPUs, and uses Python 3.5 or above.
- `Online document is
here `__
- `gpytorch - A highly efficient and modular implementation of Gaussian
Processes in PyTorch `__
- **GPyTorch** is a Gaussian process library implemented using
**PyTorch**. **GPyTorch** is designed for creating scalable,
flexible, and modular Gaussian process models with ease.
- `george - Fast and flexible Gaussian Process regression in
Python `__
- By Dan Foreman-Mackey. **George** is a fast and flexible Python
library for Gaussian Process (GP) Regression.
- Unlike some other GP implementations, **george** is focused on
efficiently evaluating the marginalized likelihood of a dataset
under a GP prior, even as this dataset gets Big
- Example applications:
- `ART - A Reconstruction
Tool `__
- `everest - De-trending of K2 Light
curves `__
- `celerite - Scalable 1D Gaussian Processes in C++, Python, and
Julia `__
- By Dan Foreman-Mackey. `Online document is
here `__
- Based on `Fast and scalable Gaussian process modeling with
applications to astronomical time
series `__
Survival Analysis
-----------------
- Traditionally, `survival
analysis `__ was
developed to measure lifespans of individuals. The analysis can be
further applied to not just traditional births and deaths, but any
duration.
- **Survival function**: the survival function defines the probability
the death event has not occured yet at time t, or equivalently, the
probability of surviving past time t
- **Hazard curve**: the probability of the death event occurring at
time t, given that the death event has not occurred until time t.
Hazard function is non-parametric.
- **Kaplan-Meier estimator for survival function**: Survival analysis
assumes that upper limits have the same underlying distribution as
the data, and the Kaplan-Meier esti- mator further assumes that
detections and upper limits are mutually independent
- `lifelines - implementation of survival analysis in
Python `__
- Handles right-censored data.
- Example of astrophysical usage: `radio SED of high-z SF
galaxies `__