---
title: "Distributions"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Distributions}
%\VignetteEngine{knitr::knitr}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
The implemented distributions are found in `univariateML_models`.
```{r}
library("univariateML")
univariateML_models
```
This package follows a naming convention for the `ml***` functions. To access the
documentation of the distribution associated with an `ml***` function, write `package::d***`.
For instance, to find the documentation for the log-gamma distribution write
```{r docs, eval = FALSE}
?actuar::dlgamma
```
Additional information about the models can found in `univariateML_metadata`.
```{r}
univariateML_metadata[["mllgser"]]
```
From the metadata you can read that
* `mllgser` estimates the parameters `N` and `s`.
* Its a discrete distribution on $1,2,3,...$,
* Its density function is `extraDistr::dlgser`.
## Problematic Distributions
Some estimation procedures will fail under certain circumstances. Sometimes due to numerical problems,
but also because the maximum likelihood estimator fails to exist. Here is a possibly non-exhaustive list of known problematic distributions.
### Discrete distributions
* **Binomial**. The maximum likelihood estimator does not exist for underdispersed data (when $size$ is estimated). There is an increasing sequence of estimates $size$, $p$ so that the binomial likelihood converges to a Poisson, however.
* **Negative binomial.** The same sort of problem occurs with the negative binomial, which converges to a Poisson for some data sets.
* **Lomax.** Here we have convergence to an exponential for certain data sets.
* **Zipf.** The optimal shape parameter may be negative, which still defines a density, but is not supported by `extraDistr`.
* **Logarithic series distribution.** When all observations are $1$ the estimator does not exist, as the "actual" maximum likelihood estimator is the point mass on $0$.
### Continuous distributions
* **Gompertz.** Here we have a similar problem, with some parameters outside the range of the distribution converging to a density function with a different support. When the `b` parameter tends towards 0, the Gompertz tends towards an exponential. A failing estimation indicates the exponential has a better fit.
* **Lomax.** Here we have convergence to an exponential for certain data sets.
* **Burr**. The Burr distribution tends to the Pareto distribution when `shape1*shape2` converges to a constant while `shape2` tends to infinity.