Title: | Determine the Number of States in Hidden Markov Models via Marginal Likelihood |
---|---|
Description: | Provide functions to make estimate the number of states for a hidden Markov model (HMM) using marginal likelihood method proposed by the authors. See the Manual.pdf file a detail description of all functions, and a detail tutorial. |
Authors: | Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and S. C. Kou. |
Maintainer: | Chu-Lan Michael Kao <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.6 |
Built: | 2025-02-20 03:58:52 UTC |
Source: | https://github.com/cran/HMMmlselect |
(See the Manual.pdf file in "inst/extdata"" folder for a detail description of all functions, and a detail tutorial.)
This packages provides function to estimate the number of states in a Hidden Markov model (HMM) using the marginal likelihood method proposed by Chen, Fuh, Kao and Kou (2019+), which we would denoted as HMMml2017 afterward. HMMmlselect
estimates the number of states, and PlotHMM
plots. Other related functions are also provided.
Package: | HMMmlselect |
Type: | Package |
Version: | 1.0 |
Date: | 2019-2-08 |
License: | GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 |
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou
Maintainer: Chu-Lan Michael Kao <[email protected]>
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
library(HMMmlselect) # simulate a 25 observations HMM obs = HMMsim ( n = 25 )$obs # perform order selection and estimation results = HMMmlselect ( y = obs, list(Kfits = c(2,3), boolUseMclust = FALSE) ) # visualize the results, see figure 1 PlotHMM ( y = obs, results )
library(HMMmlselect) # simulate a 25 observations HMM obs = HMMsim ( n = 25 )$obs # perform order selection and estimation results = HMMmlselect ( y = obs, list(Kfits = c(2,3), boolUseMclust = FALSE) ) # visualize the results, see figure 1 PlotHMM ( y = obs, results )
Auxiliary function that checks whether the parameter set is suitable for HMM/Gaussian mixture.
check_para_validity(parameters_in_matrix_form, bool_hmm)
check_para_validity(parameters_in_matrix_form, bool_hmm)
parameters_in_matrix_form |
The parameters to be checked. |
bool_hmm |
1 for HMM. 0 for Gaussian mixture. |
See Manual.pdf in "inst/extdata" folder.
It returns the logical value of whether the parameter set is suitable for HMM/Gaussian mixture.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
Auxiliary function that approximate the normalizing constant. Used when approximating marginal likelihood.
estimateNormalizingConst(SampDens, boolHMM, dft, RIS, IS, NsmpResmp, llUn, Mclust = TRUE)
estimateNormalizingConst(SampDens, boolHMM, dft, RIS, IS, NsmpResmp, llUn, Mclust = TRUE)
SampDens |
The sample and density |
boolHMM |
Number that controls the method. |
dft |
Degree of freedom used in the t-distribution. |
RIS |
RIS in the algorithm. |
IS |
IS in the algorithm. |
NsmpResmp |
Number of resampling used in the algorithm. |
llUn |
Unnormalized log likelihood. |
Mclust |
Auxiliary variable. Set to |
See Manual.pdf in "inst/extdata" folder.
It returns the approximated normalizing constant.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
Auxiliary function that computes the importance weight used in approximate the normalizing constant.
find_importance_function(x, boolUseMclust)
find_importance_function(x, boolUseMclust)
x |
The data that the importance weight is computed on. |
boolUseMclust |
Auxiliary variable. Set to |
See Manual.pdf in "inst/extdata" folder.
It returns the importance weight used in approximated normalizing constant.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
Auxiliary function that identify the local region for locally restricted importance sampling. Used in approximate the normalizing constant.
find_index_local_region(samples, EM_result, df_t)
find_index_local_region(samples, EM_result, df_t)
samples |
The sample used to consruct region. |
EM_result |
The parameters. |
df_t |
Degree of freedom for the t-distrbution. |
See Manual.pdf in "inst/extdata" folder.
It returns the local region for locally restricted importance sampling.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
The following function performs (a) HMM fitting through the Expectation-Maximization al- gorithm (METHOD = 1), (b) HMM fitting through the Markov chain Monte Carlo algorithm (METHOD = 2), and (c) Gaussian mixture model fitting through the Markov chain Monte Carlo algorithm (METHOD = 3).
HMMfit(y, K, METHOD, optionalfit = list())
HMMfit(y, K, METHOD, optionalfit = list())
y |
The observed data. |
K |
The specified number of states of the underlying Markov chian. |
METHOD |
Integer value indicating the method of parameter estimation: (a) HMM fitting through the Expectation-Maximization al- gorithm (METHOD = 1), (b) HMM fitting through the Markov chain Monte Carlo algorithm (METHOD = 2), and (c) Gaussian mixture model fitting through the Markov chain Monte Carlo algorithm (METHOD = 3) |
optionalfit |
Optional variables as a list. Possible options include: |
Ngibbs: Number of samples when using MCMC. Default is 5000.
Burnin: Length of burnin period when using MCMC. Default is 5000.
Thin: Thinning parameter when using MCMC. Default is 10.
Nstart: Number of starting points. Default is 50.
verbose: Logic variable indicating pritting details or not. Default is FALSE
.
priors: Prior when using MCMC. Default is flat.
See Manual.pdf in "inst/extdata" folder.
This functions returns the fitting parameters of the observed data given the specified number of states.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
library(HMMmlselect) # Example 1: use HMMfit to inference number of states obs = HMMsim ( n = 200 )$obs Nest = HMMfit( y = obs, K=3, METHOD = 1)
library(HMMmlselect) # Example 1: use HMMfit to inference number of states obs = HMMsim ( n = 200 )$obs Nest = HMMfit( y = obs, K=3, METHOD = 1)
Auxiliary function called from C++.
HMMfitting(tuningparameters)
HMMfitting(tuningparameters)
tuningparameters |
Carried values. |
Auxiliary function called from C++.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
Auxiliary function called from C++.
HMMll(tuningparameters)
HMMll(tuningparameters)
tuningparameters |
Carried values. |
Auxiliary function called from C++.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
Auxiliary function that approximates the marginal likelihood.
HMMmlestimate(y, K, optionalfit = list())
HMMmlestimate(y, K, optionalfit = list())
y |
The data that the marginal likelihood is computed for. |
K |
Number of states. |
optionalfit |
Optional variables. |
See Manual.pdf in "inst/extdata" folder.
It returns the approximated marginal likelihood.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
This function computes the marginal likelihood of the HMM model with the observed data and various number of states, and choose the one with the highest marginal likelihood as the estimated number of states. The method in Chen et al. (2017) is used, which we will denote it as HMMml2017 afterward.
HMMmlselect(y, optionalfit = list())
HMMmlselect(y, optionalfit = list())
y |
The observed data. |
optionalfit |
Optional variables as a list. Possible options include: |
Kfits: Possible numbers of states. The function will compute the marginal likelihood under each of these numbers, and choose the one with the highest values as the estimated number of states. Default is {2,3,...,6}
RIS: Logical variable indicating whether to use the reciprocal importance sampling method in HMMml2017. Default is set to be FALSE
.
IS: Logical variable indicating whether to use the importance sampling method in HMMml2017. Default is set to be TRUE
.
RunParallel: Logical variable indicating whether to run using parallelization or not. Default is TRUE
.
boolUseMclust: Logical variable indicating whether to use the Mclust
package. Default is set to be FALSE
.
priors: Lists of hyper parameters for the prior. Default is flat prior.
boolHMM: Compute using marginal likelihood of HMM. Default is TRUE
. If it is set to FALSE
, then the program will use the marignal likelihood of mixture model instead.
See Manual.pdf in "inst/extdata" folder.
It returns (1) the estimated number of hidden states using the marginal likelihood method, (2) the marginal likelihood values corresponding to 2, 3, ... number of hidden states, and (3) the fitted model parameters given the estimated number of hidden states.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
library(HMMmlselect) # simulate a 25 observations HMM obs = HMMsim ( n = 25 )$obs # perform order selection and estimation results = HMMmlselect ( y = obs, list(Kfits = c(2,3), boolUseMclust = FALSE) ) # visualize the results, see figure 1 PlotHMM ( y = obs, results )
library(HMMmlselect) # simulate a 25 observations HMM obs = HMMsim ( n = 25 )$obs # perform order selection and estimation results = HMMmlselect ( y = obs, list(Kfits = c(2,3), boolUseMclust = FALSE) ) # visualize the results, see figure 1 PlotHMM ( y = obs, results )
Auxiliary function called from C++.
HMMrepsim(tuningparameters)
HMMrepsim(tuningparameters)
tuningparameters |
Carried values. |
Auxiliary function called from C++.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
This function simulates HMM with the observed data being conditionally Gaussian distributed given the underlying state.
HMMsim(n, optionalsim = list())
HMMsim(n, optionalsim = list())
n |
The length of HMM to be simulated. |
optionalsim |
Optional variables as a list. Possible options include: |
Ksim: Number of states of the simulated HMM. Default is 3
.
P: The transition matrix of the underlying Markov chain. Default is a flat K-by-K matrix.
mu: The mean of the observed data given each underlying state. Default is {1, 2, ..., K}.
sigma: The standard deviation of the observed data given each underlying state. Default is {0.1, 0.1, ... 0.1}.
pi: The distribution of the initial state. Default is an uniform distribution across all possible states.
BoolWritetoFile: Logic variable indicating whether to write the result into file or not. Default is FALSE
.
Filenameoutput: The output file name for the simulated HMM. Default is "HMMtrace.txt".
See Manual.pdf in "inst/extdata" folder.
It returns the sample of the simulated HMM.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
library(HMMmlselect) # simulate a 25 observations HMM obs = HMMsim ( n = 25 )$obs # perform order selection and estimation results = HMMmlselect ( y = obs, list(Kfits = c(2,3), boolUseMclust = FALSE) ) # visualize the results, see figure 1 PlotHMM ( y = obs, results )
library(HMMmlselect) # simulate a 25 observations HMM obs = HMMsim ( n = 25 )$obs # perform order selection and estimation results = HMMmlselect ( y = obs, list(Kfits = c(2,3), boolUseMclust = FALSE) ) # visualize the results, see figure 1 PlotHMM ( y = obs, results )
Auxiliary function called from C++.
HMMsimulate(tuningparameters)
HMMsimulate(tuningparameters)
tuningparameters |
Carried values. |
Auxiliary function called from C++.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
Auxiliary function that computes the unnormalized posterior density of HMM and Gaussian mixture.
ll_un_normalized_hmm_gm(paras, yobs, bool_hmm, priors)
ll_un_normalized_hmm_gm(paras, yobs, bool_hmm, priors)
paras |
The parameter for the HMM. |
yobs |
The observed data. |
bool_hmm |
1 if it's HMM. 0 if it's Gaussian mixture. |
priors |
Prior distribution. |
See Manual.pdf in "inst/extdata" folder.
It returns the unnormalized posterior density of HMM and Gaussian mixture.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
Auxiliary function that computes log-likelihood based on multivariate t-distribution. Used when approximating marginal likelihood.
logmvNormdensity(x, mu, sqrt_inv_sigma, lgsqrt_det_sigma, d, logconstnormal)
logmvNormdensity(x, mu, sqrt_inv_sigma, lgsqrt_det_sigma, d, logconstnormal)
x |
Data to compute the log likelihood. |
mu |
Mean of the normal distribution |
sqrt_inv_sigma |
Corresponds to the standard deviation of the normal distribution. |
lgsqrt_det_sigma |
Corresponds to the determinant of the variance-covariance matrix. |
d |
Dimension of the normal distribution. |
logconstnormal |
Leading constant for normal density function. |
See Manual.pdf in "inst/extdata" folder.
It returns the log-likelihood of the multivariate student-t distribution.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
Auxiliary function that computes log-likelihood based on multivariate t-distribution. Used when approximating marginal likelihood.
logmvTdensity(x, mu, sqrt_inv_sigma, lgsqrt_det_sigma, df, d, logconstT)
logmvTdensity(x, mu, sqrt_inv_sigma, lgsqrt_det_sigma, df, d, logconstT)
x |
Data to compute the log likelihood. |
mu |
Mean of the t-distribution |
sqrt_inv_sigma |
Corresponds to the standard deviation of the t-distribution. |
lgsqrt_det_sigma |
Corresponds to the determinant of the variance-covariance matrix. |
df |
Degree of freedom of the t-distribution. |
d |
Dimension of the t-distribution. |
logconstT |
Leading constant for t-distribution density function. |
See Manual.pdf in "inst/extdata" folder.
It returns the log-likelihood of the multivariate student-t distribution.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
Auxiliary function that computes probability density of mixture normal/t-distribution. Used when approximating marginal likelihood.
multivariate_mixture_density(x, EM_result, df_t, logconstnormal, logconstT)
multivariate_mixture_density(x, EM_result, df_t, logconstnormal, logconstT)
x |
Data to compute the log likelihood. |
EM_result |
Parameter as a dataset, including $p for probability weight, $Mu for the mean of each mixture component, and $Sigma for the standard deviation of each mixture component. |
df_t |
Degree of freedom of the t-distribution. If df_t==0, then it is treated as a normal distribution. |
logconstnormal |
Leading constant for normal density function. Used when df_t==0. |
logconstT |
Leading constant for t-distribution density function. Used when df_t>0. |
See Manual.pdf in "inst/extdata" folder.
It returns the likelihood of the mixture normal/student-t distribution.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
This function visualizes the state inference result using HMMmlselect
. See the Manual.pdf under
data folder for a figure example.
PlotHMM(y, results)
PlotHMM(y, results)
y |
The observed data. |
results |
The resulting output of state inference using |
See Manual.pdf in "inst/extdata" folder.
It returns the graph with the original data and the inferenced states.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
library(HMMmlselect) # simulate a 25 observations HMM obs = HMMsim ( n = 25 )$obs # perform order selection and estimation results = HMMmlselect ( y = obs, list(Kfits = c(2,3), boolUseMclust = FALSE) ) # visualize the results, see figure 1 PlotHMM ( y = obs, results )
library(HMMmlselect) # simulate a 25 observations HMM obs = HMMsim ( n = 25 )$obs # perform order selection and estimation results = HMMmlselect ( y = obs, list(Kfits = c(2,3), boolUseMclust = FALSE) ) # visualize the results, see figure 1 PlotHMM ( y = obs, results )
This function estimates the number of states of the given HMM data using robust BIC criteria.
RobustBIC(y, optionalbic = list())
RobustBIC(y, optionalbic = list())
y |
The observed data. |
optionalbic |
Optional variables as a list. Possible options include: |
Kfits: Possible numbers of states. The function will compute the marginal likelihood under each of these numbers, and choose the one with the highest values as the estimated number of states. Default is {2,3,...,6}
Nstart: The number of starting points for the robust BIC. Default is 50.
verbose: Logic variable indicating pritting details or not. Default is FALSE
.
See Manual.pdf in "inst/extdata" folder.
This function returns the estimated number of hidden states through minimizing the BIC, the BIC values of all the possible number of hidden states, and the fitted model parameters under the estimated number of hidden states under the BIC method.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2017+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.
library(HMMmlselect) # Example 1: use robust BIC to determine the order of HMM obs = HMMsim ( n = 200 )$obs resultsBIC = RobustBIC ( y = obs )
library(HMMmlselect) # Example 1: use robust BIC to determine the order of HMM obs = HMMsim ( n = 200 )$obs resultsBIC = RobustBIC ( y = obs )
Auxiliary function that simulates mixture normal/t-distribution. Used when approximating marginal likelihood.
sample_mixture(N, list_paras, df_t)
sample_mixture(N, list_paras, df_t)
N |
The number of sample to be drawn. |
list_paras |
Parameter as a dataset, including $p for probability weight, $Mu for the mean of each mixture component, and $Sigma for the standard deviation of each mixture component. |
df_t |
Degree of freedom of the t-distribution. If df_t==0, then it is treated as a normal distribution. |
See Manual.pdf in "inst/extdata" folder.
It returns the sample of the simulated mixture normal/student-t distribution.
Yang Chen, Cheng-Der Fuh, Chu-Lan Kao, and Samuel Kou (2019+) "Determine the number of states in hidden markov models via marginal likelihood." Submitted.