Why I Don’t Trust the Hosmer-Lemeshow Test for Logistic Regression

March 5, 2013

The Hosmer-Lemeshow (HL) test for logistic regression is widely used to answer the question “How well does my model fit the data?” But I’ve found it to be unsatisfactory for several reasons that I’ll explain in this post.

First, some background. Last month I wrote about several R² measures for logistic regression, which is one approach to assessing model fit. R² is a measure of predictive power, that is, how well you can predict the dependent variable based on the independent variables. That may be an important concern, but it doesn’t really address the question of whether the model is consistent with the data.

By contrast, goodness-of-fit (GOF) tests help you decide whether your model is correctly specified. They produce a p-value—if it’s low (say, below .05), you reject the model. If it’s high, then your model passes the test.

In what ways might a model be misspecified? Well, the most important potential problems are interactions and nonlinearities. You can always produce a satisfactory fit by adding enough interactions and nonlinearities. But do you really need them? GOF tests are designed to answer that question. Another issue is whether the “link” function is correct. Is it logit, probit, complementary log-log, or something else entirely?

LEARN MORE IN A SEMINAR WITH PAUL ALLISON

For both linear and logistic regression, it’s possible to have a low R² and still have a model that is correctly specified in every respect. And vice versa, you can have a very high R² and yet have a model that is grossly inconsistent with the data.

GOF tests are readily available for logistic regression when the data can be aggregated or grouped into unique “profiles”. Profiles are groups of cases that have exactly the same values on the predictors. Suppose for example, that the model has just two predictor variables, sex (1=male, 0=female) and marital status (1=married, 0=unmarried). There are then four profiles: married males, unmarried males, married females and unmarried females, presumably with many cases in each profile.

Suppose we then fit a logistic regression model with the two predictors, sex and marital status (but not their interaction). For each profile, we can get an observed number of events and an expected number of events based on the model. There are two well-known statistics for comparing the observed number with the expected number: the deviance and Pearson’s chi-square.

The deviance is a likelihood ratio test of the fitted model versus a “saturated” model that perfectly fits the data. In our hypothetical example, a saturated model would include the interaction of sex and marital status. In that case, the deviance is testing the “no interaction” model as the null hypothesis, with the interaction model as the alternative. A low p-value suggests that the simpler model (without the interaction) should be rejected in favor of the more complex one (with the interaction). Pearson’s chi-square is an alternative method for testing the same hypothesis. It’s just the application of Pearson’s familiar formula for comparing observed with expected numbers of events (and non-events).

Both of these statistics have good properties when the expected number of events in each profile is at least 5. But most contemporary applications of logistic regression use data that do not allow for aggregation into profiles because the model includes one or more continuous (or nearly continuous) predictors. When there is only one case per profile, both the deviance and Pearson chi-square have distributions that depart markedly from a true chi-square distribution, yielding p-values that may be wildly inaccurate.

What to do? Hosmer and Lemeshow (1980) proposed grouping cases together according to their predicted values from the logistic regression model. Specifically, the predicted values are arrayed from lowest to highest, and then separated into several groups of approximately equal size. Ten groups is the standard recommendation.

For each group, we calculate the observed number of events and non-events, as well as the expected number of events and non events. The expected number of events is just the sum of the predicted probabilities over the individuals in the group. And the expected number of non-events is the group size minus the expected number of events.

Pearson’s chi-square is then applied to compare observed counts with expected counts. The degrees of freedom is the number of groups minus 2. As with the classic GOF tests, low p-values suggest rejection of the model.

It seems like a clever solution, but it turns out to have serious problems. The most troubling problem is that results can depend markedly on the number of groups, and there’s no theory to guide the choice of that number. This problem did not become apparent until software packages started allowing you to specify the number of groups, rather than just using 10.

Here’s an example using Stata with the famous Mroz data set that I used in last month’s post. The sample consists of 753 women, and the dependent variable is whether or not a woman is in the labor force. Here is the Stata code for producing the HL statistic based on10 groups:

use http://www.uam.es/personal_pdi/economicas/rsmanga/docs/mroz.dta, clear

logistic inlf kidslt6 age educ huswage city exper

estat gof, group(10)

The estat gof command produces a chi-square of a 15.52 with 8 df, yielding a p-value of .0499—just barely significant. This suggests that the model is not a satisfactory fit to the data, and that interactions and non-linearities are needed (or maybe a different link function). But if we specify 9 groups using the option group(9), the p-value rises to .11. And with group(11), the p-value is .64. Clearly, it’s not acceptable for the results to depend so greatly on such minor changes to a test characteristic that is completely arbitrary. Examples like this one are easy to come by.

But wait, there’s more. One would hope that adding a statistically significant interaction or non-linearity to a model would improve its fit, as judged by the HL test. But often that doesn’t happen. Suppose, for example, that we add the square of exper (labor force experience) to the model, allowing for non-linearity in the effect of experience. The squared term is highly significant (p=.002). But with 9 groups, the HL chi-square increases from 11.65 (p=.11) in the simpler model to 13.34 (p=.06) in the more complex model. That result suggests that we’d be better off with the model that excludes the squared term.

The reverse can also happen. Quite frequently, adding a non-significant interaction or non-linearity to a model will substantially improve the HL fit. For example, I added the interaction of educ and exper to the basic model above. The product term had a p-value of .68, clearly not statistically significant. But the HL chi-square (based on 10 groups) declined from 15.52 (p=.05) to 9.19 (p=.33). Again, unacceptable behavior.

If the HL test is no good, then how can we assess the fit of the model? It turns out that there’s been quite a bit of recent work on this topic. In next month’s post, I’ll describe some of the newer approaches.

If you want to learn more about logistic regression, check out my book Logistic Regression Using SAS: Theory and Application, Second Edition (2012), or try my seminar on Logistic Regression.

REFERENCE

Hosmer D.W. and Lemeshow S. (1980) “A goodness-of-fit test for the multiple logistic regression model.” Communications in Statistics A10:1043-1069.

Roger Keller says:

March 13, 2013 at 11:40 am

Very good explanation. I have seen this problem in my analyses too and could not find a “right” number of groups for the HL test…just beacause there isn’t one.Thanks.

Reply
Quin says:

March 28, 2013 at 10:11 am

H-L test fails most of the time in very large datasets commonly see the financial industry. Any better tests to deal this situation will be very helpful.

Reply
1. Paul Allison says:
  
  April 1, 2013 at 9:30 am
  
  See my reply to Matt Bogard below.
  
  Reply
Matt Bogard says:

March 28, 2013 at 5:16 pm

I’ve also seen several criticisms that the HL test is too sensitive to large sample sizes. I’m not sure of the validity of this criticism, but look forward to next month’s article- maybe the new approaches you are referring to will address this issue if it is valid.

For instance:

JOURNAL OF PALLIATIVE MEDICINE
Volume 12, Number 2, 2009
“The Hosmer-Lemeshow test detected a statistically significant degree of miscalibration in both models, due to the extremely large sample size of the models, as the differences between the observed and expected values within each group are relatively small”

and

SIZE MATTERS TO A MODEL’s FIT (comment in Crit Care Med. 2007: Sep 35(9):2213

“Caution should be used in interpreting the calibration of predictive models developed using a smaller data set when applied to larger numbers of patients. A significant Hosmer-Lemeshow test does not necessarily mean that a predictive model is not useful or suspect. While decisions concerning a mortality model’s suitability should include the Hosmer-Lemeshow test, additional information needs to be taken into consideration. This includes the overall number of patients, the observed and predicted probabilities within each decile, and adjunct measures of model calibration.”

and from STATA LIST comments:

http://www.stata.com/statalist/archive/2006-09/msg00226.html

“It follows that with large sample sizes any discrepancy between the model and the data will be magnified, resulting in small p-values for a goodness of fit test.”

Reply
1. Paul Allison says:
  
  April 1, 2013 at 9:29 am
  
  The large sample size issue is a potential problem with ANY goodness of fit test. With large sample sizes, even trivial departures from the model specification are likely to show up as statistically significant. Actually, simulation results suggest that the HL test has relatively LOW power for detecting certain kinds of model specification, especially interactions.
  
  Reply
Robin says:

May 9, 2013 at 11:59 pm

I look forward to the next post on this topic. I’m dealing with a CPS dataset with nearly 100,000 observations and find the H-L test to be significant, yet looking at the tables the counts in the expected/observed columns are very close, not different enough to warrant changes to a model that is theoretically very sound.

What are your thoughts on the link test (Stata linktest command)?

Reply
1. Paul Allison says:
  
  May 28, 2013 at 9:14 am
  
  The link test in Stata is fairly crude, but serviceable.
  
  Reply
William Chiu says:

June 25, 2013 at 11:09 pm

I propose calculating the HL statistic on the “hold-out sample” rather than the “model development sample”. Assuming you have a lot of data, you can do a 75% development data set, and 25% hold-out data set.

If you don’t have enough data points for a hold-out data set, I recommend the BIC which penalizes for model complexity. http://en.wikipedia.org/wiki/Bayesian_information_criterion

Reply
emeryL says:

July 16, 2013 at 8:25 pm

Are you still planning a follow-up article on a good alternative to the HL test? I’d be very interested to read it.

This article was really helpful!

Reply
Angelica says:

July 29, 2013 at 7:32 pm

Very helpful. Thank you!

Reply
Rogelio says:

October 14, 2013 at 4:06 am

Hello,

I have just read this post and I have found it really interesting. That is the reason I am looking forward to read the post on a good alternative to this HL test (which, in fact, has driven me crazy these last three months). Where can I find the explanations on those good alternatives?

Thank you very much.
Rogelio Pujol
Statistical Researcher

Reply
1. Paul Allison says:
  
  November 25, 2013 at 2:13 pm
  
  I’m working on it, but it’s taken longer than expected.
  
  Reply
Sarah says:

November 26, 2013 at 10:46 am

Is le Cessie and Houwelingen test better?

Reply
1. Paul Allison says:
  
  January 22, 2014 at 9:37 am
  
  Not familiar with this test.
  
  Reply
Neil Shephard says:

January 14, 2014 at 10:03 am

You might be interested in this article from Hosmer & Lemeshow (and a couple of others) who critique the Hosmer-Lemeshow goodness-of-fit test and looks at how it and others actually perform (I took away from it that none of them are that great)…

D. W. HOSMER, T. HOSMER, S. LE CESSIE, S. LEMESHOW (1997) A COMPARISON OF GOODNESS-OF-FIT TESTS FOR THE LOGISTIC REGRESSION MODEL Statistics in Medicine Volume 16, Issue 9, pages 965–980

http://onlinelibrary.wiley.com/doi/10.1002/%28SICI%291097-0258%2819970515%2916:9%3C965::AID-SIM509%3E3.0.CO;2-O/abstract

Reply
Jean says:

February 4, 2014 at 12:25 pm

A clearer explanation and a very helpful description of the HL’ test of GOF.
Thank you!

Reply
Wei says:

April 8, 2014 at 10:35 pm

Hi Paul,

Have you published a paper on the this particular finding? if so, would you please provide me with a link so I can refer to it in my work.

thanks

Wei

Reply
1. Paul Allison says:
  
  April 9, 2014 at 10:15 am
  
  Sorry, no publication. But you can refer to my recent paper presented at the SAS Global Forum. Click here to see it.
  
  Reply
Anjan says:

April 23, 2014 at 3:02 am

i have this (hosmer and lemeshow test) HL test for goodness of fit. All the estimates are being significant but the value of sig, in HL test is being greater than 0.75, whether it is correct or what can be the solution.
Hosmer and Lemeshow Test
Step Chi-square df Sig.
1 2.764 8 .948

Reply
1. Paul Allison says:
  
  April 23, 2014 at 10:01 am
  
  For the HL test, higher p-values are better. So 0.75 indicates that the model fits well.
  
  Reply
Dharmi Kapadia says:

June 20, 2014 at 12:06 pm

Hello Paul,

I was planning on using the HL test of GOF for my analysis because I am using the svy command in Stata and I haven’t been able to find any other appropriate GOF stats. Are you able to recommend an alternative when using the svy command?

Many thanks,
Dharmi

Reply
1. Paul Allison says:
  
  June 23, 2014 at 6:55 am
  
  Sorry, I don’t have any recommendations for this situationn.
  
  Reply
2. Puguh Prasetyoputra says:
  
  July 20, 2014 at 11:11 am
  
  Hello Dharmi,
  
  This article might be of interest to you: Archer, K. J., & Lemeshow, S. (2006). Goodness-of-fit test for a logistic regression model fitted using survey sample data. Stata Journal, 6(1), 97-105.
  
  Or this one: Archer, K. J., Lemeshow, S., & Hosmer, D. W. (2007). Goodness-of-fit tests for logistic regression models when data are collected using a complex sampling design. Computational Statistics & Data Analysis, 51(9), 4450-4464. doi: 10.1016/j.csda.2006.07.006
  
  Regards,
  Puguh.
  
  Reply
Mavarick says:

August 9, 2014 at 12:53 am

very helpful for test of results of Lr. Need more learning!

Reply
Tsega says:

January 6, 2015 at 5:21 am

I am applying Binary Logistic Regression and my independent variables are all nominal. In GOF test, the H-L test is significant (less than 0.01) and my I have all nominal independent variables in the Nagelkerke R Square is 0.0439. I would like to know your suggestion of this situation. I am looking forward to hearing from you soon.

Reply
1. Paul Allison says:
  
  January 15, 2015 at 10:46 am
  
  If your independent variables are all nominal, you should be able to use the deviance or Pearson chi-square to test the fit of the model. These are more trustworthy than the Hosmer-Lemeshow test. If these are significant, it would indicate a need for interactions among your predictors.
  
  Reply
Dingdang says:

February 13, 2015 at 9:47 am

Hi Paul,

Thanks for the article. I am using the Hosmer-Lemeshow test to see if the observations are random variables whose distribution belongs to a given family of distributions. Do the observations have to be independent of each other? I am assuming that the observations (like defaults, non-defaults) are slightly correlated with each. Does the test still do its job or do I need to modify the test statistics? The denominator of the test looks like the variance of a binomial distribution. I am thinking if i have to modify it with terms to correlationsfactor.

Many thanks for your sharing your idea.

BR Ding

Reply
1. Paul Allison says:
  
  February 18, 2015 at 1:07 pm
  
  In principle, the observations should be independent. But I haven’t seen any suggestions for how to modify the test if they are not independent.
  
  Reply
Zacarias Francisco Soquiço says:

June 5, 2015 at 4:58 am

Hi Paul,

There is something strange in R when one uses the package “LogisticDx” to make Diagnostic tests for logistic regression models. It’s about the values of the Probabilities of covariate patterns, I think they are not correct, if the are, I would like to know how they are calculated. It’s expected that the sum of y=1 observed in each covariate pattern should be approximately equal to the sum of y=1 expected in each covariate pattern when consider the probability of the covariate pattern.

Please see this point if you can and reply on my email.

Thank you!

Reply
1. Paul Allison says:
  
  June 17, 2015 at 9:20 am
  
  Sorry but I am not familiar with this package.
  
  Reply
michael cook says:

September 4, 2015 at 12:58 pm

Is it also true that the number of groups must be greater than the number of predictors+1 ? I have been told that this constraint is in HL’s original paper.

“In a 1980 paper Hosmer-Lemeshow showed by simulation that (provided p+1<g ) their test statistic approximately followed a chi-squared distribution on g−2 degrees of freedom…"

Reply
1. Paul Allison says:
  
  September 25, 2015 at 12:28 pm
  
  I don’t have access to the original HL paper, but nothing in their later work (including the 3rd edition of their textbook) says anything about this requirement.
  
  Reply
Jade says:

September 7, 2015 at 3:19 am

Hi,
Is it possible to have non-significant H&L test indicating the model fits the data, but actually have no significant predictors? I’ve run a logistic regression and found that none of my predictors are significant, yet the H&L test is still indicating a good fit.
I have quite a small sample size – can this affect H & L?

Reply
1. Paul Allison says:
  
  September 7, 2015 at 8:50 am
  
  Yes, absolutely. This is more likely to happen when the sample is small, but it can also happen in large samples. Keep in mind that the H&L statistic is not testing whether the predictors affect the outcome. Rather, it’s testing whether there are non-linearities and interactions that are not well approximated by your model.
  
  Reply
S B Mahadik says:

February 4, 2016 at 1:41 am

Hello Sir,
In HL test, the grouping criterion is the fitted probabilities of the responses. What is the logic behind using this criterion for grouping?

Reply
1. Paul Allison says:
  
  February 8, 2016 at 8:30 am
  
  Because the fitted probabilities are deterministic functions of the predictors, similar fitted probabilities are indicative of “similar” values of the predictors, at least with respect to determination of the outcome.
  
  Reply
Ahmed says:

February 10, 2016 at 12:39 pm

Hi,

Thank you for the great information and discussion on this topic. I am running a logistic model on a large data set consisting of several millions. Not surprisingly, HL test was highly significant. However, when I ran the same model on a smaller random sample of the same data set, GOF(HL)was not significant and everything else (including the ORs)remained unchanged. Could this be considered as evidence that the model was fine and that lack of fit when using the full data set has more to do with the test limitations rather than the model specification?

Thank you.

Reply
1. Paul Allison says:
  
  February 16, 2016 at 10:54 am
  
  Possibly. But if you’ve read the post, you’ll know that I don’t trust the HL test even in smaller samples.
  
  Reply
Jessica says:

February 24, 2016 at 9:38 am

Dear mr. Allison,

For my thesis I have performed a logistic regression. The Hosmer Lemeshow test is significant. Now I try to find out where the test goes wrong. I have used the contingency table to calculate the HL statistic, but I cannot find out in which decile the model predicts poorly. Do you have any recommendations as to how I could try to find the poorly predicting decile using a HL test?

With kind regards,

Jessica

Reply
eyasu says:

May 30, 2016 at 12:26 pm

hi
i know how to calculate crude odds ratio in logistic regression but how can i calculate adjusted odds ratio?

Reply
1. Paul Allison says:
  
  June 5, 2016 at 10:25 am
  
  If you exponentiate the coefficients from a logistic regression (i.e., calculate exp(b)), you get adjusted odds ratios.
  
  Reply
Jehangir Malik says:

December 10, 2016 at 9:51 am

very good explaination, thanks

Reply
hamzah says:

December 11, 2016 at 2:00 pm

Dear
Prof Allison,

Thank you for great information about that we found in Goodness-of-fit test for svy: logistic.

We have analysis our large-scale survey use svy prefix stata. We can use estat gof to perform a goodness-of-fit test for this model.

Based on data analysis, in the multiple logistic regression as final model as following
. svylogitgof
Number of observations = 259885
F-adjusted test statistic = F(9,4409) = 3.07
Prob F = 0.001
. estat gof
Logistic model for malaria, goodness-of-fit test
F(9,4409) = 3.08
Prob F = 0.0011

The F statistic is significant at the 5% level, indicating that the model is not a good fit for these data?
reference http://www.stata.com/manuals13/svy.pdf

Meanwhile, all variables remaining in the model have Pv = 0.001 and OR> 1.

Do you have any insight about our final model?

Sincerely yours,

Hamzah

Reply
1. Paul Allison says:
  
  December 27, 2016 at 7:26 am
  
  svylogitgof is not an official Stata command, and I’m really not sure what it’s doing. However, with a sample this large, almost any reasonably parsimonious model is going to show a significant goodness of fit statistic.
  
  Reply
Jake says:

March 20, 2017 at 5:44 pm

The issue of the number of groups created with the Hosmer-Lemeshow test not withstanding, couldn’t you avoid the sample size issue by applying an effect size to the HL chi-square? For example, Cramer’s V could easily be calculated as V = sqrt(HL chi square/((n*2)).

Reply
1. Paul Allison says:
  
  March 22, 2017 at 4:06 pm
  
  I don’t think it’s sensible to calculate an effect size for a goodness of fit test. And Cramer’s V definitely does not seem appropriate.
  
  Reply
Amha says:

June 14, 2017 at 4:52 am

I was using HL test in SPSS, but in case of catagorical responses, the observed and the expected values are always identical/same, the chi squeare is 0.00, df=0 and the P value is empety. How this happens?

Reply
1. Paul Allison says:
  
  June 19, 2017 at 1:13 pm
  
  I’m guessing that you are fitting a saturated model. That would happen, for example, if you had a single categorical predictor variable. Or more than one categorical predictor with all possible interactions.
  
  Reply
Jon says:

October 26, 2017 at 2:48 am

Hi Paul
Thanks for the informative article.
I have a model with 2 explanatory categorical variables (with 2 and 3 levels respectively).
But the H&L test result in SAS that shows only 3 groups. What could be the reason for this.

Thank you.

Reply
1. Paul Allison says:
  
  November 17, 2017 at 9:08 am
  
  I’m guessing that your problem stems from the fact that there are many ties on the predicted values.
  
  Reply
mari says:

April 13, 2018 at 1:49 pm

Hi paul
can you help me please

Comment on the quality of fit of a logistic model corresponding to which the P-value of a Hosmer-Lemeshow test is equal to 0.0003. What is your expectation, if any, regarding the value of Nagelkerke R2 corresponding to this model?

Reply
1. Paul Allison says:
  
  May 25, 2018 at 12:38 pm
  
  The two statistics have nothing to do with each other.
  
  Reply
sonu says:

July 25, 2018 at 5:43 am

hello sir,

i am using binary logistic regression. my dependent variable is ( 1 and 0) with only one independent variable which is categorical for ex:( calling, texting, music and gaming). when i run analysis it shows only:

1. only three categorical variables ex:( call,
tex, music) not showing gaming variable
value.
2. Nag R2 is = 0.054

3. In Hosmer lemeeson test:
(GOF is 1.000>0.05) (in SAME HL TEST TABLE CHI SQUARE VALUE 0.000)

is there any reason why i got chi-square value 0.000 in H&T TEST.

Thanks in advance

Reply
1. Paul Allison says:
  
  October 2, 2018 at 4:13 pm
  
  The chi-square is 0 because you are fitting what’s called a saturated model. If you put your data in the form of a contingency table, the model can perfectly reproduce the cell frequencies. Models will be saturated if all predictors are categorical and the model includes all possible interactions. Any model with a single categorical predictor will be saturated.
  
  Reply
Lankamo says:

February 10, 2019 at 8:30 am

Thank you very much for your clear and illustrative ways of presenting complex issues into simple.

Reply
teklay says:

March 15, 2019 at 2:16 am

I want to attend any training regarding your model.

Reply
1. Paul Allison says:
  
  March 15, 2019 at 8:34 am
  
  Then take my course “Logistic Regression” which is being offered April 5-6, 2019, in Philadelphia.
  
  Reply
James Dickson says:

June 4, 2019 at 7:24 am

Thanks prof Allison for your educative articles and explanations. This is very helpful to me as a beginner.I am expecting to reading more from you.

Reply
Nono says:

July 15, 2020 at 1:28 pm

Kindly help me with the interpretation of this results for the post estimation result after using a logistic model for data analysis.

number of observations = 23502
number of groups = 5
Hosmer-Lemeshow chi2(3) = 0.63
Prob > chi2 = 0.8905

Reply
1. Paul Allison says:
  
  July 21, 2020 at 7:49 am
  
  Well, the p-value indicates that the model is consistent with the data. But the point of the post was that there are good reasons not to trust the H-L test.
  
  Reply
Debbie says:

October 2, 2020 at 5:12 pm

Hi Professor,

Thank you for your information!

I have run a binary logistic regression with a very high HL test (p < 1.000), but the omnibus model fit test was not significant (p < .104). I am very conflicted as to which test I can trust. Is my model a good fit/ can it still be used?

Thanks!

Reply
1. Paul Allison says:
  
  October 5, 2020 at 11:41 am
  
  These two tests are testing very different things.
  -The omnibus model fit test is testing the null hypothesis that all coefficients are 0. It’s answering the question “Is this model better than nothing?”
  -The HL test is testing for whether there are any missing interactions or nonlinearities in your model. It’s answering the question “Given the variables that I have, is there something better than this model?”
  As in your case, it’s quite possible that the model has little predictive power (omnibus test), but no need for interactions or nonlinearities (HL test).
  
  Reply

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Why I Don’t Trust the Hosmer-Lemeshow Test for Logistic Regression

Comments

Leave a Reply Cancel reply