What to do when your Hessian matrix goes balmy !!!

So you ran some mixed models and got some balmy messages in return? Are these those messages?

“The Hessian (or G or D) Matrix is not positive definite. Convergence has stopped.”


“The Model has not Converged. Parameter Estimates from the last iteration are displayed.”

Then this post is for you. First let’s try to understand right from the basics of matrix algebra itself. Before going into the Hessian matrix let’s take a detour into the murky world of mixed models and see what’s going on there and how come we get a thing called Hessian matrix !

A linear mixed model looks like this (from Wikipedia):

\boldsymbol{y} = X \boldsymbol{\beta} + Z \boldsymbol{u} + \boldsymbol{\epsilon}


  • \boldsymbol{y} is a known vector of observations, with mean E(\boldsymbol{y}) = X \boldsymbol{\beta};
  • \boldsymbol{\beta} is an unknown vector of fixed effects;
  • \boldsymbol{u} is an unknown vector of random effects, with mean E(\boldsymbol{u})=\boldsymbol{0} and variance-covariance matrix \operatorname{var}(\boldsymbol{u})=G;
  • \boldsymbol{\epsilon} is an unknown vector of random errors, with mean E(\boldsymbol{\epsilon})=\boldsymbol{0} and variance \operatorname{var}(\boldsymbol{\epsilon})=R;
  • X and Z are known design matrices relating the observations \boldsymbol{y} to \boldsymbol{\beta} and \boldsymbol{u}, respectively.

Let’s focus on the variance-covariance matrix G or some software refer to it as the D. It is the a matrix of the variances and covariances of random effects. The variances are the diagonal elements and the off-diagonal ones are covariances. So if you have a mixed model with two random effects say, a random intercept as well as the random slope, then we would have a 2 X 2 G matrix. The variances of the intercept and slope terms would be in the diagonal whereas the off-diagonal would contain the covariances.

Remember this G matrix is a one which contains variances so mathematically speaking, the matrix should be positive definite (for a matrix to be so, diagonal elements should be positive). As variances are always positive, hence this makes sense.

The Hessian matrix referred to in the warning messages you got is actually based on this G matrix which is used to calculate the standard errors of the covariance parameters. So, the algorithms which calculate them would be stuck and won’t be able to find an optimised solution if the given Hessian matrix calculated for the model doesn’t have positive diagonal elements.

So, the whatever results you may get out of the mixed model wouldn’t be correct or trustworthy. What that means is that the model which you specified couldn’t estimate parameters etc with your data. Some might choose to ignore this warning and move ahead, but my request is please don’t !!! This warning is indeed important, and NO the software doesn’t have a vendetta against you/your project.

 The next step is obviously to ask what can you do in this circumstance and what might be the solution. One method might be to check the scaling of your predictor variables in the model. If they are highly different then that can be a good reason why the software has trouble in variance calculation. So, just a change in scaling of the predictors can solve your problem here.

Another method is when some covariance estimates are 0 or have no estimates at all or don’t produce the standard errors at all (SPSS usually does this, and produces blank estimates). Now don’t go on ignoring this variable, as something is fishy with the model itself. For if the best estimate of your variance is zero, this means there is zero variance within your data for the effect under consideration. For example, you have introduced a random slope for that effect, but in actuality the slopes do not differ across the subjects of your study in that effect and possibly a random intercept component might well explain all the variation.

So just remember when something like this happens, the best possible solution for you to do is to respecify the random components in your model and that could be about removing a random effect. Sometimes you might feel or have been told that a given random effect has to be introduced because of the design of the study, you wouldn’t find any variation in the data. Another thing, is that you could specify perhaps a simpler covariance structure which contains lesser number of unique parameters to be estimated.

Let me give an example to highlight this situation:

A researcher wants to understand the behavioural responses of rats living in their cages in a lab building by doing standard behavioural tests. Since the cages are situated in different floors, in different corners in the lab building, the researcher wanted to see if before experimentation is there any change in their responses to simple behavioural tests. Now let’s suppose there are 1000 rats in each floor and there are 10 floors in the building. That makes it 10000 rats which would be a huge number to study all of them individually. So, we take samples of rats within each floor and the design indicates including a random intercept component for each floor, to account for the fact that rats in the same floor may be more similar to each other than would be the case in a simple random sample. So, if this is true, we would likely want to estimate the variance of behavioural responses among floors.

But we know that modern animal facility guidelines calls for rigorous protocols to be followed and because of that rats are kept in similar cages with as similar conditions as possible. Then we can easily see here that there wouldn’t be much variance in the behavioural responses among the floors. This leads to the scenario i put up before, i.e., variance for floors = 0 and the model would be unable to uniquely estimate any variation from floor to floor, above and beyond the residual variance from one sampled rat to another.

Finally, another option is to use a population averaged model instead of a linear mixed model. As population averaged models don’t have any random effects, but do contain the correlation of multiple responses by the sampled individuals.

For more, read these —

  1. West, B. T., Welch, K. B., & Galecki, A. T. (2007). Linear mixed models: A practical guide using statistical software. New York: Chapman & Hall/CRC
  2. Linear mixed models in R- http://www.r-bloggers.com/linear-mixed-models-in-r/
  3. Model Selection in Linear Mixed Models- http://arxiv.org/pdf/1306.2427v1.pdf
  4. Hessian matrix in statistics- http://www.slideshare.net/FerrisJumah/hessin

Is consciousness a hard problem and why hasn’t it been solved yet?

The birth of self-consciousnss: 'Holy smoke, I'm standing here!'

The birth of self-consciousnss: ‘Holy smoke, I’m standing here!’

Yeah that’s how probably many imagine consciousness to have emerged, all in one single stroke. But is consciousness such tractable? Is it explainable? The question of what is consciousness has dominated science and philosophy for many centuries now. Yet, a satisfactory solution to this problem still eludes the best minds amongst us.

At one time, conciousness was considered as a question to be pondered only by philosophers. This came into prominence with Rene Descartes and his Cartesian Duality theory( though Aristotle and Plato also had some versions of mind-body duality).In theory, everything else you think you know about the world could be an elaborate illusion cooked up to deceive you – at this point, present-day writers invariably invoke The Matrix – but your consciousness itself can’t be illusory. On the other hand, this most certain and familiar of phenomena obeys none of the usual rules of science. It doesn’t seem to be physical. It can’t be observed, except from within, by the conscious person. It can’t even really be described. The mind, Descartes concluded, must be made of some special, immaterial stuff that didn’t abide by the laws of nature; it had been bequeathed to us by God.This whole duality regime persisted until the 18th century when physicalism came into the uncharted region of neurology.

And yet, even as neuroscience gathered pace in the 20th century, no convincing alternative explanation was forthcoming. So little by little, the topic became taboo. Few people doubted that the brain and mind were very closely linked. But how they were linked – or if they were somehow exactly the same thing – seemed a mystery best left to philosophers in their armchairs. As late as 1989, writing in the International Dictionary of Psychology, the British psychologist Stuart Sutherland could irascibly declare of consciousness that “it is impossible to specify what it is, what it does, or why it evolved. Nothing worth reading has been written on it.”

Then in a conference held in Arizona (1994) came up a chap who dressed up in all jeans looked like he belonged in a rock concert than the established, grumpy conference he gave a talk. What he said was to introduce consciousness as a hard problem in biology. He agreed positively with the advancement in sciences which had worked up so much to explain the inner workings of a brain but he asked how do you explain sensations, such as colors and tastes. Can we scientifically explain how the bunch of interconnected network of neurons leads to a highly subjective process such as sensations? David Chalmers proposed his zombie thought experiment wherein a zombie is a hypothetical being that is indistinguishable from a normal human being except in that it lacks conscious experience, qualia, or sentience. For example, a philosophical zombie could be poked with a sharp object, and not feel any pain sensation, but yet, behave exactly as if it does feel pain (it may say “ouch” and recoil from the stimulus, or say that it is in intense pain).

The notion of a philosophical zombie is used mainly in thought experiments intended to support arguments (often called “zombie arguments”) against forms of physicalism such as materialism, behaviorism and functionalism. Physicalism is the idea that all aspects of human nature can be explained by physical means: specifically, all aspects of human nature and perception can be explained from a neurobiological standpoint. Some philosophers, like David Chalmers, argue that since a zombie is defined as physiologically indistinguishable from human beings, even its logical possibility would be a sound refutation of physicalism. However, philosophers like Daniel Dennett counter that Chalmers’s physiological zombies are logically incoherent and thus impossible.

Ever since then, research into this area which was long abandoned by mainstream science simply exploded. An early convert into this question of consciousness was the Nobel prize winner Francis Crick.

Upon taking up work in theoretical neuroscience, Crick was struck by several things:

  • there were many isolated subdisciplines within neuroscience with little contact between them
  • many people who were interested in behaviour treated the brain as a black box
  • consciousness was viewed as a taboo subject by many neurobiologists

Crick hoped he might aid progress in neuroscience by promoting constructive interactions between specialists from the many different subdisciplines concerned with consciousness. He even collaborated with neurophilosophers such as Patricia Churchland. In 1983, as a result of their studies of computer models of neural networks, Crick and Mitchison proposed that the function of REM sleep is to remove certain modes of interactions in networks of cells in the mammalian cerebral cortex; they called this hypothetical process ‘reverse learning‘ or ‘unlearning’. In the final phase of his career, Crick established a collaboration with Christof Koch that lead to publication of a series of articles on consciousness during the period spanning from 1990 to 2005. Crick made the strategic decision to focus his theoretical investigation of consciousness on how the brain generates visual awareness within a few hundred milliseconds of viewing a scene. Crick and Koch proposed that consciousness seems so mysterious because it involves very short-term memory processes that are as yet poorly understood. Crick also published a book describing how neurobiology had reached a mature enough stage so that consciousness could be the subject of a unified effort to study it at the molecular, cellular and behavioural levels. Crick’s book The Astonishing Hypothesis made the argument that neuroscience now had the tools required to begin a scientific study of how brains produce conscious experiences. Crick was skeptical about the value of computational models of mental function that are not based on details about brain structure and function.

But now the things have come up to a stage where there are two camps- one which agree with David Chalmers, Christof Koch and their panpsychism or collective consciousness theory OR another led by Daniel Dennett, Patricia Churchland which argue that consciousness might just be an emergent property of such an interconnected network and there is nothing special about it.

Daniel Dennett argues that consciousness, as we think of it, is an illusion: there just isn’t anything in addition to the spongy stuff of the brain, and that spongy stuff doesn’t actually give rise to something called consciousness. Common sense may tell us there’s a subjective world of inner experience – but then common sense told us that the sun orbits the Earth, and that the world was flat. Consciousness, according to Dennett’s theory, is like a conjuring trick: the normal functioning of the brain just makes it look as if there is something non-physical going on. To look for a real, substantive thing called consciousness, Dennett argues, is as silly as insisting that characters in novels, such as Sherlock Holmes or Harry Potter, must be made up of a peculiar substance named “fictoplasm”; the idea is absurd and unnecessary, since the characters do not exist to begin with. Its this criticism which hits the panpsychism idea. “The history of science is full of cases where people thought a phenomenon was utterly unique, that there couldn’t be any possible mechanism for it, that we might never solve it, that there was nothing in the universe like it,” said Patricia Churchland of the University of California, a self-described “neurophilosopher” and one of Chalmers’s most forthright critics. Churchland’s opinion of the Hard Problem, which she expresses in caustic vocal italics, is that it is nonsense, kept alive by philosophers who fear that science might be about to eliminate one of the puzzles that has kept them gainfully employed for years. Look at the precedents: in the 17th century, scholars were convinced that light couldn’t possibly be physical – that it had to be something occult, beyond the usual laws of nature. Or take life itself: early scientists were convinced that there had to be some magical spirit – the élan vital – that distinguished living beings from mere machines. But there wasn’t, of course. Light is electromagnetic radiation; life is just the label we give to certain kinds of objects that can grow and reproduce. Eventually, neuroscience will show that consciousness is just brain states. Churchland said: “The history of science really gives you perspective on how easy it is to talk ourselves into this sort of thinking – that if my big, wonderful brain can’t envisage the solution, then it must be a really, really hard problem!”

So, with the Big Brain initiative in US and Europe can we finally get more answers into what consciousness is? Would it turn out to be nothing much but an emergent property of neurons or something as fundamental property of universe?

For more, read these:

  1. The Stanford Encyclopedia of Philosophy
  2. Internet Encyclopedia of Philosophy
  3. Four philosophical questions to make your brain hurt