The-Hitchhikers-Guide-to-Survival-Analysis

The Hitchhiker’s Guide to Survival Analysis

First published on ReadWrite, by Benedict Timmerman

Survival analysis is the best thing in the world since sliced bread! However, in most machine learning circles, it’s pretty much synonymous with an “# it’scomplicated” relationship status.

Survival Analysis is an Extremely Valuable Branch of Statistics

We want our guide to better serve you as a straightforward go-to/how-to, eliminating any confusion. The guide provides a valuable resource on how survival analysis, which can be applied to — well, almost anything.

However, survival analysis is wrought with misunderstanding and misuse.

What else should I know about survival analysis?

Also referred to as “time-to-event” analysis, simply put, it’s what we find when we analyze the time it takes for something like buying a new home (an event) to happen after getting a promotion -which we call “an exposure.

Basically, it’s modeling or a set of statistical stratagems which measure the time as mentioned above to an event. Literally how long it takes for something of interest to happen. Depending on what you are studying, observing, researching, or just finding interesting- you want to know, and we can now actionably determine how long it takes to happen.

To get started, first and foremost, you need to set and formulate your research question aptly to perform a survival analysis approach.

Often, researchers will simply use ‘when’ and/or ‘whether’ terminology. But, first, the information is given — as a prediction of when and/or whether something will happen.

Then the conclusion is in a yes, or no determination. Finally, the conclusion is an analysis about how long it takes before what we want to see (the subject of interest being examined) will happen and whether what we’re looking for will happen or not.

When you’re analyzing how long it takes for an event to happen and whether it will happen at all, it is imperative that what you want to eventually see and find is the same (equal) for all the subjects you examine.

In other words, you don’t want a sample with elements that have no chance of experiencing the event. It just won’t work.

Exposure is the point when we’re off to the races and start the proverbial research clock in order to analyze any time-to-event.

The event itself, in this case buying a new home, means simply the time needed to process and develop from the exposure which is getting that promotion– the moment when we stop the “clock.”

The time elapsed between these two points is the focus of interest which we call “the survival time.”

Survival analysis is a game-changer for a diverse variety of disciplines and areas of research

Most experts, however, mistakenly consider survival analysis a tool solely applied to study death and disease, an accurate method to measure relapse of a medical condition, the potential hospitalization of a patient, and the mortality rate in medicine and biomedical disciplines.

Survival analysis application has thankfully spread to serve a variety of fields and disciplines, including engineering, social and behavioral sciences, even professional sports.

In engineering, this process is known as “failure-time analysis” and is mostly applied to test the durability and quality of products.

Incorporating survival analysis in engineering is valuable. For example, we see a manufacturer wants to test how long it takes for light bulbs to burn out, how often the company’s computers crash, even predict when a mechanical part like an engine head gasket will crack.

In social sciences, survival analysis is known as “time-to-event” analysis. This is because there have been scientific studies to answer queries such as how long it takes for one to get married, get a first tattoo, buy a first home, or to graduate.

Medicine and biomedical research

In addition to medicine and biomedical research, JADBio can definitively perform survival analysis on several out-of-the-box and even what one may consider ‘weird’ cases, including:

Health – Obviously, when analyzing health disciplines, we can actionably determine values such as the time to: death, device failure such as a heart pump, or simply the readmission rate of a specific subset of patients.

You might also be interested in signing up for JADBio Newsletter. Stay up-to-date with the latest news about ML, AutoML & DataScience…. and get some tips on how you can become a #JADai yourself.
May the Data Force be With You!

Market – We use survival analysis more and more in marketplace areas of research such as manufacturing or sales when we want to determine the time to: a component failure in machines, whether a certain device becomes obsolete, and how long it will take to obtain a certain patent for example.

Finance – A valuable tool in the evermore elusive waters of finance, survival analysis can be applied to calculate the time to predict when a hospital may turn a profit or report loss, calculate costs, and how often staff present burnout or should be promoted.

Social Sciences – Especially helpful in social sciences, where experts now can analyze the time to: divorce, new couples having their first or second child, and how long it will take new families to buy their first home.

Government and Social Services – Helps determine the time to: child welfare and to match children with appropriate foster parents. Used to optimize the length of stay of children in the program, to estimate participation time in various social programs, and to estimate the time it takes for various policies to take effect.

Law Enforcement – Predicts the time to: estimate the likelihood of recidivism in criminal offenders.

Marketing Operations – Performed to assess the time to: the length of participation in loyalty programs.

Sports – Sports is a field where survival analysis can really be your golden goose. Sports — that’s right. In professional sports, survival analysis will change the game when it comes to delivering results like time to: mechanical failure of race car engines, or tires in F1, and the time it takes athletes to be substituted in team sports like football.

A coach can know the best time to switch out a soccer player. Team doctors and government health authorities can accurately evaluate and certainly limit the rate of Chronic Traumatic Encephalopathy (CTE) –a degenerative brain disease observed in professional athletes, military veterans, and anyone with a history of repetitive brain trauma.

In essence, there is no differentiation to survival analysis being used as a tool whether we consider health disciplines, the global market, social and behavioral issues, or professional sports.

When researching for survival analysis — survival time is the main driving interest.

We perform survival analysis on subjects that present a delayed onset of events where our goal is to observe that specific timeframe, how long it takes for the event to happen.

It is irrelevant whether there is a positive or negative correlation attributed to the event. The event may very well be death (negative), yet it can also be a new promotion (positive).

Although initially developed in the biomedical sciences to analyze time to death either of patients or of laboratory animals, survival analysis is now widely used in engineering, economics, finance, healthcare, marketing, and public policy. Survival analysis can be used to predict when a patient will expire; when cancer will metastasize, or on anything you are trying to predict time-wise.

Our Secret Special Sauce

At the core of this work is JADBio. JADBio systematically compares the performance and stability of a selection of machine learning algorithms and feature selection methods that are suitable for high-dimensional, heterogeneous, censored, clinical and other forms of data. The data set is used in the context of providing specific, accurate, and actionable predictions.

Leveraging the advances in modern data collection techniques will produce ever-larger clinical and other large data sets. It’s imperative to identify methods that can be used to analyze high-dimensional, heterogeneous, survival data.

JADBio has a world-class team and constructs a range of machine learning algorithms capable of analyzing vast types of data providing clients with the power to make decisions and steer their respective objectives in the direction of success.

Definitions of standard terms in survival analysis:

  • Event: Death, disease occurrence, disease recurrence, recovery, or other experience of interest.
  • Time: The time from the beginning of an observation period (such as surgery or beginning treatment) to (i) an event, or (ii) end of the study, or (iii) loss of contact or withdrawal from the study.

Benedict Timmerman is a Senior IT Experience Analyst supporting Digital Giraffe’s clients operating within the AI industry. Benedict covers data and machine learning solutions, providing quantitative and qualitative analysis on the available practices, people and markets. Benedict also spearheads the company’s lead generation process for its clients designing outreach campaigns.