Methods for Survival and Duration Analysis¶
statsmodels.duration
implements several standard methods for
working with censored data. These methods are most commonly used when
the data consist of durations between an origin time point and the
time at which some event of interest occurred. A typical example is a
medical study in which the origin is the time at which a subject is
diagnosed with some condition, and the event of interest is death (or
disease progression, recovery, etc.).
Currently only right-censoring is handled. Right censoring occurs when we know that an event occurred after a given time t, but we do not know the exact event time.
Survival function estimation and inference¶
The statsmodels.api.SurvfuncRight
class can be used to
estimate a survival function using data that may be right censored.
SurvfuncRight
implements several inference procedures including
confidence intervals for survival distribution quantiles, pointwise
and simultaneous confidence bands for the survival function, and
plotting procedures. The duration.survdiff
function provides
testing procedures for comparing survival distributions.
Here we create a SurvfuncRight
object using data from the
flchain study, which is available through the R datasets repository.
We fit the survival distribution only for the female subjects.
In [1]: import statsmodels.api as sm
In [2]: data = sm.datasets.get_rdataset("flchain", "survival", cache=True).data
---------------------------------------------------------------------------
ConnectionRefusedError Traceback (most recent call last)
File /usr/lib/python3.12/urllib/request.py:1344, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1343 try:
-> 1344 h.request(req.get_method(), req.selector, req.data, headers,
1345 encode_chunked=req.has_header('Transfer-encoding'))
1346 except OSError as err: # timeout error
File /usr/lib/python3.12/http/client.py:1327, in HTTPConnection.request(self, method, url, body, headers, encode_chunked)
1326 """Send a complete request to the server."""
-> 1327 self._send_request(method, url, body, headers, encode_chunked)
File /usr/lib/python3.12/http/client.py:1373, in HTTPConnection._send_request(self, method, url, body, headers, encode_chunked)
1372 body = _encode(body, 'body')
-> 1373 self.endheaders(body, encode_chunked=encode_chunked)
File /usr/lib/python3.12/http/client.py:1322, in HTTPConnection.endheaders(self, message_body, encode_chunked)
1321 raise CannotSendHeader()
-> 1322 self._send_output(message_body, encode_chunked=encode_chunked)
File /usr/lib/python3.12/http/client.py:1081, in HTTPConnection._send_output(self, message_body, encode_chunked)
1080 del self._buffer[:]
-> 1081 self.send(msg)
1083 if message_body is not None:
1084
1085 # create a consistent interface to message_body
File /usr/lib/python3.12/http/client.py:1025, in HTTPConnection.send(self, data)
1024 if self.auto_open:
-> 1025 self.connect()
1026 else:
File /usr/lib/python3.12/http/client.py:1461, in HTTPSConnection.connect(self)
1459 "Connect to a host on a given (SSL) port."
-> 1461 super().connect()
1463 if self._tunnel_host:
File /usr/lib/python3.12/http/client.py:991, in HTTPConnection.connect(self)
990 sys.audit("http.client.connect", self, self.host, self.port)
--> 991 self.sock = self._create_connection(
992 (self.host,self.port), self.timeout, self.source_address)
993 # Might fail in OSs that don't implement TCP_NODELAY
File /usr/lib/python3.12/socket.py:852, in create_connection(address, timeout, source_address, all_errors)
851 if not all_errors:
--> 852 raise exceptions[0]
853 raise ExceptionGroup("create_connection failed", exceptions)
File /usr/lib/python3.12/socket.py:837, in create_connection(address, timeout, source_address, all_errors)
836 sock.bind(source_address)
--> 837 sock.connect(sa)
838 # Break explicitly a reference cycle
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
URLError Traceback (most recent call last)
Cell In[2], line 1
----> 1 data = sm.datasets.get_rdataset("flchain", "survival", cache=True).data
File /usr/lib/python3/dist-packages/statsmodels/datasets/utils.py:237, in get_rdataset(dataname, package, cache)
234 docs_base_url = ("https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/"
235 "master/doc/"+package+"/rst/")
236 cache = _get_cache(cache)
--> 237 data, from_cache = _get_data(data_base_url, dataname, cache)
238 data = read_csv(data, index_col=0)
239 data = _maybe_reset_index(data)
File /usr/lib/python3/dist-packages/statsmodels/datasets/utils.py:166, in _get_data(base_url, dataname, cache, extension)
164 url = base_url + (dataname + ".%s") % extension
165 try:
--> 166 data, from_cache = _urlopen_cached(url, cache)
167 except HTTPError as err:
168 if '404' in str(err):
File /usr/lib/python3/dist-packages/statsmodels/datasets/utils.py:157, in _urlopen_cached(url, cache)
155 # not using the cache or did not find it in cache
156 if not from_cache:
--> 157 data = urlopen(url, timeout=3).read()
158 if cache is not None: # then put it in the cache
159 _cache_it(data, cache_path)
File /usr/lib/python3.12/urllib/request.py:215, in urlopen(url, data, timeout, cafile, capath, cadefault, context)
213 else:
214 opener = _opener
--> 215 return opener.open(url, data, timeout)
File /usr/lib/python3.12/urllib/request.py:515, in OpenerDirector.open(self, fullurl, data, timeout)
512 req = meth(req)
514 sys.audit('urllib.Request', req.full_url, req.data, req.headers, req.get_method())
--> 515 response = self._open(req, data)
517 # post-process response
518 meth_name = protocol+"_response"
File /usr/lib/python3.12/urllib/request.py:532, in OpenerDirector._open(self, req, data)
529 return result
531 protocol = req.type
--> 532 result = self._call_chain(self.handle_open, protocol, protocol +
533 '_open', req)
534 if result:
535 return result
File /usr/lib/python3.12/urllib/request.py:492, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
490 for handler in handlers:
491 func = getattr(handler, meth_name)
--> 492 result = func(*args)
493 if result is not None:
494 return result
File /usr/lib/python3.12/urllib/request.py:1392, in HTTPSHandler.https_open(self, req)
1391 def https_open(self, req):
-> 1392 return self.do_open(http.client.HTTPSConnection, req,
1393 context=self._context)
File /usr/lib/python3.12/urllib/request.py:1347, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1344 h.request(req.get_method(), req.selector, req.data, headers,
1345 encode_chunked=req.has_header('Transfer-encoding'))
1346 except OSError as err: # timeout error
-> 1347 raise URLError(err)
1348 r = h.getresponse()
1349 except:
URLError: <urlopen error [Errno 111] Connection refused>
In [3]: df = data.loc[data.sex == "F", :]
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[3], line 1
----> 1 df = data.loc[data.sex == "F", :]
AttributeError: 'Dataset' object has no attribute 'loc'
In [4]: sf = sm.SurvfuncRight(df["futime"], df["death"])
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File /usr/lib/python3/dist-packages/pandas/core/indexes/base.py:3803, in Index.get_loc(self, key, method, tolerance)
3802 try:
-> 3803 return self._engine.get_loc(casted_key)
3804 except KeyError as err:
File /usr/lib/python3/dist-packages/pandas/_libs/index.pyx:138, in pandas._libs.index.IndexEngine.get_loc()
File /usr/lib/python3/dist-packages/pandas/_libs/index.pyx:165, in pandas._libs.index.IndexEngine.get_loc()
File pandas/_libs/hashtable_class_helper.pxi:5745, in pandas._libs.hashtable.PyObjectHashTable.get_item()
File pandas/_libs/hashtable_class_helper.pxi:5753, in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'futime'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
Cell In[4], line 1
----> 1 sf = sm.SurvfuncRight(df["futime"], df["death"])
File /usr/lib/python3/dist-packages/pandas/core/frame.py:3807, in DataFrame.__getitem__(self, key)
3805 if self.columns.nlevels > 1:
3806 return self._getitem_multilevel(key)
-> 3807 indexer = self.columns.get_loc(key)
3808 if is_integer(indexer):
3809 indexer = [indexer]
File /usr/lib/python3/dist-packages/pandas/core/indexes/base.py:3810, in Index.get_loc(self, key, method, tolerance)
3805 if isinstance(casted_key, slice) or (
3806 isinstance(casted_key, abc.Iterable)
3807 and any(isinstance(x, slice) for x in casted_key)
3808 ):
3809 raise InvalidIndexError(key)
-> 3810 raise KeyError(key) from err
3811 except TypeError:
3812 # If we have a listlike key, _check_indexing_error will raise
3813 # InvalidIndexError. Otherwise we fall through and re-raise
3814 # the TypeError.
3815 self._check_indexing_error(key)
KeyError: 'futime'
The main features of the fitted survival distribution can be seen by
calling the summary
method:
In [5]: sf.summary().head()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[5], line 1
----> 1 sf.summary().head()
NameError: name 'sf' is not defined
We can obtain point estimates and confidence intervals for quantiles of the survival distribution. Since only around 30% of the subjects died during this study, we can only estimate quantiles below the 0.3 probability point:
In [6]: sf.quantile(0.25)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[6], line 1
----> 1 sf.quantile(0.25)
NameError: name 'sf' is not defined
In [7]: sf.quantile_ci(0.25)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 1
----> 1 sf.quantile_ci(0.25)
NameError: name 'sf' is not defined
To plot a single survival function, call the plot
method:
In [8]: sf.plot()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[8], line 1
----> 1 sf.plot()
NameError: name 'sf' is not defined

Since this is a large dataset with a lot of censoring, we may wish to not plot the censoring symbols:
In [9]: fig = sf.plot()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[9], line 1
----> 1 fig = sf.plot()
NameError: name 'sf' is not defined
In [10]: ax = fig.get_axes()[0]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[10], line 1
----> 1 ax = fig.get_axes()[0]
IndexError: list index out of range
In [11]: pt = ax.get_lines()[1]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[11], line 1
----> 1 pt = ax.get_lines()[1]
NameError: name 'ax' is not defined
In [12]: pt.set_visible(False)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[12], line 1
----> 1 pt.set_visible(False)
NameError: name 'pt' is not defined

We can also add a 95% simultaneous confidence band to the plot. Typically these bands only plotted for central part of the distribution.
In [13]: fig = sf.plot()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[13], line 1
----> 1 fig = sf.plot()
NameError: name 'sf' is not defined
In [14]: lcb, ucb = sf.simultaneous_cb()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[14], line 1
----> 1 lcb, ucb = sf.simultaneous_cb()
NameError: name 'sf' is not defined
In [15]: ax = fig.get_axes()[0]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[15], line 1
----> 1 ax = fig.get_axes()[0]
IndexError: list index out of range
In [16]: ax.fill_between(sf.surv_times, lcb, ucb, color='lightgrey')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[16], line 1
----> 1 ax.fill_between(sf.surv_times, lcb, ucb, color='lightgrey')
NameError: name 'ax' is not defined
In [17]: ax.set_xlim(365, 365*10)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[17], line 1
----> 1 ax.set_xlim(365, 365*10)
NameError: name 'ax' is not defined
In [18]: ax.set_ylim(0.7, 1)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[18], line 1
----> 1 ax.set_ylim(0.7, 1)
NameError: name 'ax' is not defined
In [19]: ax.set_ylabel("Proportion alive")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[19], line 1
----> 1 ax.set_ylabel("Proportion alive")
NameError: name 'ax' is not defined
In [20]: ax.set_xlabel("Days since enrollment")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[20], line 1
----> 1 ax.set_xlabel("Days since enrollment")
NameError: name 'ax' is not defined

Here we plot survival functions for two groups (females and males) on the same axes:
In [21]: import matplotlib.pyplot as plt
In [22]: gb = data.groupby("sex")
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[22], line 1
----> 1 gb = data.groupby("sex")
AttributeError: 'Dataset' object has no attribute 'groupby'
In [23]: ax = plt.axes()
In [24]: sexes = []
In [25]: for g in gb:
....: sexes.append(g[0])
....: sf = sm.SurvfuncRight(g[1]["futime"], g[1]["death"])
....: sf.plot(ax)
....:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[25], line 1
----> 1 for g in gb:
2 sexes.append(g[0])
3 sf = sm.SurvfuncRight(g[1]["futime"], g[1]["death"])
NameError: name 'gb' is not defined
In [26]: li = ax.get_lines()
In [27]: li[1].set_visible(False)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[27], line 1
----> 1 li[1].set_visible(False)
IndexError: list index out of range
In [28]: li[3].set_visible(False)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[28], line 1
----> 1 li[3].set_visible(False)
IndexError: list index out of range
In [29]: plt.figlegend((li[0], li[2]), sexes, loc="center right")
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[29], line 1
----> 1 plt.figlegend((li[0], li[2]), sexes, loc="center right")
IndexError: list index out of range
In [30]: plt.ylim(0.6, 1)
Out[30]: (0.6, 1.0)
In [31]: ax.set_ylabel("Proportion alive")
Out[31]: Text(0, 0.5, 'Proportion alive')
In [32]: ax.set_xlabel("Days since enrollment")
Out[32]: Text(0.5, 0, 'Days since enrollment')

We can formally compare two survival distributions with survdiff
,
which implements several standard nonparametric procedures. The
default procedure is the logrank test:
In [33]: stat, pv = sm.duration.survdiff(data.futime, data.death, data.sex)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[33], line 1
----> 1 stat, pv = sm.duration.survdiff(data.futime, data.death, data.sex)
AttributeError: 'Dataset' object has no attribute 'futime'
Here are some of the other testing procedures implemented by survdiff:
# Fleming-Harrington with p=1, i.e. weight by pooled survival time
In [34]: stat, pv = sm.duration.survdiff(data.futime, data.death, data.sex, weight_type='fh', fh_p=1)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[34], line 1
----> 1 stat, pv = sm.duration.survdiff(data.futime, data.death, data.sex, weight_type='fh', fh_p=1)
AttributeError: 'Dataset' object has no attribute 'futime'
# Gehan-Breslow, weight by number at risk
In [35]: stat, pv = sm.duration.survdiff(data.futime, data.death, data.sex, weight_type='gb')
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[35], line 1
----> 1 stat, pv = sm.duration.survdiff(data.futime, data.death, data.sex, weight_type='gb')
AttributeError: 'Dataset' object has no attribute 'futime'
# Tarone-Ware, weight by the square root of the number at risk
In [36]: stat, pv = sm.duration.survdiff(data.futime, data.death, data.sex, weight_type='tw')
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[36], line 1
----> 1 stat, pv = sm.duration.survdiff(data.futime, data.death, data.sex, weight_type='tw')
AttributeError: 'Dataset' object has no attribute 'futime'
Regression methods¶
Proportional hazard regression models (“Cox models”) are a regression technique for censored data. They allow variation in the time to an event to be explained in terms of covariates, similar to what is done in a linear or generalized linear regression model. These models express the covariate effects in terms of “hazard ratios”, meaning the the hazard (instantaneous event rate) is multiplied by a given factor depending on the value of the covariates.
In [37]: import statsmodels.api as sm
In [38]: import statsmodels.formula.api as smf
In [39]: data = sm.datasets.get_rdataset("flchain", "survival", cache=True).data
---------------------------------------------------------------------------
ConnectionRefusedError Traceback (most recent call last)
File /usr/lib/python3.12/urllib/request.py:1344, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1343 try:
-> 1344 h.request(req.get_method(), req.selector, req.data, headers,
1345 encode_chunked=req.has_header('Transfer-encoding'))
1346 except OSError as err: # timeout error
File /usr/lib/python3.12/http/client.py:1327, in HTTPConnection.request(self, method, url, body, headers, encode_chunked)
1326 """Send a complete request to the server."""
-> 1327 self._send_request(method, url, body, headers, encode_chunked)
File /usr/lib/python3.12/http/client.py:1373, in HTTPConnection._send_request(self, method, url, body, headers, encode_chunked)
1372 body = _encode(body, 'body')
-> 1373 self.endheaders(body, encode_chunked=encode_chunked)
File /usr/lib/python3.12/http/client.py:1322, in HTTPConnection.endheaders(self, message_body, encode_chunked)
1321 raise CannotSendHeader()
-> 1322 self._send_output(message_body, encode_chunked=encode_chunked)
File /usr/lib/python3.12/http/client.py:1081, in HTTPConnection._send_output(self, message_body, encode_chunked)
1080 del self._buffer[:]
-> 1081 self.send(msg)
1083 if message_body is not None:
1084
1085 # create a consistent interface to message_body
File /usr/lib/python3.12/http/client.py:1025, in HTTPConnection.send(self, data)
1024 if self.auto_open:
-> 1025 self.connect()
1026 else:
File /usr/lib/python3.12/http/client.py:1461, in HTTPSConnection.connect(self)
1459 "Connect to a host on a given (SSL) port."
-> 1461 super().connect()
1463 if self._tunnel_host:
File /usr/lib/python3.12/http/client.py:991, in HTTPConnection.connect(self)
990 sys.audit("http.client.connect", self, self.host, self.port)
--> 991 self.sock = self._create_connection(
992 (self.host,self.port), self.timeout, self.source_address)
993 # Might fail in OSs that don't implement TCP_NODELAY
File /usr/lib/python3.12/socket.py:852, in create_connection(address, timeout, source_address, all_errors)
851 if not all_errors:
--> 852 raise exceptions[0]
853 raise ExceptionGroup("create_connection failed", exceptions)
File /usr/lib/python3.12/socket.py:837, in create_connection(address, timeout, source_address, all_errors)
836 sock.bind(source_address)
--> 837 sock.connect(sa)
838 # Break explicitly a reference cycle
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
URLError Traceback (most recent call last)
Cell In[39], line 1
----> 1 data = sm.datasets.get_rdataset("flchain", "survival", cache=True).data
File /usr/lib/python3/dist-packages/statsmodels/datasets/utils.py:237, in get_rdataset(dataname, package, cache)
234 docs_base_url = ("https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/"
235 "master/doc/"+package+"/rst/")
236 cache = _get_cache(cache)
--> 237 data, from_cache = _get_data(data_base_url, dataname, cache)
238 data = read_csv(data, index_col=0)
239 data = _maybe_reset_index(data)
File /usr/lib/python3/dist-packages/statsmodels/datasets/utils.py:166, in _get_data(base_url, dataname, cache, extension)
164 url = base_url + (dataname + ".%s") % extension
165 try:
--> 166 data, from_cache = _urlopen_cached(url, cache)
167 except HTTPError as err:
168 if '404' in str(err):
File /usr/lib/python3/dist-packages/statsmodels/datasets/utils.py:157, in _urlopen_cached(url, cache)
155 # not using the cache or did not find it in cache
156 if not from_cache:
--> 157 data = urlopen(url, timeout=3).read()
158 if cache is not None: # then put it in the cache
159 _cache_it(data, cache_path)
File /usr/lib/python3.12/urllib/request.py:215, in urlopen(url, data, timeout, cafile, capath, cadefault, context)
213 else:
214 opener = _opener
--> 215 return opener.open(url, data, timeout)
File /usr/lib/python3.12/urllib/request.py:515, in OpenerDirector.open(self, fullurl, data, timeout)
512 req = meth(req)
514 sys.audit('urllib.Request', req.full_url, req.data, req.headers, req.get_method())
--> 515 response = self._open(req, data)
517 # post-process response
518 meth_name = protocol+"_response"
File /usr/lib/python3.12/urllib/request.py:532, in OpenerDirector._open(self, req, data)
529 return result
531 protocol = req.type
--> 532 result = self._call_chain(self.handle_open, protocol, protocol +
533 '_open', req)
534 if result:
535 return result
File /usr/lib/python3.12/urllib/request.py:492, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
490 for handler in handlers:
491 func = getattr(handler, meth_name)
--> 492 result = func(*args)
493 if result is not None:
494 return result
File /usr/lib/python3.12/urllib/request.py:1392, in HTTPSHandler.https_open(self, req)
1391 def https_open(self, req):
-> 1392 return self.do_open(http.client.HTTPSConnection, req,
1393 context=self._context)
File /usr/lib/python3.12/urllib/request.py:1347, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1344 h.request(req.get_method(), req.selector, req.data, headers,
1345 encode_chunked=req.has_header('Transfer-encoding'))
1346 except OSError as err: # timeout error
-> 1347 raise URLError(err)
1348 r = h.getresponse()
1349 except:
URLError: <urlopen error [Errno 111] Connection refused>
In [40]: del data["chapter"]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[40], line 1
----> 1 del data["chapter"]
KeyError: 'chapter'
In [41]: data = data.dropna()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[41], line 1
----> 1 data = data.dropna()
AttributeError: 'Dataset' object has no attribute 'dropna'
In [42]: data["lam"] = data["lambda"]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[42], line 1
----> 1 data["lam"] = data["lambda"]
KeyError: 'lambda'
In [43]: data["female"] = (data["sex"] == "F").astype(int)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[43], line 1
----> 1 data["female"] = (data["sex"] == "F").astype(int)
KeyError: 'sex'
In [44]: data["year"] = data["sample.yr"] - min(data["sample.yr"])
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[44], line 1
----> 1 data["year"] = data["sample.yr"] - min(data["sample.yr"])
KeyError: 'sample.yr'
In [45]: status = data["death"].values
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[45], line 1
----> 1 status = data["death"].values
KeyError: 'death'
In [46]: mod = smf.phreg("futime ~ 0 + age + female + creatinine + np.sqrt(kappa) + np.sqrt(lam) + year + mgus", data, status=status, ties="efron")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[46], line 1
----> 1 mod = smf.phreg("futime ~ 0 + age + female + creatinine + np.sqrt(kappa) + np.sqrt(lam) + year + mgus", data, status=status, ties="efron")
NameError: name 'status' is not defined
In [47]: rslt = mod.fit()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[47], line 1
----> 1 rslt = mod.fit()
NameError: name 'mod' is not defined
In [48]: print(rslt.summary())
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[48], line 1
----> 1 print(rslt.summary())
AttributeError: '_Bunch' object has no attribute 'summary'
See Examples for more detailed examples.
There are some notebook examples on the Wiki: Wiki notebooks for PHReg and Survival Analysis
References¶
References for Cox proportional hazards regression model:
T Therneau (1996). Extending the Cox model. Technical report.
http://www.mayo.edu/research/documents/biostat-58pdf/DOC-10027288
G Rodriguez (2005). Non-parametric estimation in survival models.
http://data.princeton.edu/pop509/NonParametricSurvival.pdf
B Gillespie (2006). Checking the assumptions in the Cox proportional
hazards model.
http://www.mwsug.org/proceedings/2006/stats/MWSUG-2006-SD08.pdf
Module Reference¶
The class for working with survival distributions is:
|
Estimation and inference for a survival function. |
The proportional hazards regression model class is:
|
Cox Proportional Hazards Regression Model |
The proportional hazards regression result class is:
|
Class to contain results of fitting a Cox proportional hazards survival model. |
The primary helper class is:
|
A class representing a collection of discrete distributions. |