JHU Coursera Rprogramming Course Project 3

医院病例数据分析项目

JHU DataScience Specialization/Cousers Rprogramming/Week3/Course Project 3

作业练习目标:通过分析医院数据,编写函数,通过函数分析各州医院指定的病例排名

数据文档打包

  • [x] 数据源:outcome-of-care-measures.csv

Contains information about THIRTY(30)-day mortality and readmission rates for heart attacks,heart failure, and pneumonia for over FOUR THOUSAND (4,000) hospitals;

  • [x] 说明书:Hospital_Revised_Flatfiles.pdf

30天死亡最率最低的医院

输出指定州(例子德州TX)函数(best)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
best <- function(state, outcome) {

## Read the outcome data
dat <- read.csv("outcome-of-care-measures.csv", colClasses = "character")
## Check that state and outcome are valid 病例分类,判断州名称
if (!state %in% unique(dat[, 7])) {
stop("invalid state")
}
switch(outcome, `heart attack` = {
col = 11
}, `heart failure` = {
col = 17
}, pneumonia = {
col = 23
}, stop("invalid outcome"))
## Return hospital name in that state with lowest 30-day death rate
df = dat[dat$State == state, c(2, col)]
df[which.min(df[, 2]), 1]
}

例子:输出指定州(德州TX)30天死亡最率最低的医院

1
## Warning in which.min(df[, 2]): 强制改变过程中产生了NA
1
## [1] "Hospital with lowest 30-day death rate is: CYPRESS FAIRBANKS MEDICAL CENTER"

心脏病死亡率最低医院

输出前10字母排名州的心脏病死亡最低医院(rankall)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
rankall <- function(outcome="heart attack", num=1) {
## Read the outcome data
dat <- read.csv("outcome-of-care-measures.csv", colClasses = "character")
## Check that state and outcome are valid 病例分类
#"heart attack心脏病""heart failure心衰竭""pneumonia肺炎"
states = unique(dat[, 7])
switch(outcome, `heart attack` = {
col = 11
}, `heart failure` = {
col = 17
}, pneumonia = {
col = 23
}, stop("invalid outcome"))

## Return hospital name in that state with the given rank 30-day death rate
## 在给定等级30天患病死亡率的情况下,返回该州的医院名称
dat[, col] = as.numeric(dat[, col])
dat = dat[, c(2, 7, col)] # leave only name, state, and death rate
dat = na.omit(dat)
# 不清楚可以查看医院dat结构 head(dat)
# 医院在本州的排名
rank_in_state <- function(state) {
df = dat[dat[, 2] == state, ]
nhospital = nrow(df)
switch(num, best = {
num = 1
}, worst = {
num = nhospital
})
if (num > nhospital) {
result = NA
}
o = order(df[, 3], df[, 1])
result = df[o, ][num, 1]
c(result, state)
}
output = do.call(rbind, lapply(states, rank_in_state))
output = output[order(output[, 2]), ]
output = cbind(1:dim(output)[1],output)
colnames(output) = c("No.","hospital", "state")
data.frame(output)
}

例子:二十间心脏病死亡率最低医院A-Z

1
pander(head(rankall("heart attack", 1), 20))
No. hospital state
1 PROVIDENCE ALASKA MEDICAL CENTER AK
2 CRESTWOOD MEDICAL CENTER AL
3 ARKANSAS HEART HOSPITAL AR
4 MAYO CLINIC HOSPITAL AZ
5 GLENDALE ADVENTIST MEDICAL CENTER CA
6 ST MARYS HOSPITAL AND MEDICAL CENTER CO
7 WATERBURY HOSPITAL CT
8 PROVIDENCE HOSPITAL DC
9 BAYHEALTH - KENT GENERAL HOSPITAL DE
10 MOUNT SINAI MEDICAL CENTER FL
11 STEPHENS COUNTY HOSPITAL GA
12 GUAM MEMORIAL HOSPITAL AUTHORITY GU
13 HILO MEDICAL CENTER HI
14 MARY GREELEY MEDICAL CENTER IA
15 PORTNEUF MEDICAL CENTER ID
16 SAINT JOSEPH HOSPITAL IL
17 ST VINCENT HEART CENTER OF INDIANA LLC IN
18 KANSAS HEART HOSPITAL KS
19 ST ELIZABETH MEDICAL CENTER NORTH KY
20 ST FRANCIS MEDICAL CENTER LA

综合函数,指定医院指定,指定病例,指定排位

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
rankhospital <- function(state, outcome, num = "best") {

## Read the outcome data
dat <- read.csv("outcome-of-care-measures.csv", colClasses = "character")
## Check that state and outcome are valid
if (!state %in% unique(dat[, 7])) {
stop("invalid state")
}
switch(outcome, `heart attack` = {
col = 11
}, `heart failure` = {
col = 17
}, pneumonia = {
col = 23
}, stop("invalid outcome"))
dat[, col] = as.numeric(dat[, col])
df = dat[dat[, 7] == state, c(2, col)]
df = na.omit(df)
nhospital = nrow(df)
switch(num, best = {
num = 1
}, worst = {
num = nhospital
})
if (num > nhospital) {
return(NA)
}

o = order(df[, 2], df[, 1])
df[o, ][num, 1]
}

例子:德州心脏病死亡率第四名低的医院

1
## Warning in rankhospital("TX", "heart attack", 4): 强制改变过程中产生了NA
1
## [1] "Texias with Heart Attack and lowest 4th death rate is: PARKLAND HEALTH AND HOSPITAL SYSTEM"
THE END