class: center, middle, inverse, title-slide # How to Use the
across()
function ## mutate or summarize across many columns at once ### Phanikumar S Tata ### 2021-09-10 --- ### How to Use Across to Mutate many columns at once #### This is awfully convenient Format: <br> mutate(across(logic.test = which variables, _function to apply()_)) Example 1: mutate numeric to factor Some numeric variables having to do with Tumor volume and Stage should actually be factors. --- count: false Mutate across (selected) numeric type to convert to factor type .panel1-across1-auto[ ```r *data ``` ] .panel2-across1-auto[ ``` # A tibble: 316 x 20 RBC.Age.Group Median.RBC.Age Age AA FamHx PVol TVol T.Stage bGS <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 3 25 72.1 0 0 54 3 1 3 2 3 25 73.6 0 0 43.2 3 2 2 3 3 25 67.5 0 0 103. 1 1 3 4 2 15 65.8 0 0 46 1 1 1 5 2 15 63.2 0 0 60 2 1 2 6 3 25 65.4 0 0 45.9 2 1 1 7 3 25 65.5 1 0 42.6 2 1 1 8 1 10 67.1 0 0 40.7 3 1 1 9 1 10 63.9 0 0 45 2 1 1 10 2 15 63 1 0 67.6 2 1 2 # ... with 306 more rows, and 11 more variables: BN+ <dbl>, # OrganConfined <dbl>, PreopPSA <dbl>, PreopTherapy <dbl>, Units <dbl>, # sGS <dbl>, AnyAdjTherapy <dbl>, AdjRadTherapy <dbl>, Recurrence <dbl>, # Censor <dbl>, TimeToRecurrence <dbl> ``` ] --- count: false Mutate across (selected) numeric type to convert to factor type .panel1-across1-auto[ ```r data %>% # select the variables needed * select(Age, starts_with('T')) ``` ] .panel2-across1-auto[ ``` # A tibble: 316 x 4 Age TVol T.Stage TimeToRecurrence <dbl> <dbl> <dbl> <dbl> 1 72.1 3 1 2.67 2 73.6 3 2 47.6 3 67.5 1 1 14.1 4 65.8 1 1 59.5 5 63.2 2 1 1.23 6 65.4 2 1 74.7 7 65.5 2 1 13.9 8 67.1 3 1 8.37 9 63.9 2 1 48.6 10 63 2 1 22.6 # ... with 306 more rows ``` ] --- count: false Mutate across (selected) numeric type to convert to factor type .panel1-across1-auto[ ```r data %>% # select the variables needed select(Age, starts_with('T')) %>% # check out the variable types * select(-starts_with('Time')) ``` ] .panel2-across1-auto[ ``` # A tibble: 316 x 3 Age TVol T.Stage <dbl> <dbl> <dbl> 1 72.1 3 1 2 73.6 3 2 3 67.5 1 1 4 65.8 1 1 5 63.2 2 1 6 65.4 2 1 7 65.5 2 1 8 67.1 3 1 9 63.9 2 1 10 63 2 1 # ... with 306 more rows ``` ] --- count: false Mutate across (selected) numeric type to convert to factor type .panel1-across1-auto[ ```r data %>% # select the variables needed select(Age, starts_with('T')) %>% # check out the variable types select(-starts_with('Time')) %>% # now do mutate across * mutate(across(starts_with('T'), * factor)) ``` ] .panel2-across1-auto[ ``` # A tibble: 316 x 3 Age TVol T.Stage <dbl> <fct> <fct> 1 72.1 3 1 2 73.6 3 2 3 67.5 1 1 4 65.8 1 1 5 63.2 2 1 6 65.4 2 1 7 65.5 2 1 8 67.1 3 1 9 63.9 2 1 10 63 2 1 # ... with 306 more rows ``` ] --- count: false Mutate across (selected) numeric type to convert to factor type .panel1-across1-auto[ ```r data %>% # select the variables needed select(Age, starts_with('T')) %>% # check out the variable types select(-starts_with('Time')) %>% # now do mutate across mutate(across(starts_with('T'), factor)) # see how variable types have changed # # Format: *# mutate(across(logic.test,function)) # mutate(across(logic.test,function)) ``` ] .panel2-across1-auto[ ``` # A tibble: 316 x 3 Age TVol T.Stage <dbl> <fct> <fct> 1 72.1 3 1 2 73.6 3 2 3 67.5 1 1 4 65.8 1 1 5 63.2 2 1 6 65.4 2 1 7 65.5 2 1 8 67.1 3 1 9 63.9 2 1 10 63 2 1 # ... with 306 more rows ``` ] <style> .panel1-across1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-across1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-across1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style>