Variable Types

September 23 + 25, 2024

Jo Hardin

Variable Types

Some new variable types:

character strings
factor variables
dates
numeric
logical

A variable’s type determines the values that the variable can take on and the operations that can be performed on it. Specifying variable types ensures the data’s integrity and increases performance.

Agenda 9/23/24

Character strings
str_*() functions
Factor variables

Character strings

When working with character strings, we might want to detect, replace, or extract certain patterns.

Strings are objects of the character class (abbreviated as <chr> in tibbles). When you print out strings, they display with double quotes:

some_string <- "banana"
some_string

[1] "banana"

Creating strings

Creating strings by hand is useful for testing out regular expressions.

To create a string, type any text in either double quotes " or single quotes '. Using double or single quotes doesn’t matter unless your string itself has single or double quotes.

string1 <- "This is a string"
string2 <- 'If I want to include a "quote" inside a string, I use single quotes'

string1

[1] "This is a string"

string2

[1] "If I want to include a \"quote\" inside a string, I use single quotes"

`str_view()`

We can view these strings “naturally” (without the opening and closing quotes) with str_view():

str_view(string1)

[1] │ This is a string

str_view(string2)

[1] │ If I want to include a "quote" inside a string, I use single quotes

`str_c`

Similar to paste() (gluing strings together), but works well in a tidy pipeline.

df <- tibble(name = c("Flora", "David", "Terra", NA))
df |> mutate(greeting = str_c("Hi ", name, "!"))

# A tibble: 4 × 2
  name  greeting 
  <chr> <chr>    
1 Flora Hi Flora!
2 David Hi David!
3 Terra Hi Terra!
4 <NA>  <NA>

`str_sub()`

str_sub(string, start, end) will extract parts of a string where start and end are the positions where the substring starts and ends.

fruits <- c("Apple", "Banana", "Pear")
str_sub(fruits, 1, 3)

[1] "App" "Ban" "Pea"

str_sub(fruits, -3, -1)

[1] "ple" "ana" "ear"

Won’t fail if the string is too short.

str_sub(fruits, 1, 5)

[1] "Apple" "Banan" "Pear"

`str_sub()` in a pipeline

We can use the str_*() functions inside the mutate() function.

titanic |> 
  mutate(class1 = str_sub(Class, 1, 1))

   Class    Sex   Age Survived Freq class1
1    1st   Male Child       No    0      1
2    2nd   Male Child       No    0      2
3    3rd   Male Child       No   35      3
4   Crew   Male Child       No    0      C
5    1st Female Child       No    0      1
6    2nd Female Child       No    0      2
7    3rd Female Child       No   17      3
8   Crew Female Child       No    0      C
9    1st   Male Adult       No  118      1
10   2nd   Male Adult       No  154      2
11   3rd   Male Adult       No  387      3
12  Crew   Male Adult       No  670      C
13   1st Female Adult       No    4      1
14   2nd Female Adult       No   13      2
15   3rd Female Adult       No   89      3
16  Crew Female Adult       No    3      C
17   1st   Male Child      Yes    5      1
18   2nd   Male Child      Yes   11      2
19   3rd   Male Child      Yes   13      3
20  Crew   Male Child      Yes    0      C
21   1st Female Child      Yes    1      1
22   2nd Female Child      Yes   13      2
23   3rd Female Child      Yes   14      3
24  Crew Female Child      Yes    0      C
25   1st   Male Adult      Yes   57      1
26   2nd   Male Adult      Yes   14      2
27   3rd   Male Adult      Yes   75      3
28  Crew   Male Adult      Yes  192      C
29   1st Female Adult      Yes  140      1
30   2nd Female Adult      Yes   80      2
31   3rd Female Adult      Yes   76      3
32  Crew Female Adult      Yes   20      C

`str_replace*()`

str_replace() replaces the first match of a pattern. str_replace_all() replaces all the matches of a pattern.

fruits

[1] "Apple"  "Banana" "Pear"

str_replace(fruits, "a", "x")

[1] "Apple"  "Bxnana" "Pexr"

str_replace_all(fruits, "a", "x")

[1] "Apple"  "Bxnxnx" "Pexr"

`str_detect()`

str_detect(fruits, "a")

[1] FALSE  TRUE  TRUE

`str_detect()` in pipeline

str_detect() used in a filter() pipeline.

original data
filtered data

starwars |> 
  select(name, films) |> 
  str()

tibble [87 × 2] (S3: tbl_df/tbl/data.frame)
 $ name : chr [1:87] "Luke Skywalker" "C-3PO" "R2-D2" "Darth Vader" ...
 $ films:List of 87
  ..$ : chr [1:5] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "Revenge of the Sith" ...
  ..$ : chr [1:6] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "The Phantom Menace" ...
  ..$ : chr [1:7] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "The Phantom Menace" ...
  ..$ : chr [1:4] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "Revenge of the Sith"
  ..$ : chr [1:5] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "Revenge of the Sith" ...
  ..$ : chr [1:3] "A New Hope" "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr [1:3] "A New Hope" "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr "A New Hope"
  ..$ : chr "A New Hope"
  ..$ : chr [1:6] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "The Phantom Menace" ...
  ..$ : chr [1:3] "The Phantom Menace" "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr [1:2] "A New Hope" "Revenge of the Sith"
  ..$ : chr [1:5] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "Revenge of the Sith" ...
  ..$ : chr [1:4] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "The Force Awakens"
  ..$ : chr "A New Hope"
  ..$ : chr [1:3] "A New Hope" "Return of the Jedi" "The Phantom Menace"
  ..$ : chr [1:3] "A New Hope" "The Empire Strikes Back" "Return of the Jedi"
  ..$ : chr "A New Hope"
  ..$ : chr [1:5] "The Empire Strikes Back" "Return of the Jedi" "The Phantom Menace" "Attack of the Clones" ...
  ..$ : chr [1:5] "The Empire Strikes Back" "Return of the Jedi" "The Phantom Menace" "Attack of the Clones" ...
  ..$ : chr [1:3] "The Empire Strikes Back" "Return of the Jedi" "Attack of the Clones"
  ..$ : chr "The Empire Strikes Back"
  ..$ : chr "The Empire Strikes Back"
  ..$ : chr [1:2] "The Empire Strikes Back" "Return of the Jedi"
  ..$ : chr "The Empire Strikes Back"
  ..$ : chr [1:2] "Return of the Jedi" "The Force Awakens"
  ..$ : chr "Return of the Jedi"
  ..$ : chr "Return of the Jedi"
  ..$ : chr "Return of the Jedi"
  ..$ : chr "Return of the Jedi"
  ..$ : chr "The Phantom Menace"
  ..$ : chr [1:3] "The Phantom Menace" "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr "The Phantom Menace"
  ..$ : chr [1:3] "The Phantom Menace" "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr [1:2] "The Phantom Menace" "Attack of the Clones"
  ..$ : chr "The Phantom Menace"
  ..$ : chr "The Phantom Menace"
  ..$ : chr "The Phantom Menace"
  ..$ : chr [1:2] "The Phantom Menace" "Attack of the Clones"
  ..$ : chr "The Phantom Menace"
  ..$ : chr "The Phantom Menace"
  ..$ : chr [1:2] "The Phantom Menace" "Attack of the Clones"
  ..$ : chr "The Phantom Menace"
  ..$ : chr "Return of the Jedi"
  ..$ : chr [1:3] "The Phantom Menace" "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr "The Phantom Menace"
  ..$ : chr "The Phantom Menace"
  ..$ : chr "The Phantom Menace"
  ..$ : chr "The Phantom Menace"
  ..$ : chr [1:3] "The Phantom Menace" "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr [1:3] "The Phantom Menace" "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr [1:3] "The Phantom Menace" "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr [1:2] "The Phantom Menace" "Revenge of the Sith"
  ..$ : chr [1:2] "The Phantom Menace" "Revenge of the Sith"
  ..$ : chr [1:2] "The Phantom Menace" "Revenge of the Sith"
  ..$ : chr "The Phantom Menace"
  ..$ : chr [1:3] "The Phantom Menace" "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr [1:2] "The Phantom Menace" "Attack of the Clones"
  ..$ : chr "Attack of the Clones"
  ..$ : chr "Attack of the Clones"
  ..$ : chr "Attack of the Clones"
  ..$ : chr [1:2] "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr [1:2] "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr "Attack of the Clones"
  ..$ : chr "Attack of the Clones"
  ..$ : chr [1:2] "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr [1:2] "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr "Attack of the Clones"
  ..$ : chr "Attack of the Clones"
  ..$ : chr "Attack of the Clones"
  ..$ : chr "Attack of the Clones"
  ..$ : chr "Attack of the Clones"
  ..$ : chr "Attack of the Clones"
  ..$ : chr [1:2] "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr "Attack of the Clones"
  ..$ : chr "Attack of the Clones"
  ..$ : chr [1:2] "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr "Revenge of the Sith"
  ..$ : chr "Revenge of the Sith"
  ..$ : chr [1:2] "A New Hope" "Revenge of the Sith"
  ..$ : chr [1:2] "Attack of the Clones" "Revenge of the Sith"
  ..$ : chr "Revenge of the Sith"
  ..$ : chr "The Force Awakens"
  ..$ : chr "The Force Awakens"
  ..$ : chr "The Force Awakens"
  ..$ : chr "The Force Awakens"
  ..$ : chr "The Force Awakens"

starwars |> 
  filter(str_detect(films, "Empire")) |> 
  select(name, films) |> 
  str()

tibble [16 × 2] (S3: tbl_df/tbl/data.frame)
 $ name : chr [1:16] "Luke Skywalker" "C-3PO" "R2-D2" "Darth Vader" ...
 $ films:List of 16
  ..$ : chr [1:5] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "Revenge of the Sith" ...
  ..$ : chr [1:6] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "The Phantom Menace" ...
  ..$ : chr [1:7] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "The Phantom Menace" ...
  ..$ : chr [1:4] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "Revenge of the Sith"
  ..$ : chr [1:5] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "Revenge of the Sith" ...
  ..$ : chr [1:6] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "The Phantom Menace" ...
  ..$ : chr [1:5] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "Revenge of the Sith" ...
  ..$ : chr [1:4] "A New Hope" "The Empire Strikes Back" "Return of the Jedi" "The Force Awakens"
  ..$ : chr [1:3] "A New Hope" "The Empire Strikes Back" "Return of the Jedi"
  ..$ : chr [1:5] "The Empire Strikes Back" "Return of the Jedi" "The Phantom Menace" "Attack of the Clones" ...
  ..$ : chr [1:5] "The Empire Strikes Back" "Return of the Jedi" "The Phantom Menace" "Attack of the Clones" ...
  ..$ : chr [1:3] "The Empire Strikes Back" "Return of the Jedi" "Attack of the Clones"
  ..$ : chr "The Empire Strikes Back"
  ..$ : chr "The Empire Strikes Back"
  ..$ : chr [1:2] "The Empire Strikes Back" "Return of the Jedi"
  ..$ : chr "The Empire Strikes Back"

stringr functions

The stringr package within tidyverse contains lots of functions to help process strings. Letting x be a string variable…

str function	arguments	returns
`str_replace()`	`x`, `pattern`, `replacement`	a modified string
`str_replace_all()`	`x`, `pattern`, `replacement`	a modified string
`str_to_lower()`	`x`	a modified string
`str_to_upper()`	`x`	a modified string
`str_sub()`	`x`, `start`, `end`	a modified string
`str_length()`	`x`	a number
`str_detect()`	`x`, `pattern`	TRUE/FALSE

Use the stringr cheatsheet.

Agenda 9/25/24

Factor variables
Time and date objects

Factor variables

Factor variables are a special type of character string. The computer actually stores them as integers (?!?!!?) with a label (abbreviated as <fct> in tibbles).

categorical variable
represented in discrete levels with an ordering

Where do we order?

The ordering of the factor variable comes out in:

plots (e.g., barplots)
tables (e.g., group_by())
modeling (e.g., the baseline level in a linear regression)

Order matters

SurveyUSA poll from 2012 on views of the DREAM Act.

What is off about the data viz part of the report?

Data
Plot
Levels

openintro::dream

# A tibble: 910 × 2
   ideology     stance
   <fct>        <fct> 
 1 Conservative Yes   
 2 Conservative Yes   
 3 Conservative Yes   
 4 Conservative Yes   
 5 Conservative Yes   
 6 Conservative Yes   
 7 Conservative Yes   
 8 Conservative Yes   
 9 Conservative Yes   
10 Conservative Yes   
# ℹ 900 more rows

dream |> 
  ggplot(aes(x = ideology, fill = stance)) + 
  geom_bar()

dream |> 
  select(ideology) |> 
  pull() |>  # levels() works only on vectors, not data frames
  levels()

[1] "Conservative" "Liberal"      "Moderate"

Change the order

We can fix the order of the ideology variable.

Code
Plot

dream |> 
  mutate(ideology = fct_relevel(ideology, 
                                c("Liberal", "Moderate", "Conservative"))) |> 
  ggplot(aes(x = ideology, fill = stance)) + 
  geom_bar()

starbucks |> 
  select(item, type, calories)

# A tibble: 77 × 3
   item                          type   calories
   <chr>                         <fct>     <int>
 1 "8-Grain Roll"                bakery      350
 2 "Apple Bran Muffin"           bakery      350
 3 "Apple Fritter"               bakery      420
 4 "Banana Nut Loaf"             bakery      490
 5 "Birthday Cake Mini Doughnut" bakery      130
 6 "Blueberry Oat Bar"           bakery      370
 7 "Blueberry Scone"             bakery      460
 8 "Bountiful Blueberry Muffin"  bakery      370
 9 "Butter Croissant "           bakery      310
10 "Cheese Danish"               bakery      420
# ℹ 67 more rows

Reorder according to another variable

Lets say that we wanted to order the type of food item based on the average number of calories in that food.

Code
Plot

starbucks |> 
  mutate(type = fct_reorder(type, calories, .fun = "mean", .desc = TRUE)) |> 
  ggplot(aes(x = type, y = calories)) + 
  geom_point() + 
  labs(x = "type of food",
       y = "",
       title = "Calories for food items at Starbucks")

Change character to factor

OG data
New factor

starbucks

# A tibble: 77 × 7
   item                          calories   fat  carb fiber protein type  
   <chr>                            <int> <dbl> <int> <int>   <int> <fct> 
 1 "8-Grain Roll"                     350     8    67     5      10 bakery
 2 "Apple Bran Muffin"                350     9    64     7       6 bakery
 3 "Apple Fritter"                    420    20    59     0       5 bakery
 4 "Banana Nut Loaf"                  490    19    75     4       7 bakery
 5 "Birthday Cake Mini Doughnut"      130     6    17     0       0 bakery
 6 "Blueberry Oat Bar"                370    14    47     5       6 bakery
 7 "Blueberry Scone"                  460    22    61     2       7 bakery
 8 "Bountiful Blueberry Muffin"       370    14    55     0       6 bakery
 9 "Butter Croissant "                310    18    32     0       5 bakery
10 "Cheese Danish"                    420    25    39     0       7 bakery
# ℹ 67 more rows

starbucks |> 
  mutate(item = as.factor(item))

# A tibble: 77 × 7
   item                          calories   fat  carb fiber protein type  
   <fct>                            <int> <dbl> <int> <int>   <int> <fct> 
 1 "8-Grain Roll"                     350     8    67     5      10 bakery
 2 "Apple Bran Muffin"                350     9    64     7       6 bakery
 3 "Apple Fritter"                    420    20    59     0       5 bakery
 4 "Banana Nut Loaf"                  490    19    75     4       7 bakery
 5 "Birthday Cake Mini Doughnut"      130     6    17     0       0 bakery
 6 "Blueberry Oat Bar"                370    14    47     5       6 bakery
 7 "Blueberry Scone"                  460    22    61     2       7 bakery
 8 "Bountiful Blueberry Muffin"       370    14    55     0       6 bakery
 9 "Butter Croissant "                310    18    32     0       5 bakery
10 "Cheese Danish"                    420    25    39     0       7 bakery
# ℹ 67 more rows

forcats functions

The forcats package within tidyverse contains lots of functions to help process factor variables Use the forcats cheatsheet. We’ll focus on the most common functions.

functions for changing the order of factor levels
- fct_relevel() = manually reorder levels
- fct_reorder() = reorder levels according to values of another variable
- fct_infreq() = order levels from highest to lowest frequency
- fct_rev() = reverse the current order
functions for changing the labels or values of factor levels
- fct_recode() = manually change levels
- fct_lump() = group together least common levels

Time and date objects

If the variable is formatted as a time or date object, you will find that there are very convenient ways to access, wrangle, and plot the information.

There are three types of date/time data that refer to an instant in time:

A date. Tibbles print this as <date>.

A time within a day. Tibbles print this as <time>.

A date-time is a date plus a time: it uniquely identifies an instant in time (typically to the nearest second). Tibbles print this as <dttm>. Base R calls these POSIXct, but that doesn’t exactly trip off the tongue.

Formatting time variablse

image credit: https://xkcd.com/1179/

What time is it?

today()

[1] "2024-09-25"

now()

[1] "2024-09-25 12:27:16 PDT"

Creating dates

ymd() and friends create dates

ymd("2024-09-25")

[1] "2024-09-25"

mdy("September 25th, 2024")

[1] "2024-09-25"

dmy("25-Sep-2024")

[1] "2024-09-25"

… with times

To create a date-time, add an underscore and one or more of “h”, “m”, and “s” to the name of the parsing function:

ymd_hms("2024-09-25 11:45:59", tz = "America/Los_Angeles")

[1] "2024-09-25 11:45:59 PDT"

mdy_hm("09/25/2024 15:01")  # default is UTC = GMT

[1] "2024-09-25 15:01:00 UTC"

More information about time zones in R.

lubridate

lubridate is a another R package meant for data wrangling!

In particular, lubridate makes it very easy to work with days, times, and dates. The base idea is to start with dates in a ymd (year month day) format and transform the information into whatever you want.

Example from the lubridate vignette.

If anyone drove a time machine, they would crash

The length of months and years change so often that doing arithmetic with them can be unintuitive.

Consider a simple operation: January 31st + one month.

If anyone drove a time machine, they would crash

The length of months and years change so often that doing arithmetic with them can be unintuitive.

Consider a simple operation: January 31st + one month.

Should the answer be:

February 31st (which doesn’t exist)?
March 4th (31 days after January 31)?
February 28th (assuming its not a leap year)?

If anyone drove a time machine, they would crash

A basic property of arithmetic is that a + b - b = a. Only solution 1 obeys the mathematical property, but it is an invalid date. Wickham wants to make lubridate as consistent as possible by invoking the following rule: if adding or subtracting a month or a year creates an invalid date, lubridate will return an NA.

If you thought solution 2 or 3 was more useful, no problem. You can still get those results with clever arithmetic, or by using the special %m+% and %m-% operators. %m+% and %m-% automatically roll dates back to the last day of the month, should that be necessary.

basics in `lubridate`

library(lubridate)
rightnow <- now()
rightnow

[1] "2024-09-25 12:27:16 PDT"

day(rightnow)

[1] 25

week(rightnow)

[1] 39

month(rightnow, label=FALSE)

[1] 9

month(rightnow, label=TRUE)

[1] Sep
12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec

year(rightnow)

[1] 2024

basics in `lubridate`

minute(rightnow)

[1] 27

hour(rightnow)

[1] 12

yday(rightnow)

[1] 269

mday(rightnow)

[1] 25

wday(rightnow, label=FALSE)

[1] 4

wday(rightnow, label=TRUE)

[1] Wed
Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat

Working with a date object

jan31 <- ymd("2024-01-31")
jan31 + months(0:11)

 [1] "2024-01-31" NA           "2024-03-31" NA           "2024-05-31"
 [6] NA           "2024-07-31" "2024-08-31" NA           "2024-10-31"
[11] NA           "2024-12-31"

floor_date(jan31, "month")

[1] "2024-01-01"

floor_date(jan31, "month") + months(0:11) + days(31)

 [1] "2024-02-01" "2024-03-03" "2024-04-01" "2024-05-02" "2024-06-01"
 [6] "2024-07-02" "2024-08-01" "2024-09-01" "2024-10-02" "2024-11-01"
[11] "2024-12-02" "2025-01-01"

jan31 + months(0:11) + days(31)

 [1] "2024-03-02" NA           "2024-05-01" NA           "2024-07-01"
 [6] NA           "2024-08-31" "2024-10-01" NA           "2024-12-01"
[11] NA           "2025-01-31"

jan31 %m+% months(0:11)

 [1] "2024-01-31" "2024-02-29" "2024-03-31" "2024-04-30" "2024-05-31"
 [6] "2024-06-30" "2024-07-31" "2024-08-31" "2024-09-30" "2024-10-31"
[11] "2024-11-30" "2024-12-31"

NYC flights

library(nycflights13)
names(flights)

 [1] "year"           "month"          "day"            "dep_time"      
 [5] "sched_dep_time" "dep_delay"      "arr_time"       "sched_arr_time"
 [9] "arr_delay"      "carrier"        "flight"         "tailnum"       
[13] "origin"         "dest"           "air_time"       "distance"      
[17] "hour"           "minute"         "time_hour"

NYC flights

Creating a date object from variables.

flightsWK <- flights |>  
   mutate(ymdday = ymd(paste(year, month,day, sep="-"))) |> 
   mutate(weekdy = wday(ymdday, label=TRUE), 
          whichweek = week(ymdday)) 

flightsWK |>  select(year, month, day, ymdday, weekdy, whichweek, 
                     dep_time, arr_time, air_time)

# A tibble: 336,776 × 9
    year month   day ymdday     weekdy whichweek dep_time arr_time air_time
   <int> <int> <int> <date>     <ord>      <dbl>    <int>    <int>    <dbl>
 1  2013     1     1 2013-01-01 Tue            1      517      830      227
 2  2013     1     1 2013-01-01 Tue            1      533      850      227
 3  2013     1     1 2013-01-01 Tue            1      542      923      160
 4  2013     1     1 2013-01-01 Tue            1      544     1004      183
 5  2013     1     1 2013-01-01 Tue            1      554      812      116
 6  2013     1     1 2013-01-01 Tue            1      554      740      150
 7  2013     1     1 2013-01-01 Tue            1      555      913      158
 8  2013     1     1 2013-01-01 Tue            1      557      709       53
 9  2013     1     1 2013-01-01 Tue            1      557      838      140
10  2013     1     1 2013-01-01 Tue            1      558      753      138
# ℹ 336,766 more rows

Variable Types

Variable Types

Agenda 9/23/24

Character strings

Creating strings

str_view()

str_c

str_sub()

str_sub() in a pipeline

str_replace*()

str_detect()

str_detect() in pipeline

stringr functions

Agenda 9/25/24

Factor variables

Where do we order?

Order matters

Change the order

Factor and character variables

Reorder according to another variable

Change character to factor

forcats functions

Time and date objects

Formatting time variablse

What time is it?

Creating dates

… with times

lubridate

If anyone drove a time machine, they would crash

If anyone drove a time machine, they would crash

If anyone drove a time machine, they would crash

basics in lubridate

basics in lubridate

Working with a date object

NYC flights

NYC flights

`str_view()`

`str_c`

`str_sub()`

`str_sub()` in a pipeline

`str_replace*()`

`str_detect()`

`str_detect()` in pipeline

basics in `lubridate`

basics in `lubridate`