Who Voted in favor of Brexit?
Brexit was indeed a major event for UK. While Brexit is said to be caused by gear of immigration, we will look at some variables such as party affiliation and, percentage of uk residents born in the uk and percentage of voters having a university degree , of specific counties of UK. The data comes from Elliott Morris, who cleaned it and made it available through his DataCamp class on analysing election and polling data in R.
We first read the file and glimse over it.
```r
url <- "https://assets.datacamp.com/production/repositories/1934/datasets/7c5ad33c949eb0042d50f8c18d538cde0c7bf4e7/brexit_results.csv"
# Download TFL data to temporary file
httr::GET(url, write_disk(brexit.temp <- tempfile(fileext = ".csv")))
## Response [https://assets.datacamp.com/production/repositories/1934/datasets/7c5ad33c949eb0042d50f8c18d538cde0c7bf4e7/brexit_results.csv]
## Date: 2020-09-20 21:16
## Status: 200
## Content-Type: <unknown>
## Size: 71.4 kB
## <ON DISK> C:\Users\Public\Documents\Wondershare\CreatorTemp\RtmpMbJWbL\fileb72056be1f93.csv
# Use read_excel to read it as dataframe
brexit_results <- read_csv(brexit.temp)
## Parsed with column specification:
## cols(
## Seat = col_character(),
## con_2015 = col_double(),
## lab_2015 = col_double(),
## ld_2015 = col_double(),
## ukip_2015 = col_double(),
## leave_share = col_double(),
## born_in_uk = col_double(),
## male = col_double(),
## unemployed = col_double(),
## degree = col_double(),
## age_18to24 = col_double()
## )
kable(head(brexit_results,5))
Seat | con_2015 | lab_2015 | ld_2015 | ukip_2015 | leave_share | born_in_uk | male | unemployed | degree | age_18to24 |
---|---|---|---|---|---|---|---|---|---|---|
Aldershot | 50.592 | 18.333 | 8.824 | 17.867 | 57.89777 | 83.10464 | 49.89896 | 3.637000 | 13.870661 | 9.406093 |
Aldridge-Brownhills | 52.050 | 22.369 | 3.367 | 19.624 | 67.79635 | 96.12207 | 48.92951 | 4.553607 | 9.974114 | 7.325850 |
Altrincham and Sale West | 52.994 | 26.686 | 8.383 | 8.011 | 38.58780 | 90.48566 | 48.90621 | 3.039963 | 28.600135 | 6.437453 |
Amber Valley | 43.979 | 34.781 | 2.975 | 15.887 | 65.29912 | 97.30437 | 49.21657 | 4.261173 | 9.336294 | 7.747801 |
Arundel and South Downs | 60.788 | 11.197 | 7.192 | 14.438 | 49.70111 | 93.33793 | 48.00189 | 2.468100 | 18.775592 | 5.734730 |
One may observe that the data in the brexit_results file is untidy, and we may want to use pivot_longer to put all the party percentages in the same column called ‘parpercent’, and the name of the party in the colomn ‘party’.
# Tide data
brexit_results1<-brexit_results %>%
# Pivot longer
pivot_longer(names_to= 'party', values_to='parpercent', cols=c(con_2015, lab_2015, ld_2015, ukip_2015)) %>%
# Select variables of interest
select(leave_share, party, parpercent)
# Show results
kable(head(brexit_results1,5))
leave_share | party | parpercent |
---|---|---|
57.89777 | con_2015 | 50.592 |
57.89777 | lab_2015 | 18.333 |
57.89777 | ld_2015 | 8.824 |
57.89777 | ukip_2015 | 17.867 |
67.79635 | con_2015 | 52.050 |
We color each party with its specific hex color, and we add each corespondence to a vector called pal.
# Create vector
pal <- c(
"ld_2015" = "#FDBB30",
"con_2015" = "#0087dc",
"lab_2015" = "#d50000",
"ukip_2015" = "#EFE600")
# Create ggplot
ggplot(brexit_results1, aes(x=parpercent,leave_share, color=party))+
geom_point(aes(color=factor(party)), shape=21, size=0.1, alpha=0.1)+#display different points, colored according to different parties
geom_jitter(alpha = 0.3)+ #adjust transparency, make the points more visible
scale_color_manual(name=NULL, values=pal, labels=c("Conservative", "Labour", 'Lib Dems', 'UKIP'))+ #change points according to party colors, change legend labels names
geom_smooth(method=lm)+ #to create the lines
theme_light()+
theme(legend.position="bottom")+ #we moved the legend from right to bottom
labs(title='How political affiliation translated to Brexit voting', x='Party % in the UK general elections', y='Leave % in the 2016 Brexit referendum')
## `geom_smooth()` using formula 'y ~ x'
Looking at the graph, one may observe that the more the county is affiliated with UKIP, the more likely they are to vote for leave. The line is quite steep, which makes us think that the two variables are indeed correlated.
Conversely, the more the country is affiliated with Liberal Democrats party, the more likely the citizens are to vote agains Brexit.
ggplot(brexit_results, aes(x = born_in_uk, y = leave_share)) +
geom_point(alpha=0.25) +
geom_smooth(method = "lm", col='#FFC0CB') +
theme_minimal() +
labs(title = "People born in UK scared of immigration?",
x = " Percent of UK Resident Born in the UK",
y = " Percent of votes cast in favor of Brexit ")
## `geom_smooth()` using formula 'y ~ x'
We can notice a high concetration of votes in the upper right corner, telling us that in the counties with high percentage of people born in the UK, there was a high percentage of votes in favor of Brexit.
sum(is.na(brexit_results$degree)) #Checking if there are any missing values
## [1] 59
degree_brex<-brexit_results %>%
filter(degree!= "NA") #filter for non missing values
hey<-ggplot(degree_brex, aes(x = degree, y = leave_share)) +
geom_point(alpha=0.2) +
#add pink regression line
geom_smooth(method = "lm", col='#FF1493') +
#change theme
theme_minimal() +
#add labels
labs(title = "More people without a degree voted for Brexit!",
x = " Percent of UK Residents in a County Having a Degree",
y = " Percent of votes cast in favor of Brexit ",
source= "https://www.thecrosstab.com/")
hey
## `geom_smooth()` using formula 'y ~ x'
We can thus infer that conties lower percentages of degree votes for Brexit more, as the line sharply decreses when the percentage of degrees increses. If only the people were more educated so Brexit would not have happened!