![](https://static.wixstatic.com/media/1aef70_1bb1a42ca211427588a130a338f5572b~mv2.jpg/v1/fill/w_1600,h_900,al_c,q_85,enc_avif,quality_auto/1aef70_1bb1a42ca211427588a130a338f5572b~mv2.jpg)
![](https://static.wixstatic.com/media/1aef70_eada6a2b18cc4f2b9eca82f09d191a3c~mv2.png/v1/fill/w_980,h_527,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/1aef70_eada6a2b18cc4f2b9eca82f09d191a3c~mv2.png)
Divvy Bike Share Case Study
This was my capstone project for my Google Data Analytics Certificate. I analyzed how members and casual riders of a bike share service in Chicago used their services differently.
** Although the dataset used in this project is assumed to be real life data and accurate to a bikeshare company named "Divvy" in Chicago, for the sake of the project we will replace the company name with a fictional company, 'Cyclystic', and assume all data is fictional as well
Summary of Task
Cyclistic is a bike-share program that offers a variety of bikes and has riders who primarily use traditional bikes and use them for leisure or commuting. The marketing director aims to increase annual memberships, which are more profitable than casual riders, by analyzing differences in how riders use the bikes and creating a new marketing strategy based on these findings. The junior data analyst is tasked with analyzing the historical ride data to identify trends and differences between riders and presenting these findings with visual representations.
Here is the resulting graphic representing the findings of this case study. For detailed documentation continue below.
(Please click the fullscreen button at the bottom right of the embeded box below for viewing.)
Documentation
Cyclistic is a bike-share program with more than 5,800 bikes and 600 docking stations that also provides reclining bikes, hand tricycles, and cargo bikes. Most riders choose traditional bikes, while 8% prefer assistive options, and approximately 30% use bikes for commuting.
Cyclistic's finance analysts have found that annual memberships are more profitable than casual riders. The marketing director aims to increase annual memberships by understanding how casual riders and members use the bikes differently.
As a Jr Data Analyst, my assignment is to analyze Cyclistics historical ride data for the past year to identify trends and differences in how riders use our service differently and deliver my findings alongside a visual representation that shareholders can easily explore to back up my findings.
First I needed to download the datasets of historical data found here.
![](https://static.wixstatic.com/media/1aef70_912821cd97614a429fb5c3ed03785828~mv2.png/v1/fill/w_980,h_381,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/1aef70_912821cd97614a429fb5c3ed03785828~mv2.png)
SQL
Data Exploration, Cleaning, and Filtering
(I wanted to work with cloud services and SQL more so I decided to use Google Cloud and create a dataset to work from)
![](https://static.wixstatic.com/media/1aef70_f3fbcf0e62ff43128b2ca6eaeaef921a~mv2.png/v1/fill/w_980,h_454,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/1aef70_f3fbcf0e62ff43128b2ca6eaeaef921a~mv2.png)
Next I verified the schema and data was matching accross all tables.
![](https://static.wixstatic.com/media/1aef70_2949cfe4d1ec4f8fb30c9a7540feac96~mv2.png/v1/fill/w_980,h_387,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/1aef70_2949cfe4d1ec4f8fb30c9a7540feac96~mv2.png)
Now I needed to perform some queries to pull the needed data from the dataset. I decided I wanted to compared differences month by month, comparing the number of rides taken, the type of bike ridden, and the time spent riding. I was unable to combine the tables via the FROM statement because BigQuery does not allow external tables to be combined this way, so I had to write a UNION statement to combine the tables in my query into one table for analysis.
SELECT
month,
member_casual,
rideable_type,
SUM(rider_count) AS rider_count,
ROUND(((SUM(time_traveled)/60)/60), 0) AS time_traveled_hrs
FROM (
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-02`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-03`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-04`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-05`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-06`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-07`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-08`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-09`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-10`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-11`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2022-12`
GROUP BY started_at, member_casual, rideable_type
UNION ALL
SELECT
FORMAT_DATETIME("%m-%Y", started_at) AS month,
member_casual,
rideable_type,
COUNT(rideable_type) AS rider_count,
SUM(TIMESTAMP_DIFF(ended_at, started_at, SECOND)) AS time_traveled
FROM `coursera-pro-367212.bike_data.2023-01`
GROUP BY started_at , member_casual, rideable_type
)
GROUP BY month, member_casual, rideable_type
ORDER BY month
Produces this table
![](https://static.wixstatic.com/media/1aef70_fb4e070642b448eaaacf74d267b45956~mv2.png/v1/fill/w_980,h_406,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/1aef70_fb4e070642b448eaaacf74d267b45956~mv2.png)
Now I needed to analyze the data
Tableau
I connected my dataset to Tableau Desktop via a csv I produced from the SQL query I wrote previously.
![](https://static.wixstatic.com/media/1aef70_2cb8cc459bbb4c9e818bbf7eb9f100f4~mv2.png/v1/fill/w_980,h_511,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/1aef70_2cb8cc459bbb4c9e818bbf7eb9f100f4~mv2.png)
I wanted to analyze the number of rides taken per month to see if there were any trends.
![](https://static.wixstatic.com/media/1aef70_df80a675cd4d4af5b8f28f2a80a6323c~mv2.png/v1/fill/w_980,h_458,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/1aef70_df80a675cd4d4af5b8f28f2a80a6323c~mv2.png)
(Percentages represent the percentage of total yearly rides that occurred during that month)
You can see that while there is a clear busy season during the spring and summer months, casual riders take a large majority (60%) of their rides during Q2 of the year, whereas members tend to continue riding into the offseason more frequently than casual riders, boasting less than half (48%) of their rides during the same period.
Next I wanted to explore whether there were differences in time spent riding.
![](https://static.wixstatic.com/media/1aef70_41fb71f50746409a8d6762cbfaa1677e~mv2.png/v1/fill/w_980,h_545,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/1aef70_41fb71f50746409a8d6762cbfaa1677e~mv2.png)
As we can see, even though membership riders take more overall rides than casual riders, the casual riders seem to take longer rides than members overall.
Let's explore what rideable types members and casual riders prefer.
![](https://static.wixstatic.com/media/1aef70_c78d438ff7f845ffb6ded492adadae24~mv2.png/v1/fill/w_980,h_464,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/1aef70_c78d438ff7f845ffb6ded492adadae24~mv2.png)
We can see that although members are pretty balanced, they lean slightly toward preferring classic bikes over their electric counterparts, whereas casual members seem to prefer electric bikes more than classic bikes. Now let's explore the length of rides depending on which bike type they choose to ride.
![](https://static.wixstatic.com/media/1aef70_d77f1a1a1e804c1f9e42586f6a111ac1~mv2.png/v1/fill/w_980,h_454,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/1aef70_d77f1a1a1e804c1f9e42586f6a111ac1~mv2.png)
(Each value, or mark, on the chart represents one months data for the corresponding rider and biketype.)
And this confirms that casual riders typically enjoy longer rides, regardless of bike type, than membership riders do. Interestingly we also notice that the docked bikes, although the least used option in number of rides from previous charts, is a contender for time spent riding each month.
Finally I wanted to represent these findings in a simple to understand graphic that would allow shareholders to isolate members and casual riders with highlighting options or completely remove bike types to explore data without the docked category. The overall results of my findings are still accurate when the docked biketype is removed from the graphics.
(Please click the fullscreen button at the bottom right of the embeded box below for viewing.)
In conclusion, it seems casual riders tend to enjoy longer rides than members, typically prefer electric bikes more than members, and don't take as many rides in the off-season as members do.
My recommendation for converting casual riders into membership riders based on this data would be to offer some form of rewards system for longer classic bike rides through memberships, incentivize more electric bike benefits when using a membership, and to target the beginning of Q2 to start the promotional period for membership sign ups.