SQL queries

Questions to answer:

1. Seasonality of cab fares:

a. Create a table that gives the total paid for all fares in the month, average per fare, the standard deviation (std), and the number of fares by month.

b. Which months have the highest total fare for all rides? Does there seem to be any seasonality?

c. Which months have the highest average fare per ride? Does there appear to be any seasonality? As part of answering this question, produce a plot with error bars (+/- 1 standard deviation) of the average fare per month.

d. Which months have the largest variance in cab rides?

2. Fares by day of week:

a. Create a table that gives the total paid for all fares in a given week day, average per fare, the standard deviation (std), and the number of fares by day.

b. Which days have the highest total fare for all rides?

c. Which days have the highest fares per ride? Do weekends differ from weekdays in a statistically significant way? As part of answering this question, produce a plot with error bars (+/- 1 standard deviation) of the average fare by day.

3. Fares by time of day:

a. Create a table that gives the total paid for all fares in a given hour day, average per fare, the standard deviation (std), and the number of fares by hour.

b. Is it better to work the night shift or the day shift in terms of earned income?

4. Payment types:

a. Create a table that shows the average total fare and standard deviation by payment type.

b. Which payment types are the most lucrative?

c. Are disputed payments more likely to occur at one time of day than another?

5. Conclusion:

a. If you were to become a cab driver. When would you want to work? Why?

b. Is there anything else that stands out as worth mentioning?

Extra Credit Questions (worth up to 15 points total):

For 1-4, create a table or plot that helps answer each of the following question:

1. (2 pt) Do people tip better on the weekends or during the week?

2. (2 pt) Do people tip better in the morning or in the evening?

3. (2 pt) Do people take longer trips during the week or on the weekends?

4. (2 pt) Do people take longer trips in the morning or the evening?

5. (2 pt) Write a query that returns a receipt for each row with the following content. Items in bold should be dynamically populated from database columns. You only need to use the taxi_01 table. Remember CHAR(10,13) is used for carriage returns and line feeds.

Thank you for riding taxi #2776 on January 22, 2016 9:30 am.

Your ride lasted 6 minutes and covered a distance 0.7 miles.

The total for this trip is $4.45.

Please remember to tip your driver.

6. (5 pt) Which cab driver (taxi_id) is the best? Explain how you reached that conclusion. This should involve multiple lines of evidence. It is not necessarily the one who makes the most money.