DSS Case Study: Discover dataset’s data structure, trends, patterns, or any anomalies using Microsoft Excel

Use only selected or assigned dataset and analyze the data using Microsoft Excel to discover the structure of data, trends, patterns, or any anomalies in the data based on your own hypothesis. Perform the following tasks. You should use visualization to aid your answer.

Your project will include two main parts:

  1. The final project report which must incorporate all the following 5 tasks and is written using the provided template. (10 marks distributed among the below tasks).
  2. A presentation that illustrates your five tasks. (4 marks)

==========================================================

Task 1: Understand and describe the nature and structure of the selected dataset. (2 marks)

  • A brief description regarding the dataset.
  • Identify the features of dataset.
  • Propose hypothesis / assumptions (between two variables) to validate.

Task 2: Reduce the dimension of the datasets to support the hypothesis validation. If necessary, do data preprocessing on any missing values, duplicate values, etc. You can also generate new features from any of the provided features that may support your hypothesis. Due to the limitation of processing power of some devices, you can reduce your dataset to 1000 tuples. (2 marks)

Task 3: Provide descriptive statistics for some features using statistical methods to understand the dataset more and answer the following analysis questions :(3 marks)

  • Compare different attributes (features). What trend did you find?
  • Include any of the “measure of central tendency” such as the mean, median, and mode.
  • Describe the spread of your data. This may include the measure of variance, standard deviation, skewness, and kurtosis.

(You are encouraged to impose other analysis questions based on any trend you notice in the dataset).

Task 4: Validate the hypothesis in Task 3 by investigating the relationship between two quantitative variables you have chosen using correlation, regression and R-squared with possible conclusions. (2 marks)

Task 5: Show visual representation of your analysis (hint: use the right chart/graph for your data analysis). (1 mark)

Project Report Template

  1. Introduction

In this section provide a brief description of your project and an overview of the data that you are analyzing in this project.

 

  1. Body section

 

  • Data

In this section you should include a description of the data being examine (include number of samples in the dataset, features and their types, descriptive statistics of the data, etc).

 

  • Methods

In this section describe the methods you used to obtain the data and perform the analysis.

 

  • Analysis

In this section include a written description of your analysis, highlight the most important data you used for analysis, and include all relevant visualization.

 

  • Results

In this section include a written description of your analysis results, describe what was examined and the conclusions you made from the analysis followed by relevant visualization.

 

  1. Conclusion

In this section restate the main results of your analysis and provide any future recommendations.

 

 

 

 

Powered by WordPress