Requirements of Assign1
Requirements
Instructions
Part 1
Requirements
a) Describe a big data application that the company is using and use it to illustrate the characteristics of big data analytics. (4)
b) Suggest and describe one new big data application that would help the company improve their business performance. (4)
c) Explain why your suggested application is innovative and useful. Discuss the challenges of implementing the application that you proposed. (4)
Part2
Numerical Variables
Part A: Model (3 marks)
Requirement
- Remove the ID attribute. Consider all attributes except the class attribute as numeric attributes.
Requirement
- Build a Naïve Bayes model and classify benign and malignant. Show the screenshot of the model when you train the model with all records.
Part B: Explanation (3 marks)
Requirement
Use one record to explain how the model makes classification.
Part C: 10 fold cross validation (3 marks)
Requirement
Show the accuracy of the model using 10-fold cross validation and the confusion matrix. Show and explain the meaning of the precision and recall for malignant.
Categorical Variables
Part D: Discretizetion (3 marks)
Requirement
Discretise the data set using three bins (equal-frequency).
Part E: Model (3 Marks)
- Model and Evaluation
Requirement
Build a Naïve Bayes model and classify benign and malignant using the discretised dataset. Show the accuracy of the model using 5-fold cross validation and the confusion matrix.
Part F: Explaination (3 Marks)
Requirement
Use one record to explain how the model makes classification using the discretised dataset.