Posted: September 23rd, 2022

you will select one of the four options from the data repository in the weekly resources in preparation for inferential data analysis. Note that some repository options have multiple files to work with. In your paper, you will document the entire process used to review, cleanse, and transform the dataset acquired from curation via repository.
Follow these six steps to complete this assignment:
Apply for the RapidMiner Studio Educational License using the link provided under weekly resources and download RapidMiner Studio. Note that the software has built-in datasets and a great set of tutorials to get started!
Using RapidMiner Studio, determine the initial definition of the selected dataset, document the observations made (anomalies, missing data, outliers, distribution, etc.), and describe the metadata using a table or annotated screenshot in your paper.
Cleanse the dataset of all anomalies, providing representative annotated screenshots of the dataset before and after.
Perform at least three transformations (filtering, aggregation, merging, or another suitable transformation) of the dataset and provide annotated screenshots that show each committed transformation.
Export the processed (cleansed) dataset as a CSV file.
Document each of the steps taken during the data curation process by writing a clear and concise narrative that describes the steps in the process with justification and embed annotated screenshots.
Length: 5 to 7-page paper, not including title and references pages and the CSV file
References: Include a minimum of 3 scholarly references.

