WE WRITE CUSTOM ACADEMIC PAPERS

100% Original, Plagiarism Free, Tailored to your instructions

Order Now!

Preliminary Information
In both courseworks you will be analysing the same dataset.
Do not model your answer on the workshop material. The objective of the workshops is to introduce
you to dierent data mining tasks discussed in lectures, and not to give you a roadmap on how to
answer the coursework. Therefore if you simply reproduce the steps in the workshops you are very
likely to make serious mistakes.
In both courseworks you will be assessed on your understanding of the data mining process, your ability
to use correctly the tools that we covered in the course, and the ability to draw correct conclusions
from what you observe. You will not be assessed by your capability to use R or any other software.
Therefore, don’t include information about commands you used, or options you set, or how to draw a
gure etc. You will be simply wasting valuable space.
You are free to use any software to do the coursework. However, you can’t use as an excuse the fact that
you couldn’t do a particular task because the software you chose doesn’t oer a particular capability
which we covered in the workshops.
The page limit for this report is 8 pages using at least 11 point typeface. This limit is strict and it
includes appendices (which I strongly recommend that you don’t use). Standard penalties apply for
exceeding this limit.
Please pay particular attention to the disclaimer at the end of the assignment that gives more details
about the assessment of your report.
This is an individual piece of assessment, and you should ensure that your report re
ects your own
work exclusively. All reports go through automated software to detect plagiarism from a variety of
sources (including past and current students’s reports as well as online resources, conference and journal
publications etc.) The consequences of plagiarism are very serious.
Description of the Problem and the Data
A bank wants to develop a credit scoring model to classify applications for mortgages. You are provided
with a sample of 2000 observations (past customers). Table 1 provides a description of the variables at your
disposal. The target variable is named “Good” and indicates whether a customer proved to be a good”
customer (Good = 1) or bad” customer (Good =0). A bad customer is dened as someone that has missed
three or more payments during the rst year of the mortgage.
Tasks
Based on the project description and the distribution of the target variable, what are the implications
for building and assessing classication models for this problem? (10 marks)
Use visualisation tools and appropriate statistical measures covered in the course (i) to perform a
preliminary data analysis (answering questions about data quality like outliers, missing values, etc)
and (ii) to quantify how relevant each variable is for the classication problem at hand. (30 marks)
Certain variables in the dataset contain missing values. Is this relevant for your task, and how would
you treat these? Use data analysis tools like the visualisation and statistical measures to obtain insights
about the properties of missing values and what is a sensible way to treat these? You can make use a
logistic regression classier to answer this question. (Base your conclusions and recommendations on
properties of the data, and more generally your ndings, rather than generic arguments.) (30 marks)
1
Develop a logistic regression classier that you think is appropriate for this dataset. Your discussion
needs to show evidence of tackling issues such as the indicative questions listed below (this is an
indicative and not an exhaustive list):
{ Consider dierent ways to handle missing values and assess their implications. What seems to be
the better way of handling these and what are the implications? What did you learn through this
process, and how can you relate this to your previous ndings?
{ Which variables are important for this problem and how do your ndings compare with the
expectations you formed during the preliminary data analysis?
{ Explain carefully how you assessed models to evaluate their suitability and how this process led
you to revise / improve your recommendations.
{ Explain what the nal model you develop actually implies for the problem at hand.
(30 marks)
Report Assessment
Your coursework will not be evaluated by the quality of the nal logistic regression alone, or by whether you
got a particular answer right. You will be primarily assessed by whether you are able to correctly justify
the steps you took to complete the assignment. In other words, your report needs to document that you are
able to intelligently analyse the provided data, that you draw correct conclusions from you observations, and
that these conclusions lead you either to the next logical step of the data mining process, or to the revision
of decisions made in previous steps of the analysis. (Refer to the
owchart of data mining stages we covered
in the rst lectures and in particular to the feedback loops)
Therefore, don’t simply present the conclusions/ results of your analysis and expect to get a high mark.
Reports that don’t document the steps followed and the reasons why these were chosen will receive minimal
marks, even if the nal answer is sensible. Explain your reasoning clearly and in good English. Don’t provide
a list of bullet points, or unstructured sentences etc. Similarly, don’t include gures or any other output
from R that you don’t comment/ explain in the text. I will not assume that you know how to interpret these
correctly.
2
Good (Target variable)
1: Good customer
0: Bad customer
Income Annual Gross Income
Amount Amount of requested loan
Installment Percentage Installment as percentage of monthly earnings
Applications Applications for credit over past year
Loans Number of existing loans
Credit Cards Credit cards currently held
Payments Missed or Delayed Payments in last 5 years: None / Delayed / Missed
Age (in years)
Marital Status Married/ Single/ Divorced
Employment Other/ Self Employment / Part time / Full time
Time at Employment (in years)
Residential Status Rent / Own / Other
Time at Address (in years)
Repayment method Non-Automated / Automated
Area indicator Location of branch receiving application
Table 1: Data Description
3

Our Service Charter

  1. Excellent Quality / 100% Plagiarism-Free

    We employ a number of measures to ensure top quality essays. The papers go through a system of quality control prior to delivery. We run plagiarism checks on each paper to ensure that they will be 100% plagiarism-free. So, only clean copies hit customers’ emails. We also never resell the papers completed by our writers. So, once it is checked using a plagiarism checker, the paper will be unique. Speaking of the academic writing standards, we will stick to the assignment brief given by the customer and assign the perfect writer. By saying “the perfect writer” we mean the one having an academic degree in the customer’s study field and positive feedback from other customers.
  2. Free Revisions

    We keep the quality bar of all papers high. But in case you need some extra brilliance to the paper, here’s what to do. First of all, you can choose a top writer. It means that we will assign an expert with a degree in your subject. And secondly, you can rely on our editing services. Our editors will revise your papers, checking whether or not they comply with high standards of academic writing. In addition, editing entails adjusting content if it’s off the topic, adding more sources, refining the language style, and making sure the referencing style is followed.
  3. Confidentiality / 100% No Disclosure

    We make sure that clients’ personal data remains confidential and is not exploited for any purposes beyond those related to our services. We only ask you to provide us with the information that is required to produce the paper according to your writing needs. Please note that the payment info is protected as well. Feel free to refer to the support team for more information about our payment methods. The fact that you used our service is kept secret due to the advanced security standards. So, you can be sure that no one will find out that you got a paper from our writing service.
  4. Money Back Guarantee

    If the writer doesn’t address all the questions on your assignment brief or the delivered paper appears to be off the topic, you can ask for a refund. Or, if it is applicable, you can opt in for free revision within 14-30 days, depending on your paper’s length. The revision or refund request should be sent within 14 days after delivery. The customer gets 100% money-back in case they haven't downloaded the paper. All approved refunds will be returned to the customer’s credit card or Bonus Balance in a form of store credit. Take a note that we will send an extra compensation if the customers goes with a store credit.
  5. 24/7 Customer Support

    We have a support team working 24/7 ready to give your issue concerning the order their immediate attention. If you have any questions about the ordering process, communication with the writer, payment options, feel free to join live chat. Be sure to get a fast response. They can also give you the exact price quote, taking into account the timing, desired academic level of the paper, and the number of pages.

Excellent Quality
Zero Plagiarism
Expert Writers

Instant Quote

Subject:
Type:
Pages/Words:
Single spaced
approx 275 words per page
Urgency (Less urgent, less costly):
Level:
Currency:
Total Cost: NaN

Get 10% Off on your 1st order!