DSCI6002 Data Exploration
Fall 2019
Meeting Times and Location(s): Tues/Thurs 4:30pm – 5:45pm BCKM 208
Credit Hours: 3
Faculty Contact Information:
Dr. Vahid Behzadan, Assistant Professor
Email: vbehzadan@newhaven.edu
Phone: 203-479-4723
COURSE SYLLABUS
Course Description:
Prerequisites: no formal prerequisites; familiarity with linear algebra, calculus, and some object-oriented programming language is recommended. Introduction to the infrastructure and architecture of data warehousing systems, with a focus on querying, exploring, understanding, and transforming data features for statistical and machine learning applications. 3 credits.
Required Text(s):
OpenIntro Statistics by Diez, Barr and Centinkaya. This book is available for free from OpenIntro.org.
Additional Reading (Not Required):
- The Data Science Design Manual (Texts in Computer Science) 1st ed. 2017 Edition by Steven S. Skiena
- Python for Data Analysis, by Wes McKinney, O’Reilly Media, 2013 — This book is a nuts and bolts guide to data wrangling with Python, including such tools/libraries as Pandas, NumPy, and IPython. You will be expected to use these tools in doing your projects.
- The Signal and the Noise: Why so many predictions fail but some don’t, by Nate Silver, Penguin Press, 2012 — This popular, easy-to-read book focuses on how effectively data can be used to make predictions in domains like sports, science, economics, and politics. This is exactly what we are trying to do in this course, and Silver’s book is an excellent model to build on.
Course Structure/Course Format/Course Objectives:
Course will combine lectures, labs, assignments, and projects. Active learning will constitute as much as 50% of class time, so attendance and participation are required.
Course Objectives:
By the end of this course, students will be able to:
- Write Basic SQL Queries
- Demonstrate Python Programming Skills for Statistical Data Analysis
- Collect Good Data According to the Purpose of the Analysis
- Explore Data with Exploratory Data Analysis (EDA) techniques
- Calculate Probabilities
- Identify and Manipulate Random Variables / Distributions
- Make Statistical Inference from Sample to Population
- Model Quantitative Response Variable with Linear Regression
- Model Binary Response Variable with Logistic Regression
- Model Time Series Data
Student Learning Outcomes:
Demonstrate achievement of course objectives in class discussion, homeworks, exams, and final project.
Course Requirements & Assessment:
Please see official University of New Haven Academic Policies located in the links below:
Assignments/Projects:
- All work must be turned in via Blackboard. Please turn in whatever you have for participation credit, even if incomplete.
- Pen-and-paper assignments and quizzes will also be required. If handwriting is deemed illegible there may be a penalty, or the attempt may be completely rejected.
Examinations:
- The exams will include questions taken directly from the class discussions and exercises. Exams will also require handwritten code. Everything you are told or shown in class is fair game, not just the content of slides.
Participation:
- Active-learning techniques will be used, such as group discussions and “think-pair-share”, requiring students to work individually and/or with other students. Refusal to participate will be treated as absence from class and ultimately lead to dismissal from the class (see University Policies).
Grading:
Grades earned are based on your performance on homework, projects, the midterm and the final exam. The weight of each component is outlined below:
Midterm Exam | 25% |
Final Exam | 25% |
Labs/Quizzes/Homeworks/Participation | 30% |
Final Project | 20% |
Total** | 100% |
Typical Graduate Scale |
Grades Scored Between & it’s Letter Equivalent |
97 to 100 — A+ |
94 to Less than 97 — A |
90 to Less than 94 — A- |
87 to Less than 90 — B+ |
84 to Less than 87 — B |
80 to Less than 84 — B- |
77 to Less than 80 — C+ |
74 to Less than 77 — C |
70 to Less than 74 — C- |
Less than 70 — F |
Expectations:
Students should expect to spend at least two hours on academic studies outside, and in addition to, each hour of class time. There will be readings, simple questions/problems, and lab and projects.
Late Work:
Assignments turned in late may be accepted with a grade penalty, if the solutions have not been distributed yet. This is completely at the discretion of the instructor, as the goal is to balance learning and fairness.
Missed Work:
Exams may be made up in only the most unavoidable situations (at the discretion of the instructor). A formal excused absence (such as a note from Health Services or a healthcare provider) will be required before you can make up a missed exam.
Individual Work:
Students must work individually on assignments and projects unless specifically allowed to work in groups by the instructor. Any work taken from the internet must be cited properly (acceptance of code taken from elsewhere is at the discretion of the instructor) or will be considered plagiarism. Failure to adhere to this policy will result in penalties ranging from a zero on the assignment to a zero in the final grade. Students may also be subject to disciplinary action by the University of New Haven (see University Policies).
Course Outline/Schedule:
Day/Date | Topic/Note |
8/27 | Intro to data exploration |
8/29 | Sampling and Metrics |
9/3 | Exploratory Data Analysis |
9/5 | Lab 1 |
9/10 | Mathematical Preliminaries |
9/12 | Correlation |
9/17 | Assembling Data Sets |
9/19 | Data Cleaning – Lab 2 |
9/24 | Scores and Ranking |
9/26 | Statistical Distributions |
10/1 | Statistical Significance |
10/3 | Principles of Data Visualization + Practice of Data Visualization – Lab 3 |
10/8 | Midterm |
10/10 | Building Models |
10/15 | Validating Models |
10/17 | Linear Algebra Review |
10/24 | Linear Regression – Lab 4 |
10/29 | Gradient Descent Search and Regularization |
10/31 | Logistic Regression and Classification – Lab 5 |
11/5 | Nearest Neighbor Methods |
11/7 | Clustering – Lab 6 |
11/12 | Introduction to Machine Learning |
11/14 | Topics in Machine Learning |
11/19 | Lab 7 |
11/21 | Human-centric Data Science |
11/26 | Project Presentations |
12/3 | Final Review |
12/5 | Final Review – Make up |
12/17 3:30pm – 5:30pm | Final Exam |
University Policies:
Adding and Dropping Classes:
Tuesday, September 3, 2019 is the final day to drop this course so that it does not appear on your transcript. After the first week of class, self-service registration will not be enabled for students to directly add or drop classes. Students should contact the registrar’s office directly or the Academic Success Center for assistance with adding and dropping courses during this time.
Attendance Regulations:
Students are expected to attend regularly and promptly all their classes, appointments, and exercises. The instructor has the right to dismiss from class any student who has been absent more than two weeks (pro-rated for terms different from that of the semester). A dismissed student will receive a withdrawal (W) from the course if they are still eligible for a withdrawal per the university “Withdrawal from a Course” policy, or a failure (F) if not.
A student who is not officially registered in the course is not permitted to attend classes or take part in any other course activities.
Students absent from any class meeting are responsible for making up missed assignments and examinations at the discretion of the instructor.
If an instructor is more than 15 minutes late for a class meeting, without providing notification to the students, the students may leave without penalty.
Religious Observance Policy for Students:
The University of New Haven respects the right of its students to observe religious holidays that may necessitate their absence from class or from other required university-sponsored activities.
Students who wish to observe such holidays should not be penalized for their absence, although in academic courses they are responsible for making up missed work.
Note: instructors should try to avoid scheduling exams or quizzes on religious holidays, but where such conflicts occur should provide reasonable accommodations for missed assignment deadlines or exams. If a class, an assignment due date, or exam interferes with the observance of such a religious holiday, it is the student’s responsibility to notify his or her instructor, preferably at the beginning of the term, but otherwise at least two weeks before the holiday.
Students wishing to withdraw from a course MUST officially do so by completing the online form or by submitting a course withdrawal form to the registrar’s office. The final date to request a withdrawal for this term is Tuesday October 29, 2019. This request must be submitted to the Office of the University Registrar (and signed by the International Services Office if you are an international student). The grade of W will be recorded, but the course will not affect the GPA.
A grade of Incomplete (INC) is given only in special circumstances and indicates that the student has been given permission by the instructor to complete required course work (with the same instructor) after the end of the term. In the absence of the instructor a student should contact the Department Chair.
The University of New Haven expects its students to maintain the highest standards of academic conduct. Academic dishonesty is not tolerated at the University. To know what it is expected of them, students are responsible for reading and understanding the statement regarding academic honesty in the Student Handbook. Please ask me about my expectations regarding permissible or encouraged forms of student collaboration if there is any confusion about this topic.
The Dean of Students Office provides support and advocacy for students.
Commitment to Positive Learning Environment:
The University adheres to the philosophy that all community members should enjoy an environment free of any form of harassment, sexual misconduct, discrimination, or intimate partner violence. If you have been the victim of sexual misconduct, we encourage you to report this. If you report this to a faculty/staff member, they must notify our college’s Title IX coordinator about the basic facts of the incident (you may choose to request confidentiality from the University). If you encounter sexual harassment, sexual misconduct, sexual assault, or discrimination based on race, color, religion, age, national origin, ancestry, sex, sexual orientation, gender identity, or disability please contact the Title IX Coordinator, Caroline Koziatek, at (203) 932.7479 or ckoziatek@newhaven.edu. Separate title 9 from other forms of discrimination. Title IX at the University of New Haven
Reporting Bias Incidents:
At the University of New Haven, there is an expectation that all community members are committed to creating and supporting a climate which promotes civility, mutual respect, and open-mindedness. There also exists an understanding that with the freedom of expression comes the responsibility to support community members’ right to live and work in an environment free from harassment and fear. It is expected that all members of the University community will engage in anti-bias behavior and refrain from actions that intimidate, humiliate, or demean persons or groups or that undermine their security or self-esteem. Reporting Options
University Support Services:
The University recognizes that students can often use some help outside of class and offers academic assistance through several offices.
The Academic Success Center provides a wide range of academic support to day and evening undergraduate students beyond their first year of college.
Center for Learning Resources (CLR):
The Center for Learning Resources (CLR), located in the Peterson Library, provides academic content support to the students of the University of New Haven using metacognitive strategies that help students become aware of and learn to apply optimal learning processes in the pursuit of creating independent learners CLR tutors focus sessions on discussions of concepts and processes and typically use external examples to help students grasp and apply the material.
Writer to Writer is a peer-tutoring program inspired by the belief that all writers struggle and can benefit from talking through their ideas. Tutors are undergraduate students trained to work with you at any stage in the writing process.
Accessibility Resources Center:
Students with disabilities are encouraged to share, in confidence, information about needed specific course accommodations. The Accessibility Resources Center, located in Sheffield Hall, is responsible for and committed to providing services and support that serve to promote educational equity and ensure that students are able to participate in the opportunities available at the University of New Haven. Accommodations cannot be made without written documentation from the Accessibility Resources Center.
Counseling & Psychological Services:
The Counseling Center offers a variety of services aimed at helping students resolve personal difficulties and acquire the balance, skills, and knowledge that will enable them to take full advantage of their experience at the University of New Haven.