Lecturer: Uğur Aytun, Visiting Researcher at METU
Classroom: 01:40 PM-16:30 PM Friday, Computer-Lab
Office hours: 12:00 PM-01:00 PM Friday, A26-B
Course prerequisites: IS 100, ECON 206
Description
Data is a crucial resource for understanding and interpreting the world around us. Effectively harnessing this resource is essential for deriving meaningful insights. In economics, the importance of data has grown significantly with the proliferation of diverse sources, including administrative datasets, large-scale surveys, and social media data.
This course aims to introduce students to the modern data science toolkit. It covers the fundamentals of data manipulation, visualization, and key statistical techniques, such as regression analysis. The primary tool for this course is R, an open-source programming language widely used by economists. R is a powerful tool for data analysis and visualization, and it is an essential skill for any economist working with data. It also helps us to estimate big data with high-dimensional fixed effects.
Course objectives
Students will learn how to use R for economic analysis and the basic tools of data science. By using R and RStudio, students will be able to import, clean, manipulate, visualize, report, present, and analyze data. They will also learn how to write reports and create presentations using R Markdown.
Learning outcomes
Basic programming skills in R programming language
Accessing and importing data from various sources
Manipulating, converting and storing data using data.table, collapse, haven and fst packages
Data visualization using ggplot2
Statistical analysis using regression models with fixest package
Difference-in-differences and event study analysis
Poisson estimation with high-dimensional fixed effects
Perform reproducible research
Creating reports and presentations using R Markdown
Grading
The course consists of lectures, midterms, homeworks and projects.
Course grades will be based on 2 midterms (30 pts each), 1 project (40 pts), and forum participation (as a bonus, up to 10 pts). There will be no make-up.
The project teams will consist of 3 students. Projects will be presented on-line and be submitted by midnight, the same day.
Textbooks
Erol Taymaz, Introduction to Data Science, Lecture Notes
Venables, W. N., Smith, D. M. and the R Core Team (2015), An Introduction to R, R Core Team
Grolemund, Garrett, and Wickham, Hadley (2017), R for Data Science, O’Reilly.
Hanck, C., Arnold, M., Gerber,A. and Schmelzer, M. (2020), Introduction to Econometrics with R
Julian Hinz and Irene Iodice, Data Science for Economists, Lecture Notes
Grant R. McDermott, Data science for ecoonmists, Lecture notes
Nick Huntington-Klein, The Effect: An Introduction to Research Design and Causality
Getting started with R and RStudio
In second week, please install R and RStudio on your computer. You can download R from here and RStudio from here.
I strongly recommend to use GitHub Copilot for R programming. But you should sign up to Github in advance.
Tools > Global Options > Code > Git > Enable GitHub Copilot.
Presentations
Week 1: Getting started
Week 2: Toolkit