ML for Social Science, Fall 2024

Resources
Outline:Book office hours:

Class files
i1 Session 1: Workflow and ML regression Interactive slides Print slides (pdf)
    R code Python code (View online)
i2 Session 2: Classification Interactive slides Print slides (pdf)
    R code Python code (View online)
i3 Session 3: Ensembling and Clustering Interactive slides Print slides (pdf)
    Example R code for clustering Python code (View online)
i4 Session 4: Causal and non-causal ML Interactive slides Print slides (pdf)
    Python code (View online)  
i5 Session 5: Bias and Fairness Interactive slides Print slides (pdf)
    Python code (View online) Example R code for SHAP
i6 Session 6: Text Analytics Interactive slides Print slides (pdf)
    Python code (View online) Example R code for text analytics
i7 Session 7: Linguistics Interactive slides Print slides (pdf)
    Example R code for text analytics Python code (View online)
    Implementing word2vec in Colab  
i8 Session 8: Topic modeling Interactive slides Print slides (pdf)
    Python code for LDA (View online) R code for STM and ETM (View online). This also includes code to build a word2vec model in R.
i9 Session 9: Transformers Interactive slides Print slides (pdf)
    Python code for BERTopic (View online) BERTopic visualizations: base topics, base heatmap, base time, base class, custom topics, custom heatmap
    Python code for FinBERT usage Pytorch code for FinBERT fine-tuning
    Python code for FinBERT fine-tuning on new data (View online) Pytorch code for FinBERT inference of the fine-tuned model (View online)
i10 Session 10: Understanding LLMs Interactive slides Print slides (pdf)
    Python code for OpenAI API (View online) Includes replications of 2 of the papers Building a tiny GPT from scratch
i11 Session 11: Using LLMs Interactive slides Print slides (pdf)
i12 Session 12: Images as Data Interactive slides Print slides (pdf)
    Python code for CNNs and object detection (View online) (Colab version) Image and text embeddings with CLIP