Advanced Deep Learning
EECS E6691 - Topics and Data Driven Analysis and Computation
Advanced Deep Learning
Columbia University Course
Zoran Kostic, Ph.D., Dipl. Ing., Professor of Professional Practice, zk2172(at)columbia.edu
Electrical Engineering Department, Data Sciences Institute, Columbia University in the City of New York
Course in a nutshell:
Advanced theory and practice of Deep Learning. Applications and projects.
Description: Advanced (Second) Course on Deep Learning
Bulletin Description: Regularized autoencoders, sparse coding and predictive sparse decomposition, denoising autoencoders, representation learning, manifold perspective on representation learning, structured probabilistic models for deep learning, Monte Carlo methods, training and evaluating models with intractable partition functions, restricted Boltzmann machines, approximate inference, deep belief networks, deep learning in speech and object recognition.
Detailed Description for Spring 2024
EECS E6691 Advanced Deep Learning (TOPICS DATA-DRIVEN ANAL & COMP)
Spring 2024, 3 credits
Professor Zoran Kostic zk2172 (at) columbia.edu
A second-level seminar-style course in which the students study advanced topics in deep learning. Prior to this course, students must previously take a first course in deep learning. The course consists of: (i) studying state-of-the art architectural and modeling concepts, (ii) systematic review of recent literature and reproduction of the results, (iii) pursuing novel research ideas, (iv) participating in local and potentially in public contests on Kaggle or elsewhere, (v) class presentation(s) of paper studies during the semester, (vi) final project, (vii) quizzes during the lecture time. The course will address topics beyond material covered in the first course on Deep Learning (such as Columbia course ECBM E4040), with applications of interest to students. Example topics are object detection and tracking, smart city and medical applications, use of spectral-domain processing, applications of transformers, capsule networks.
Students entering the course must have prior experience with deep learning and neural network architectures including Convolutional Neural Nets (CNNs), Recurrent Neural Networks (RNNs), Long Short Term Memories (LSTMs), and autoencoders. They need to have a working knowledge of coding in Python, Python libraries, Jupyter notebook, Tensorflow, both on local machines and on a cloud platform (such as Google cloud GCP), and of Github or similar. The framework and associated tools which are the focus of this course are PyTorch and Google Cloud. The course will leverage the infrastructure and coding/python templates from the ECBM E4040 assignments (the first course in deep learning by Prof. Kostic). Students must be self-sufficient learners and take an active role during the classroom activities.
Semester class assignments (paper reviews) will consist of reading, coding, and presentations. Every week, several student groups will make presentations reviewing selected papers from recent conferences such as NIPS and ICLR, including students’ results in reproducing the papers, which will be followed by open discussions. Quizzes are a part of the class time.
Final project is a group project (up to 3 students). The topic will be selected by students or by the instructor. It needs to be documented in a conference-style report, with code deposited in a github repository. The code needs to be documented and instrumented such that the instructor can run it after a download from the repository. A google-slide presentation of the project suitable for poster version is required. Students will present the project at the end of the semester using the slides.
Prerequisites
(i) Machine Learning (taken previously, or in parallel with this course).
(ii) ECBM E4040 Neural Networks and Deep Learning, or an equivalent neural network/DL university course taken for academic credit. Whereas the quality of online ML and DL courses (coursera, udacity, eDx) is outstanding, many takers of online courses do the hands-on coding assignments superficially and therefore do not gain practical coding skills which are essential to participate in this advanced course. Therefore, online courses are not accepted as prerequisites.
(iii) The course requires an excellent theoretical background in probability and statistics and linear algebra.
Students are strongly advised to drop the class if they do not have adequate theoretical background and/or previous experience with programming deep learning models. It is strongly advised (the instructor’s requirement) that students take no more than 12 credits of any coursework (including this course, and project courses) during the semester while this course is being taken.
Registration
The enrollment is limited to several dozen students. Instructor’s permission is required to register. Students interested in the course need to populate the SSOL waitlist and MUST also populate this questionnaire. The instructor will move the students off of the SSOL waitlist after reviewing the questionnaire.
(Tentative) Grading for Spring 2024
Assignments: 30% (there may be 2-4 assignments per semesters)
Topic Papers: student presentation + code + discussions: 30%
Project (proposal presentation + final presentation + final report + code repository): 30%
Quizzes + class contribution: 10%
Assignment submission policy:
The total number of days late for all assignments together is 4 days.
Late days do not apply to final project reports, slides, presentations, code
Content
Analytical study and software design.
Several assignments in Python and in PyTorch
Significant project.
Pursuing deeper exploration of deep learning.
Syllabus (2024 Spring)
Attention Fundamentals
Transformer Architecture
Building the Transformer
Vision Transformer
Segmentation Transformer
SWIN Transformer: Hierarchical Vision Transformer using Shifted Windows
End-to-End Object Detection with Transformers: DETR
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
GPT
Generative Models
Diffusion Models: DDPM and DALL·E 2
Diffusion Models; From LDMs to Stable Diffusion
Scalable Diffusion Models with Transformers
CLIP: Contrastive Language-Image Pre-training - Learning Transferable Visual Models From Natural Language Supervision
Mamba SSM: Linear-Time Sequence Modeling with Selective State Spaces
LLaMA and LLaMA adapter
Unifying LLMs and Knowledge Graphs
Frameworks for Autonomous Language Agents
Stealing Secrets from a Production Language Model
Graph Neural Networks
DSPY: Compiling Declarative Languages - Model Calls Into Self-Improving Pipelines
YOLOv3 to YoloV9
Multiresolution Pyramids: HOG, SIFT, Scattering Networks
Organization
Lectures:
Presentation of material by instructors and guest lecturers
Student Presentations
Every student contributes several presentations on the subject of interest (can be in groups)
Assignments:
Combination of analytical and programming assignments
Quizzes:
During the class time
Projects:
Team-based
Students with complementary backgrounds
Significant design
Reports and presentations to Columbia and NYC community
Best projects could qualify for publications and/or funding
In-person attendance is mandatory
Prerequisites:
Required: knowledge of linear algebra, probability and statistics, programming, machine learning, first course in deep learning.
Prerequisite courses: ECBM E4040 or similar
Time:
Spring 2024 - https://doc.sis.columbia.edu/#subj/EECS/E6691-20241-001/
Spring 2023
Spring 2022 - Advanced Deep Learning (EECS E6691 TPC - Topics in Data-driven Analysis and Computation)
Spring 2021 - Advanced Deep Learning (EECS E6691 TPC - Topics in Data-driven Analysis and Computation)
Project Areas
Smart cities
Medical Applications
Autonomous vehicles
Environmental
Physical data analytics
Finance
Books, Tools and Resources
BOOKS:
Tools/Software platform:
PyTorch as the main framework, Google TensorFlow, Google Cloud, Python, bitbucket, PyTorch
2024 Spring Projects
AudioMamba: Mamba Architecture for Audio Classification
Music Generation with Music transformer and LSTM
COVID-19 Forecasting using Spatio-Temporal Graph Attention Networks
ConMamba: Convolution-augmented Mamba for Speech Recognition
Diminished Reality for Emerging Applications in Medicine through Inpainting
Efficent Deep Learning Investigation
Real-time Automatic Face Anonymization in Video Streams
Development of Discrete Prompt Learning via Evolutionary Search
Heart Disease Detection Using Transformer Models
3D Tumor Segmentation with U-Net: Analyzing MRI Scans for Medical Insights
Low-Light Raw Image Enhancement
Direct Preference Optimization and Proximal Policy Optimization on Small Language Models
3D Scene Reconstruction using Neural Radiance Fields and Structure from Motion
Dynamic Video Generation from Static Comic Panels
Small Object Detection using improved YOLO
Predictive Modeling of Tennis Player Poses and Ball Trajectory
VMamba: Visual State Space Model
2023 Spring Projects
Object Recognition and Seq2Seq Models for Handwritten Equation Recognition and LaTeX Translation
Deep Learning for Soccer Pass Receiver Prediction in Broadcast Images
Automatic Person Removal Pipeline
AI Photographic Assistant: Implementation of Deep Learning Photographic Tools
The Development of Athena: Leveraging GPT-3.5 for Adaptive and Interactive Intelligent Tutoring for Personalized Learning
What’s Happened, Happened: Leveraging Past Decisions for Improved Interactive Image Segmentation
ASL Detection Correction and Completion
A hierarchical attention based model for biopsy classification
Enhancing Indoor Bouldering Experience for Color-Blind Climbers: A Deep Learning Approach for Route Identification on Climbing Walls
SwagGAN (StyleGAN for Fashion)
Learning Multi-scale Visual Representation via Language Description for Segmentation
Deep Learning For Financial Time Series
Classifying neuron cell types within Drosophila melanogaster using graph convolutional networks
SAM Based Cell Blood Classification Model
Instance-Level Image Retrieval While Navigating Interactive 3D Environments
Gaze and head redirection model based on StyleGAN3
2022 Spring Projects
Pix2Pix Image-to-Image Translation with Conditional Adversarial Networks
Jump rope counter
Black Box Adversarial Attack with Style Information
Learning Signed Distance Function for 3D Shape Representation (DeepSDF)
Representation learning without any labeled data
Comparison of Self-Supervised Models for Music Classification
Subcellular localization of proteins using deep learning
Image Descriptions Generator
Predicting remaining surgery duration
Vision Transformer
Adversarial Audio Synthesis
PlaNet - Latent Dynamics from Pixels
RecoNET: Understanding What happens in Videos
2021 Spring Projects
Multi-Graph Graph Attention Network for Music Recommendation
3D Facial Reenactment from 2D Video
Temporal Fusion Transformers for Time Series Forecasting
Deep Reinforcement Learning for Environmental Policy
Stochastic & Split Convolutions for High Resolution Image Classification (Integrated with EfficientNet)
Temporal Fusion Transformers for Time Series Forecasting
Forecasting Corn Yields with Semiparametric CNNs
Pose Estimation + Instance Segmentation
Speaker Independent Speech Separation
Generalized Autoencoder-based Transfer Learning for Structural Damage Assessment
Deep Reinforcement Learning for Environmental Policy
End-to-end object detection with Transformers
Exploring latent space of InfoGAN
Multi-Graph Graph Attention Network for Music Recommendation
3D Facial Reenactment from 2D Video
Pose Estimation + Instance Segmentation
2018-2020 Projects
See list of projects under E6040 link
Course sponsored by equipment and financial contributions of:
NVIDIA GPU Education Center, Google Cloud, IBM Bluemix, AWS Educate, Atmel, Broadcom (Wiced platform); Intel (Edison IoT platform), Silicon Labs.
PREVIOUS SEMESTERS
Detailed Description for Spring 2023
Instructor: Dr. Mehmet Kerem Turkcan mkt2126 (at) columbia.edu
This is an advanced-level course in which the students study topics in deep learning. It is required that students had previously taken a first-course in deep learning. The course consists of: (i) lectures on state-of-the-art architectural and modeling concepts, (ii) assignments, (iii) exam, and a (iv) final project. The course will address topics beyond material covered in the first course on Deep Learning (such as ECBM E4040), with applications of interest to students. In 2023, the main subject of the lectures will be object detection.
Students entering the course have to have prior experience with deep learning and neural network architectures including Convolutional Neural Nets (CNNs), Recurrent Neural Networks (RNNs), Long Short Term Memories (LSTMs), and autoencoders. They need to have working knowledge of coding in Python, Python libraries, Jupyter notebook, TensorFlow both on local machines and on Google Cloud, and of GitHub or similar code hosting tools. The framework and associated tools which will be the focus of this course are PyTorch and Google Cloud. Students have to be self-sufficient learners and to take an active role during classroom activities.
There will be a few (3-4) assignments throughout the semester focusing on coding. In the second half of the course, there will be a midterm exam comprised of multiple-choice questions.
Final projects need to be documented in a conference-style report, with code deposited in a GitHub repository. The code needs to be documented and instrumented such that the instructor can run it after a download from the repository. A Google Slides presentation of the project suitable for a poster presentation is required.
Prerequisites
(i) Machine Learning (taken previously, or in parallel with this course).
(ii) ECBM E4040 Neural Networks and Deep Learning, or an equivalent neural network/DL university course taken for academic credit.
(iii) The course requires an excellent theoretical background in probability and statistics, and linear algebra.
Students are strongly advised to drop the class if they do not have an adequate theoretical background and/or previous experience with programming of deep learning models. It is strongly advised (the instructor’s requirement) that students take no more than 12 credits of any coursework (including this course and project courses) during the semester while this course is being taken.
Registration
The enrollment is limited to several dozen students. The instructor’s permission is required to register. Students interested in the course need to populate the SSOL waitlist, and MUST also populate the questionnaire. The instructor will move the students off of the SSOL waitlist after reviewing the questionnaire.
(Tentative) Grading for the course (2023 Spring)
Assignments: 30%
Midterm Exam (Delivered at Week 11): 30%
Project (Final report & Code Repository): 40%
(Potential) Class Contribution: x