DS-2002: Data Science Systems

Jon Tupitza email
School of Data Science
University of Virginia

Discord GitHub commit activity GitHub last commit

Course Description

By definition, "data science" must make meaning out of ever-growing pools of data. But the researcher quickly discovers that the hand examination of any data, while useful for granular analysis, is never adequate for large samples. To produce data science at scale, researchers must make effective use of workflows, pipelines, and processes to ingest, parse, and transform data with tools and automation.

This course will center on exposing students to contemporary pipelines for data analysis through a series of steadily escalating use cases. The course will begin with simple local database construction and evolve to cloud-based infrastructure such as AWS or Google Cloud. This progression will include learning a variety of systems for data collection, orchestration, transformation, consumption, and others as appropriate.

Spring 2023 - This course will be held in person.

Lectures will be on Tuesdays most of the time - occasionally, they may be recorded and delivered to watch as you wish. This may happen time to time. Class discussion, more specifics of the content, and Q&A will be held in Zoom or our in our Discord in our discussion channel. Thursdays will be used for Labs and Hands-on application of what we read/lectured on from Tuesday in person

Office hours can be at your request or drop your question in the #office-hours channel in Discord or DM a TA (Details coming) or Instructor to set up a time to chat. If you want Office Hours to talk more or connect on Careers...etc anything - please DM me and we can set up times to meet.


This course will emphasize hands-on experience in the creation, management, and consumption of various computational services that support the practice of data science. For the purposes of getting started I will assume that you fit roughly into at least one of three categories as you approach the subject:

  1. Data Scientist / Researcher
  2. Developer / CS
  3. Data Engineer / Systems

Students will learn how to implement data science systems according to best practices, with an emphasis upon creating reusable and portable environments.


Component Weight Notes / Due
Lectures, readings and other material Weekly before class discussion
Labs 30% Weekly
Engaged Discussions 5%
Quizzes 15% 3 or 4 quizzes
Data Projects 50% 2 projects

Grading Scale

Grade Point Range
A+ 98-100
A 94-97
A- 90-93
B+ 87-89
B 84-86
B- 80-83
Grade Point Range
C+ 77-79
C 74-76
C- 70-73
D+ 67-69
D 64-66
D- 60-63
F =<59

Office Hours

Most interactions will occur within the course Discord server, office hours will be scheduled upon request. Generally, I'll hold them from 3 to 4 at Pav VII. You can also post your questions to the #questions channel, or message the instructor or TA directly. We will do our best to reply to all questions within a few hours.


Students in this course will be expected to use the following pieces of software on a weekly basis:

Student Action Items

Create a free GitHub account
Install git locally
Fork the Course GitHub Repository
Create a free Discord account / join server


Honor Policy

The course will be conducted according to the UVA honor system. Programming assignments and exams are to be completed by the individual (no group collaborations). You will sign an honor pledge for all assignments, quizzes, and exams; more importantly, I expect you to adhere to the intent of the pledge. Cooperative efforts at understanding the material and technologies of the course are encouraged.

All suspected violations will be forwarded to the Honor Committee, and you may, at my discretion, receive an immediate zero on that assignment regardless of any action taken by the Honor Committee.

If you believe you may have committed an Honor Offense, you may wish to file a Conscientious Retraction by calling the Honor Offices at (434)924-7602. For your retraction to be considered valid, it must, among other things, be filed with the Honor Committee before you are aware that the act in question has come under suspicion by anyone. More information can be found at http://honor.virginia.edu. Your Honor representatives can be found at: http://honor.virginia.edu/representatives.

Disabilities or Learning Needs

It is my goal to create a learning experience that is as accessible as possible. If you anticipate any issues related to the format, materials, or requirements of this course, please meet with me outside of class so we can explore potential options. Students with disabilities may also wish to work with the Student Disability Access Center to discuss a range of options to removing barriers in this course, including official accommodations. Please visit their website for information on this process and to apply for services online: sdac.studenthealth.virginia.edu. If you have already been approved for accommodations through SDAC, please send me your accommodation letter and meet with me so we can develop an implementation plan together.

Lectures and other learning material will be made available throughout the semester, and all assignments and exams will be granted ample time for complmetion. Should you require accommodations through SDAC for extra time, please contact the instructor.

Discrimination and power-based violence

The University of Virginia is dedicated to providing a safe and equitable learning environment for all students. To that end, it is vital that you know two values that I and the University hold as critically important:

  1. Power-based personal violence will not be tolerated.
  2. Everyone has a responsibility to do their part to maintain a safe community on Grounds.

If you or someone you know has been affected by power-based personal violence, more information can be found on the UVA Sexual Violence website that describes reporting options and resources available - www.virginia.edu/sexualviolence.

As your professor and as a person, know that I care about you and your well-being and stand ready to provide support and resources as I can. As a faculty member, I am a responsible employee, which means that I am required by University policy and federal law to report what you tell me to the University's Title IX Coordinator. The Title IX Coordinator's job is to ensure that the reporting student receives the resources and support that they need, while also reviewing the information presented to determine whether further action is necessary to ensure survivor safety and the safety of the University community. If you wish to report something that you have seen, you can do so at the Just Report It portal (http://justreportit.virginia.edu/). The worst possible situation would be for you or your friend to remain silent when there are so many here willing and able to help.

Religious Accommodations

It is the University's long-standing policy and practice to reasonably accommodate students so that they do not experience an adverse academic consequence when sincerely held religious beliefs or observances conflict with academic requirements.

Students who wish to request academic accommodation for a religious observance should submit their request in writing directly to me as far in advance as possible. Students who have questions or concerns about academic accommodations for religious observance or religious beliefs may contact the University’s Office for Equal Opportunity and Civil Rights (EOCR) at UVAEOCR@virginia.edu or 434-924-3200.

Preservation of classroom discussions

I will preserve weekly class discussions in Discord for the duration of the semester. Because these discussions include fellow students, you and they may be personally identifiable in these discussion logs. These logs may only be used for the purpose of individual or group study with other students enrolled in this class during this semester. You may not distribute them in whole or in part through any other platform or to any persons outside of this class, nor may you make your own copies of class discussions unless written permission has been obtained from the Instructor and all participants in the class have been informed. For additional details, please see Provost Policy 008 which is expected to be updated for the Fall 2020 semester.

Other Details

Support or Contact

If you have a question about any aspect of this course - a particular topic, method, concept, etc. - please contact the TAs or me via Discord or email. It is often the case that you're not the only one having trouble understanding it!