Introduction to the Analysis of Electronic Health Records - Online Course
A 3-Day Livestream Seminar Taught by
Jesse Gronsbell10:00am-12:30pm (convert to your local time)
1:30pm-3:30pm
The widespread adoption of electronic health records (EHR) has generated massive amounts of clinical data with potential to improve healthcare delivery and advance biomedical research. EHRs contain comprehensive patient-level information collected over time, including demographics, disease diagnoses, medical procedures, and vital signs. Large-scale EHR databases are also being increasingly linked across healthcare systems and to biobanks containing detailed genetic data to characterize individual health at unprecedented scale and precision.
However, EHR data is complex and heterogeneous. Effective data analysis requires a deep understanding of the data as well as familiarity with modern statistical and machine learning methods. This course will provide a broad overview of the analysis of EHR data for participants with little or no prior experience with the topic. We will start with the opportunities and challenges associated with the analysis of EHR data. We will then build an understanding of data provenance and structure. Finally, we will cover basic and advanced methods for EHR data analysis and their use in various research applications.
We will cover a full suite of methods for processing EHR data, developing phenotyping models, generating real-world evidence, and developing fair and privacy preserving-predictive models. You will also be introduced to publicly available datasets, software packages for statistical analyses, and tools for clinical natural language processing. The course will be hands-on and use the R and RStudio computing environment. After completing the course, you will be prepared to analyze your own EHR dataset and deepen your knowledge of the topic.
Starting November 7, this seminar will be presented as a 3-day synchronous, livestream workshop via Zoom. Each day will feature two lecture sessions with hands-on exercises, separated by a 1-hour break. Live attendance is recommended for the best experience. But if you can’t join in real time, recordings will be available within 24 hours and can be accessed for four weeks after the seminar.
Closed captioning is available for all live and recorded sessions. Captions can be translated to a variety of languages including Spanish, Korean, and Italian. For more information, click here.
ECTS Equivalent Points: 1
Computing
This seminar will use R as the base software and incorporate publicly available clinical natural language processing software such as MetaMap. All of the datasets used for exercises are openly available and detailed instructions will be provided for additional software.
Basic familiarity with R is highly desirable, but even novice R coders should be able to follow the presentation and do the exercises.
If you’d like to take this course but are concerned that you don’t know enough R, there are excellent online resources for learning the basics. Here are our recommendations.
This seminar will use R as the base software and incorporate publicly available clinical natural language processing software such as MetaMap. All of the datasets used for exercises are openly available and detailed instructions will be provided for additional software.
Basic familiarity with R is highly desirable, but even novice R coders should be able to follow the presentation and do the exercises.
If you’d like to take this course but are concerned that you don’t know enough R, there are excellent online resources for learning the basics. Here are our recommendations.
Who should register?
This course is for you if you want to learn the fundamentals of EHR data analysis and apply them to your own biomedical research questions. While no prior knowledge of EHR data is necessary, knowledge of linear and logistic regression is required for the course.
This course is for you if you want to learn the fundamentals of EHR data analysis and apply them to your own biomedical research questions. While no prior knowledge of EHR data is necessary, knowledge of linear and logistic regression is required for the course.
Seminar outline
Day 1
1. Introduction to electronic health record (EHR) data
-
-
-
- Types of EHR systems
- EHR terminology
- Data structure and provenance
2. Opportunities and challenges for EHR-based applications
-
-
-
- Opportunities: comparative effectiveness studies, clinical decision support, biobank analyses, etc.
- Challenges: selection bias, missing data, measurement error, etc.
Day 2
3. Curating research-quality data
-
-
-
- Code mapping
- Free-text processing
4. EHR-based phenotyping
-
-
-
- Rule-based algorithms
- Machine learning methods
Day 3
5. Real-world evidence generation with EHRs
6. Predictive modeling with EHRs
Day 1
1. Introduction to electronic health record (EHR) data
-
-
-
- Types of EHR systems
- EHR terminology
- Data structure and provenance
-
-
2. Opportunities and challenges for EHR-based applications
-
-
-
- Opportunities: comparative effectiveness studies, clinical decision support, biobank analyses, etc.
- Challenges: selection bias, missing data, measurement error, etc.
-
-
Day 2
3. Curating research-quality data
-
-
-
- Code mapping
- Free-text processing
-
-
4. EHR-based phenotyping
-
-
-
- Rule-based algorithms
- Machine learning methods
-
-
Day 3
5. Real-world evidence generation with EHRs
6. Predictive modeling with EHRs
Payment information
The fee of $995 USD includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.
The fee of $995 USD includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.