||Introduction of Python/C, Linux and HPC environment Running their own jobs on HPC.
||Running their own jobs on HPC.
||Numerical methods for Partial Differential Equations (PDE) Model as PDE and solve them
using numerical methods.
||Model as PDE and solve them using numerical methods.
||Message Passing Interface (MPI) Write MPI jobs and performance studies.
||Write MPI jobs and performance studies.
||Introduction of Data Science Know basic tasks and techniques of Data Science.
||Know basic tasks and techniques of Data Science.
||Basics of Big Data Understand the basics of Big Data and demo programs.
||Understand the basics of Big Data and demo programs.
||Big Data system: Hadoop/Spark Write Hadoop/Spark jobs and run them on HPC.
||Write Hadoop/Spark jobs and run them on HPC.
||Basics of Machine Learning Write a machine learning program using Spark MLlib.
||Write a machine learning program using Spark MLlib.
||Basics of earth-atmosphere radiative energy balance and global warming Understand basic
concepts and principles of radiative energy balance and global warming.
||Understand basic concepts and principles of radiative energy balance and global
||Basics of radiative transfer simulation framework Understand the basic physics
underlying the transport of radiation in atmosphere.
||Understand the basic physics underlying the transport of radiation in atmosphere.
||GCM simulation and satellite observations Understand the importance of GCM and satellite
||Understand the importance of GCM and satellite remote sensing.
||Project introduction and assignment Each interdisciplinary team will be assigned one
||Each interdisciplinary team will be assigned one project .
||Project progress report from each team and feedback 20 minutes report from each team +
Q&A + rating.
||20 minutes report from each team + Q&A + rating.
||Final project presentation Report, software, and a final presentation from each team.
||Report, software, and a final presentation from each team.
Introduction of Python/C, Linux and HPC environment. The first module explains the whole
structure of the program and required basic knowledge for the program. It briefly goes
through a programming language such as Python or C. It also introduces the hardware
architecture, available software and basic usage of the UMBC HPCF environment.
Numerical Methods for Partial Differential Equations. This module will explain the basics of
partial differential equations, which is commonly used in physical models. It will discuss
the use of numerical methods for PDEs, which is one major driving force behind research in
many other fields like numerical linear algebra, scientific computing, and the development
of parallel computers. It will cover the three basic PDE categories and their mathematical
properties with examples. It will discuss two large classes of methods: finite difference
and finite element methods.
Message Passing Interface (MPI). This module will explain how to write MPI programs which is
one of most common approach to build portable and scalable parallel scientific applications.
It will cover basic MPI commands such as MPI_Send and MPI_Recv, collective communication
commands like MPI_Bcast, MPI_Reduce/MPI_Allreduce, and MPI_Gather/MPI_Scatter. It will also
explain how to write MPI programs in both C and Python (through mpi4py)
Introduction of Data Science. This module will explain the basic concepts of Data Science,
including generic lifecycle and different stages of data analytics, such as acquisition,
cleaning/preprocessing, integration/aggregation, analysis/modeling and interpretation. It
will cover basics of descriptive statistics, graphic displays of data summaries, and basics
of probability theory (including Bayes’ theorem).
Basics of Big Data. This module will explain the basics of Big Data, including its 5V
characteristics. It starts with the challenges and bottleneck of many applications when
dealing with large volume of data. Then it will introduce the basics of distributed file
system and why we need them. It will cover Big Data concepts/techniques: data partitioning,
data parallelization, key-value pairs, functional programming and MapReduce.
Big Data system: Hadoop/Spark. This module will cover how to use two popular Big Data
systems namely Hadoop and Spark. It will explain how Hadoop Distributed File System (HDFS)
can achieve data partitioning, and fault tolerance and cluster management and job scheduling
in Hadoop/Spark. For Spark, it will explain resilient distributed datasets (RDD), RDD
transformations (map, join, cogroup, etc.) and actions (count, collection, foreach, etc.),
Basics of Machine Learning. This module will explain the main lifecycle (training, testing,
applying) and main types of machine learning (supervised and unsupervised learning). Major
techniques to be covered include inferential statistics, feature selection, regression,
correlation, clustering and classification. It will also explain how to construct Big Data
machine learning through Spark MLlib.
Basics of earth-atmosphere radiative energy balance and global warming. This module will
explain the basic concepts and principles that control the radiative energy balance of
earth-atmosphere system, and its implications to climate. The module will start with the
fundamental physics, such as black-body radiation, followed by zero-order radiative energy
balance between incoming solar radiation and outgoing terrestrial longwave radiation. The
module will end with discussion of what kinds of roles the greenhouse gases, aerosols and
clouds play in the radiative energy budget.
Basics of radiative transfer simulation framework. Following previous module, this module
will introduce the fundamental physical principles that control the transport of radiation
(i.e., visible and infrared light) in our atmosphere. The module will also include the
introduction of Monte-Carlo method and it application to radiative transfer.
GCM simulation and satellite observations. This module will start with an introduction to
the basic concepts and principles of numerical climate simulations, followed by explaining
the importance of evaluating climate simulations and why satellite remote sensing products
are invaluable for climate model evaluation. Basic concepts and principle underlying
satellite remote sensing will also be introduced this module.
Project introduction and assignment. This module will explain available research projects to
be conducted in the following five weeks (see below for possible projects). For each
project, it will cover the required techniques, suggested phases and major tasks, expected
outputs, output evaluation metrics and challenges to each discipline. Each team will be
assigned one project to work on.
Project progress report from each team and feedback from instructors as well. These three
modules will be weekly project progress updates and discussions. Since each team has three
members, every member will be a presenter for the reports. All instructors and other teams
will discuss the progress, perform peer review, provide feedback and give ratings.
Final project presentation. The final module will be the final project presentation and
final CI software program and technical report delivery. Each team will give a talk on the
problems to be solved, the approaches taken, demonstration of developed software program,
the experiments and results, and contributions of each member. All instructors and other
teams will provide feedback and give ratings and suggestions for future work.