Modern applications, including video recommendation systems, video search and retrieval, and large-scale scientific experiments, involve the acquisition and analysis of petabytes of high-dimensional data. Distributed computing refers to a large collaboration between networked processing units that allows for their processing capacity to be put at the service of a large problem. Nowadays, many systems and applications are being distributed for a variety of reasons: fault-tolerance, processing performance, security as well as geographical spreading of the data or the problem requirements.
This course digs into the internals of distributed computing and storage architectures, with particular emphasis on algorithms and techniques that underlie today’s distributed computing systems. Topics addressed in this course include: modeling of distributed computation, introduction to clouds (map reduce and key-value store), distributed shared memory, distributed compression algorithms with application in distributed file synchronization, authentication, distributed process scheduling, distributed optimization algorithms (e.g., Gossip, consensus, pulse-coupled oscillators).
The course will present the Map Reduce programming model for processing large data sets as well as representative applications. Task scheduling methodologies for the efficient allocation of distributed computing resources will also be presented. Furthermore, we will address distributed file synchronization mechanisms based on distributed compression principles and algorithms. We will also present gossip and consensus algorithms for distributed big data processing, as well as algorithms for distributed synchronization and desynchronization of networked agents’clocks.