Amnon Barak projects

2018 - 2025: Collective operations for MPI.

Goals: optimizations of collective operations for MPI.
Development platform: a multi-core cluster.
Selected publication:
- A. Margolin and A. Barak. RDMA-Based Library for Collective Operations in MPI. 2019 IEEE/ACM Workshop on Exascale MPI (ExaMPI). Int'l Conf. for High Performance Computing, Networking, Storage and Analysis (SC19), Denver, Nov. 2019. Also, IEEE Xplore,. Jan. 2020.
- A. Margolin and A. Barak. Tree-Based Fault-Tolerant Collective Operations for MPI. Workshop on Exascale MPI (ExaMPI). Int'l Conf. for High Performance Computing, Networking, Storage and Analysis (SC18), Dallas, Nov. 2018. Also, Concurrency and Computation: Practice and Experience, special issue, Wiley Online, June 2020.

2014 - 2021: MOSIX-4

Goals: a major redesign of the MOSIX cluster management platform, so that it does not require a kernel patch.
Development platform: our multi-cluster cloud.
Selected publication:
- The MOSIX Cluster Operating System for Parallel Computing on Linux Clusters and Multi-Cluster Private Clouds, 2021.

2013 - 2019 The FFMK project. A Fast and Fault-tolerant Microkernel-based System for Exascale Computing

Goals: Parallel algorithms for resilient computing in Exascale systems.
Joint research with TUDOS - Operating Systems Group, TU Dresden, Prof. Dr. H. Härtig (PI); ZIH - Center for Information Services and HPC, TU Dresden, Prof. Dr. W. E. Nagel (PI); ZIB - Zuse Institute Berlin, Prof. Dr. A. Reinefeld (PI).
Selected publications:
- C. Weinhold, A. Lackorzynski, J. Bierbaum, M. Küttler, M. Planeta, H. Weisbach, M. Hille, H. Härtig, A. Margolin, D. Sharf, E. Levy, P. Gak, A. Barak, M. Gholami, F. Schintke, T. Schütt, A. Reinefeld, M. Lieber and W.E. Nagel. FFMK: A Fast and Fault-Tolerant Microkernel-Based System for Exascale Computing. In: H-J. Bungartz, S. Reiz, B. Uekermann, P. Neumann, W.E. Nagel (Eds.), Software for Exascale Computing - SPPEXA 2016-2019, LNCSE, Vol 136. Springer, Chem, July 2020.
- M. Küttler, M. Planeta, J. Bierbaum, C. Weinhold, H. Härtig, A. Barak, T. Hoefler. Corrected Trees for Reliable Group Communication. Proc. Principles and Practice of Parallel Programming (PPoPP'19), Washington, DC, Feb. 2019.
- C. Weinhold, A. Lackorzynski, J. Bierbaum, M. Küttler, M. Planeta, H. Härtig, A. Shiloh, E. Levy, T. Ben-Nun, A. Barak, T. Steinke, T. Schütt, J. Fajerski, A. Reinefeld, M. Lieber, W.E. Nagel. FFMK: A Fast and Fault-tolerant Microkernel-based System for Exascale Computing. In: H-J. Bungartz, P. Neumann, W.E. Nagel (Eds.), Software for Exascale Computing - SPPEXA 2013 - 2015, LNCSE, Vol. 113, Springer, Oct. 2016.

2012 - 2018 The MAPS Framework for optimizing GPU and multi-GPU applications

Goals: Develop optimized GPU and multi-GPU applications using Memory Access Pattern Specification.
Joint research with T. Ben-Nun, E. Levy and E. Rubin.
Development platform: our GPU clusters.
Selected publication:
- T. Ben-Nun, E. Levy, A. Barak and E. Rubin. Memory access patterns: the missing piece of the multi-GPU puzzle. Proc. IEEE/ACM Int'l Conf. for High Performance Computing, Networking, Storage and Analysis (SC'15), Austin, Nov. 2015.
- E. Rubin, E. Levy, A. Barak and T. Ben-Nun. MAPS: Optimizing massively parallel applications using device-level memory abstraction. ACM Trans. on Architecture and Code Optimization (TACO), Vol. 11(4), Article 44, Dec. 2014. Also, Proc. High Performance and Embedded Architecture and Compilation (HiPEAC 2015), Amsterdam, Jan. 2015.

2009 - 2018: VirtualCL (VCL) Cluster Platform

Goals: allow applications to transparently utilize many OpenCL devices (CPUs, GPUs, Accelerators) in a cluster.
- Subproject: The Many GPUs Package (MGP) that provides extended OpenMP and C++ APIs for running OpenCL kernels.
- Subproject: SuperCL, an extension of OpenCL that allows OpenCL micro-programs to run efficiently on accelerator devices in remote nodes.
Joint research with T. Ben-Nun, E. Levy, A. Shiloh and J. Smith.
Development platform: a cluster of Infiniband connected (4-way Core i7) nodes, each with NVIDIA and AMD GPU devices.
Selected publication:
- A. Barak, T. Ben-Nun, E. Levy and Shiloh A. A Package for OpenCL Based Heterogeneous Computing on Clusters with Many GPU Devices, Workshop on Parallel Programming and Applications on Accelerator Clusters (PPAAC), IEEE Cluster 2010, Crete, Sept. 2010.
- A. Barak and A. Shiloh, The MOSIX VCL Cluster Platform (abstract), Proc. Intel European Research & Innovation Conf., pp. 196, Leixlip, Oct. 2011.

2008 - 2009: MOSIX Reach the Clouds (MRC)

Goals: allow applications to start on one node, then run on Clouds, without pre-copying files to the Clouds.

2004 - 2008: MOSIX for Linux-2.6 & Linux 3.x

Goals: Redesign and upgrade of MOSIX, including R&D of on-line algorithms for fair-share of resources in clusters, data compression and virtual machine migration.
Joint research with A. Shiloh, L. Amar, E. Levy, T. Maoz, E. Meiri and M. Okun.
Development platform: a 96 node cluster.
Outcome: a production system - in use worldwide.
Selected publications:
- A. Barak, A. Shiloh and L. Amar. An Organizational Grid of Federated MOSIX Clusters. Proc. 5^th IEEE Int. Symp. on Cluster Computing and the Grid (CCGrid'05), pp. 350-357, Cardiff, May 2005.
- L. Amar, A. Barak, E. Levy and M. Okun. An On-line Algorithm for Fair-Share Node Allocations in a Cluster. Proc. 7^th IEEE Int. Symp. on Cluster Computing and the Grid (CCGrid'07), pp. 83-91, Rio de Janeiro, May 2007.
- E. Meiri and A. Barak. Parallel Compression of Correlated Files, IEEE Cluster 2007, pp.285-292, Austin, Sept. 2007.
- T. Maoz, A. Barak and L. Amar. Combining Virtual Machine Migration with Process Migration for HPC on Multi-Clusters and Grids, IEEE Cluster 2008, pp. 89-98, Tsukuba, Sept. 2008.

2000 - 2003: MOSIX for Linux-2.4 and DFSA/MFS

Goals: upgrade MOSIX to Linux kernel 2.4, including R&D of scalable cluster file systems and parallel I/O.
Joint research with A. Shiloh and L. Amar.
Development platform: a 72 node cluster.
Outcome: a production system - used worldwide.
Selected publications:
- L. Amar, A. Barak and A. Shiloh. The MOSIX Parallel I/O System for Scalable I/O Performance. Proc. 14^th Int. Conf. on Parallel and Distributed Computing and Systems (PDCS'02), pp. 495-500, Cambridge, MA, Nov. 2002.
- L. Amar, A. Barak and A. Shiloh. The MOSIX Direct File System Access Method for Supporting Scalable Cluster File Systems. Cluster Computing, Vol. 7(2):141-150, April 2004.

1998 - 2000: MOSIX for Linux-2.2

Goals: major redesign of MOSIX for Linux kernel 2.2 on x86 computers.
Joint research with A. Shiloh and O. La'adan.
Development platform: a 32 node cluster.
Outcome: a production system - used worldwide.
Selected publication:
- A. Barak, O. La'adan and A. Shiloh. Scalable Cluster Computing with MOSIX for Linux, Proc. Linux Expo '99, pp. 95-100, Raleigh, N.C., May 1999.

1991 - 1998: MOSIX for the i486/Pentium and BSD/OS

Goals: develop MOSIX for the i486/Pentium and BSDI's BSD/OS, then study the performance of parallel applications and TCP/IP over the Myrinet LAN.
Joint research with A. Shiloh, Y. Yarom, O. La'adan, A. Braverman, I. Gilderman and I. Metrik.
Development platforms: initially, a cluster of 8 i486 workstations and 10 i486 DX-2 (Multibus II) Single Board Computer. Later, a cluster with 16 Pentium-100 and 64 Pentium II workstations, connected by Myrinet.
Outcome: a production system - used in several countries.
Selected publications:
- A. Barak, O. La'adan and Y. Yarom, The NOW MOSIX and its Preemptive Process Migration Scheme, Bull. IEEE Tech. Committee on Operating Systems and Application Environments, Vol. 7(2):5-11, Summer 1995.
- A. Barak and O. La'adan. Experience with a Scalable PC Cluster for HPC, Proc. Cluster Computing Conf. (CCC 97), Emory Univ., Atlanta, GA, March 1997.
- A. Barak and A. Braverman. Memory Ushering in a Scalable Computing Cluster, Microprocessors and Microsystems, Vol. 22(3-4):175-182, Aug. 1998.
- A. Barak, I. Gilderman and I. Metrik. Performance of the Communication Layers of TCP/IP with the Myrinet Gigabit LAN, Computer Communications, Vol. 22(11), July 1999.

1988 - 1990: MOSIX for the VME532 and SVR2

Goals: develop MOSIX for the VME532 using AT&T System V release 2, including R&D of distributed algorithms, monitoring tools and evaluation of the network performance.
Joint research with A. Shiloh and R. Wheeler and S. Guday.
Development platform: a parallel computer made of 4 VME enclosures, each with 4-6 VME532 (NS32532) computers that communicated via the VME-bus. The VME enclosures were connected by the ProNET-80 LAN.
Outcome: a production system - installed in several sites.
Selected publications:
- A. Barak and R. Wheeler. MOSIX: An Integrated Multiprocessor UNIX, Proc. Winter 1989 USENIX Conf., pp. 101-112, San Diego, CA, Feb. 1989.
- A. Barak, A. Shiloh and R. Wheeler. Flood Prevention in the MOSIX Load-Balancing Scheme, IEEE-TCOS Newsletter, Vol. 3(1):24-27, Winter 1989.
- A. Barak, S. Guday and R. Wheeler. The MOSIX Distributed Operating System, Load Balancing for UNIX, Lecture Notes in Computer Science, Vol. 672, Springer-Verlag, May 1993.

1988: MOSIX for the VAX and SVR2

Goals: port MOS to the VAX architecture using AT&T System V release 2, including R&D of communication protocols.
Joint research with A. Shiloh and G. Shwed.
Development platform: a cluster of a VAX-780 and 4 VAX-750 computers connected by Ethernet.
Outcome: a running system - used internally.

1983 - 1988: MOS for the M68K and Unix Version 7

Goals: develop MOS for the M68K and Bell Lab's Unix Version 7 with some BSD 4.1 extensions, including R&D of load-balancing algorithms, distributed file systems and scalability.
Joint research with A. Shiloh, O.G. Paradise, D. Malki, and R. Wheeler.
Development platform: a cluster of 7 CADMUS/PCS MC68K computers connected by the ProNET-10 LAN.
Outcome: a running system - used for followup R&D projects.
Selected publications:
- A. Barak and O.G. Paradise. MOS - Scaling Up UNIX. Proc. Summer 1986 USENIX Conf., pp. 414-418, Atlanta, GA, June 1986.
- A. Barak, D. Malki and R. Wheeler. AFS, BFS, CFS ... or Distributed File Systems for UNIX, Proc. Autumn 86 EUUG Conf. on Dist. Systems, pp. 214-222, Manchester, Sept. 1986.
- A. Barak and O.G. Paradise. MOS - a Load Balancing UNIX, Proc. Autumn 86 EUUG Conf. on Dist. Systems, pp. 273-280, Manchester, Sept. 1986.

1981 - 1983: MOS for the PDP-11 and Unix Version 7

Goals: design and implementation of a multi-computer operating system that provides a single-system image.
Joint research with A. Litman, R. Hofner, J. Mizel and A. Shiloh.
Development platform: a cluster of a PDP-11/45 and 4 PDP-11/23 computers initially connected by DRV11-J (Parallel Line Interface) and later by ProNET-10, running Bell Lab's Unix Version 7.
Outcome: a working prototype.
Selected publication:
- A. Barak and A. Litman. MOS - A Multicomputer Distributed Operating System, Software: Practice & Experience, Vol. 15(8):725-737, Aug. 1985.
- A. Barak and A. Shiloh. A Distributed Load-balancing Policy for a Multicomputer, Software: Practice & Experience, Vol. 15(9):901-913, Sept. 1985.

1977 - 1980: UNIX with Satellite Processors

Goals: study performance gains when distributing Unix processes to remote computers.
Joint research with A. Shapir, G. Steinberg and A.I. Karshmer.
Development platform: a "cluster" of a PDP-11/45 and a PDP-11/10, connected by parallel I/O, running Bell Lab's Unix Version 6.
Outcome: a working prototype.
Selected publication:
- A. Barak and A. Shapir. UNIX with satellite Processors. Software: Practice & Experience, Vol. 10(5):383-392, May 1980.
- A. Barak, A. Shapir, G. Steinberg and A.I. Karshmer. A Modular, Distributed UNIX. Proc. 14^th Hawaii Int. Conf. on System Science, pp. 740-747, January 1981.