Overview
The knowledge center provides tools and infrastructures to support research on Chip MultiProcessors (CMP) in Israel. It facilitates collaboration among researchers from separate scientific communities, including architecture, circuits, micro-architecture, VLSI, operating systems, parallel and distributed computing, and computer networks. The center thus enables the multi-disciplinary research required for progress in the area of CMP.
The Knowledge Center is funded in part by the Israeli Ministry of Science, Culture and Sport.
The Knowledge Center is managed by:
Outreach
In February 2009, we held a workshop on Multicore Day: The Challenges of Today and Tomorrow, which was attended by over 400 researchers and engineers from Israel's High-Tech Industry and Israeli Universities.
All research supported by the CMP Knowledge Center, as well as the software and tools we develop, is publicly available to the research community at-large. The research papers can be obtained from the MATRICS Publications page.
Software Tools
All software tools and benchmarks developed by the group are offered for public use. For more information please follow the links below, or visit the Software page on the MATRICS Webpage.
-
SMV: Selective Multi-Versioning STM
Selective Multi-Versioning STM (SMV) is a novel STM algorithm that reduces the number of aborts, especially those of long read-only transactions. SMV keeps old object versions as long as they might be useful for some transaction to read. It is able to do so while still allowing reading transactions to be invisible by relying on automatic garbage collection to dispose of obsolete versions. In order to evaluate SMV's performance we have designed and implemented a framework for a smooth plug-in of STM algorithms into a Java code. In the package below you can find this framework plugged-in to the STMBench7 benchmark suite, together with SMV, TL2 and LSA algorithms' implementations.
the SMV package: zip file
-
Simics Workload Kits
Virtutech Simics is a full-system simulator platform, becoming quite popular within the computer architecture research community. Unfortunately, building and setting up benchmarks for the simulator is a time-consuming task that requires a long ramp up time. To ease up this burden, we will try to supply scripts that (to some extent) automate the build of several application. For more inforamtion please follow to the Simics Workload Kits page. -
OPNET Models for NoC
In order to examine the effects of various design and implementation options on the performance of the NoC, "OPNET Modeler" is used. This commercial, GUI-based environment facilitates the development of a modular, hierarchical description of the NoC architecture and allows rapid evaluation of its components using an event-driven simulation engine. We have implemented router, source and sink modules which enable the modeling of an entire NoC, accounting for its flow-control, virtual channels, resource contention, arbitration policy, finite buffers, link capacities, etc. These models are offered for public use - for details, please contact zigi@tx.technion.ac.il. Note that a license is required in order to use OPNET Modeler, as specified at www.opnet.com. -
Transactified Apache Kit
Apache is a large-scale industrial multi-process and multi-threaded application, which uses lock-based synchronization. We have experimented with modifying Apache to employ transactional memory instead of locks, a process we refer to as transactification; we are not aware of any previous efforts to transactify legacy software of such a large scale. We have transactified apache's memory cache module mod_mem_cache using Intel's experimental STM C/C++ compiler. For the transactified Apache code along with detailed instuctions on how to install, test it, and use it as a benchmark please follow to the Transacitfied Apache Kit page
Infrastructures
The knowledge center provides a unique multi-processor hardware (a tightly coupled cluster server) for running CMP-related experiments. The Knowledge Center maintains two 32-core (8X4) NUMA-based HP servers. More information about the system may be found in the HP site: General Overview, Short Specification, Detailed Specification.
The Simics simulation toolkit is installed on the machines, allowing full-system simulations of CMP systems. The servers may be accessed from outside the Technion using SSH.
The infrastructure supports a broad range of CMP-related research projects. We are always happy to expand this set of projects. The infrastructure may be used, based on availability, for CMP-related research by Israeli Academia, at no cost. Use of the infrastructure is also open to CMP-related R&D in Israeli industries, based on availability, at cost price. We are currently offering two free months of usage for CMP-related R&D to Israeli companies.
Researchers interested in using the infrastructure, please send a one-page project proposal to the Knowledge Center Manager, Idit Keidar.
Supported Research
Our vision of future nanoscale integrated systems involves many processors and shared memories, interconnected by on-chip packet-switched networks. We believe it is important to tackle these challenges by an interdisciplinary team of researchers, with diverse expertise in areas such as circuits, microarchitecture, VLSI, networking and concurrent programming.
The knowledge center supports interdisciplinary research of a team of the following Principal Investigators at the Technion:
- Yitzhak (Tsahi) Birk
- Israel Cidon
- Ran Ginosar
- Idit Keidar
- Isaac Keslassy
- Avinoam Kolodny
- Avi Mendelson
- Uri Weiser
The group addresses several aspects of such systems, including:
- Networks on Chips (architecture and circuits, quality of service, routers)
- Chip Multi-Processor (CMP) architecture
- Shared memory architectures for CMP systems
- Advanced NoC services for CMP systems
- Software issues in CMP systems
Background and Rationale
Microcomputers are currently undergoing a paradigm shift. Previously, computer systems' speed and performance improved annually via micro-architecture improvements, which performed speculative computing to exploit instruction level parallelism, as well as by reducing circuit sizes and increasing processor frequencies. However, it is apparent nowadays that this trend cannot continue, due to the enormous power requirements of chips that operate at a high frequency. In fact, the main consideration in computer architecture design today is power efficiency per performance. Power considerations are ubiquitous: they are essential for mobile computers, due to difficulties in heat dissipation and battery life optimization, as well as for servers in data centers, where air conditioning costs are dominant. At the same time, the opportunities for exploiting instruction level parallelism have been exhausted, and the high power per performance cost of speculative execution is no longer adequate for today's design considerations.
In order to conserve energy while improving performance, a conceptual change is due. The diminishing size of CMOS technology allows for placing multiple processor cores on a single chip. The trend in computer architecture is therefore shifting from production of faster uni-processors, as was done until recently, to architectures based on multiple cores on the same chip. Thus, instead of a single energy inefficient core (processor), the chip will include many slower, and hence more energy-efficient, cores. Computer manufacturers are already producing systems with 2, 4, and 8 processor cores. In the future, such systems are projected to increase to incorporate dozens of cores. This approach is called Chip MultiProcessor (CMP).
The shift to CMP mandates a shift to exploiting thread-level parallelism, instead of instruction level parallelism as in traditional systems. New research is needed in order to design appropriate software structures for highly parallel mutli-threading in order to address the needs of software developers for CMP architectures. Communication and synchronization among the parallel computing elements will become a major bottleneck in such systems. It is expected that such future software models will be based on shared memory rather than on direct core-to-core message passing. In this context, the structure, layout, and functionality of the on-chip cache hierarchy will be of utmost importance. It is expected that the on-chip caches will be distributed, and will have non-uniform access times. That is, different processors will incur different access times to different memory units, based on physical distance on the chip and other architectural considerations.
The inevitable shift to CMP architectures raises a multitude of research problems and challenges, whose solution is of critical importance to the advancement of this technology. These problems arise in several different tiers, including:
- micro-architecture of multiple cores and memories on the same chip;
- on-chip communication solutions, both at the circuit level and at the network-on-chip (NoC) design level;
- operating systems;
- compilers; and
- software development tools for massive multi-threaded applications.
Typically, each of these tiers (research topics) is pursued in a separate scientific community, with expertise in its respective area, publishing in its own conferences and journals. Researchers within each community make standard (fixed) assumptions about the operation modes and cost models of other tiers.
Nevertheless, we argue that a dramatic paradigm shift such as the switch to CMP mandates the removal of the traditional barriers between communities, since the standard assumptions reflect operations modes of the past, and are inadequate for the new reality. Scientific progress and technological development in this area must rely on multi-disciplinary research, traversing all the relevant tiers, and creating new interactions among them.