research-article

Towards a Composable Computer System

Authors:

Paul CrumleyAuthors Info & Claims

HPCAsia '18: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region

Pages 137 - 147

https://doi.org/10.1145/3149457.3149466

Published: 28 January 2018 Publication History

Abstract

The recent advancement of technology in both software and hardware enables us to revisit the concept of the composable architecture in the system design. The composable system design provides flexibility to serve a variety of workloads. The system offers a dynamic co-design platform that allows experiments and measurements in a controlled environment. This speeds up the system design and software evolution. It also decouples the lifecycles of components. The design consideration includes adopting available technology with the understanding of application characteristics. With the flexibility, we show the design has the potential to be the infrastructure of both cloud computing and HPC architecture serving a variety of workloads.

References

[1]

2011. Open Compute Project. (2011). http://www.opencompute.org.

[2]

2013. AMD Disaggregates the Server, Defines New Hyperscale Building Block. (2013). http://www.seamicro.com/sites/default/files/MoorInsights.pdf.

[3]

2013. HPE Synergy, Composable Infrastructure white paper. (2013). http://www.hpe.com.

[4]

2015. Avago Express Fabric PEX9700 series. (May 2015). http://www.avagotech.com/news/next-gen-express-fabric-switch

[5]

2015. Cisco UCS M-Series Modular Servers. (2015). http://www.cisco.com/c/en/us/products/servers-unified-computing/ucs-m-series-modular-servers/index.html.

[6]

2015. Intel Rack Scale Design. (2015). http://www.intel.com/content/www/us/en/architecture-and-technology/rack-scale-design-overview.html.

[7]

2017. Microsemi Switchtec PCIe switches. (2017). http://www.microsemi.com

[8]

Bulent Abali. 1997. A deadlock avoidance method for computer networks. In International Workshop on Communication, Architecture, and Applications for Network-Based Parallel Computing. Springer, 61--72.

Digital Library

[9]

Bülent Abali, Richard J. Eickemeyer, Hubertus Franke, Chung-Sheng Li, and Marc Taubenblatt. 2015. Disaggregated and optically interconnected memory: when will it be cost effective? CoRR abs/1503.01416 (2015). http://arxiv.org/abs/1503.01416

[10]

Cédric Augonnet, Jérôme Clet-Ortega, Samuel Thibault, and Raymond Namyst. 2010. Data-aware task scheduling on multi-accelerator based platforms. In Parallel and Distributed Systems (ICPADS), 2010 IEEE 16th International Conference on. IEEE, 291--298.

Digital Library

[11]

Ray Bittner, Erik Ruf, and Alessandro Forin. 2014. Direct GPU/FPGA communication via PCI express. Cluster Computing 17, 2 (2014), 339--348.

Digital Library

[12]

I. H. Chung, T. N. Sainath, B. Ramabhadran, M. Picheny, J. Gunnels, V. Austel, U. Chauhari, and B. Kingsbury. 2016. Parallel Deep Neural Network Training for Big Data on Blue Gene/Q. IEEE Transactions on Parallel and Distributed Systems PP, 99 (2016), 1--1.

Digital Library

[13]

Charles Clos. 1953. A Study of Non-Blocking Switching Networks. Bell System Technical Journal 32, 2 (1953), 406--424.

[14]

H3Platform Inc. 2017. PCIe Expansion Solution, Falconwitch PS1816. http://www.h3platform.com/product. (2017).

[15]

IBM. 2016. Power Systems S822LC for High Performance Computing with POWER8 CPUs and 4 NVIDIA P100 GPUs. http://www.ibm.com. (2016).

[16]

IBM Corp. 2002. z/VM built on IBM Virtualization Technology General Information Version 4 Release 3.0. (2002).

[17]

Inventec. 2017. Enterprise Systems Solutions. http://www.inventec.com. (2017).

[18]

Joefon Jann, Luke M. Browning, and R. Sarma Burugula. 2003. Dynamic reconfiguration: Basic building blocks for autonomic computing on IBM pSeries servers. IBM Systems Journal 42, 1 (2003), 29--37.

Digital Library

[19]

K. Lim, Y. Turner, J. R. Santos, A. AuYoung, J. Chang, P. Ranganathan, and T. F. Wenisch. 2012. System-level implications of disaggregated memory. In IEEE International Symposium on High-Performance Comp Architecture. 1--12.

Digital Library

[20]

NVIDIA. 2016. DGX-1 server with 8 P100 GPUs interconnected with NVLINK fabric. http://www.nvidia.com/object/deep-learning-system.html. (2016).

[21]

NVIDIA. 2016. NVIDIA GPUDirect. https://developer.nvidia.com/gpudirect. (2016).

[22]

Ritesh A Patel, Yao Zhang, Jason Mak, Andrew Davidson, and John D Owens. 2012. Parallel lossless data compression on the GPU. IEEE.

[23]

Charles Reiss, Alexey Tumanov, Gregory R Ganger, Randy H Katz, and Michael A Kozuch. 2012. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the Third ACM Symposium on Cloud Computing. ACM, 7.

Digital Library

[24]

Qi Zhang, Joseph L Hellerstein, and Raouf Boutaba. 2011. Characterizing task usage shapes in Google compute clusters. In Large Scale Distributed Systems and Middleware Workshop (LADIS.11).

Cited By

Namiki S(2024)Digital Infrastructure Pivots on Silicon Photonics: An Aspect from the Past, Present, and Future of Optical Communications2024 IEEE Silicon Photonics Conference (SiPhotonics)10.1109/SiPhotonics60897.2024.10544249(1-2)Online publication date: 15-Apr-2024
https://doi.org/10.1109/SiPhotonics60897.2024.10544249
Markussen JKristiansen LKvale Stensland HHalvorsen P(2024)Multi-Host Sharing of a Single-Function NVMe Device in a PCIe ClusterSC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SCW63240.2024.00204(1638-1645)Online publication date: 17-Nov-2024
https://doi.org/10.1109/SCW63240.2024.00204
Gloukhovtsev M(2024)Sustainable high-performance computingMaking IT Sustainable10.1016/B978-0-443-13597-2.00006-6(137-156)Online publication date: 2024
https://doi.org/10.1016/B978-0-443-13597-2.00006-6
Show More Cited By

Index Terms

Towards a Composable Computer System
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
2. Software and its engineering
  1. Software organization and properties
    1. Software system structures
      1. Distributed systems organizing principles

Index terms have been assigned to the content through auto-classification.

Recommendations

Migrating legacy system towards object technology
Towards Reusable Personas for Everyday Design
CHI EA '16: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems

Personas are artificial character based representations of user goals, attitudes, motivations and abilities which enable designers to focus their design efforts on key, _targeted users. The success of personas in design is due to their capacity to enable ...
Computer automated and integrated design (caid)

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

HPCAsia '18: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region

January 2018

322 pages

ISBN:9781450353724

DOI:10.1145/3149457

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

SIGHPC: ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing
IPSJ: Information Processing Society of Japan
Cybermedia Center, Osaka University: Cybermedia Center, Osaka University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 January 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

HPC Asia 2018

HPC Asia 2018: International Conference on High Performance Computing in Asia-Pacific Region

January 28 - 31, 2018

Tokyo, Chiyoda, Japan

Acceptance Rates

HPCAsia '18 Paper Acceptance Rate 30 of 67 submissions, 45%;

Overall Acceptance Rate 69 of 143 submissions, 48%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
540
Total Downloads

Downloads (Last 12 months)41
Downloads (Last 6 weeks)7

Reflects downloads up to 07 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Namiki S(2024)Digital Infrastructure Pivots on Silicon Photonics: An Aspect from the Past, Present, and Future of Optical Communications2024 IEEE Silicon Photonics Conference (SiPhotonics)10.1109/SiPhotonics60897.2024.10544249(1-2)Online publication date: 15-Apr-2024
https://doi.org/10.1109/SiPhotonics60897.2024.10544249
Markussen JKristiansen LKvale Stensland HHalvorsen P(2024)Multi-Host Sharing of a Single-Function NVMe Device in a PCIe ClusterSC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SCW63240.2024.00204(1638-1645)Online publication date: 17-Nov-2024
https://doi.org/10.1109/SCW63240.2024.00204
Gloukhovtsev M(2024)Sustainable high-performance computingMaking IT Sustainable10.1016/B978-0-443-13597-2.00006-6(137-156)Online publication date: 2024
https://doi.org/10.1016/B978-0-443-13597-2.00006-6
He BZheng XChen YLi WZhou YLong XZhang PLu XJiang LLiu QCai DZhang X(2023)DxPU: Large-scale Disaggregated GPU Pools in the DatacenterACM Transactions on Architecture and Code Optimization10.1145/361799520:4(1-23)Online publication date: 14-Dec-2023
https://dl.acm.org/doi/10.1145/3617995
He ZSaluja ALawrence RChakravorty DDang FPerez LLiu H(2023)Performance of Distributed Deep Learning Workloads on a Composable CyberinfrastructurePractice and Experience in Advanced Research Computing 2023: Computing for the Common Good10.1145/3569951.3593601(60-67)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.1145/3569951.3593601
Masouros DPinto CGazzetti MXydis SSoudris D(2023)Adrias: Interference-Aware Memory Orchestration for Disaggregated Cloud Infrastructures2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10070939(855-869)Online publication date: Feb-2023
https://doi.org/10.1109/HPCA56546.2023.10070939
Ishii KMatsumoto RInoue TNamiki S(2022)Disaggregated optical-layer switching for optically composable disaggregated computing [Invited]Journal of Optical Communications and Networking10.1364/JOCN.47113215:1(A11)Online publication date: 31-Oct-2022
https://doi.org/10.1364/JOCN.471132
Ma HLiu SWang CQiao YBond MBlackburn SKim MXu GJhala RDillig I(2022)Mako: a low-pause, high-throughput evacuating collector for memory-disaggregated datacentersProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3519939.3523441(92-107)Online publication date: 9-Jun-2022
https://dl.acm.org/doi/10.1145/3519939.3523441
Pornpongtechavanich PNilsook PWannapiroon P(2022)Intelligent Composable Education System2022 Research, Invention, and Innovation Congress: Innovative Electricals and Electronics (RI2C)10.1109/RI2C56397.2022.9910265(178-184)Online publication date: 4-Aug-2022
https://doi.org/10.1109/RI2C56397.2022.9910265
Gu WXie XDong D(2022)LTNoT: Realizing the Trade-Offs Between Latency and Throughput in NVMe over TCPAlgorithms and Architectures for Parallel Processing10.1007/978-3-031-22677-9_22(412-432)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1007/978-3-031-22677-9_22
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents

NODES