田中 昌宏 (タナカ マサヒロ)

Tanaka, Masahiro

写真a

所属(所属キャンパス)

理工学研究科 (矢上)

職名

特任准教授(有期)

 

著書 【 表示 / 非表示

  • System software for data-intensive science

    Tatebe O., Oyama Y., Tanaka M., Ohtsuji H., Takatsu F., Li X., Advanced Software Technologies for Post-Peta Scale Computing: The Japanese Post-Peta CREST Research Project, 2018年12月

     概要を見る

    © Springer Nature Singapore Pte Ltd. 2019. All rights reserved. The storage performance is an issue for supercomputers to facilitate the data-intensive science. To improve the storage bandwidth according to the number of compute nodes, we assume a node-local scale-out storage architecture. The number of local storages increases according to the number of compute nodes, and the total storage bandwidth increases scalably. Our research target is a distributed file system in the node-local storage architecture, an operating system for compute node, and runtime systems for the distributed file system using node-local storages for workflow systems, MapReduce, MPI-IO, and batch job schedulers.

論文 【 表示 / 非表示

  • Applying Pwrake Workflow System and Gfarm File System to Telescope Data Processing

    M Tanaka, O Tatebe, H Kawashima

    2018 IEEE International Conference on Cluster Computing (CLUSTER), 124-133 (Proceedings - IEEE International Conference on Cluster Computing, ICCC)  2018-September   124 - 133 2018年

    研究論文(国際会議プロシーディングス), 共著, 査読有り,  ISSN  9781538683194

     概要を見る

    © 2018 IEEE. In this paper, we describe a use case applying a scientific workflow system and a distributed file system to improve the performance of telescope data processing. The application is pipeline processing of data generated by Hyper Suprime-Cam (HSC) which is a focal plane camera mounted on the Subaru telescope. In this paper, we focus on the scalability of parallel I/O and core utilization. The IBM Spectrum Scale (GPFS) used for actual operation has a limit on scalability due to the configuration using storage servers. Therefore, we introduce the Gfarm file system which uses the storage of the worker node for parallel I/O performance. To improve core utilization, we introduce the Pwrake workflow system instead of the parallel processing framework developed for the HSC pipeline. Descriptions of task dependencies are necessary to further improve core utilization by overlapping different types of tasks. We discuss the usefulness of the workflow description language with the function of scripting language for defining complex task dependency. In the experiment, the performance of the pipeline is evaluated using a quarter of the observation data per night (input files: 80 GB, output files: 1.2 TB). Measurements on strong scaling from 48 to 576 cores show that the processing with Gfarm file system is more scalable than that with GPFS. Measurement using 576 cores shows that our method improves the processing speed of the pipeline by 2.2 times compared with the method used in actual operation.

  • Design of Fault Tolerant Pwrake Workflow System Supported by Gfarm File System

    M Tanaka, O Tatebe

    Proceedings of the 9th Workshop on Many-Task Computing on Clouds, Grids, and … 2016年

    研究論文(国際会議プロシーディングス), 共著, 査読有り

  • Disk Cache-Aware Task Scheduling For Data-Intensive and Many-Task Workflow

    M Tanaka, O Tatebe

    2014 IEEE International Conference On Cluster Computing (IEEE Cluster 2014 … 2014年

    研究論文(国際会議プロシーディングス), 共著, 査読有り

  • Workflow scheduling to minimize data movement using multi-constraint graph partitioning

    M Tanaka, O Tatebe

    Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster … 2012年

    研究論文(国際会議プロシーディングス), 共著, 査読有り

  • 並列分散ワークフローシステム Pwrake による大規模データ処理

    田中昌宏, 建部修見

    宇宙航空研究開発機構研究開発報告: 宇宙科学情報解析論文誌 1, 67-75 2012年

    研究論文(大学,研究機関等紀要), 共著, 査読有り

全件表示 >>