Summary of Article: Active Flash: Towards Energy-Efficient, In-Situ Data Analytics on Extreme-Scale Machines

Summary: The article discusses energy and performance inefficiency of High performance computing (HPC) on large-scale supercomputers due to an increased data movement between these super computers and the storage devices during data analysis. The article suggest the usage of Active Flash where the data analysis is conducted on the Solid-State Device (SSD). Active Flash is observed to consume 5 to 9 times less energy than the energy consumed when the data is processed by the internal processor.

Strengths: For each 100MB of data input, Active Flash is observed to be with improved data movement and energy consumption as compared to other approaches. The modelling has also shown that it also improves the overall application performance. Active Flash is recommended to be cost effective approach for the future in-situ scientific data analysis. Other researches have suggested that reconfigured processing systems will be required to SSD for data intensive jobs, but this study has incorporated an approach that does not require additional hardware changes.

Weaknesses: When the read size incorporated by the host application is less aligned, the SSD to host In/out timing increases. The prototype is tested on smaller amounts of data and it is not clear if the same data and energy efficiency will be attained if larger amount of data is involved. Other researches have suggested that reconfigured processing systems will be required to SSD for data intensive jobs.

Questions: Is the performance of Active Flash still going to be as energy efficient for large amount of data as it is for 100MB data transfer? OpenSSd-based prototype modelling is used in this study which incorporates a simple event loop, what would be the results if a complex event loop prototype system is used?