From 0e15af6da06027b35736e6dffd9e695dd315c601 Mon Sep 17 00:00:00 2001 From: Lofstead Date: Wed, 21 Aug 2024 22:24:14 -0600 Subject: [PATCH 1/5] wrote article draft --- Articles/Blog/ACM-REP-23-24.md | 45 ++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 Articles/Blog/ACM-REP-23-24.md diff --git a/Articles/Blog/ACM-REP-23-24.md b/Articles/Blog/ACM-REP-23-24.md new file mode 100644 index 000000000..33f228fd6 --- /dev/null +++ b/Articles/Blog/ACM-REP-23-24.md @@ -0,0 +1,45 @@ +# The new ACM REP conference + + +#### Contributed by [Jay Lofstead](https://github.com/gflofst "Jay Lofstead GitHub Profile") + +#### Publication date: September 10, 2024 + +The ACM REP conference helps build better software by highlighting developments in reproducibility and replicability for computational science. + +A few years ago, growing out of Carlos Maltzahn, his student Ivo Jimenez, and my work, we saw momentum towards a critical mass of people interested in reproducibility and replicability challenges and successes for computational science. To help drive the community and grow those efforts, the topic of Ivo Jimenez's PhD thesis, we gathered prominent community members and started an Exploratory Interest Group (EIG) with the Association of Computing Machinery (ACM). Our goal was to explore how big this community was and see if we could make a catalyst and meeting point for these researchers and practitioners to help drive improving scientific software quality. + +Within a few years, we felt like we had enough momentum and in Summer 2023, we had the first ACM REP conference. Even with short notice and not much advertising, we managed to get a critical mass of research papers and attendance that proved the community was both large enough and interested in participating. + +For Summer 2023, we met in the Hay Barn at the University of California, Santa Cruz and online. With the success of that effort, we took the conference to Rennes, France in Summer 2024 to see what more convenient attendance for the European community would muster. We had slight more attendance and a different mix showing a fragmented community we could help bring together. + +For 2025, we will bring ACM REP to Vancouver, Canada the last week of July examining what attendance may be like for something closer to east Asia, but not with the potential visa issues for something hosted in the USA. + +At ACM REP 23, we had three keynotes each looking at a different aspect of sustainable software. Torsten Hoefler described his path into proper statistics for analyzing data and the impacts that had on quality. Juliana Feire presented on the challenges in many domains for data driven science. The last keynote was from Grigori Fursin laid out the efforts he has been involved with that trying to provide tools to support reproducible science. In all of these cases, the depth and breadth of available research topics were made clear and the challenges in achieving success are shown to be daunting. For us to achieve better scientific software, considerable research will be needed. + +At ACM REP 24, our first keynote was from Anne-Laure Boulesteix talking about the challenges in machine learning and how reproducibility is particularly challenging as ML-infused approaches become an integral part of our computational science. Our second keynote was from Konrad Hinsen talking about the problems with platforms and how GUIX can give a portable, stable, reproducible base upon which to run reproducible computational science. While not a complete solution, it offers stronger infrastructure to make many of the ill-defined elements more about the science rather than the experimental environment. + +At both years, we featured a wide variety of papers ranging from success stories to incremental, but difficult and important improvements to make reproducibility and replicability achievable. The discussions among attendees has also shown this to be a vibrant and collaborative group interested in advancing the field rather than trying to hide their own efforts. + +Science is basically defined as something that can be reproduced in a new environment by someone else and achieve statistically the same results. Sometimes it is exactly and there are some other minor variations. Overall, that ability to recreate the results again is key for something to be considered **science**. With the importance of computational science to our scientific inquiry process. Ensuring it is science must be done or our results are little more than, "I got this result and I trust it. Therefore so should you." + +This conference series was started hoping we could achieve a critical mass to foster community for sustainable computational science. The first two years have shown that this critical mass exists and is ready to grow. + +Our latest effort is to encourage experience papers talking about lessons learned from reproducing other computational research. These experience papers can be from any conference or journal and will help justify the educational and research value in exploring existing work by offering a publication credit for writing up the results. + +As with both previous years, we plan to continue offering tutorials with a stronger emphasis on hands-on tutorials rather than lecture style. This will help people learn how to use new tools. + +If you wish to participate, please visit https://acm-rep.github.io/2025/ to keep abreast of the conference as it comes organizes. + +Together, we can make better scientific software through better support for reproducibility and replicability. + +### Author bio + +Jay Lofstead is a Principal Member of Technical Staff at Sandia National Laboratories. His research interests focus around large scale data management and trusting scientific computing. In particular, he works on storage, IO, metadata, workflows, reproducibility, software engineering, machine learning, and operating system-level support for any of these topics. Broadly across these topics, he is also deeply interested in ethics related to these topics and computing in general and how to drive inclusivity across the computation-related science domains. Dr. Lofstead received his Ph.D. in Computer Science from the Georgia Institute of Technology in 2010. + + + From 3d4ca1ff26c2e3ca71179749e3cfd2af34a55f87 Mon Sep 17 00:00:00 2001 From: "David E. Bernholdt" <426409+bernhold@users.noreply.github.com> Date: Sat, 24 Aug 2024 15:11:59 -0400 Subject: [PATCH 2/5] Initial edits --- Articles/Blog/2024-08-ACM-REP-23-24.md | 50 ++++++++++++++++++++++++++ Articles/Blog/ACM-REP-23-24.md | 45 ----------------------- 2 files changed, 50 insertions(+), 45 deletions(-) create mode 100644 Articles/Blog/2024-08-ACM-REP-23-24.md delete mode 100644 Articles/Blog/ACM-REP-23-24.md diff --git a/Articles/Blog/2024-08-ACM-REP-23-24.md b/Articles/Blog/2024-08-ACM-REP-23-24.md new file mode 100644 index 000000000..7f2132891 --- /dev/null +++ b/Articles/Blog/2024-08-ACM-REP-23-24.md @@ -0,0 +1,50 @@ +# The New ACM Conference on Reproducibility and Replicability (ACM REP) + +#### Contributed by [Jay Lofstead](https://github.com/gflofst) + +#### Publication date: Augusta 28, 2024 + +The ACM REP conference helps build better software by highlighting developments in reproducibility and replicability for computational science. + +A few years ago, growing out of Carlos Maltzahn, his student Ivo Jimenez, and my work, we saw growing momentum of people interested in reproducibility and replicability challenges and successes for computational science. To help drive the community and grow those efforts, the topic of Ivo Jimenez's PhD thesis, we gathered prominent community members and started an Exploratory Interest Group (EIG) with the Association of Computing Machinery (ACM). Our goal was to explore how big this community was and see if we could make a catalyst and meeting point for these researchers and practitioners to help drive improving scientific software quality. + +Within a few years, we felt like we had enough momentum and in Summer 2023 we held the first ACM REP conference. Even with short notice and not much advertising, we managed to get a critical mass of research papers and attendance that proved the community was both large enough and interested in participating. + +For ACM REP 23, we met in the Hay Barn at the University of California, Santa Cruz and online. With the success of that effort, we took the conference to Rennes, France in the Summer of 2024 to see what more convenient attendance for the European community would muster. We had slightly more attendance and a different mix showing a varied community we could help bring together. + +For 2025, we will bring ACM REP to Vancouver, Canada the last week of July, exploring what attendance may be like for something closer to east Asia, but without the potential visa issues of a meeting hosted in the USA. + +At [ACM REP 23](https://acm-rep.github.io/2023/), we had three keynotes each looking at a different aspect of sustainable software. Torsten Hoefler described his path into proper statistics for analyzing data and the impacts that had on quality. Juliana Feire presented on the challenges in many domains for data driven science. The last keynote, from Grigori Fursin, laid out the efforts he has been involved in trying to provide tools to support reproducible science. In all of these cases, the depth and breadth of available research topics were made clear and the challenges in achieving success are shown to be daunting. For us to achieve better scientific software, considerable research will be needed. + +At [ACM REP 24](https://acm-rep.github.io/2024/), our first keynote was from Anne-Laure Boulesteix talking about the challenges in machine learning and how reproducibility is increasingly challenging as ML-infused approaches become an integral part of our computational science. Our second keynote was from Konrad Hinsen talking about the problems with platforms and how GUIX can give a portable, stable, reproducible base upon which to run reproducible computational science. While not a complete solution, it offers stronger infrastructure to make many of the ill-defined elements more about the science rather than the experimental environment. + +At both years, we featured a wide variety of papers ranging from success stories to incremental, but difficult and important improvements to make reproducibility and replicability achievable. The discussions among attendees has also shown this to be a vibrant and collaborative group interested in advancing the field rather than trying to hide their own efforts. + +Science is basically defined as something that can be reproduced in a new environment by someone else and achieve statistically the same results. Sometimes that reproduction is exact and at other times, there are minor variations. Overall, that ability to recreate the results is key for something to be considered **science**. With the importance of computational science to our modern process of scientific inquiry, ensuring it is **science** is a requirement, or our contributions are little more than, "I got this result and I trust it. Therefore so should you." + +This conference series was started hoping we could achieve a critical mass to foster community for sustainable computational science. The first two years have shown that this critical mass exists and is ready to grow. + +Our latest effort is to encourage experience papers talking about lessons learned from reproducing other computational research. These experience papers can be from any conference or journal and will help justify the educational and research value in exploring existing work by offering a publication credit for writing up the results. + +As with both previous years, we plan to continue offering tutorials with a stronger emphasis on hands-on tutorials rather than lecture style. This will help people learn how to use new tools. + +If you wish to participate, please visit https://acm-rep.github.io/2025/ to keep abreast of the conference as it comes organizes. + +Together, we can make better scientific software through better support for reproducibility and replicability. + +### Further information + +* [ACM REP 23 proceedings](https://dl.acm.org/doi/proceedings/10.1145/3589806) +* [ACM REP 24 proceedings](https://dl.acm.org/doi/proceedings/10.1145/3641525) +* [ACM REP 25 website](https://acm-rep.github.io/2025/) + +### Author bio + +Jay Lofstead is a Principal Member of Technical Staff at Sandia National Laboratories. His research interests focus around large scale data management and trusting scientific computing. In particular, he works on storage, IO, metadata, workflows, reproducibility, software engineering, machine learning, and operating system-level support for any of these topics. Broadly across these topics, he is also deeply interested in ethics related to these topics and computing in general and how to drive inclusivity across the computation-related science domains. Dr. Lofstead received his Ph.D. in Computer Science from the Georgia Institute of Technology in 2010. + + + diff --git a/Articles/Blog/ACM-REP-23-24.md b/Articles/Blog/ACM-REP-23-24.md deleted file mode 100644 index 33f228fd6..000000000 --- a/Articles/Blog/ACM-REP-23-24.md +++ /dev/null @@ -1,45 +0,0 @@ -# The new ACM REP conference - - -#### Contributed by [Jay Lofstead](https://github.com/gflofst "Jay Lofstead GitHub Profile") - -#### Publication date: September 10, 2024 - -The ACM REP conference helps build better software by highlighting developments in reproducibility and replicability for computational science. - -A few years ago, growing out of Carlos Maltzahn, his student Ivo Jimenez, and my work, we saw momentum towards a critical mass of people interested in reproducibility and replicability challenges and successes for computational science. To help drive the community and grow those efforts, the topic of Ivo Jimenez's PhD thesis, we gathered prominent community members and started an Exploratory Interest Group (EIG) with the Association of Computing Machinery (ACM). Our goal was to explore how big this community was and see if we could make a catalyst and meeting point for these researchers and practitioners to help drive improving scientific software quality. - -Within a few years, we felt like we had enough momentum and in Summer 2023, we had the first ACM REP conference. Even with short notice and not much advertising, we managed to get a critical mass of research papers and attendance that proved the community was both large enough and interested in participating. - -For Summer 2023, we met in the Hay Barn at the University of California, Santa Cruz and online. With the success of that effort, we took the conference to Rennes, France in Summer 2024 to see what more convenient attendance for the European community would muster. We had slight more attendance and a different mix showing a fragmented community we could help bring together. - -For 2025, we will bring ACM REP to Vancouver, Canada the last week of July examining what attendance may be like for something closer to east Asia, but not with the potential visa issues for something hosted in the USA. - -At ACM REP 23, we had three keynotes each looking at a different aspect of sustainable software. Torsten Hoefler described his path into proper statistics for analyzing data and the impacts that had on quality. Juliana Feire presented on the challenges in many domains for data driven science. The last keynote was from Grigori Fursin laid out the efforts he has been involved with that trying to provide tools to support reproducible science. In all of these cases, the depth and breadth of available research topics were made clear and the challenges in achieving success are shown to be daunting. For us to achieve better scientific software, considerable research will be needed. - -At ACM REP 24, our first keynote was from Anne-Laure Boulesteix talking about the challenges in machine learning and how reproducibility is particularly challenging as ML-infused approaches become an integral part of our computational science. Our second keynote was from Konrad Hinsen talking about the problems with platforms and how GUIX can give a portable, stable, reproducible base upon which to run reproducible computational science. While not a complete solution, it offers stronger infrastructure to make many of the ill-defined elements more about the science rather than the experimental environment. - -At both years, we featured a wide variety of papers ranging from success stories to incremental, but difficult and important improvements to make reproducibility and replicability achievable. The discussions among attendees has also shown this to be a vibrant and collaborative group interested in advancing the field rather than trying to hide their own efforts. - -Science is basically defined as something that can be reproduced in a new environment by someone else and achieve statistically the same results. Sometimes it is exactly and there are some other minor variations. Overall, that ability to recreate the results again is key for something to be considered **science**. With the importance of computational science to our scientific inquiry process. Ensuring it is science must be done or our results are little more than, "I got this result and I trust it. Therefore so should you." - -This conference series was started hoping we could achieve a critical mass to foster community for sustainable computational science. The first two years have shown that this critical mass exists and is ready to grow. - -Our latest effort is to encourage experience papers talking about lessons learned from reproducing other computational research. These experience papers can be from any conference or journal and will help justify the educational and research value in exploring existing work by offering a publication credit for writing up the results. - -As with both previous years, we plan to continue offering tutorials with a stronger emphasis on hands-on tutorials rather than lecture style. This will help people learn how to use new tools. - -If you wish to participate, please visit https://acm-rep.github.io/2025/ to keep abreast of the conference as it comes organizes. - -Together, we can make better scientific software through better support for reproducibility and replicability. - -### Author bio - -Jay Lofstead is a Principal Member of Technical Staff at Sandia National Laboratories. His research interests focus around large scale data management and trusting scientific computing. In particular, he works on storage, IO, metadata, workflows, reproducibility, software engineering, machine learning, and operating system-level support for any of these topics. Broadly across these topics, he is also deeply interested in ethics related to these topics and computing in general and how to drive inclusivity across the computation-related science domains. Dr. Lofstead received his Ph.D. in Computer Science from the Georgia Institute of Technology in 2010. - - - From ce82d3d449ab2e65f1e5a745a2431990bc904343 Mon Sep 17 00:00:00 2001 From: "David E. Bernholdt" Date: Sat, 24 Aug 2024 15:57:34 -0400 Subject: [PATCH 3/5] A few more edits --- Articles/Blog/2024-08-ACM-REP-23-24.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/Articles/Blog/2024-08-ACM-REP-23-24.md b/Articles/Blog/2024-08-ACM-REP-23-24.md index 7f2132891..321bbe983 100644 --- a/Articles/Blog/2024-08-ACM-REP-23-24.md +++ b/Articles/Blog/2024-08-ACM-REP-23-24.md @@ -14,21 +14,21 @@ For ACM REP 23, we met in the Hay Barn at the University of California, Santa Cr For 2025, we will bring ACM REP to Vancouver, Canada the last week of July, exploring what attendance may be like for something closer to east Asia, but without the potential visa issues of a meeting hosted in the USA. -At [ACM REP 23](https://acm-rep.github.io/2023/), we had three keynotes each looking at a different aspect of sustainable software. Torsten Hoefler described his path into proper statistics for analyzing data and the impacts that had on quality. Juliana Feire presented on the challenges in many domains for data driven science. The last keynote, from Grigori Fursin, laid out the efforts he has been involved in trying to provide tools to support reproducible science. In all of these cases, the depth and breadth of available research topics were made clear and the challenges in achieving success are shown to be daunting. For us to achieve better scientific software, considerable research will be needed. +At [ACM REP 23](https://acm-rep.github.io/2023/), we had three keynotes each looking at a different aspect of sustainable software. Torsten Hoefler described his path into proper statistics for analyzing data and the impacts that had on quality. Juliana Feire presented on the challenges in many domains for data-driven science. The last keynote, from Grigori Fursin, laid out the efforts he has been involved in trying to provide tools to support reproducible science. In all of these cases, the depth and breadth of available research topics were made clear and the challenges in achieving success are shown to be daunting. For us to achieve better scientific software, considerable research will be needed. At [ACM REP 24](https://acm-rep.github.io/2024/), our first keynote was from Anne-Laure Boulesteix talking about the challenges in machine learning and how reproducibility is increasingly challenging as ML-infused approaches become an integral part of our computational science. Our second keynote was from Konrad Hinsen talking about the problems with platforms and how GUIX can give a portable, stable, reproducible base upon which to run reproducible computational science. While not a complete solution, it offers stronger infrastructure to make many of the ill-defined elements more about the science rather than the experimental environment. -At both years, we featured a wide variety of papers ranging from success stories to incremental, but difficult and important improvements to make reproducibility and replicability achievable. The discussions among attendees has also shown this to be a vibrant and collaborative group interested in advancing the field rather than trying to hide their own efforts. +In both conferences, we featured a wide variety of papers ranging from success stories to incremental, but difficult and important improvements to make reproducibility and replicability achievable. The discussions among attendees has also shown this to be a vibrant and collaborative group interested in advancing the field rather than trying to hide their own efforts. -Science is basically defined as something that can be reproduced in a new environment by someone else and achieve statistically the same results. Sometimes that reproduction is exact and at other times, there are minor variations. Overall, that ability to recreate the results is key for something to be considered **science**. With the importance of computational science to our modern process of scientific inquiry, ensuring it is **science** is a requirement, or our contributions are little more than, "I got this result and I trust it. Therefore so should you." +Science is defined as something that can be reproduced in a new environment by someone else and achieve statistically the same results. Sometimes that reproduction is exact and at other times, there are minor variations. Overall, the ability to recreate the results is key for something to be considered **science**. With the importance of computational science to our modern process of scientific inquiry, ensuring it is **science** is a requirement, or our contributions are little more than, "I got this result and I trust it. Therefore, so should you." -This conference series was started hoping we could achieve a critical mass to foster community for sustainable computational science. The first two years have shown that this critical mass exists and is ready to grow. +This conference series was started hoping we could achieve a critical mass to foster a community for sustainable computational science. The first two years have shown that this critical mass exists and is ready to grow. Our latest effort is to encourage experience papers talking about lessons learned from reproducing other computational research. These experience papers can be from any conference or journal and will help justify the educational and research value in exploring existing work by offering a publication credit for writing up the results. As with both previous years, we plan to continue offering tutorials with a stronger emphasis on hands-on tutorials rather than lecture style. This will help people learn how to use new tools. -If you wish to participate, please visit https://acm-rep.github.io/2025/ to keep abreast of the conference as it comes organizes. +If you wish to participate, please visit https://acm-rep.github.io/2025/ to keep abreast of the conference as it is organized. Together, we can make better scientific software through better support for reproducibility and replicability. @@ -40,8 +40,7 @@ Together, we can make better scientific software through better support for repr ### Author bio -Jay Lofstead is a Principal Member of Technical Staff at Sandia National Laboratories. His research interests focus around large scale data management and trusting scientific computing. In particular, he works on storage, IO, metadata, workflows, reproducibility, software engineering, machine learning, and operating system-level support for any of these topics. Broadly across these topics, he is also deeply interested in ethics related to these topics and computing in general and how to drive inclusivity across the computation-related science domains. Dr. Lofstead received his Ph.D. in Computer Science from the Georgia Institute of Technology in 2010. - +Jay Lofstead is a Principal Member of Technical Staff at Sandia National Laboratories. His research interests focus on large-scale data management and trusting scientific computing. In particular, he works on storage, IO, metadata, workflows, reproducibility, software engineering, machine learning, and operating system-level support for any of these topics. Broadly across these topics, he is also deeply interested in ethics related to these topics and computing in general and how to drive inclusivity across the computation-related science domains. Dr. Lofstead received his Ph.D. in Computer Science from the Georgia Institute of Technology in 2010.