In the past decade, neuroscience datasets and the scale of scientific collaborations have grown dramatically in size. With the advent of cutting-edge technologies such as Neuropixels probes, large-field-of-view microscopes, multi-omics and advanced video tracking, the amount of data acquired over the course of a project typically now reaches terabytes. These data need to be processed, distributed to multiple laboratories for analysis and shared with the wider community. Against this backdrop, a new type of position has emerged in the field of neuroscience: research software engineers (RSEs), professionals who are trained to develop data infrastructure and processing pipelines.
As RSEs ourselves, leading a large-scale scientific collaboration of 22 laboratories, the International Brain Laboratory, we have witnessed how tailored engineering support can vastly improve the quality and impact of research. Our flagship project—a brain-wide map of neural activity—would not have been possible without our core engineers, who created a common infrastructure to gather, process and redistribute thousands of experimental data across dozens of geographically distributed laboratories. Standardization, visualization and professional testing of pipelines were a necessity for an effort at this scale. These solid foundations also accelerated progress on more than 30 smaller-scale projects, as scientists could swiftly reuse methods and dive into our large, well-curated and well-documented datasets. These shared methodologies were especially important for scientists confirming discoveries across different institutions, such as in a study of the neural representations of prior information.
Given the essential role of RSEs, their numbers in neuroscience are growing; Princeton University, Janelia Research Campus and University College London, for example, have launched dedicated software cores. But to adopt this approach on a wider scale requires more dedicated funding for these positions, as well as structural changes to how they fit in the broader academic landscape. To be most effective, we believe RSEs should be embedded in institutions rather than in individual labs. RSE appointments also need to be longer-term, rather than tied to the lifespan of specific projects.
RSEs hold a broad set of skills and help support all phases of a scientific project. They manage datasets and develop and operate processing algorithms, such as data compression or spike sorting. They develop new analytical methods, producing or reviewing code that supports journal publications and help enforce quality control. On the hardware side, engineers assemble components, implement control software and generate documentation for users to build and debug their system. They also act as project managers, surveying, prioritizing and translating scientists’ needs, ultimately helping to foster collaborative work. They are often a driving force behind open-science efforts, disseminating tools, methods and datasets to the community, documenting and packaging code, developing courses, delivering lectures and interacting with the researchers who want to use shared data.
R
SEs offer clear benefits to the scientific enterprise, but several factors stymy broader adoption of this type of support. For one, the role lacks a well-defined career path within academia. Hemmed in by existing human resources categories, senior RSEs with many years of experience are often appointed as postdoctoral researchers or staff collaborators and do not get the same salary benefits as their equivalent scientific peers in tenure-track positions. It has been challenging to explain to institutional human resources departments that a valuable RSE without a doctorate should be paid more than a master’s graduate, or that an engineer with 15 years of experience should receive comparable benefits as a faculty member. We need to create pathways for RSEs to grow—this effort will both help retain talented workers and enable them to apply their knowledge to multiple projects.
At many universities, RSEs must be hired under principal investigators, the main recipients of grants. This rule means that RSEs are dependent on their home laboratory budget, both in terms of salary duration and level. That makes it difficult for RSEs to apply their expertise from one project to another. And an engineer formally supervised by a PI cannot hire or manage junior RSE team members. This situation is even more challenging for collaborations among multiple institutions. The lead RSE at one institution has to manage a remote team over which they have no direct authority. If an engineer receives mixed signals regarding work priorities, coming both from the home PI and the lead RSE (a classic case of matrix management), it’s difficult to know who to follow.
Dr. Thomas Hughes is a UK-based scientist and science communicator who makes complex topics accessible to readers. His articles explore breakthroughs in various scientific disciplines, from space exploration to cutting-edge research.