Our center invites applications for a Data Engineer to lead the curation and management of research data in our department.
The Data Engineer will interact closely with our clinical researchers and support them in various research projects that require aggregation of large data sets from a range of clinical information systems within our institution, including electronic health records, radiologic image archives, and pathology reports. The successful candidate will work with our team of software and machine-learning engineers to develop tools and pipelines for automatic data processing and ensure high availability of such systems.
We are looking for a highly motivated, reliable, and self-driven professional with experience in big-data applications and with passion for working in a dynamic and collaborative research environment focused on improving patients’ lives.
- Work closely with stakeholders, including clinical researchers and AI scientists, to design and implement data aggregation strategies for a variety of research projects.
- Operate data collection tools for ingestion and fusion of clinical information from a range of data sources within the institution (e.g. electronic health records, radiology picture archive, pathology reports, demographic data).
- Develop and maintain scalable tools and pipelines for automatic data processing tasks and streamline data delivery to the stakeholders.
- Perform basic data analysis to obtain actionable insights into the nature and quality of our data sets.
- Act as departmental point person for questions regarding aggregation of research data.
- Monitor data-collection systems and processes.
- Conduct quality assurance of collected data and work within regulatory constraints.
- Assist with drafting study descriptions for the institutional review board (IRB).
- Bachelor’s degree in computer science or equivalent.
- Excellent knowledge of data structures, common file formats, and storage architectures; and experience with large-scale data curation and preprocessing tasks.
- Experience with databases, in particular relational SQL databases, and profound knowledge of the SQL language, including dialects such as Impala, Postgres, and Microsoft SQL Server.
- Experience with Apache Hadoop, Impala, Hue.
- Strong knowledge of the Linux operating system (Ubuntu, Debian, RedHat).
- Experience with development of Bash scripts and task automation.
- Experience with development of basic Python programs, including libraries for data preprocessing such as Pickle, NumPy, SciPy, SQLAlchemy.
- Experience with business intelligence software, such as Tableau, Qlik Sense, or Redash, and experience with creating reports, dashboards, and data visualizations.
- Excellent communication skills and goal-driven work attitude.
- Prior involvement in the management of research data.
- Experience working with electronic health record (EHR) software (preferably EPIC) and PACS software.
- Experience with service monitoring tools, such as Grafana, Graphite, Prometheus.
- Technical understanding of the DICOM standard (file format and communication).
- Prior exposure to or domain knowledge in medical imaging.
- Ability to develop basic web applications using frameworks such as Flask or Django.
The Center for Advanced Imaging Innovation and Research (CAI2R, pronounced care) is a National Center for Biomedical Imaging and Bioengineering supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB) and operated by the Department of Radiology at NYU Langone Health.
CAI2R comprises approximately 150 full-time personnel dedicated to imaging research, development, and clinical translation. Our team is diverse and highly collaborative. We work in interdisciplinary groups that include engineers, scientists, clinicians, technologists, and industry experts.
Joining us means becoming part of a diverse community that values cross-pollination of ideas, celebrates creativity, and nurtures an environment conducive to breakthrough innovations.
Timeline, Salary, and Benefits
The anticipated start for this position is by the end of 2022. The initial appointment is for two years, with an intention to renew depending on mutual agreement. NYU Langone Health offers a competitive salary and benefits package. We welcome both domestic and international applicants.
We are committed to diversity and inclusion in all aspects of recruiting and employment. All qualified individuals are encouraged to apply and will receive consideration without regard to race, color, gender, gender identity or expression, sexual orientation, national origin, age, religion, creed, or disability.
Email your CV, a list or description of recent ML projects, a cover letter describing your interest in the position, and relevant transcripts (unofficial transcripts are fine) to Sara.Thermer@nyulangone.org. Include “Data Engineer” in the subject line.