Lead Data Scientist - Bioinformatics
The Bioinformatics Group within the Data Science Organization at Spark Therapeutics is seeking an engaged and passionate Lead Data Scientist with a focus on bioinformatics and computational biology to participate in and support projects involving omics and other high-dimensional data across the Technology & Research Organizations. He/she will be responsible for:
- Analysis and interpretation of high-dimensional data, including genomic, transcriptomic, and proteomic datasets
- Engaging in cross-functional discussions, providing conceptual input in experimental and study design and serving as subject matter expert in bioinformatics
- Developing custom bioinformatics tools and machine learning models
- Building and validating bioinformatics pipelines using a combination of open source and in-house developed tools
- Summarizing, visualizing, and presenting analyses and findings to key stakeholders
- Supporting evaluation and writing of study reports, scientific presentations, and SOPs
- Working with the rest of the bioinformatics group & data science organization to build the infrastructure (e.g. data capture and analysis software, developing SOPs)
% of Time
Job Function and Description
Bioinformatics analysis, interpretation, and communication of genomics data to support various technology development projects
Develop in-house computational tools and machine learning models to support analysis of high-throughput datasets
Generate technical reports, prepare presentation slides, generate novel concepts.
Trainings, lab meetings and administration workEducation and Experience Requirements
- Ph.D. in Computer science, Computational Biology, Bioinformatics, or related disciplines
- Minimum 5 years of post-graduate experience in genomics and bioinformatics
- Extensive experience in next generation sequencing (NGS) analysis of DNA and RNA-seq data using short and long read sequencing technologies
- Demonstrable track record in the core competency areas: high dimensional data analysis, applying machine learning (ML) to biological datasets, data visualization
- Proficiency in programming using one or more common data science languages such as Python, R, SPARQL, SQL
- Extensive experience using bioinformatics workflow technologies such as WDL, CWL, Cromwell, Docker
- Familiarity with commonly used bioinformatics tools such as BWA, Samtools, BLAST, GATK suite
- Strong familiarity with core concepts in molecular biology and related lab technologies
- Familiarity with ML libraries such as scikit-learn, TensorFlow, Keras, etc…
- Track record of following best practices of coding, version control (Git), code documentation, and reproducible research
- Proven ability to work independently & in a collaborative group setting
- Self-motivated to learn and develop new methodologies, manage multiple analysis pipelines simultaneously, keep accurate records, follow instructions, and comply with company policies.
- Excellent communication skills (both oral and written)
- Experience with AAV vectors and gene therapy is preferred, but not required
- Expert ability to critically analyze problems, develop potential solutions, and evaluate impact to Spark
- Demonstrated ability to independently provide effective oversight and management of external collaborations and vendors
- Provides coaching and guidance to more junior team members on technical issues
Please be aware that Spark mandates COVID-19 vaccination of all employees regardless of work location. Accommodations may be made in accordance with applicable law.
Nearest Major Market: Philadelphia
Join the Spark Team
We were born of innovation, springing from the curiosity, imagination and dedication of remarkable scientists and healthcare visionaries. Our shared mission is to challenge the inevitability of genetic disease by discovering, developing, and delivering treatments in ways unimaginable - until now.
We don't follow footsteps. We create the path.