Data Scientist

Second Genome

Brisbane, CA, US
  • Job Type: Full-Time
  • Function: Research Sci/Assoc/Mgr
  • Industry: Healthcare
  • Post Date: 01/11/2022
  • Website:
  • Company Address: 1700 Owens Street, 5th Floor, San Francisco, CA, 94158

About Second Genome

Second Genome is at the cutting edge of microbiome science, translating breakthrough research into medicines and other novel products that help humanity.

Our scientists and staff are true pioneers in this dynamic field. We are advancing a pipeline of novel therapies for serious diseases, while also making new breakthroughs in multiple fields of research using our one-of-a kind discovery platform. With our impact felt across industries — from healthcare and nutrition to agriculture — we are recognized as global leaders in translating rapidly emerging science into products that help humanity.

Job Description

We are a fast-paced, venture-backed biotechnology company developing breakthrough therapeutics through innovative microbiome science, and we are looking for a Data Scientist to develop and apply our machine learning capabilities to critical problems in microbiome science and human health.


You are not intimidated by difficult machine learning problems with high-dimensional data. You are an independent thinker and you love sharing your knowledge with others. You have passion for learning and advancing human health through data-driven therapeutic discovery.


How You’ll Impact the Company:

  • Expand Second Genome’s data science and machine learning capabilities through research and implementation of state-of-the-art methods to solve our domain-specific problems.
  • Design, implement and execute small to large data science projects in collaboration with other Second Genome program, project, and function leads for data-driven decision support and/or to fulfill criteria as defined in partnership agreements or other externally funded research.
  • Develop and maintain Second Genome’s cloud configurable machine learning pipeline and operations according to SDLC and MLOps best practices
  • Collaborate with Platform product and Engineering functions to define requirements needed for timely and successful execution of data science deliverables
  • Collaborate with Product and Omics functions when evaluating new platform capabilities or modifications thereof, including new and alternative software packages and parameterizations
  • Participate in reciprocal code reviews with other Informatics functions
  • Contribute to recognition of Second Genome as an industry leader in microbiome data science through conference presentations, patent applications, peer-reviewed publications and other external communications


What You Bring to the Role:

  • You have, or are actively pursuing, an advanced degree in computer science, computer engineering, electrical engineering, or equivalent.
  • You have demonstrated experience in developing, implementing and communicating machine learning algorithms or high-dimensional data analyses
  • You have experience working with small-n-large-p datasets and managing false discovery
  • You have hands-on experience with model interpretation and visualizing feature importances and interactions
  • You are comfortable reading and presenting technical papers and conference proceedings in machine learning.
  • You are proficient in Python, Python’s scikit-learn library, and Jupyter notebooks. Ideally you have exposure to mlflow, metaflow or other experiment tracking systems
  • You are familiar with SDLC and source code revision control such as git, SVN or others. Familiarity with AWS or other cloud service providers is a plus.

We use cookies to customize your user experience. Click “Agree” if you agree with our Policy.