Sr Data Scientist
About the job
Roles & Responsibilities: • Lead conversations with business collaborators to elucidate semantic models of pharmaceutical business concepts, aligned definitions, and relationships. Negotiate and debate across collaborators to drive alignment and create system-independent information models, taking a data-centric approach aligned with business data domains. • Develop comprehensive business information models and ontologies that capture industry-specific concepts, including CMC, Clinical, and Operations data. • Facilitate whiteboarding sessions with business subject matter experts to elicit knowledge, drive interoperability across pharmaceutical domains, and interface between data producers and consumers. • Educate peers on the practical use and differentiating value of Linked Data and FAIR+ data principles. Champion standards for master data & reference data. • Formalize data models in RDF as OWL and SHACL ontologies that interoperate with each other and with relevant industry standards like FHIR and IDMP for healthcare data exchange. • Build a broad semantic knowledge graph that threads data together across end-to-end business processes and enables the transformation to data-centricity and new ways of working • Apply pragmatic semantic abstraction to simplify diverse pharmaceutical and healthcare data patterns effectively. Basic Qualifications: • Doctorate degree OR• Masters degree and 4 to 6 years of Data Science experience OR• Bachelors degree and 6 to 8 years of Data Science experience OR• Diploma and 10 to 12 years of Data Science experience Preferred Qualifications: About the role You will play a key role in a regulatory submission content automation initiative which will modernize and digitize the regulatory submission process, positioning Amgen as a leader in regulatory innovation. The initiative uses state-of-the-art technologies, including Generative AI, Structured Content Management, and integrated data to automate the creation, review, and approval of regulatory content. Role Description: The Sr Data Scientist is responsible for developing interconnected business information models and ontologies that capture real-world meaning of data by studying the business, our data, and the industry. With a focus on pharmaceutical industry-specific data, including Clinical, Operations, and Chemistry, Manufacturing, and Controls (CMC), this role involves creating robust semantic models based on data-centric principles to realize a connected data ecosystem that empowers consumers. The Information Modeler drives seamless cross-functional data interoperability, enables efficient decision-making, and supports digital transformation in pharmaceutical operations. Functional Skills: Must-Have Skills: • Proven ability to lead and develop successful teams. • Strong problem-solving, analytical, and critical thinking skills to address complex data challenges. • Deep understanding of pharmaceutical industry data, including CMC, Process Development, Manufacturing, Engineering Quality, Supply Chain, and Operations. • Advanced skills in semantic modeling, RDF, OWL, SHACL, and ontology development in TopBraid and/or Protg. • Demonstrated experience creating knowledge graphs with semantic RDF technologies (e.g. Stardog, AllegroGraph, GraphDB, Neptune) and testing models with real data. • Highly proficient with RDF, SPARQL, Linked Data concepts, and interacting with triple stores. • Highly proficient at facilitating, capturing, and organizing collaborative discussions through tools such as Miro, Lucidspark, Lucidchart, and Confluence. • Expertise in FAIR data principles and their application in healthcare and pharmaceutical data models. Good-to-Have Skills: • Experience in regulatory data modeling and compliance requirements in the pharmaceutical domain. • Familiarity with pharmaceutical lifecycle data (PLM), including product development and regulatory submissions. • Knowledge of supply chain and operations data modeling in the pharmaceutical industry. • Proficiency in integrating data from various sources, such as LIMS, EDC systems, and MES. • Hands-on data analysis and wrangling experience including SQL-based data transformation and solving integration challenges arising from differences in data structure, meaning, or terminology. • Expertise in FHIR data standards and their application in healthcare and pharmaceutical data models. Soft Skills: • Exceptional interpersonal, business analysis, facilitation, and communication skills. • Ability to interpret complex regulatory and operational requirements into data models. • Analytical thinking for problem-solving in a highly regulated environment. • Adaptability to manage and prioritize multiple projects in a dynamic setting. • Strong appreciation for customer- and user-centric product design thinking.
Requirements
- Semantic Modeling
- Pharmaceutical Data
- RDF
- OWL
- SHACL
- Generative AI
Qualifications
- Doctorate
- Masters
- Bachelors
Preferred Technologies
- Semantic Modeling
- Pharmaceutical Data
- RDF
- OWL
- SHACL
- Generative AI
Similar Jobs
Sr Data Scientist
Crayon Data
Sr Data Scientist
Tata Consultancy Services
Sr Data Scientist
Syneos Health