About the job
Are you interested in working with data Engineers and analytics to solve problems? Are you interested in bringing and building up your NLP and (gen) AI expertise to projects? About Our Team In RDP, we are describing the world of scientific research as accurately as possible by profiling a wide range of entities, from People and Organizations to research artefacts such as Publications and Grants all the way to topics and concepts that characterize what research is about. Our entity-linking and disambiguation systems integrate data from a diverse set of sources that include millions of research articles to demonstrate market leading performance. The RDP team is large, full of talented people, and highly cross-functional and collaborative: As a data scientist, you will be working together with software engineers, analysts, product managers, and architects on selected areas of focus, and be part of the wider RDP Data Science team. About The Role As a Data Scientist for the source domain, your focus will be on improving the quality of our links between sources (e.g scientific journals) and works (e.g. articles in scientific journals) as well as source entity extraction. You will work closely with our engineers, architects, product managers and other data scientists within the Sources domain to continually improve Sources systems and your base data will be millions of scientific publications. With data quality being vital to Elsevier's success, there is a strong focus on measuring and improving data quality and coverage. At the same time, the increasingly large volumes of data from increasingly diverse sources requires you to maintain and develop systems and models that scale well. We tackle some of the hardest data science challenges to solve pressing customer needs and we are looking for a talented, enthusiastic, and collaborative colleague to join our ranks. Responsibilities: • You are passionate about the possibilities of creating value for customers by interconnecting data and improving the quality and richness of data at scale. • You like to work with others as part of a multidisciplinary team to achieve a common goal, and you are eager to contribute and put your ideas across. • Wiling to learn new things and come up with practical solutions to technical challenges. • An interest in the data you're working with. Our data is largely text based. Requirements: • Hands-on experience developing (components of) large scale data science solutions. • Preferably, a formal qualification in computer science, computational linguistics, data science, machine learning, or other quantitative disciplines. • Demonstrable coding skills in Python. • Knowledge of applying machine learning techniques. • Experience working with data in files (e.g. csv or parquet) • An understanding of statistics for the purpose of quantitative analysis. • Experience with Databricks and/or Spark or Pandas • Familiarity with tools such as Git or Jira • Familiarity with agile methodologies • Experience with NLP. • Experience with databases.
Requirements
- Data Science
- NLP
- Python
- Machine Learning
Qualifications
- Formal qualification in computer science, computational linguistics, data science, or related field
Preferred Technologies
- Data Science
- NLP
- Python
- Machine Learning
Benefits
- Health insurance for you and your family
- Flexible working arrangements
- Employee assistance programs
About the company
Elsevier is a global leader in information and analytics, helping researchers and healthcare professionals advance science and improve health outcomes. Building on our publishing heritage, we combine quality information and vast data sets with analytics to support visionary science, research, health education, interactive learning, and exceptional healthcare and clinical practice.
Similar Jobs
Data Scientist
V4c.ai
Data Scientist
v4c
Data Scientist
v4c