Language Data Engineer (RE1)

Job Reference



Language Data Engineer (RE1)

Closing Date

Tuesday, 31 March, 2020
Reference: 59_20_LS_TM_RE1
Job title: Language Data Engineer (RE1)

About BSC

The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe, and is a hosting member of the PRACE European distributed supercomputing infrastructure. The mission of BSC is to research, develop and manage information technologies in order to facilitate scientific progress. BSC combines HPC service provision and R&D into both computer and computational science (life, earth and engineering sciences) under one roof, and currently has over 650 staff from 49 countries.

Look at the BSC experience:
BSC-CNS YouTube Channel
Let's stay connected with BSC Folks!

Context And Mission

We are looking for candidates with a background in computational linguistics who will participate in the MT4All project as a data manager and curator and in related activities within the Text Mining Unit.

Key Duties

  • Search the web for language data as required by the projects carried out in the Unit.
  • Clean, preprocess and prepare data. Build language corpora usable by the Unit’s tool, specifically neural architectures.
  • Automatically annotate data using state-of-the-art language processing tools.
    Manage corpora and language data according to the requirements specified in the Unit’s data management plan.
  • Monitor applications of data protection, licensing and security rules.
  • Support the researchers of the Unit in their need for data.
  • Write technical reports and project documentation in English, Spanish and Catalan.


  • Education
    • Degree in Applied Linguistics, Computer Science or related disciplines.
  • Essential Knowledge and Professional Experience
    • Around 3 years of experience in the NLP and/or MT fields.
    • Excellent understanding of data administration and management functions (transfer, storage, analysis, distribution, exploration, etc.).
    • Proven experience in working with large datasets and distributed file systems: SQL, databases and metadata management.
    • Proven experience in UNIX/LINUX environments, scripting languages and Python
  • Competences
    • Fluent written and spoken English, Spanish and Catalan.
    • Ability to work under set deadlines
    • Ability to work independently and in a team to complete tasks on schedule


  • The position will be located at BSC within the Life Sciences Department
  • We offer a full-time contract, a good working environment, a highly stimulating environment with state-of-the-art infrastructure, flexible working hours, extensive training plan, tickets restaurant, private health insurance, fully support to the relocation procedures
  • Duration: Temporary - 1 year renewable renewable
  • Salary: we offer a competitive salary commensurate with the qualifications and experience of the candidate and according to the cost of living in Barcelona
  • Starting date: asap

Applications Procedure

All applications must include:

  • A Cover Letter with a statement of interest in English, including two contacts for further references - Applications without this document will not be considered

  • A full CV in English including contact details


The vacancy will remain open until suitable candidate has been hired. Applications will be regularly reviewed and potential candidates will be contacted.

Diversity and Equal Opportunity Employment

BSC-CNS is an equal opportunity employer committed to diversity and inclusion. We are pleased to consider all qualified applicants for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or any other basis protected by applicable state or local law.

Application Form

Please, upload your CV document using the following name structure: Name_Surname_CV
Files must be less than 3 MB.
Allowed file types: txt rtf pdf doc docx.
Please, upload your CV document using the following name structure: Name_Surname_CoverLetter
Files must be less than 3 MB.
Allowed file types: txt rtf pdf doc docx zip.
Please, upload your CV document using the following name structure: Name_Surname_OtherDocument
Files must be less than 10 MB.
Allowed file types: txt rtf pdf doc docx rar tar zip.
** Consider that the information provided in relation to gender and nationality will be used solely for statistical purposes.