What is CV/Resume Parsing?
Resume Parsing (also known as CV Parsing, Resume Extraction, CV Extraction) is the conversion of a free-form CV/Resume document into structured information suitable for storage, reporting and manipulation by a computer.
Recruitment agencies usually work with CV Parsing software tools in order to automate the storage and analysis of CV/Resume data. This way, recruiters save hours of manual work by not having to process manually each job application and CV that they receive.
The most common CV/Resume format is MS Word which despite being easy for humans to read and understand, to a computer they are just a long sequence of letters, numbers and punctuation. A CV parser is a program that can analyse a document, and extracts from it the elements of what the writer actually meant to say, which in the case of a CV usually are the skills, work experience, education, contact details and so on.
This is a surprisingly difficult task for a computer to do because:
-
Language is infinitely varied. There are hundreds of ways to write down a date, for example, and countless millions of ways to write what you did in your last job. A resume parsing tool has to capture all these different ways of writing the same thing through complex rules and statistical algorithms.
-
Language is ambiguous: the same word or phrase can mean different things in different contexts. For example:
- "Director" can be a job title in some contexts, or a software package in others.
- A 4-digit number can be part of a telephone number, a Swiss zip code, a year or a version of a software package.
- The term "Project Manager" may indicate that the writer was indeed a project manager, but it is quite different if it is in a different context, like "I used to report to the Project Manager".
- "Meryll Lynch" may be someone's name, but is more likely to refer to a company.
The only way CV parsing software can resolve these ambiguities is by being able to "understand" the context in which they are used. A good CV parser thus, has to be "Intelligent".
Read More: