Web scraping is the process of gathering data from the internet, usually using automated bots or web crawlers. Emsi’s job postings are scraped from company websites and job boards. Job boards work directly with employers, who pay the job board to list their postings. Company websites and job boards and considered primary sources of postings data–the information for the postings is coming from the original source of the job vacancy.
Job postings are not collected from aggregator sites for several reasons. First, aggregators are not primary sources of posting data. Aggregators behave more like internet search engines–they scrape the web, find job postings, and re-post them. Additionally, postings collected from aggregators generally contain less detail and do not contain any new information that was not available from the original source. Finally, aggregators are not incentivized to ensure that their postings are up to date. Since companies pay job boards to post their vacancies, the likelihood of postings continuing to be available after a vacancy is filled are much lower. Including only the most up-to-date sources allows Emsi to more accurately provide data concerning the expiration dates of job postings.
Users often ask about the absence of postings from Indeed and LinkedIn in Emsi’s job postings. Both sources have asked that their sites not be scraped for job postings; therefore Emsi does not collect or display postings from either source.