Job Crawler/Scraper/Parser #6

austinoboyle · 2018-04-26T20:31:41Z

Scrape jobs by various filters:

Location
Company
Etc

First Use Case: Scrape all jobs in Kingston

Relevant URL
https://www.linkedin.com/jobs/search/?keywords=&location=Kingston%2C%20Ontario%2C%20Canada&sortBy=DD

Process:

Scrape Basic Info for All Jobs
Based on Basic Scrape (job_id), run parallel scrape to get detailed info on all jobs

Basic Fields

title
job_id (links are '/jobs/view/ID')
location
company_name
company_id (links are '/company/ID')
company_image_link

Detailed Info

job_description
seniority_level
industries
employment_type
job_functions

simarpreetsingh-019 · 2020-09-22T21:47:43Z

@austinoboyle I am working on a similar issue for my project, mostly founded which class I should parse and extract info, but i got struck when i try to download the source code of page, i got an utput like:
`
r = requests.get('https://linkedin.com/jobs/')
html_content = r.content

print(html_content)

print()
soup = BeautifulSoup(html_content,'html.parser')
print(soup)
`

to which i got output:
`

if (window.location.protocol == "http:") {
// If "sl" cookie is set, redirect to https.
for (var i = 0; i < cookies.length; ++i) {
if ((cookies[i].indexOf("sl=") == 0) && (cookies[i].length > 3)) {
window.location.href = "https:" + window.location.href.substring(window.location.protocol.length);
return;
}
}
}

// Get the new domain. For international domains such as
// fr.linkedin.com, we convert it to www.linkedin.com
var domain = "www.linkedin.com";
if (domain != location.host) {
var subdomainIndex = location.host.indexOf(".linkedin");
if (subdomainIndex != -1) {
domain = "www" + location.host.substring(subdomainIndex);
}
}

window.location.href = "https://" + domain + "/authwall?trk=" + trk + "&trkInfo=" + trkInfo +
"&originalReferer=" + document.referrer.substr(0, 200) +
"&sessionRedirect=" + encodeURIComponent(window.location.href);
}
</script>

`

If you or anyone else can help me with how to get exact source code?

would be helpful for this issue also.
I know its an older issue but i thought of why creating new one when similar issue is already here. if needed , i would make new one.

anilabhadatta · 2021-08-24T22:27:58Z

I did a pull request ,
Added Jobs and People in CompanyScraper
If possible please test it on a temporary linked in account.

austinoboyle added the enhancement New feature or request label Apr 26, 2018

austinoboyle self-assigned this Apr 26, 2018

austinoboyle assigned peteraw77 May 18, 2018

simarpreetsingh-019 unassigned peteraw77 Sep 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Job Crawler/Scraper/Parser #6

Job Crawler/Scraper/Parser #6

austinoboyle commented Apr 26, 2018 •

edited

simarpreetsingh-019 commented Sep 22, 2020

anilabhadatta commented Aug 24, 2021

Job Crawler/Scraper/Parser #6

Job Crawler/Scraper/Parser #6

Comments

austinoboyle commented Apr 26, 2018 • edited

simarpreetsingh-019 commented Sep 22, 2020

anilabhadatta commented Aug 24, 2021

austinoboyle commented Apr 26, 2018 •

edited