Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting and monitoring web content with PowerShell #39

Open
utterances-bot opened this issue Jun 14, 2021 · 7 comments
Open

Extracting and monitoring web content with PowerShell #39

utterances-bot opened this issue Jun 14, 2021 · 7 comments

Comments

@utterances-bot
Copy link

Extracting and monitoring web content with PowerShell

FoxDeploy.com, Stephen Owen's technical blog about
PowerShell, Systems Administration, GUI Design and Programming.
.

https://www.foxdeploy.com/blog/extracting-and-monitoring-web-content-with-powershell.html

Copy link

Hey Stephen,

Great article!!! I found it useful and something similar with what I have in plan to do, and I would want to ask you as I don't have experience in this domain.
Is there a way to create a code like the one you created above for PowerSheel that perform a css requirement?

For example, I want to hide a section from a specific website. The css code is "display: none;" for that section that has a specific class or id.

Thank you and I look forward to hearing from you!

@1RedOne
Copy link
Owner

1RedOne commented Jun 14, 2021

Happy you liked the blog post! Can you help me understand the full requirement?

Do you control the web page? If you do, you should control what elements appear by altering the css on the site, or using JavaScript to determine when to hide or show an element.

If you don't...tell me what the script would do.

Copy link

Hello Stephen,

I apologize for delay!

No, I don't have control over it. It can be any website on the internet.
My idea is this! I want to customize sections of different websites like stylebot https://chrome.google.com/webstore/detail/stylebot/oiaejidbmkiecgbjeifoejpgmdaleoha?hl=ro
or adblock plugins do for Chrome, and I want to create a code similar with the one you created in this article, so that whenever I startup the system(windows os), the code automatically start and do its job to make the css customization translated into that powershell code, without having to install the third party plugins like stylebot.

Thank you!

@1RedOne
Copy link
Owner

1RedOne commented Jun 17, 2021

You could do this by newing up a webkit or ie object and then editing or manipulating the DOM (document object model, the parsed view of the webpage's html) but it would be a big ask and not really a good use for PowerShell.

It would be much better, IMHO, to do this as a web browser extension.

Copy link

Ok Stephen, thank you for your help! If you know some references or tutorials using this method you described above to create such a code, that would be great! Thanks!

Copy link

With the introduction of PS7, the Invoke-WebRequest function no longer produces the ParsedHtml method (which is a shame because I have to parse a webpage exactly like you demonstrate).

Is there a way to redo with using PowerShell 7?

For example, I'm looking to see when the last modified date from https://www.virtualbox.org/ticket/20536 was.

Copy link

rosamund commented Oct 4, 2023

Hi Stephen,
I read your post with interest and need your help with my work.
I want to extract the data contained in "impressum" of my websites.
For example the url "https://kathrein.at/impressum".
I did a right click inspect and the data I want is contained in div class="content-text" i.e. company name, address, telephone and email.

I entered the following code but it doesn't work:
Invoke-WebRequest -UseBasicParsing https://kathrein.at/impressum $rep.ParsedHtml.body.getElementsByClassName('content-text')| select -expand innertext

Can You help me correct This script and tell me for a list of urls how to get the same data in a loop.

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants