Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/website screenshotting #18

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

supercrafter100
Copy link
Member

This PR will create screenshots of websites linked in a message and run it through OCR. If people link their website because they want to showcase it being "broken", this can automatically scan them and perform checks.

Technically we could be running checks on the HTML itself (which would definitely be easier) but this allows for better look at the error. The keyword matching could false trigger way easier on a giant html document while this is for the actual visible text on the website.

@Derkades
Copy link
Member

What about security? For example, will this execute arbitrary javascript? That would allow some form of SSRF.

Other than that it is a pretty cool feature, I can imagine it being useful

@supercrafter100
Copy link
Member Author

What about security? For example, will this execute arbitrary javascript? That would allow some form of SSRF.

Other than that it is a pretty cool feature, I can imagine it being useful

Would disabling JavaScript on the headless browser fix this?

@Derkades
Copy link
Member

Not entirely, GET requests are still possible. For example, by adding an image to the page

@Derkades
Copy link
Member

Derkades commented Jun 10, 2023

With javascript disabled I personally can't think of any security risks, besides of course zeroday vulnerabilities in the various renderers and parsers involved. Running the headless browser in a restricted container would eliminate that concern for me.

Can puppeteer work in a client-server setup where the actual browser runs in a restricted environment? It would be a lot more work if you needed to build a client-server model yourself

@supercrafter100
Copy link
Member Author

The entire bot already runs within a docker container (using pterodactyl)

@supercrafter100
Copy link
Member Author

Or you mean to run the browser in a different container, separate from the bot?

@Derkades
Copy link
Member

Or you mean to run the browser in a different container, separate from the bot?

This. The bot deals with sensitive information; it has an API token which grants access to staff channels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants