Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed images support in libpurple #102

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Roger
Copy link
Contributor

@Roger Roger commented May 19, 2017

Parse tags in messages, store the image in the filesystem and
replace the tag with a url to the image.

This add two new settings, webserver_root and webserver_url, and only
stores images if they exist.

The user should configure an http server, pointing to the webserver_root
and serving to webserver_url

Example Config:
webserver_url = http://example.com
webserver_root = /srv/bitlbee/

@dequis
Copy link
Member

dequis commented May 21, 2017

I reviewed this thing out-of-band with roger before he opened the PR so the code already looks good to me and I'd merge it. Probably adding some documentation afterwards, like including the new settings in bitlbee.conf

But I would like to hear opinions from the others (@Wilm0r @sm00th @jgeboski @EionRobb) regarding the "web_directory" and "web_url" global settings

(btw @Roger your commit log got the names wrong, fix that)

The idea has been suggested before by seirl in trac ticket 1192 ("Leave the http server responsibility to the user, and display an URL formatted like the user configured it")

And spectrum2 (which is like bitlbee for xmpp) did this before too, see SpectrumIM/spectrum2@65d5b4f

It's disabled by default so that's safe. There's no size limit (spectrum has 1mb limit, which sounds too low - probably worth adding a third setting to do that). There's also no maximum size limit so running out of disk space is a real risk, which we could document and maybe later provide some ways to help keep it limited.

There's no structure under web_directory, just filenames that are "sha256.ext". That's probably ok, not sure if i'd want to add even more settings, and being in a single place helps with deduplication (files sent more than once are not written more than once)

Anyway, just looking for opinions on the general concept before introducing this as a new feature.

regex = g_regex_new("<img id=\"(\\d+)\">", 0, 0, NULL);
}

message = g_regex_replace_eval(regex, message_, -1, 0, 0, replace_images_cb, NULL, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using a regex, use purple_markup_find_tag() which processes xml and handles the case of that string you're trying to match being embedded in other strings

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but that's just the finding part. The replacement part requires doing fiddly pointer arithmetic. I'm not sure it's worth it if this is just for the edge case where someone does <tag attr='<img id="123">'>

path = g_strdup_printf("%s/%s.%s", global.conf->web_directory, hash, ext);

// only save images that don't exist in the filesystem, in protocols like telegram lots of images are duplicated(stickers)
if (access(path, F_OK) == -1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The man page says to not ever use access() in production. Perhaps use g_file_test() instead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The man page says to not ever use access() in production

What version of the man page and where? g_file_test() uses access() (for the unix part, and we don't care about the win32 part here. sorry eion). Probably a good idea to use g_file_test() anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://linux.die.net/man/2/access in the notes "For this reason, the use of this system call should be avoided"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the time-of-check-time-of-use vuln. That's not about the system call itself, it's about how it's used (g_file_test() can be used incorrectly like that too, the docs mention it). We're not checking for permissions here, just for existence, so the risk is that a file with the same name is created between the check and our overwrite. Non-issue imo.

g_string_append_printf(res, "%s/%s.%s", global.conf->web_url, hash, ext);

g_free(id);
g_free(path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've got a memleak on hash, you need to g_free() it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 1purple_imgstore_unref(image) still needed?

@rodneyrod
Copy link

What's the progress on this one?

@Roger
Copy link
Contributor Author

Roger commented Jan 17, 2018

I have some code done.. but had no time to finish it

@dequis dequis added the WIP label Feb 11, 2018
@Roger
Copy link
Contributor Author

Roger commented Feb 12, 2018

This is the last I was working, if someone want to try to finish it

Roger@a04c222

Parse <img> tags in messages, store the image in the filesystem and
replace the tag with a url to the image.

This add two new settings, web_directory and web_url, and only
stores images if they exist.

The user should configure an http server, pointing to the web_directory
and serving to web_url

Example Config:
web_url = http://example.com
web_directory = /srv/bitlbee/

unref images from purple_imgstore after store them in the filesystem

WIP: rewrite replace_images with purple_markup_find_tag
@Thaodan
Copy link
Contributor

Thaodan commented Jun 12, 2019

What about remote access? Do you store image data per ident?

@Roger
Copy link
Contributor Author

Roger commented Jun 13, 2019

What about remote access? Do you store image data per ident?

The remote access should be configured as an external http server serving the images directory, the images are stored in one directory using the hash and the extension only, not per ident

@Thaodan
Copy link
Contributor

Thaodan commented Jun 13, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants