Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: input.rawValue #5257

Closed
josepharhar opened this issue Feb 5, 2020 · 30 comments
Closed

Proposal: input.rawValue #5257

josepharhar opened this issue Feb 5, 2020 · 30 comments
Labels
addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: forms

Comments

@josepharhar
Copy link
Contributor

Currently, there is no way to retrieve the actual text a user has entered into certain types of input elements due to the sanitization that occurs when accessing the value property, particularly on the number and email input types. Without the ability to do this, virtual DOM framework authors are forced to guess what text the user entered to avoid clearing the text or selection when assigning to the value property. It doesn't work perfectly, and they have to chase down bugs by adding more and more logic on top of input elements.

In order to address this problem, I would like to propose adding a read-only property for input elements called "rawValue" (suggestions on naming welcome). input.rawValue returns the actual text inside of textual input elements, as opposed to the sanitized value returned in the value property. For example, if a user enters "1234ee" into a number input, the rawValue property would return "1234ee," as opposed to the value property which returns an empty string.

@tkent-google proposed this years ago, and some additional use cases he presented are:

  • Enabling selection APIs for more input types, which is an additional pain point noted by framework authors
  • Allowing javascript screenreaders to read the user-entered content in more input types

I created an explainer for the proposed rawValue property here: https://docs.google.com/document/d/1UrwLarU-o_-22OMddCH-fu5YJwmIKHwby3SR3_jQ9YM

cc @tkent-google @bzbarsky @mfreed7

@emilio
Copy link
Contributor

emilio commented Feb 5, 2020

cc @masayuki-nakano

@bzbarsky
Copy link
Contributor

bzbarsky commented Feb 6, 2020

@smaug---- as well.

@annevk annevk added addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: forms labels Feb 6, 2020
@annevk
Copy link
Member

annevk commented Feb 6, 2020

cc @whatwg/forms

@smaug----
Copy link
Collaborator

The proposed raw values on date/time controls look surprising. What is shown to the user is rather UA dependent, and localization dependent. In theory type=time for example might not even have any textual UI, just some analog-like clock.

And does the proposed type=submit/reset handling work well in practice? The value shown to the user can be after all localized.

@josepharhar
Copy link
Contributor Author

The proposed raw values on date/time controls look surprising. What is shown to the user is rather UA dependent, and localization dependent. In theory type=time for example might not even have any textual UI, just some analog-like clock.

And does the proposed type=submit/reset handling work well in practice? The value shown to the user can be after all localized.

Thanks for your feedback, I really appreciate it!

For the button, submit, and reset types, I just removed rawValue support for them from the explainer since the user doesn't actually type into them like the others. I originally included them simply because they had text, so it seemed like a good idea.

For the date/time input types, I included them because they are text fields the user types into.
If in any browsers they are not text fields, then I agree it would be a good reason to not have rawValue for them. In chrome on android and safari on iOS, they appear as dropdowns rather than actual text fields, but are still clearly backed by text that could be read out. Does anyone know of any other browsers where they are not just showing text?
I agree that they are UA and locale dependent, they definitely have different text in them on mobile browsers at the very least. Does anyone consider this a blocker for having rawValue on date/time input types? If so, how come?
I'm not determined to add rawValue for these date/time input types because the main problems I'm trying to address which I've heard about are with the email and number types, since they perform significant sanitization on the text before returning it in the value property. If I continue to hear concerns instead of support for the date/time input types, I will gladly remove rawValue support for them.

While we are on this topic, does anyone have any feelings about whether or not we should include this for the password input type? In my explainer, I am showing that we should returns the dots instead of the underlying text, because that's what appears in the text field. Does this seem helpful to anyone, or just more concerning?

@mfreed7
Copy link
Collaborator

mfreed7 commented Feb 7, 2020

For date/time controls, or really any control for which the UA doesn’t display something text-based, then perhaps rawValue should default back to returning value. That would seem to be a safe fallback, which won’t allow frameworks detect changes, but also likely won’t make them do any more work than they already have to do without a rawValue.

Any localization should be included in rawValue, I would think.

I’m not sure what to do about password. I suppose having rawValue return the dots make the most sense, given that the display shows them. But I don’t really see the use case - I doubt any sanitization could/would ever be done on passwords. Maybe rawValue shouldn’t work on password.

@smaug----
Copy link
Collaborator

I would be surprised if having localized rawValue wouldn't cause compat issues.
I could easily see some web site to rely on browser A's rawValue when using English localization, say date 02/07/2020, but then some other browser B localized to say Chinese could have quite different rawValue 2020年2月7日.

@bzbarsky
Copy link
Contributor

bzbarsky commented Feb 7, 2020

Any localization should be included in rawValue, I would think.

Apart from compat issues, this is a fingerprinting problem, no?

@phistuck
Copy link

phistuck commented Feb 7, 2020

Apart from compat issues, this is a fingerprinting problem, no?

Always been there anyway, with (new Date()).toString() as well as (more recently) Intl date formatting, right?

@bzbarsky
Copy link
Contributor

bzbarsky commented Feb 7, 2020

Those don't expose information about the browser localization (which may not match the OS locale, note), while form control internal values absolutely do.

@mfreed7
Copy link
Collaborator

mfreed7 commented Feb 7, 2020

Good catch on the fingerprinting concern. Maybe this feature should require user activation? That shouldn't get in the developer's way, since the stated use case is "while the user is typing". I.e. activation is kind of built-in.

Also, that's another good concern about sites trying to interpret the returned value from rawValue, rather than just using it to check for changes, or announce it for the screen reader use case. I don't know how to discourage that, other than perhaps choosing a good name that indicates how it should be used?

@domenic
Copy link
Member

domenic commented Feb 7, 2020

I'm not sure I understand the screen reader argument. Do there exist JavaScript screenreaders that don't have access to extension APIs?

@josepharhar
Copy link
Contributor Author

Thanks again for all the comments! All of your insights are very helpful.

I'm not sure I understand the screen reader argument. Do there exist JavaScript screenreaders that don't have access to extension APIs?

Unfortunately I don't really have much context on this, I just copied it over from Kent's proposal. Maybe @tkent-google could elaborate?

I would be surprised if having localized rawValue wouldn't cause compat issues.
I could easily see some web site to rely on browser A's rawValue when using English localization, say date 02/07/2020, but then some other browser B localized to say Chinese could have quite different rawValue 2020年2月7日.

Apart from compat issues, this is a fingerprinting problem, no?

I am leaning even more towards removing rawValue support for date/time inputs now due to the compat and fingerprinting issues. I haven't heard any interest in those input types from developers yet, so I think this new API would still be great without them.

For date/time controls, or really any control for which the UA doesn’t display something text-based, then perhaps rawValue should default back to returning value. That would seem to be a safe fallback, which won’t allow frameworks detect changes, but also likely won’t make them do any more work than they already have to do without a rawValue.

I think returning undefined would be better because it leaves us with more space to change the API later if needed. If there arises a strong use case for rawValue on date/time inputs, and we have a way to address the compat and fingerprinting concerns later, we could change it later. If planning on leaving room to change it later is a bad idea, then I suppose making rawValue reflect value for unsupported input types would be a good idea.

@tkent-google
Copy link
Collaborator

I'm not sure I understand the screen reader argument. Do there exist JavaScript screenreaders that don't have access to extension APIs?

Unfortunately I don't really have much context on this, I just copied it over from Kent's proposal. Maybe @tkent-google could elaborate?

ChromeVox, a screenreader for ChromeOS, was not able to read 'raw value' when I proposed. I don't know the current status. Anyway, it's possible to expose such information via extension API, and we should focus on the virtual DOM usecases in this proposal.

@josepharhar
Copy link
Contributor Author

Those don't expose information about the browser localization (which may not match the OS locale, note), while form control internal values absolutely do.

@bzbarsky Are you saying that there is an OS locale which corresponds to (new Date()).ToString() and a separate browser locale which corresponds to the text inside input elements? Are these concepts specced anywhere?

@bzbarsky
Copy link
Contributor

Are you saying that there is an OS locale which corresponds to (new Date()).ToString()

First of all, (new Date()).toString() does not depend on locale very much. It's always going to output something like "Wed Feb 12 2020 20:49:14 GMT" (this is specced in https://tc39.es/ecma262/#sec-date.prototype.tostring which calls https://tc39.es/ecma262/#sec-datestring and then https://tc39.es/ecma262/#sec-timestring), followed by the timezone offset and the timezone name. The timezone offset depends on timezone but not locale. The timezone name is "implementation-dependent" in the spec (see https://tc39.es/ecma262/#sec-timezoneestring) and in practice seems to be somewhat localized. At least in Firefox the locale for this comes from ICU, which ends up returning the C locale, more or less. This is generally what one would think of as the "OS locale", though it doesn't actually have to match it depending on environment variables, etc.

and a separate browser locale which corresponds to the text inside input elements?

There can be, yes. For example, it's quite possible to install an French-language Firefox on German-language Windows. In that case, I am pretty sure that the timezone stuff in Date will be in German, but the Firefox UI and the text inside input elements will be in French. This part is not specced anywhere, because the text inside inputs, being part of browser UI, is not specced anywhere.

@josepharhar
Copy link
Contributor Author

Thanks for all the comments everyone. I am planning on removing rawValue support for the date/time input types because it would reveal a lot of OS locale specific text, and I will update the explainer to match.
Number input types allow commas in some OS locales, but it is significantly less revealing than the locale specific text that shows up in date/time input types, requires the user to actually type the comma, and the number input type is a big focus of this proposal.

I am planning on making a TAG review for this feature tomorrow.

@smaug----
Copy link
Collaborator

That comma vs period difference is not only about revealing privacy sensitive information, but it also increases the likelihood for web pages not working when UA happens to use different locale than what the page is expecting. I could easily see some web page starting to use the rawValue as a number (by passing it to parseFloat or some such).

It is still a bit unclear to me what information the use cases for this feature actually need.

@josepharhar
Copy link
Contributor Author

That comma vs period difference is not only about revealing privacy sensitive information, but it also increases the likelihood for web pages not working when UA happens to use different locale than what the page is expecting. I could easily see some web page starting to use the rawValue as a number (by passing it to parseFloat or some such).

If a page wants to parse a number out of rawValue, then they should just use the value property, right? We could document this concept and make it clear that using parseFloat on rawValue is a bad idea.

It is still a bit unclear to me what information the use cases for this feature actually need.

Accessing the actual text the user typed into the input element will help to prevent bugs where it gets swapped out with what gets read out of .value like this one: facebook/react#14356

@smaug----
Copy link
Collaborator

smaug---- commented Mar 14, 2020

that is about email. What about number?

I'm mostly just wondering if we could fulfill the needs for the use cases in some other way.
But before that one needs to understand what the use cases exactly are.

@josepharhar
Copy link
Contributor Author

For number inputs, rawValue would be used to determine if the value in react's virtual DOM should be assigned to the input's value property, which currently happens here.
@nhunzaker is this right? Do you have anything to add?

However, number inputs in react don't currently appear to be very broken. After looking for open bugs on react about number inputs, I found several that have already been fixed like this one, which I believe were due to react parsing things into numbers before comparing them.

I found another bug which is still open which alludes more to rawValue, where a user wants to know when the input is empty vs having content which is not a valid number. As the user pointed out, this is possible by looking at the input's validity properties, so maybe that one is more of a react problem than a browser problem. After thinking about this case some more though, I could imagine an alternative where rawValue is also assignable, which would give react the power to have values which aren't valid numbers in its virtual DOM and still stay synced with the input, and would then give that user the power to see whatever text is inside of the input with react's current semantics. However, I'd imagine that making rawValue assignable would pose further issues.

@josepharhar
Copy link
Contributor Author

Update:
I found that many of the React bugs I've been focused on are due to another issue with assigning to the value attribute messing with the visible text, which I opened another issue for here: #5427

There are still some people who really want direct control over the visible text in input elements like these ones, which are really asking for a readable and writable rawValue property:

However, both of these can be worked around by adding event listeners that prevent spaces from being entered into email inputs...

I'm leaning towards dropping input.rawValue, but I'll update again soon.
Thanks for all the feedback and questions!

@nhunzaker
Copy link

nhunzaker commented Apr 2, 2020

Sorry to just catch up on this.

I found that many of the React bugs I've been focused on are due to another issue with assigning to the value attribute messing with the visible text, which I opened another issue for here: #5427

Yep, to the point where React avoids assigning the attribute when a user is editing number inputs as a last resort to maintain the behaviour of synchronizing the value attribute (it's synchronized again on blur)

Eliminating this side-effect would be awesome, however there are other side-effects that result from React synchronizing the value attribute, such as breaking autofill that still suggest it should be removed. The team just hasn't deemed it the right time to make that breaking change.

However, both of these can be worked around by adding event listeners that prevent spaces from being entered into email inputs

I suppose that's an option. The way controlled inputs work in React definitely runs against the grain. Still I can't help but wish there was a good way to support them without asking users to add workarounds. Still, that might ultimately be unavoidable.

I'm leaning towards dropping input.rawValue, but I'll update again soon.

Either way, thank you for all of your work on this!

@josepharhar
Copy link
Contributor Author

The defaultValue clobbering fix is now in stable chrome, and I am about to remove the rawValue experiment from chrome

@rniwa
Copy link
Collaborator

rniwa commented Sep 8, 2020

Are there any WPT tests added for your change? We never made any change to WebKit so I'd imagine this is still an issue on iOS.

@josepharhar
Copy link
Contributor Author

@rniwa Yes, I added a WPT for the defaultValue clobbering problem here.

I am pretty sure that the following patch would fix the same problem for WebKit, the code is pretty much the same as what I changed in chrome. Are you interested in merging it in WebKit?

diff --git a/Source/WebCore/html/HTMLInputElement.cpp b/Source/WebCore/html/HTMLInputElement.cpp
index ccd10619bb44..6d3de6f3502d 100644
--- a/Source/WebCore/html/HTMLInputElement.cpp
+++ b/Source/WebCore/html/HTMLInputElement.cpp
@@ -764,8 +764,8 @@ void HTMLInputElement::parseAttribute(const QualifiedName& name, const AtomStrin
         if (!hasDirtyValue()) {
             updatePlaceholderVisibility();
             invalidateStyleForSubtree();
+            setFormControlValueMatchesRenderer(false);
         }
-        setFormControlValueMatchesRenderer(false);
         updateValidity();
         m_valueAttributeWasUpdatedAfterParsing = !m_parsingInProgress;
     } else if (name == checkedAttr) {

blueboxd pushed a commit to blueboxd/chromium-legacy that referenced this issue Sep 8, 2020
I decided to not move forward with this experiment here:
whatwg/html#5257 (comment)

Bug: 1126053
Change-Id: Ic6dd7446f14d38dc5f9f5b9538a9278c88528ef1
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2399119
Reviewed-by: Mason Freed <masonfreed@chromium.org>
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#805053}
@emilio
Copy link
Contributor

emilio commented Sep 8, 2020

@rniwa Yes, I added a WPT for the defaultValue clobbering problem here.

That test seems failing everywhere? https://wpt.fyi/results/html/semantics/forms/the-input-element/defaultValue-clobbering.html

@josepharhar
Copy link
Contributor Author

That test seems failing everywhere? https://wpt.fyi/results/html/semantics/forms/the-input-element/defaultValue-clobbering.html

Yikes! The test works correctly with chromium's command line test runner, but there must be something different about the actual wpt test runner. I'll look into it.
Thanks for pointing this out.

@zcorpan
Copy link
Member

zcorpan commented Sep 8, 2020

@josepharhar https://web-platform-tests.org/writing-tests/testdriver.html says testdriver.js is only supported for testharness.js tests. cc @gsnedders

@josepharhar
Copy link
Contributor Author

I was able to test the same behavior without using a reftest, which should make it stop timing out.
I made a pr here: web-platform-tests/wpt#25442

mjfroman pushed a commit to mjfroman/moz-libwebrtc-third-party that referenced this issue Oct 14, 2022
I decided to not move forward with this experiment here:
whatwg/html#5257 (comment)

Bug: 1126053
Change-Id: Ic6dd7446f14d38dc5f9f5b9538a9278c88528ef1
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2399119
Reviewed-by: Mason Freed <masonfreed@chromium.org>
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#805053}
GitOrigin-RevId: 153159b0b83f6ef670d2b329e526018043cd430a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: forms
Development

No branches or pull requests