1. #1
    Sencha User
    Join Date
    Nov 2011
    Posts
    39
    Answers
    5
    Vote Rating
    0
    Araberen is on a distinguished road

      0  

    Default Unanswered: mp

    Unanswered: mp


    Hi,

    Let's say I allow users to enter raw HTML in a web page. Let's say users are stupid and don't know how to write validate HTML. How can I clean up and simplify their code?

    Examples:
    Code:
    <b>lorem</b><b> ipsum</b>
    is simplified in
    Code:
    <b>lorem ipsum</b>
    ---
    Code:
    <b>lorem <i>ipsum</b> dolor</i>
    is cleaned up in
    Code:
    <b>lorem <i>ipsum</i></b> dolor
    (or something similar).

    Is there any Ext JS function or plugin to do that? Or any external JS library?
    I've been trying to make my own algorithm but it's not really trivial...

  2. #2
    Sencha Premium Member skirtle's Avatar
    Join Date
    Oct 2010
    Location
    UK
    Posts
    3,568
    Answers
    539
    Vote Rating
    307
    skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future

      0  

    Default


    Please give your threads meaningful titles. 'mp' doesn't give much of a hint what the thread is about.

    What about just setting it as the innerHTML of a DOM node then reading it back out again?

  3. #3
    Sencha User
    Join Date
    Nov 2011
    Posts
    39
    Answers
    5
    Vote Rating
    0
    Araberen is on a distinguished road

      0  

    Default


    Sorry... I put a good title, but something has probably happend with my keyboard then... ^^ (and now I can't edit the post title).

    I seems to work to clean up the code. At least, when I'm typing <b>lorem <i>ipsum</b> dolor</i>, my Chrome inspector elements creates a good HTML tree. So I guess I could read it from the DOM and get the tags in the right order.

    But I also (and mostly) want to simplify the code. Stuff like <b><b>OK</b></b> shouldn't exist...

  4. #4
    Sencha Premium Member skirtle's Avatar
    Join Date
    Oct 2010
    Location
    UK
    Posts
    3,568
    Answers
    539
    Vote Rating
    307
    skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future skirtle has a brilliant future

      0  

    Default


    Technically there's nothing wrong with nested <b> tags. Given suitable CSS the inner tag could easily be styled differently from the outer tag. While I understand where you're coming from, the requirement not to have nested <b> tags isn't really part of HTML, that's a requirement you've layered on top. As such I suspect you'll struggle to find a library to do it.

    My first thought for how I'd implement this is also to use DOM nodes. That avoids any issues with invalid HTML and you can then navigate the tree and make your modifications before reading back the contents.

    Given your markup rules don't appear to match the rules of HTML you might want to consider using an alternative markup format that can be converted to HTML, like the one used by Wikis. Conventions like using stars to surround bold text avoid the nesting issue as the opening and closing tag are identical.

Thread Participants: 1