Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: Stripping comments and whitespace from javascript

  1. #1
    Sencha - Community Support Team Condor's Avatar
    Join Date
    Mar 2007
    Location
    The Netherlands
    Posts
    24,246
    Vote Rating
    122
      0  

    Default Stripping comments and whitespace from javascript

    I'm searching for a javascript compressor that can remove all comments, trailing whitespace and multiple newlines from a javascript file.

    You can disable obfuscation in most compressors, but I also don't want to remove newlines (only multiple newlines).

    Does anyone know such a compressor (a modified version of JSMin perhaps)?

    Background:
    The new JSBuilder2 tool creates an ext-all-debug.js file with all comments. This not only makes it load slowly, but it also makes it almost impossible to search, because my searches get too much hits in the API docs.

  2. #2
    Sencha - Ext JS Dev Team Animal's Avatar
    Join Date
    Mar 2007
    Location
    Bédoin/Redwood City
    Posts
    30,626
    Vote Rating
    56
      0  

    Default

    Doesn't yuicompressor do all this?
    Longtime Sencha geek. Outspoken advocate of pure Javascript Views. Posts my own opinions.

  3. #3
    Sencha - Ext JS Dev Team Animal's Avatar
    Join Date
    Mar 2007
    Location
    Bédoin/Redwood City
    Posts
    30,626
    Vote Rating
    56
      0  

    Default

    No, looking at its documentation, there is no -preserve-newline switch.

    I just requested one: http://yuilibrary.com/projects/yuico...ticket/2527982
    Longtime Sencha geek. Outspoken advocate of pure Javascript Views. Posts my own opinions.

  4. #4
    Sencha - Community Support Team Condor's Avatar
    Join Date
    Mar 2007
    Location
    The Netherlands
    Posts
    24,246
    Vote Rating
    122
      0  

    Default

    No, YUI compressor with --nomunge and --disable-optimizations still removes newlines and leading whitespace.

    JSMin in minimal mode keeps newlines, but still removes leading whitespace.

  5. #5
    Sencha User mystix's Avatar
    Join Date
    Mar 2007
    Location
    Singapore
    Posts
    6,236
    Vote Rating
    5
      0  

    Default

    you can disable yuicompressor newline mangling with the --line-break 0 option (that's a zero, not the letter O).

    which means leading whitespace is the only remaining thing that will need to be taken care of.
    (we should probably put up a request for tab-to-space conversion in yuicompressor too).

  6. #6
    Sencha User vmorale4's Avatar
    Join Date
    Mar 2007
    Location
    Chicago, IL
    Posts
    189
    Vote Rating
    1
      0  

    Default Maybe a two step process?

    You could try a two step approach, i.e:

    1. Run the file through a minifier, so that it removes comments,spaces etc...
    2. Run the resulting file through a code formatter so that it makes it readable again.

    I like to use a code formatter called Polystyle (Windows only), as it has ton of config options.

  7. #7
    Sencha - Community Support Team Condor's Avatar
    Join Date
    Mar 2007
    Location
    The Netherlands
    Posts
    24,246
    Vote Rating
    122
      0  

    Default

    Running the code through a formatter would probably change the code style significantly.

    I can't use that, since I often create overrides based on the content of ext-all-debug.js (which need to resemble the original code closely).

  8. #8
    Ext JS Premium Member dj's Avatar
    Join Date
    Mar 2007
    Location
    Germany
    Posts
    573
    Vote Rating
    3
      0  

    Default

    Just do some RegEx magic in your favorite scripting language. E.g. that's how it can look like in ruby:

    Code:
    #!/usr/bin/env ruby
    #
    # strips comments from JavaScript files while preserving 
    # the overall structure of the file.
    #
    # usage:
    # strip-debug-version.rb inputfile1.js inputfile2.js
    # or
    # strip-debug-version.rb < inputfile.js
    
    def strip_extjs_debug_version(data)
      non_empty_strings = []
      stripped_data = data.gsub(/('|").*?[^\\]\1/) { |m| 
        non_empty_strings.push $&
        "!temp-string-replacement-#{non_empty_strings.size}!"
      }
      stripped_data.gsub(/\/\*.*?\*\//m,"\n").gsub(/\/\/.*$/,"").gsub(/( |\t)+$/, "").gsub(/\n+/,"\n").gsub(/!temp-string-replacement-([0-9]+)!/) { |m| non_empty_strings[$1.to_i] }
    end
    
    if __FILE__ == $0
      puts strip_extjs_debug_version(ARGF.read) # reads stdin or arguments 
      #puts strip_extjs_debug_version(DATA.read) # for testing
    end
    
    # test input
    __END__
    /**
     * doc comment
     */
    function test(arg) {
      '// comment or /* comment */ are still there'
      // comment
      alert('arg is \''+arg+"'");    
    
      
    
    }
    Attached Files Attached Files
    Daniel Jagszent
    d??iel@??gsze?t.de <- convert to plain ASCII to get my email address

  9. #9
    Sencha - Community Support Team Condor's Avatar
    Join Date
    Mar 2007
    Location
    The Netherlands
    Posts
    24,246
    Vote Rating
    122
      0  

    Default

    I finally ended up writing it myself (in Java):
    Code:
    return Pattern.compile("/\\*.*?\\*/", Pattern.DOTALL).matcher(script).replaceAll("") // Remove /*comments*/
    		.replaceAll("//.*", "") // Remove //comments
    		.replaceAll("\r\n", "\n") // DOS -> Unix linefeeds
    		.replaceAll("\\s+\n", "\n") // Trim trailing whitespace
    		.replaceAll("\t", "    ") // Tabs to spaces
    		.replaceAll("\n\n", "\n") // Remove duplicate linefeeds
    		.replaceAll("^\n", ""); // Remove leading linefeeds

  10. #10
    Ext JS Premium Member dj's Avatar
    Join Date
    Mar 2007
    Location
    Germany
    Posts
    573
    Vote Rating
    3
      0  

    Default

    test your code with the test from my script above:
    Code:
    /**
     * doc comment
     */
    function test(arg) {
      '// comment or /* comment */ are still there'
      // comment
      alert('arg is \''+arg+"'");    
    }
    Comments in strings should be preserved. One not so uncommon case that will bail if they are are not preserved is
    Code:
    url = 'http://www.example.com';
    Because regular expressions are context-free (besides assertions like \b) you cannot do it with regular expressions alone.
    In the script above I replaced all non-empty strings with temporary values, did the regular expression magic and reinserted the strings. That's one common way to handle those cases where you need to replace something with context awareness.
    Daniel Jagszent
    d??iel@??gsze?t.de <- convert to plain ASCII to get my email address

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •