Escape path separator in a regular expression

I need to write a regular expression that finds javascript files that match

<anypath><slash>js<slash><anything>.js

For example, it should work for both :

  • c:\mysite\js\common.js (Windows)
  • /var/www/mysite/js/common.js (UNIX)

The problem is that the file separator in Windows is not being properly escaped :

pattern = Pattern.compile(
     "^(.+?)" + 
     File.separator +
     "js" +
     File.separator +
     "(.+?).js$" );

Throwing

java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence

Is there any way to use a common regular expression that works in both Windows and UNIX systems ?


Asked by: Jack706 | Posted: 21-01-2022






Answer 1

Does Pattern.quote(File.separator) do the trick?

EDIT: This is available as of Java 1.5 or later. For 1.4, you need to simply escape the file separator char:

"\\" + File.separator

Escaping punctuation characters will not break anything, but escaping letters or numbers unconditionally will either change them to their special meaning or lead to a PatternSyntaxException. (Thanks Alan M for pointing this out in the comments!)

Answered by: Roland486 | Posted: 22-02-2022



Answer 2

Is there any way to use a common regular expression that works in both Windows and UNIX systems ?

Yes, just use a regex that matches both kinds of separator.

pattern = Pattern.compile(
    "^(.+?)" + 
    "[/\\\\]" +
    "js" +
    "[/\\\\]" +
    "(.+?)\\.js$" );

It's safe because neither Windows nor Unix permits those characters in a file or directory name.

Answered by: Ada622 | Posted: 22-02-2022



Answer 3

Can't you just use a backslash to escape the path separator like so:

pattern = Pattern.compile(
     "^(.+?)\\" + 
     File.separator +
     "js\\" +
     File.separator +
     "(.+?).js$" );

Answered by: John208 | Posted: 22-02-2022



Answer 4

Why don't you escape File.separator:

... +
"\\" + File.separator +
...

to fit Pattern.compile requirements? I hope "/" (unix case) is processed as a single "/".

Answered by: Julian523 | Posted: 22-02-2022



Answer 5

I've tested gimel's answer on a Unix system - putting "\\" + File.separator works fine - the resulting "\/" in the pattern correctly matches a single "/"

Answered by: Rebecca846 | Posted: 22-02-2022



Similar questions

How can I create a create a java regular expression for a comma separator list

How can I create a java regular expression for a comma separator list (3) (3,6) (3 , 6 ) I tried, but it does not match anything: Pattern.compile("\\(\\S[,]+\\)") and how can I get the value "3" or "3"and "6" in my code from the Matcher class?


java - Regular expression with & as separator

I was given a long text in which I need to find all the text that are embedded in a pair of &amp; (For example, in a text "&amp;hello&amp;&amp;bye&amp;", I need to find the words "hello" and "bye"). I try using the regex ".*&amp;([^&amp;])*&amp;.*" but it doesn't work, I don't know what's wrong with that. Any help? Thanks


How can I create a create a java regular expression for a comma separator list

How can I create a java regular expression for a comma separator list (3) (3,6) (3 , 6 ) I tried, but it does not match anything: Pattern.compile("\\(\\S[,]+\\)") and how can I get the value "3" or "3"and "6" in my code from the Matcher class?


java - Regular expression with & as separator

I was given a long text in which I need to find all the text that are embedded in a pair of &amp; (For example, in a text "&amp;hello&amp;&amp;bye&amp;", I need to find the words "hello" and "bye"). I try using the regex ".*&amp;([^&amp;])*&amp;.*" but it doesn't work, I don't know what's wrong with that. Any help? Thanks


Why does this regular expression kill the Java regex engine?

I have this naive regex "&lt;([\s]|[^&lt;])+?>" (excluding the quotation marks). It seems so straightforward but it is indeed evil when it works against the below HTML text. It sends the Java regular expression engine to an infinite loop. I have another regex ("&lt;.+?>"), which does somewhat the same thing, but it doesn't kill anything. Do you know why this happens? &lt;script language="JavaScript...


regex - Negating literal strings in a Java regular expression

So regular expressions seem to match on the longest possible match. For instance: public static void main(String[] args) { String s = "ClarkRalphKentGuyGreenGardnerClarkSupermanKent"; Pattern p = Pattern.compile("Clark.*Kent", Pattern.CASE_INSENSITIVE); Matcher myMatcher = p.matcher(s); int i = 1; while (myMatcher.find()) { System.out.println(i++ + ". " + myMatcher.group()); ...


java - How would you use a regular expression to ignore strings that contain a specific substring?

How would I go about using a negative lookbehind(or any other method) regular expression to ignore strings that contains a specific substring? I've read two previous stackoverflow questions: java-regexp-for-file-filtering


java - Regular expression to parse option string

I'm using the Java matcher to try and match the following: @tag TYPE_WITH_POSSIBLE_SUBTYPE -PARNAME1=PARVALUE1 -PARNAME2=PARVALUE2: MESSAGE The TYPE_WITH_POSSIBLE_SUBTYPE consists of letters with periods. Every parameter has to consist of letters, and every value has to consist of numerics/letters. There can be 0 or more parameters. Immediately after the last parameter value comes...


java - el expression in jsp:invoke

I'm trying to use the following snippet inside my tag file: &lt;%@ attribute name="content" fragment="true"%&gt; ... &lt;c:set var="part" value="content"/&gt; &lt;jsp:invoke fragment="${part}" var="partValue"/&gt; ... and compiler gives me the following error: Syntax error, insert ") Statement" to complete IfStatement so as I understand it's not permitted ...


java - String delimited regular expression

I'm trying to build a bbcode parser, but I'm having quite some problems figuring out how to avoid matching too widely. For example I want to implement a [list] to conversion like this: \[list\](.*)\[/list\] would be replaced by this: &lt;ul&gt;$1&lt;/ul&gt; This works fine, except if I have two lists where the regular expression matches the beginning tag ...


regex - Using Java to find substring of a bigger string using Regular Expression

If I have a string like this: FOO[BAR] I need a generic way to get the "BAR" string out of the string so that no matter what string is between the square brackets it would be able to get the string. e.g. FOO[DOG] = DOG FOO[CAT] = CAT


java - Why does this code cause an "illegal start of expression" exception?

These are my questions: I'm getting a couple of errors on the line "public static boolean validNumCheck(String num){" - "illegal start of expression", "';' expected", and "')' expected". How can I give the user 3 tries in total for each number? I believe right now the programme asks the user for 3 numbers and gives them 3 tries in total to get the numbers correct (My explanations suck... re...


regex - How do I avoid the implicit "^" and "$" in Java regular expression matching?

I've been struggling with doing some relatively straightforward regular expression matching in Java 1.4.2. I'm much more comfortable with the Perl way of doing things. Here's what's going on: I am attempting to match /^&lt;foo&gt;/ from "&lt;foo&gt;&lt;bar&gt;" I try: Pattern myPattern= Pattern.compile("^&lt;foo&gt;"); Matcher myMatcher= myPattern.matcher("&lt;foo&gt;&lt;bar&gt;"); System....


java - Regular expression to allow a set of characters and disallow others

I want to restrict the users from entering the below special characters in a field: œçşÇŞ ğĞščřŠŘŇĚŽĎŤČňěž ůŮ İťı —¿„”*@ Newline Carriage return A few more will be added to this list but I will have the complete restricted list eventually. But he can enter certain foreign characters like äöüÄÖÜÿï etc in addition to alphanumeric chars, usual...






Still can't find your answer? Check out these amazing Java communities for help...



Java Reddit Community | Java Help Reddit Community | Dev.to Java Community | Java Discord | Java Programmers (Facebook) | Java developers (Facebook)



top