Results 1 to 7 of 7

Thread: Regular Expression help needed - URL rewrite

  1. #1
    mmh
    mmh is offline
    Full Stack Operator mmh's Avatar
    Join Date
    May 2005
    Location
    UK, Stourbridge
    Posts
    1,802
    Thanks
    148
    Thanked
    93 times in 55 posts

    Regular Expression help needed - URL rewrite

    I'm currently migrating servers from iis6 to iis7.5 and on ii6 we used to have an isapi plugin that did this for us, but I'm trying to get it done using microsofts IIS URL REWRITE extension - however, I'm struggling with the regular expression needed.

    Basically we have a site that runs and it converts the first / after the file extension (.asp) to ? and all subsequent forward slashes to &, it also changes _ to =

    so a url of someurl.com/gig.asp/venue_101/gig_1011/from_20131001/to_20131002

    would actually be someurl.com/gig.asp?venue=101&gig=1011&from=20131001&to=20131002

    How would I write this rule as a regular expression? My mind is boggled.

    if you could write a little about what is doing what in order that I might understand it I would thoroughly appreciate it as its confusing the crap out of me.

    Thanks in advance for any help. Just so as you know I have looked into this on the web before asking, but its a bit overwhelming jumping in at the deep end
    : RFNX Ste | : stegough | www.stegough.com

  2. #2
    Seething Cauldron of Hatred TheAnimus's Avatar
    Join Date
    Aug 2005
    Posts
    17,147
    Thanks
    798
    Thanked
    2,151 times in 1,407 posts

    Re: Regular Expression help needed - URL rewrite

    http://www.cheatography.com/davechil...r-expressions/

    You want to look at the lookahead assertions, because you want to do everything after the first .asp right?

    Try just getting all symbols substituted how you want, then progress to putting in the assertion, shouldn't be too bad.

    That said, my personal hell would involve writing regexs all day.
    throw new ArgumentException (String, String, Exception)

  3. Received thanks from:

    mmh (10-05-2013)

  4. #3
    mmh
    mmh is offline
    Full Stack Operator mmh's Avatar
    Join Date
    May 2005
    Location
    UK, Stourbridge
    Posts
    1,802
    Thanks
    148
    Thanked
    93 times in 55 posts

    Re: Regular Expression help needed - URL rewrite

    Thanks for the input.

    I'm still lost, mainly because I don't really understand how the regex is strung together.

    So I'd want to use the lookahead (?=) so, like this? \.asp?=/

    to find the first / after .asp?

    I really don't have the first clue even with the cheat sheet its a little overwhelming. I understand certain things, but when you have to put it together, it gets a little bit confusing.
    : RFNX Ste | : stegough | www.stegough.com

  5. #4
    Almost Ex-HEXUS Staff Jonatron's Avatar
    Join Date
    Sep 2009
    Location
    London
    Posts
    549
    Thanks
    45
    Thanked
    157 times in 101 posts

    Re: Regular Expression help needed - URL rewrite

    Some stuff for you to try out in a javascript console:
    Code:
    var url = 'someurl.com/gig.asp/venue_101'
    url.replace(/^someurl.com\/gig.asp\/([a-z]+)_(\d+)$/, "someurl.com/gig.asp?$1=$2")
     > "someurl.com/gig.asp?venue=101"
    
    var url = 'someurl.com/gig.asp/venue_101/gig_1011'
    url.replace(/^someurl.com\/gig.asp\/([a-z]+)_(\d+)(\/([a-z]+)_(\d+))$/, "someurl.com/gig.asp?$1=$2&$4=$5")
     > "someurl.com/gig.asp?venue=101&gig=1011"
    
    var url = 'someurl.com/gig.asp/venue_101/gig_1011/from_20131001'
    url.replace(/^someurl.com\/gig.asp\/([a-z]+)_(\d+)(\/([a-z]+)_(\d+))(\/([a-z]+)_(\d+))$/, "someurl.com/gig.asp?$1=$2&$4=$5&$7=$8")
     > "someurl.com/gig.asp?venue=101&gig=1011&from=20131001"
    
    
    var url = 'someurl.com/gig.asp/venue_101/gig_1011/from_20131001/to_20131002'
    url.replace(/^someurl.com\/gig.asp\/([a-z]+)_(\d+)(\/([a-z]+)_(\d+))(\/([a-z]+)_(\d+))(\/([a-z]+)_(\d+))$/, "someurl.com/gig.asp?$1=$2&$4=$5&$7=$8&$10=$11")
     > "someurl.com/gig.asp?venue=101&gig=1011&from=20131001&to=20131002"

    Now, from this http://www.iis.net/learn/extensions/...rewrite-module it looks like the string you'd be matching would start with "gig.asp" and not have the "someurl.com/" before it. And the back-references seem to be in the form {R:N} instead of $N.

    So one of the patterns might be something like:
    Code:
    ^gig.asp/([a-z]+)_(\d+)$
    and the rewrite rule
    Code:
    gig.asp?{R:1}={R:2}
    But I haven't used IIS before!
    Last edited by Jonatron; 09-05-2013 at 05:25 PM. Reason: removed backslash

  6. Received thanks from:

    mmh (10-05-2013)

  7. #5
    HEXUS.social member finlay666's Avatar
    Join Date
    Aug 2006
    Location
    Newcastle
    Posts
    8,546
    Thanks
    297
    Thanked
    894 times in 535 posts
    • finlay666's system
      • CPU:
      • 3570k
      • Memory:
      • 16gb
      • Graphics card(s):
      • 6950 2gb
      • Case:
      • Fractal R3
      • Operating System:
      • Windows 8
      • Monitor(s):
      • U2713HM and V222H
      • Internet:
      • cable

    Re: Regular Expression help needed - URL rewrite

    Quote Originally Posted by mmh View Post
    I'm currently migrating servers from iis6 to iis7.5 and on ii6 we used to have an isapi plugin that did this for us, but I'm trying to get it done using microsofts IIS URL REWRITE extension - however, I'm struggling with the regular expression needed.

    Basically we have a site that runs and it converts the first / after the file extension (.asp) to ? and all subsequent forward slashes to &, it also changes _ to =

    so a url of someurl.com/gig.asp/venue_101/gig_1011/from_20131001/to_20131002

    would actually be someurl.com/gig.asp?venue=101&gig=1011&from=20131001&to=20131002

    How would I write this rule as a regular expression? My mind is boggled.

    The IIS rewrite rules are a bit funky for regex and slightly different to normal regex

    ^gig.asp/venue_([0-9]+)/from_([0-9]+)/to_([0-9]+)/?$

    to break it down, the ^ means anything before, you can use this to apply generic wildcard regex matches like ^(.+)/search/(.+)/$ for making a pretty url to rewrite to (.+)?search=(.+) for instance

    then you use [0-9] to match digits, the + indicates one to many, you can also use (.+) to indicate and number of other characters (for say search text)

    the /?$ indicates it CAN end in a trailing slas, but doesn't have to, if your to could include slashes you would need /$ to say if it ends in a slash it's not included

    your rewrite url would match anything leading to /gig.asp then you can rewrite using the format:
    url="gig.asp?venue={R:1}&from={R:2}&to={R:3}"
    You need to encode your ampersands to & in the web.config but I'm not sure if this is needed in the IIS module directly

    Used to do this all day in my old position, don't use them as much now

    don't forget to stop processing once it's matched so you don't keep checking any other rewrite rules
    H3XU5 Social FAQ
    Quote Originally Posted by tiggerai View Post
    I do like a bit of hot crumpet

  8. Received thanks from:

    mmh (10-05-2013)

  9. #6
    mmh
    mmh is offline
    Full Stack Operator mmh's Avatar
    Join Date
    May 2005
    Location
    UK, Stourbridge
    Posts
    1,802
    Thanks
    148
    Thanked
    93 times in 55 posts

    Re: Regular Expression help needed - URL rewrite

    Thanks guys... So finlay, what happens if say a lazy coder somewhere in the past had not always put the values in the same place, and perhaps the pages are not always populated with the samequery strings, for example;

    I have some pages:
    someurl.com/gig.asp/venue_101

    Showing just the venue

    someurl.com/gig.asp/gig_1011

    Showing just the gig

    someurl.com/gig.asp/gig_1011/from_20131001/to_20131002/venue_101

    Showing the same as the original, but the order of querystrings has changed.

    and someurl.com/venue.asp/gig_1011/email_true

    Where a page might show an email address when this is shown - note it is on venue.asp also, the pages can be different.

    Is it possible to trap all these scenarios with one rule?

    I would imagine something like this:

    ^.asp/(.+)_([0-9]+)/(.+)_([0-9]+)/(.+)_([0-9]+)/?$

    asp?{R:1}={R:2}&{R:3}={R:4}&{R:5}={R:6}

    HOWEVER, how would I account for all of the strings, what if there was another QS added? remember email=true? what if that gets added on and its 4 QS's long? Is there a way to account for this.

    Sincerely thank you for your help with this, you've made it a lot clearer to me than I understoof it before.
    : RFNX Ste | : stegough | www.stegough.com

  10. #7
    HEXUS.social member finlay666's Avatar
    Join Date
    Aug 2006
    Location
    Newcastle
    Posts
    8,546
    Thanks
    297
    Thanked
    894 times in 535 posts
    • finlay666's system
      • CPU:
      • 3570k
      • Memory:
      • 16gb
      • Graphics card(s):
      • 6950 2gb
      • Case:
      • Fractal R3
      • Operating System:
      • Windows 8
      • Monitor(s):
      • U2713HM and V222H
      • Internet:
      • cable

    Re: Regular Expression help needed - URL rewrite

    Quote Originally Posted by mmh View Post
    Thanks guys... So finlay, what happens if say a lazy coder somewhere in the past had not always put the values in the same place, and perhaps the pages are not always populated with the samequery strings, for example;

    I have some pages:
    someurl.com/gig.asp/venue_101

    Showing just the venue

    someurl.com/gig.asp/gig_1011

    Showing just the gig

    someurl.com/gig.asp/gig_1011/from_20131001/to_20131002/venue_101

    Showing the same as the original, but the order of querystrings has changed.

    and someurl.com/venue.asp/gig_1011/email_true

    Where a page might show an email address when this is shown - note it is on venue.asp also, the pages can be different.

    Is it possible to trap all these scenarios with one rule?
    Not as such (the rule would be too vague and nigh on impossible to test), you can however create a composition of multiple rules to catch them and process them all

    The example you gave wouldn't be a lot of use as it's brittle and wouldn't handle all cases

    you could catch each individually though and break it down and to continue the page processing

    ^venue_([0-9]+)/?$

    ^from_([0-9]+)/$?

    ^to_([0-9]+)/?$

    by setting stopprocessing to false on each if you had:
    /venue.../to.../from.../

    it would querystring to ?venue=...?from=...?to=...

    but as the order of the querystring doesn't matter when you request by key you can put it in whichever order you wish

    you could even combine that with a specific asp page like

    ^gig.asp/(.+)venue_(0-9]+)/$?

    as (.+) denotes any character 0...many times you can use that as a sortof wildcard to add your filters regardless of order, remember you can still use the ? querystring multiple times so ?a=b?c=d?dog=woof instead of ?a=b&c=d&dog=woof, it's not as proper, but you're just rewriting it for the server

    The 2nd set also stops the rule running on any other page other than one that contains gig.asp, which is fairly unlikely in most cases

    You can also run the url through, perform a redirect on the url then do your pattern matching properly if you want, IIRC it's better for SEO as the url is in a standardised format and it will send the correct response code to the browser

    Quote Originally Posted by mmh View Post
    Sincerely thank you for your help with this, you've made it a lot clearer to me than I understoof it before.
    No worries, it never hurts to help someone out in a niche question, started doing it on SO before work in the morning if I'm in early as a bit of a wake up task, might as well do a bit on Hexus while processing OS co-ordinates to lat/long for a pet project

    http://blogs.iis.net/eokim/archive/2...t-rule-gt.aspx This is probably worth a look, the IIS blogs are pretty good for explaining things well
    Last edited by finlay666; 14-05-2013 at 12:36 AM. Reason: added IIS blog
    H3XU5 Social FAQ
    Quote Originally Posted by tiggerai View Post
    I do like a bit of hot crumpet

  11. Received thanks from:

    mmh (14-05-2013)

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •