I know this was discussed in
another thread, but that one's been dead and I thought it seemed peaceful that way. Anyways, I was writing a regex for email addresses and came across a little question:
On
this page it has the following language for domain names:
Quote:
<domain> ::= <subdomain> | " "
<subdomain> ::= <label> | <subdomain> "." <label>
<label> ::= <letter> [ [ <ldh-str> ] <let-dig> ]
<ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str>
<let-dig-hyp> ::= <let-dig> | "-"
<let-dig> ::= <letter> | <digit>
<letter> ::= any one of the 52 alphabetic characters A through Z in
upper case and a through z in lower case
<digit> ::= any one of the ten digits 0 through 9
|
does the spec for label mean [letter][ldh-str]?[let-dig] or [letter]([ldh-str]*[let-dig])* or something else? I went for the 2nd one, but wasn't sure as I'd not seen the [] notation used as it was in the quoted form.
And for those who care, my regex (which is not perfect yet) is:
^[!#$%\'*+/=?^_`{|}~A-Za-z0-9-]([!#$%\'*+/=?^_`{|}~\.A-Za-z0-9-]*[!#$%\'*+/=?^_`{|}~A-Za-z0-9-])@([A-Za-z]([A-Za-z0-9-]*[A-Za-z0-9])*\.)*[A-Za-z]([A-Za-z0-9-]*[A-Za-z0-9])?$